JP2022040352A

JP2022040352A - Model update support system

Info

Publication number: JP2022040352A
Application number: JP2022006831A
Authority: JP
Inventors: 崇博瀧本; Takahiro Takimoto
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 2018-09-13
Filing date: 2022-01-20
Publication date: 2022-03-10
Anticipated expiration: 2038-09-13
Also published as: JP7225444B2

Abstract

PROBLEM TO BE SOLVED: To provide a model update support system which can support update of a model.

SOLUTION: The model update support system according to an embodiment supports update of a first model learned by a learning data group including plural pieces of example data and plural labels provided to the pieces of data, respectively. The model update support system includes a processor. The processor can output first information showing that learning of the first model is insufficient and second information showing that one of the labels is not appropriate on the basis of the classification certainty showing the certainty of classification of first data calculated using the first model and a plurality of similarities of the first data and the pieces of example data.

SELECTED DRAWING: Figure 1

Description

本発明の実施形態は、モデル更新支援システムに関する。 An embodiment of the present invention relates to a model update support system.

深層学習を用いて学習されたモデルは、データを分類する際などに用いられる。データを継続的に精度良く分類するためには、モデルを適宜更新することが望ましい。 Models trained using deep learning are used when classifying data. It is desirable to update the model from time to time in order to continuously and accurately classify the data.

特開２００３－１３２３３２号公報Japanese Patent Application Laid-Open No. 2003-132332

本発明の実施形態は、モデルの更新を支援可能なモデル更新支援システムを提供する。 An embodiment of the present invention provides a model update support system capable of supporting model update.

実施形態に係るモデル更新支援システムは、複数の例示データと、前記複数の例示データにそれぞれ付与された複数のラベルと、を含む学習データ群を用いて学習された第１モデルの更新を支援する。前記モデル更新支援システムは、処理部を含む。前記処理部は、前記第１モデルを用いて算出された、第１データの分類の確からしさを示す分類確信度と、前記第１データと前記複数の例示データとの類似性をそれぞれ示す複数の類似度と、に基づいて、前記第１モデルへの学習が十分では無いことを示す第１情報、又は前記複数のラベルの１つが適切では無いことを示す第２情報を出力可能である。 The model update support system according to the embodiment supports the update of the first model trained using the learning data group including the plurality of exemplary data and the plurality of labels assigned to the plurality of exemplary data. .. The model update support system includes a processing unit. The processing unit has a plurality of classification certainty factors calculated by using the first model, which indicate the certainty of classification of the first data, and a plurality of similarities between the first data and the plurality of exemplary data. Based on the similarity, it is possible to output the first information indicating that the training to the first model is not sufficient, or the second information indicating that one of the plurality of labels is not appropriate.

第１実施形態に係るモデル更新支援システムの構成を表す模式図である。It is a schematic diagram which shows the structure of the model update support system which concerns on 1st Embodiment. 第１実施形態に係るモデル更新支援システムによる出力を例示する模式図である。It is a schematic diagram which illustrates the output by the model update support system which concerns on 1st Embodiment. 第１実施形態に係るモデル更新支援システムを用いた処理を例示するフローチャートである。It is a flowchart which illustrates the process using the model update support system which concerns on 1st Embodiment. 第１実施形態に係るモデル更新支援システムによる出力を例示する模式図である。It is a schematic diagram which illustrates the output by the model update support system which concerns on 1st Embodiment. 第１実施形態に係るモデル更新支援システムによる出力を例示する模式図である。It is a schematic diagram which illustrates the output by the model update support system which concerns on 1st Embodiment. 第２実施形態に係るモデル更新支援システムの構成を表す模式図である。It is a schematic diagram which shows the structure of the model update support system which concerns on 2nd Embodiment.

以下に、本発明の各実施形態について図面を参照しつつ説明する。
本願明細書と各図において、既に説明したものと同様の要素には同一の符号を付して詳細な説明は適宜省略する。 Hereinafter, each embodiment of the present invention will be described with reference to the drawings.
In the present specification and each figure, the same elements as those already described are designated by the same reference numerals, and detailed description thereof will be omitted as appropriate.

図１は、第１実施形態に係るモデル更新支援システムの構成を表す模式図である。
図１に表した第１実施形態に係るモデル更新支援システム１１０は、学習済みのモデルの更新を支援するために用いられる。 FIG. 1 is a schematic diagram showing a configuration of a model update support system according to the first embodiment.
The model update support system 110 according to the first embodiment shown in FIG. 1 is used to support the update of the trained model.

例えば、学習済みのモデルを用いて、データを分類することがある。各分類に関してモデルが十分且つ適切に学習されていれば、モデルは入力されたデータの分類をより正確に推論できる。 For example, a trained model may be used to classify data. If the model is sufficiently and properly trained for each classification, the model can more accurately infer the classification of the input data.

しかし、ある分類に関して、モデルが十分に学習されていなかったり、モデルが適切に学習されていなかったりする場合がある。この場合、その分類と推論されるべきデータが、別の分類と推論される可能性がある。又は、その分類と推論されたが、分類確信度が低い可能性がある。 However, for some classifications, the model may not be well trained or the model may not be trained properly. In this case, the classification and the data to be inferred may be inferred to another classification. Or, it was inferred to be the classification, but the classification certainty may be low.

分類確信度は、モデルがデータを分類する際に算出される値である。分類確信度は、推論された分類の確からしさを示す。分類確信度が高いほど、モデルによって推論されたデータの分類が実際の分類に一致している可能性が高い。 The classification confidence is a value calculated when the model classifies the data. The classification conviction indicates the certainty of the inferred classification. The higher the classification confidence, the more likely it is that the classification of the data inferred by the model matches the actual classification.

以降では、モデルによって正しい分類が推論できなかったデータ、又は正しい分類を推論できたが分類確信度が低かったデータを、「異常」なデータと呼ぶ。モデルによって正しい分類が推論でき、且つ分類確信度が高いデータを、「正常」なデータと呼ぶ。 Hereinafter, the data in which the correct classification could not be inferred by the model, or the data in which the correct classification could be inferred but the classification certainty was low, are referred to as "abnormal" data. Data in which the correct classification can be inferred by the model and the classification certainty is high are called "normal" data.

モデルによるデータの分類時に異常なデータが発生した場合、そのモデルを更新（再学習）させることが望ましい。しかし、異常は、モデルの学習に用いた学習データやモデルの内部構造に起因することが多い。このため、異常の要因が何かユーザが判別することは容易では無い。 If abnormal data occurs when classifying data by a model, it is desirable to update (re-learn) the model. However, the anomaly is often due to the training data used to train the model or the internal structure of the model. Therefore, it is not easy for the user to determine what is the cause of the abnormality.

モデル更新支援システム１１０は、異常の要因に関する情報をユーザに提供し、モデルの更新を支援するために用いられる。モデル更新支援システム１１０から情報が提供されることで、ユーザは、その情報を基に、どのようにモデルを更新すれば良いか知ることができる。 The model update support system 110 is used to provide the user with information on the cause of the abnormality and to support the model update. By providing the information from the model update support system 110, the user can know how to update the model based on the information.

実施形態に係るモデル更新支援システム１１０は、処理部１０を含む。図１に表したように、モデル更新支援システム１１０は、取得部２０、出力部３０、モデル記憶部５１、学習データ記憶部５２をさらに含んでいても良い。 The model update support system 110 according to the embodiment includes a processing unit 10. As shown in FIG. 1, the model update support system 110 may further include an acquisition unit 20, an output unit 30, a model storage unit 51, and a learning data storage unit 52.

取得部２０は、映像や音声などの情報をデジタルデータとして取得し、処理部１０へ出力する。取得部２０は、例えば、撮像装置及びマイクの少なくともいずれかを含む。取得部２０は、取得した情報を不図示の記憶部に記憶しても良い。この場合、処理部１０は、その記憶部にアクセスし、取得されたデータを参照する。 The acquisition unit 20 acquires information such as video and audio as digital data and outputs it to the processing unit 10. The acquisition unit 20 includes, for example, at least one of an image pickup device and a microphone. The acquisition unit 20 may store the acquired information in a storage unit (not shown). In this case, the processing unit 10 accesses the storage unit and refers to the acquired data.

処理部１０は、例えば、ＣＰＵ（Central Processing Unit）及び電子回路などを含む。処理部１０は、受付部１１、分類確信度算出部１２、判定部１３、類似度算出部１４、及び要因選択部１５を含む。 The processing unit 10 includes, for example, a CPU (Central Processing Unit), an electronic circuit, and the like. The processing unit 10 includes a reception unit 11, a classification certainty calculation unit 12, a determination unit 13, a similarity calculation unit 14, and a factor selection unit 15.

例えば、取得部２０が、撮影又は音声記録を行い、第１データを取得する。受付部１１は、取得部２０から出力された第１データを受け付ける。受付部１１が第１データを受け付けると、分類確信度算出部１２は、モデル記憶部５１及び学習データ記憶部５２にアクセスする。 For example, the acquisition unit 20 takes a picture or records a voice and acquires the first data. The reception unit 11 receives the first data output from the acquisition unit 20. When the reception unit 11 receives the first data, the classification certainty calculation unit 12 accesses the model storage unit 51 and the learning data storage unit 52.

モデル記憶部５１は、学習済みの第１モデルを記憶している。学習データ記憶部５２は、第１モデルの学習に用いられた学習データ群を記憶している。学習データ群は、複数の学習データを含む。それぞれの学習データは、１つの例示データと、その例示データの分類を示す１つのラベルと、を含む。 The model storage unit 51 stores the trained first model. The learning data storage unit 52 stores the learning data group used for learning the first model. The training data group includes a plurality of training data. Each training data includes one exemplary data and one label indicating the classification of the exemplary data.

モデル記憶部５１及び学習データ記憶部５２は、ハードディスクドライブ、フラッシュメモリ、又はネットワークハードディスクなどの記憶媒体を含む。１つの記憶媒体がモデル記憶部５１及び学習データ記憶部５２として機能しても良い。 The model storage unit 51 and the learning data storage unit 52 include a storage medium such as a hard disk drive, a flash memory, or a network hard disk. One storage medium may function as the model storage unit 51 and the learning data storage unit 52.

分類確信度算出部１２は、第１モデルに第１データを入力し、第１モデルに第１データの分類を推論させる。この推論時における第１モデルの出力を基に、分類確信度算出部１２は、第１分類確信度を算出する。第１分類確信度は、第１モデルにより推論された第１データの分類（第１分類）の確からしさを示す。 The classification certainty calculation unit 12 inputs the first data to the first model, and causes the first model to infer the classification of the first data. The classification conviction calculation unit 12 calculates the first classification conviction based on the output of the first model at the time of this inference. The first classification conviction indicates the certainty of the classification (first classification) of the first data inferred by the first model.

さらに、分類確信度算出部１２は、第１モデルに複数の例示データを順次入力し、複数の分類確信度を算出する。分類確信度算出部１２は、第１データと、第１分類確信度と、複数の例示データに基づく複数の分類確信度と、を判定部１３に出力する。 Further, the classification conviction calculation unit 12 sequentially inputs a plurality of exemplary data into the first model and calculates a plurality of classification convictions. The classification conviction calculation unit 12 outputs the first data, the first classification conviction, and a plurality of classification convictions based on a plurality of exemplary data to the determination unit 13.

判定部１３は、第１分類確信度及び複数の分類確信度に基づいて、第１分類確信度が十分に高いか判定する。例えば、判定部１３は、複数の分類確信度の平均値及びばらつきを算出し、平均値及びばらつき用いて閾値を設定する。判定部１３は、第１分類確信度を設定された閾値と比較する。判定部１３は、第１分類確信度が閾値以上であると、第１データは正常であると判定する。これは、第１分類確信度が十分に高く、第１データの分類が正しく推論されている可能性が高いことを意味する。 The determination unit 13 determines whether the first classification conviction is sufficiently high based on the first classification conviction and the plurality of classification convictions. For example, the determination unit 13 calculates an average value and variation of a plurality of classification certainty, and sets a threshold value using the average value and variation. The determination unit 13 compares the first classification conviction with the set threshold value. When the determination unit 13 determines that the first classification certainty is equal to or higher than the threshold value, the determination unit 13 determines that the first data is normal. This means that the first classification certainty is high enough and the classification of the first data is likely to be inferred correctly.

閾値の設定方法は、この例に限定されない。閾値の設定に複数の分類確信度を用いず、ユーザにより予め設定された値が閾値として用いられても良い。この場合、分類確信度算出部１２による複数の分類確信度の算出、判定部１３による平均値及びばらつきの算出などは不要である。 The method of setting the threshold value is not limited to this example. Instead of using a plurality of classification convictions for setting the threshold value, a value preset by the user may be used as the threshold value. In this case, it is not necessary for the classification conviction calculation unit 12 to calculate a plurality of classification convictions, and the determination unit 13 to calculate the average value and the variation.

第１分類確信度が閾値未満であると、第１データが異常であることを意味する。この場合、判定部１３は、第１データを類似度算出部１４へ出力し、複数の例示データに関する複数の分類確信度を要因選択部１５へ出力する。 If the first classification certainty is less than the threshold value, it means that the first data is abnormal. In this case, the determination unit 13 outputs the first data to the similarity calculation unit 14, and outputs a plurality of classification convictions regarding the plurality of exemplary data to the factor selection unit 15.

類似度算出部１４は、第１データと複数の例示データを用いて、複数の類似度を算出する。複数の類似度は、それぞれ、第１データと複数の例示データとの類似性を示す。類似度算出部１４は、算出した複数の類似度を要因選択部１５へ出力する。 The similarity calculation unit 14 calculates a plurality of similarity using the first data and a plurality of exemplary data. The plurality of similarities indicate the similarity between the first data and the plurality of exemplary data, respectively. The similarity calculation unit 14 outputs a plurality of calculated similarities to the factor selection unit 15.

要因選択部１５は、複数の類似度に基づいて第１情報又は第２情報を選択可能である。この例では、要因選択部１５が、複数の類似度と、複数の分類確信度の少なくとも一部と、に基づいて、第１情報又は第２情報を適宜選択する場合について説明する。第１情報は、モデルへの学習が十分では無いことを示す。第２情報は、学習データ群に含まれるいずれかのラベルが適切では無いことを示す。 The factor selection unit 15 can select the first information or the second information based on a plurality of similarities. In this example, a case where the factor selection unit 15 appropriately selects the first information or the second information based on a plurality of similarities and at least a part of a plurality of classification certainty will be described. The first information indicates that the training on the model is not sufficient. The second information indicates that any label contained in the training data group is not appropriate.

要因選択部１５は、第１情報又は第２情報を選択した場合、選択した情報を出力部３０へ出力する。要因選択部１５は、複数の分類確信度及び複数の類似度に基づいて、第１情報又は第２情報を選択しない場合もある。 When the first information or the second information is selected, the factor selection unit 15 outputs the selected information to the output unit 30. The factor selection unit 15 may not select the first information or the second information based on a plurality of classification certainty degrees and a plurality of similarity degrees.

出力部３０は、ユーザが認識できるように第１情報又は第２情報を出力する。出力部３０は、モニタ、スピーカ、及びプリンタの少なくともいずれかを含む。例えば、出力部３０は、モニタ又はプリンタを含み、第１情報又は第２情報を視認可能に出力する。出力部３０は、第１情報又は第２情報とともに、別の情報を出力しても良い。別の情報としては、第１データ、第１分類、第１分類確信度、第１データに類似する第１例示データ、第１例示データの分類、第１例示データについての分類確信度、適切では無いと判断されたラベル、そのラベルが付与された第２例示データ、などである。 The output unit 30 outputs the first information or the second information so that the user can recognize it. The output unit 30 includes at least one of a monitor, a speaker, and a printer. For example, the output unit 30 includes a monitor or a printer and outputs the first information or the second information visually. The output unit 30 may output other information together with the first information or the second information. Other information includes the first data, the first classification, the first classification certainty, the first exemplary data similar to the first data, the classification of the first exemplary data, the classification certainty about the first exemplary data, as appropriate. A label determined not to exist, a second exemplary data to which the label is attached, and the like.

第１実施形態に係るモデル更新支援システム１１０によれば、第１データが異常であった場合に、その要因を示す情報をユーザへ提供できる。ユーザは、提供された情報を基に、第１モデルを更新できる。例えば、第１モデルについて、第１分類に関する学習が不十分である場合は、第１分類に関して第１モデルを再学習させる。ラベルに誤りがあった場合は、ラベルを訂正した学習データを用いて第１モデルを再学習させる。これにより、それ以降に、第１モデルを用いてデータをより精度良く分類できるようになる。 According to the model update support system 110 according to the first embodiment, when the first data is abnormal, information indicating the cause can be provided to the user. The user can update the first model based on the information provided. For example, if the learning about the first classification is insufficient for the first model, the first model is retrained for the first classification. If there is an error in the label, the first model is retrained using the training data with the corrected label. This makes it possible to classify the data more accurately using the first model after that.

以下で、モデル更新支援システム１１０に関する処理の一例をより具体的に説明する。
第１モデルは、例えば以下の方法により作成される。まず、学習前のモデルに、入力されたデータを種類ごとに分類させるようなタスクについて深層学習を適用させる。次に、ラベルが付与されていない例示データを入力して事前学習を行う。その後、データの種類ごとに教示された（ラベル付けされた）例示データを用いてFineTuningを実施する。作成された学習モデルに対して、分類したいデータを入力し、深層学習による分類（ラベル付け）を行う。 Hereinafter, an example of processing related to the model update support system 110 will be described more specifically.
The first model is created, for example, by the following method. First, deep learning is applied to the pre-learning model for tasks such as classifying the input data by type. Next, pre-learning is performed by inputting example data without a label. Then, Fine Tuning is performed using the illustrated (labeled) data taught for each type of data. For the created learning model, input the data to be classified and perform classification (labeling) by deep learning.

分類確信度算出部１２は、学習済みの第１モデルに第１データを入力し、第１モデルから出力ベクトルを取得する。分類確信度算出部１２は、出力ベクトルをsoftmax関数に入力し、その出力ベクトルの中の最大値が得られた分類を、その第１データの分類と推論する。また、その最大値を、分類確信度とする。 The classification certainty calculation unit 12 inputs the first data to the trained first model and acquires an output vector from the first model. The classification certainty calculation unit 12 inputs the output vector to the softmax function, and infers the classification from which the maximum value in the output vector is obtained as the classification of the first data. The maximum value is used as the classification certainty.

判定部１３は、学習データ群に含まれる複数の例示データを第１モデルへ順次入力し、分類確信度算出部１２で算出した複数の分類確信度を取得する。判定部１３は、それらの分類確信度の平均（μ）と、その分散（σ）と、第１データの第１分類確信度（ｘ）と、を比較することで、第１データが正常か異常かを判定する。例えば、以下の「数１」が成立するとき、判定部１３は、第１データを異常と判定する。αは、予め設定される係数である。

The determination unit 13 sequentially inputs a plurality of exemplary data included in the learning data group into the first model, and acquires a plurality of classification convictions calculated by the classification conviction calculation unit 12. The determination unit 13 compares the average (μ) of the classification certainty, the variance (σ), and the first classification certainty (x) of the first data, and determines whether the first data is normal. Determine if it is abnormal. For example, when the following "number 1" is satisfied, the determination unit 13 determines that the first data is abnormal. α is a preset coefficient.

類似度算出部１４は、第１データと複数の例示データとの間でそれぞれ複数の類似度を算出する。類似度算出部１４は、例えば以下の「数２」で表現されるユークリッド距離ｄを基に類似度を算出する。例えば、第１データと例示データとの間の類似度の値が大きいほど、それらのデータがより類似していることを示す。類似度の算出には、ユークリッド距離以外に、コサイン類似度などが用いられても良い。

The similarity calculation unit 14 calculates a plurality of similarities between the first data and the plurality of exemplary data. The similarity calculation unit 14 calculates the similarity based on, for example, the Euclidean distance d represented by the following “Equation 2”. For example, the greater the value of similarity between the first data and the exemplary data, the more similar the data are. In addition to the Euclidean distance, the cosine similarity may be used to calculate the similarity.

「数２」において、ｐ＝（ｐ１、ｐ２、・・・、ｐｉ）は、第１モデルによる第１データの推論時において、最終層よりひとつ前の層の出力ベクトルを表す。ｑ＝（ｑ１、ｑ２、・・・ｑｉ）は、第１モデルによる例示データの推論時において、最終層よりひとつ前の層の出力ベクトルを表す。又は、出力ベクトルとして、第１データ及び例示データの推論時における最終層から２層以上前の出力ベクトル又は最終層の出力ベクトルを用いても良い。 In "Equation 2", p = (p1, p2, ..., Pi) represents the output vector of the layer immediately before the final layer at the time of inferring the first data by the first model. q = (q1, q2, ... qi) represents the output vector of the layer immediately before the final layer at the time of inferring the exemplary data by the first model. Alternatively, as the output vector, an output vector two or more layers before the final layer at the time of inference of the first data and the exemplary data or an output vector of the final layer may be used.

要因選択部１５は、第１情報又は第２情報を選択可能である。
例えば、第１情報は、以下の第１詳細情報及び第２詳細情報を含む。第１詳細情報は、第１モデルへの学習時に、第１データの第１分類に関する学習データが無かったことを示す。第２詳細情報は、第１モデルへの学習時に、第１分類に関する学習データがあったが、第１分類に関する学習が十分では無いことを示す。 The factor selection unit 15 can select the first information or the second information.
For example, the first information includes the following first detailed information and second detailed information. The first detailed information indicates that there was no training data regarding the first classification of the first data at the time of training to the first model. The second detailed information indicates that there was learning data regarding the first classification at the time of learning to the first model, but learning regarding the first classification is not sufficient.

要因選択部１５は、第１条件が満たされる場合、第１詳細情報を選択する。第１条件は、複数の類似度の最大値が第１閾値を下回ることである。ある例示データと第１データとの類似度が最大であることは、その例示データが学習データの中で最も第１データに類似していることを示す。類似度の最大値が第１閾値未満であることは、学習データ群の中で第１データに最も類似する例示データが、第１データとは似ていないことを示す。これは、第１モデルへの学習時、学習データに、第１データと類似するデータ（第１分類に属するデータ）が含まれていなかったことを示す。 The factor selection unit 15 selects the first detailed information when the first condition is satisfied. The first condition is that the maximum value of the plurality of similarities is below the first threshold value. The maximum degree of similarity between a certain example data and the first data indicates that the example data is most similar to the first data among the training data. The fact that the maximum value of the similarity is less than the first threshold value indicates that the exemplary data most similar to the first data in the training data group is not similar to the first data. This indicates that the training data did not include data similar to the first data (data belonging to the first classification) at the time of training to the first model.

要因選択部１５は、第１条件が満たされないとき、複数の例示データから複数の類似データを抽出する。複数の類似データは、複数の類似度の最大値が得られた第１例示データを含む。複数の類似データは、複数の例示データの中で、第１データに比較的類似したデータである。要因選択部１５は、複数の類似データの分類の確からしさをそれぞれ示す複数の参照確信度を参照する。複数の参照確信度は、分類確信度算出部１２により算出された複数の分類確信度の一部である。 When the first condition is not satisfied, the factor selection unit 15 extracts a plurality of similar data from a plurality of exemplary data. The plurality of similar data includes the first exemplary data in which the maximum value of the plurality of similarities is obtained. The plurality of similar data are data that are relatively similar to the first data among the plurality of exemplary data. The factor selection unit 15 refers to a plurality of reference convictions, each of which indicates the certainty of classification of a plurality of similar data. The plurality of reference convictions are a part of the plurality of classification convictions calculated by the classification conviction calculation unit 12.

要因選択部１５は、複数の参照確信度の平均値及びばらつきを算出する。要因選択部１５は、第２条件が満たされる場合、第２詳細情報を選択する。第２条件は、平均値が第２閾値未満であること、又はばらつきが第３閾値以上となることである。第２条件は、平均値が第２閾値未満であり、且つばらつきが第３閾値以上となることであっても良い。 The factor selection unit 15 calculates the average value and the variation of the plurality of reference convictions. The factor selection unit 15 selects the second detailed information when the second condition is satisfied. The second condition is that the mean value is less than the second threshold value, or the variation is greater than or equal to the third threshold value. The second condition may be that the mean value is less than the second threshold value and the variation is greater than or equal to the third threshold value.

類似度の最大値が第１閾値以上であることは、学習データ群に第１データと類似するデータが含まれていることを示す。一方、複数の参照確信度の平均値が第２閾値未満、又はばらつきが第３閾値以上であることは、第１モデルが第１データに関して十分に学習されていないことを示す。すなわち、第１モデルの学習時に、第１データと類似する例示データ（第１分類に属する例示データ）が学習データ群に十分に含まれていなかったことを示す。 When the maximum value of the similarity is equal to or higher than the first threshold value, it means that the training data group contains data similar to the first data. On the other hand, the fact that the average value of the plurality of reference certainty is less than the second threshold value or the variation is equal to or more than the third threshold value indicates that the first model is not sufficiently trained with respect to the first data. That is, it is shown that the training data group did not sufficiently include the exemplary data (exemplified data belonging to the first classification) similar to the first data at the time of training the first model.

第１条件又は第２条件が満たされない場合、要因選択部１５は、学習データ群に含まれる不適切なラベルが異常の要因と判断し、第２情報を選択する。 When the first condition or the second condition is not satisfied, the factor selection unit 15 determines that the inappropriate label included in the learning data group is the cause of the abnormality, and selects the second information.

又は、要因選択部１５は、第１条件又は第２条件が満たされない場合、第１モデルによりそれぞれ推論された複数の類似データの複数の分類を参照する。要因選択部１５は、複数の類似データのそれぞれについて、複数の類似データにそれぞれ付与された複数のラベルと、複数の分類と、を対比する。要因選択部１５は、ある類似データについて、複数のラベルの１つと、複数の分類の１つと、が一致しないとき、第２情報を選択する。要因選択部１５は、複数の分類と複数のラベルがそれぞれ一致しているときは、いずれの情報も選択せず、処理を終了する。 Alternatively, when the first condition or the second condition is not satisfied, the factor selection unit 15 refers to a plurality of classifications of a plurality of similar data inferred by the first model. The factor selection unit 15 compares a plurality of labels assigned to the plurality of similar data and a plurality of classifications for each of the plurality of similar data. The factor selection unit 15 selects the second information for a certain similar data when one of the plurality of labels and one of the plurality of classifications do not match. When the plurality of classifications and the plurality of labels match, the factor selection unit 15 does not select any information and ends the process.

図２は、第１実施形態に係るモデル更新支援システムによる出力を例示する模式図である。
図３は、第１実施形態に係るモデル更新支援システムを用いた処理を例示するフローチャートである。
図４及び図５は、第１実施形態に係るモデル更新支援システムによる出力を例示する模式図である。
ここでは、第１モデルに犬の画像を入力し、第１モデルに犬種を推論させる例について説明する。この例では、出力部３０は、モニタである。 FIG. 2 is a schematic diagram illustrating the output by the model update support system according to the first embodiment.
FIG. 3 is a flowchart illustrating processing using the model update support system according to the first embodiment.
4 and 5 are schematic views illustrating the output by the model update support system according to the first embodiment.
Here, an example in which an image of a dog is input to the first model and the first model infers the breed will be described. In this example, the output unit 30 is a monitor.

例えば、判定部１３により第１データが正常と判定されると、処理部１０は、第１モデルにより推論された第１データの第１分類と、第１分類確信度と、を出力部３０に表示させる。 For example, when the determination unit 13 determines that the first data is normal, the processing unit 10 sends the first classification of the first data inferred by the first model and the first classification certainty to the output unit 30. Display.

図２は、第１データが正常であったときの出力例を示す。図２に表した例では、処理部１０により、第１データａ１、第１分類ａ２、第１分類確信度ａ３、第１データに関する各分類の確信度ａ４、第１データと類似する１つ以上の例示データａ５、第１モデルにより推論されたその例示データの分類ａ６、分類確信度ａ７、例示データに付与されたラベルａ８、例示データに関する各分類の確信度ａ９などが表示されている。 FIG. 2 shows an output example when the first data is normal. In the example shown in FIG. 2, the processing unit 10 causes the first data a1, the first classification a2, the first classification certainty a3, the certainty a4 of each classification related to the first data, and one or more similar to the first data. The exemplary data a5, the classification a6 of the exemplary data inferred by the first model, the classification certainty a7, the label a8 attached to the exemplary data, the certainty a9 of each classification regarding the exemplary data, and the like are displayed.

入力されたデータが判定部１３により異常と判定されると、そのデータと複数の例示データとの間の複数の類似度が要因選択部１５に入力され、図３に表したフローチャートの処理が開始される。要因選択部１５は、異常と判定された第１データと、その第１データに関する各分類の分類確信度と、を出力部３０に表示させる（ステップＳ１）。 When the input data is determined to be abnormal by the determination unit 13, a plurality of similarities between the data and the plurality of exemplary data are input to the factor selection unit 15, and the processing of the flowchart shown in FIG. 3 is started. Will be done. The factor selection unit 15 causes the output unit 30 to display the first data determined to be abnormal and the classification certainty of each classification regarding the first data (step S1).

ユーザは、出力部３０に表示された第１データを確認し、そのデータに見た目で不具合があるか判断する（ステップＳ２）。見た目の不具合としては、例えば、画面全体がボケており画像そのものが認識できない場合などが挙げられる。見た目で不具合がある場合、そのデータが異常と判定された要因は、撮影時の不具合と判断される（ステップＳ３）。 The user confirms the first data displayed on the output unit 30, and determines whether or not the data has a visual defect (step S2). As an appearance defect, for example, there is a case where the entire screen is blurred and the image itself cannot be recognized. If there is a defect in appearance, the factor for determining that the data is abnormal is determined to be a defect at the time of shooting (step S3).

見た目で不具合が無い場合、要因選択部１５は、第１データと類似度が相対的に高い例示データを出力部３０に表示させる（ステップＳ４）。ユーザは、表示された例示データが、第１データと似ているか判断する（ステップＳ５）。 If there is no problem in appearance, the factor selection unit 15 causes the output unit 30 to display exemplary data having a relatively high degree of similarity to the first data (step S4). The user determines whether the displayed exemplary data is similar to the first data (step S5).

表示された例示データが第１データと似ていない場合、要因選択部１５は、異常の要因が第１モデルの学習不足であると判断し（ステップＳ６）、第１詳細情報を選択する。すなわち、第１モデルに対して、第１データに関する学習が行われてないと判断される。 When the displayed exemplary data is not similar to the first data, the factor selection unit 15 determines that the cause of the abnormality is insufficient learning of the first model (step S6), and selects the first detailed information. That is, it is determined that the first model has not been learned about the first data.

例示データが第１データと似ているかユーザが判断する代わりに、要因選択部１５が類似度を用いて判断しても良い。例えば、要因選択部１５は、上述した通り、第１条件が満たされるか判断する。要因選択部１５は、第１条件が満たされる場合、第１詳細情報を選択する。 Instead of the user determining whether the exemplary data is similar to the first data, the factor selection unit 15 may determine using the degree of similarity. For example, the factor selection unit 15 determines whether the first condition is satisfied, as described above. The factor selection unit 15 selects the first detailed information when the first condition is satisfied.

表示された例示データが第１データと似ている（第１条件が満たされない）場合、要因選択部１５は、上述した通り、第２条件が満たされるか判断する（ステップＳ７）。第２条件が満たされる場合、要因選択部１５は、異常の要因が第１モデルの学習不足であると判断し（ステップＳ８）、第２詳細情報を選択する。より具体的には、第１モデルに対して、第１データに関する学習は行われているが、学習が十分では無いと判断される。 When the displayed exemplary data is similar to the first data (the first condition is not satisfied), the factor selection unit 15 determines whether the second condition is satisfied as described above (step S7). When the second condition is satisfied, the factor selection unit 15 determines that the cause of the abnormality is insufficient learning of the first model (step S8), and selects the second detailed information. More specifically, it is judged that the learning regarding the first data is performed for the first model, but the learning is not sufficient.

第２条件が満たされない場合、要因選択部１５は、複数の類似データのそれぞれについて、第１モデルにより推論された複数の類似データの分類と、複数の類似データに付与されたラベルと、が一致するか判断する（ステップＳ９）。一致しない場合、要因選択部１５は、異常の要因がラベルのミスであると判断し（ステップＳ１０）、第２情報を選択する。一致する場合、要因選択部１５は、異常であることについて、問題は無いと判断し（ステップＳ１１）、処理を終了する。 When the second condition is not satisfied, the factor selection unit 15 matches the classification of the plurality of similar data inferred by the first model and the label given to the plurality of similar data for each of the plurality of similar data. It is determined whether to do it (step S9). If they do not match, the factor selection unit 15 determines that the cause of the abnormality is a label error (step S10), and selects the second information. If they match, the factor selection unit 15 determines that there is no problem regarding the abnormality (step S11), and ends the process.

上記の処理において、要因選択部１５が少なくともいずれかの情報を選択した場合、要因選択部１５は、その情報を出力部３０に表示させる。図４は、要因選択部１５が、出力部３０に第１情報及び第２情報を表示させた場合を例示している。 In the above process, when the factor selection unit 15 selects at least one of the information, the factor selection unit 15 causes the output unit 30 to display the information. FIG. 4 illustrates a case where the factor selection unit 15 causes the output unit 30 to display the first information and the second information.

図４の例では、第１データについて、第１データｂ１、第１分類ｂ２、第１分類確信度ｂ３、及び第１データに関する各分類の確信度ｂ４が表示されている。第１情報について、第１データと類似する例示データｃ１、第１モデルにより推論された例示データｃ１の分類ｃ２、分類ｃ２の確からしさを示す分類確信度ｃ３、例示データｃ１に付与されたラベルｃ４、例示データｃ１に関する各分類の確信度ｃ５、第１情報ｃ６などが表示されている。第２情報について、ラベルと分類の不一致が発見された例示データｄ１、第１モデルにより推論された例示データｄ１の分類ｄ２、分類ｄ２の確からしさを示す分類確信度ｄ３、例示データｄ１に付与されたラベルｄ４、例示データｄ１に関する各分類の確信度ｄ５、第２情報ｄ６などが表示されている。 In the example of FIG. 4, for the first data, the first data b1, the first classification b2, the first classification certainty b3, and the certainty b4 of each classification regarding the first data are displayed. Regarding the first information, the exemplary data c1 similar to the first data, the classification c2 of the exemplary data c1 inferred by the first model, the classification certainty c3 indicating the certainty of the classification c2, and the label c4 given to the exemplary data c1. , The certainty degree c5 of each classification regarding the exemplary data c1, the first information c6, and the like are displayed. Regarding the second information, it is given to the exemplary data d1 in which a discrepancy between the label and the classification is found, the classification d2 of the exemplary data d1 inferred by the first model, the classification certainty d3 indicating the certainty of the classification d2, and the exemplary data d1. The label d4, the certainty degree d5 of each classification regarding the exemplary data d1, the second information d6, and the like are displayed.

第１情報又は第２情報に関する例示データが表示された領域が、他のデータが表示された領域に対して、区別可能に表示されても良い。図４に表した例では、例示データｃ１及び例示データｄ１が表示された領域には、他のデータと区別可能にパターンが表示されている。パターンに代えて互いに異なる色などが付されても良い。 The area in which the example data relating to the first information or the second information is displayed may be displayed in a distinctive manner with respect to the area in which other data is displayed. In the example shown in FIG. 4, the pattern is displayed in the area where the exemplary data c1 and the exemplary data d1 are displayed so as to be distinguishable from other data. Instead of the pattern, different colors may be added.

処理部１０は、さらに別の情報を出力部３０に出力させても良い。例えば、処理部１０は、推論時にデータのどの部分に反応しているかを示す誘目度マップ（SaliencyMap）を出力部３０に表示させても良い。処理部１０は、第１データの画像のＲ、Ｇ、Ｂをそれぞれ別々に表示させても良い。更に、第１データの画像と誘目度マップを重畳表示させても良い。 The processing unit 10 may output still another information to the output unit 30. For example, the processing unit 10 may display an attraction map (SaliencyMap) indicating which part of the data is reacting at the time of inference on the output unit 30. The processing unit 10 may display R, G, and B of the image of the first data separately. Further, the image of the first data and the attraction degree map may be superimposed and displayed.

ここでは、処理部１０に１つのデータが入力される場合について説明した。処理部１０には、複数のデータ（例えば複数の画像）が入力されても良い。このとき、最初に入力された複数のデータに対して図５に表したような分類のヒストグラムを提示しても良い。これにより、ユーザは、分類ごとの偏りが無いか判断できる。偏りがあった場合は、恣意的に実施された検査などの可能性などを検出できる。 Here, a case where one data is input to the processing unit 10 has been described. A plurality of data (for example, a plurality of images) may be input to the processing unit 10. At this time, a histogram of the classification as shown in FIG. 5 may be presented for the plurality of data initially input. As a result, the user can determine whether or not there is a bias for each classification. If there is a bias, it is possible to detect the possibility of an arbitrarily performed inspection.

以上で説明したように、要因選択部１５を含む処理部１０を備えたモデル更新支援システムによれば、第１モデルを用いて算出された、第１データの分類の確からしさを示す分類確信度と、第１データと複数の例示データとの類似性をそれぞれ示す複数の類似度と、に基づいて、第１モデルへの学習が十分では無いことを示す第１情報、又は複数のラベルの１つが適切では無いことを示す第２情報を出力可能である。異常の要因を示す情報がユーザに提供されることで、第１モデルの更新が容易となる。 As described above, according to the model update support system including the processing unit 10 including the factor selection unit 15, the classification certainty indicating the certainty of the classification of the first data calculated using the first model. And a plurality of similarities indicating the similarity between the first data and the plurality of exemplary data, and the first information indicating that the training to the first model is insufficient, or one of the plurality of labels. It is possible to output second information indicating that one is not appropriate. By providing the user with information indicating the cause of the abnormality, it becomes easy to update the first model.

図１に表した例では、処理部１０が、要因選択部１５以外に、受付部１１、分類確信度算出部１２、判定部１３、及び類似度算出部１４を含む。この例に限定されず、処理部１０は、要因選択部１５以外を含んでいなくても良い。例えば、別の処理部で類似度や分類確信度が算出され、その計算結果が処理部１０に入力されても良い。 In the example shown in FIG. 1, the processing unit 10 includes a reception unit 11, a classification certainty calculation unit 12, a determination unit 13, and a similarity calculation unit 14 in addition to the factor selection unit 15. Not limited to this example, the processing unit 10 may not include other than the factor selection unit 15. For example, the similarity degree and the classification certainty may be calculated by another processing unit, and the calculation result may be input to the processing unit 10.

図６は、第２実施形態に係るモデル更新支援システムの構成を表す模式図である。
図６に表した第２実施形態に係るモデル更新支援システム２１０では、処理部１０は、ラベル付与部１６及び更新部１７をさらに含む。モデル更新支援システム２１０は、入力部４０をさらに含む。 FIG. 6 is a schematic diagram showing the configuration of the model update support system according to the second embodiment.
In the model update support system 210 according to the second embodiment shown in FIG. 6, the processing unit 10 further includes a labeling unit 16 and an update unit 17. The model update support system 210 further includes an input unit 40.

処理部１０の受付部１１、分類確信度算出部１２、判定部１３、類似度算出部１４、及び要因選択部１５における処理は、モデル更新支援システム１１０と同様である。例えば、要因選択部１５により第１情報又は第２情報が選択され、その情報が出力部３０から出力される。ユーザは、出力された第１情報又は第２情報を参考にして、入力部４０を操作する。 The processing in the reception unit 11, the classification certainty calculation unit 12, the determination unit 13, the similarity calculation unit 14, and the factor selection unit 15 of the processing unit 10 is the same as that of the model update support system 110. For example, the factor selection unit 15 selects the first information or the second information, and the information is output from the output unit 30. The user operates the input unit 40 with reference to the output first information or the second information.

入力部４０は、キーボード、マウス、タッチパネル、及びマイクロフォン（音声操作）の少なくともいずれかを含む。 The input unit 40 includes at least one of a keyboard, a mouse, a touch panel, and a microphone (voice operation).

第１情報が出力された場合、ユーザは、例えば、異常と判定された第１データの本来の分類に関する学習データを学習データ記憶部５２に追加する操作を行う。第２情報が出力された場合、ユーザは、正しいラベルを入力する操作を行う。ラベル付与部１６は、ユーザがラベルを入力すると、第２情報に係る例示データに、ユーザから入力されたラベルを付与する。ラベル付与部１６は、その例示データ及びラベルを、学習データ記憶部５２に記憶する。 When the first information is output, the user performs an operation of adding, for example, learning data relating to the original classification of the first data determined to be abnormal to the learning data storage unit 52. When the second information is output, the user performs an operation of inputting the correct label. When the user inputs a label, the label assigning unit 16 assigns the label input by the user to the exemplary data related to the second information. The label giving unit 16 stores the exemplary data and the label in the learning data storage unit 52.

学習データ記憶部５２の学習データ群が変更されると、更新部１７は、変更された後の学習データ群を用いてモデル記憶部５１の第１モデルを更新（再学習）する。学習データ群の変更は、学習データの追加、又はラベルの訂正などである。更新部１７は、更新した第１モデルをモデル記憶部５１に記憶する。 When the learning data group of the learning data storage unit 52 is changed, the update unit 17 updates (re-learns) the first model of the model storage unit 51 using the changed learning data group. The change of the training data group is the addition of the training data or the correction of the label. The update unit 17 stores the updated first model in the model storage unit 51.

処理部１０がラベル付与部１６及び更新部１７を含むことで、ユーザに異常の要因を示す情報を提供するだけでなく、その異常を改善するための第１モデルの更新も実行可能となる。これにより、ユーザの利便性を向上させることができる。 When the processing unit 10 includes the labeling unit 16 and the updating unit 17, it is possible not only to provide the user with information indicating the cause of the abnormality, but also to update the first model for improving the abnormality. As a result, the convenience of the user can be improved.

図６では、１つの処理部が、受付部１１、分類確信度算出部１２、判定部１３、類似度算出部１４、要因選択部１５、ラベル付与部１６、及び更新部１７として機能する例を表した。この例に限定されず、複数の処理部により、これらの機能が実現されても良い。例えば、ある処理部が、受付部１１、分類確信度算出部１２、判定部１３、類似度算出部１４、及び要因選択部１５として機能し、別の処理部が、ラベル付与部１６及び更新部１７として機能として機能しても良い。それらの処理部を含むシステムは、実質的に処理部１０を含むものと見なすことができる。 In FIG. 6, one processing unit functions as a reception unit 11, a classification certainty calculation unit 12, a determination unit 13, a similarity calculation unit 14, a factor selection unit 15, a label assignment unit 16, and an update unit 17. expressed. Not limited to this example, these functions may be realized by a plurality of processing units. For example, one processing unit functions as a reception unit 11, a classification certainty calculation unit 12, a determination unit 13, a similarity calculation unit 14, and a factor selection unit 15, and another processing unit functions as a labeling unit 16 and an update unit. It may function as a function as 17. A system including those processing units can be regarded as substantially including the processing unit 10.

以上で説明した各実施形態によれば、異常の要因を示す情報を出力可能なモデル更新支援システムを提供できる。 According to each of the embodiments described above, it is possible to provide a model update support system capable of outputting information indicating the cause of the abnormality.

上記の種々のデータの処理は、例えば、プログラム（ソフトウェア）に基づいて実行される。例えば、コンピュータが、このプログラムを記憶し、このプログラムを読み出すことにより、上記の種々の情報の処理が行われる。 The processing of the various data described above is executed based on, for example, a program (software). For example, a computer stores this program and reads this program to process the various information described above.

上記の種々の情報の処理は、コンピュータに実行させることのできるプログラムとして、磁気ディスク（フレキシブルディスク及びハードディスクなど）、光ディスク（ＣＤ－ＲＯＭ、ＣＤ－Ｒ、ＣＤ－ＲＷ、ＤＶＤ－ＲＯＭ、ＤＶＤ±Ｒ、ＤＶＤ±ＲＷなど）、半導体メモリ、または、他の記録媒体に記録されても良い。 The processing of the above various information can be performed by a computer as a program that can be executed by a magnetic disk (flexible disk, hard disk, etc.), an optical disk (CD-ROM, CD-R, CD-RW, DVD-ROM, DVD ± R). , DVD ± RW, etc.), semiconductor memory, or other recording medium.

例えば、記録媒体に記録された情報は、コンピュータ（または組み込みシステム）により読み出されることが可能である。記録媒体において、記録形式（記憶形式）は任意である。例えば、コンピュータは、記録媒体からプログラムを読み出し、このプログラムに基づいてプログラムに記述されている指示をＣＰＵで実行させる。コンピュータにおいて、プログラムの取得（または読み出し）は、ネットワークを通じて行われても良い。 For example, the information recorded on the recording medium can be read out by a computer (or an embedded system). In the recording medium, the recording format (storage format) is arbitrary. For example, the computer reads a program from the recording medium and causes the CPU to execute the instructions described in the program based on the program. In the computer, the acquisition (or reading) of the program may be performed through the network.

記録媒体からコンピュータ（または組み込みシステム）にインストールされたプログラムに基づいてコンピュータ上で稼働している種々のソフトウェアにおいて、上記の情報の処理の少なくとも一部が実施されても良い。このソフトウェアは、例えば、ＯＳ（オペレーティングシステム）などを含む。このソフトウェアは、例えば、ネットワーク上で動作するミドルウェアなどを含んでも良い。 At least some of the above information processing may be performed in various software running on the computer based on a program installed on the computer (or embedded system) from the recording medium. This software includes, for example, an OS (operating system). This software may include, for example, middleware running on a network.

実施形態に係る記録媒体は、上記の種々の情報の処理をコンピュータに実行させることのできるプログラムを記憶している。実施形態に係る記録媒体には、プログラムをＬＡＮまたはインターネットなどによりダウンロードして記憶された記録媒体も含まれる。複数の記録媒体に基づいて、上記の処理が行われても良い。 The recording medium according to the embodiment stores a program capable of causing a computer to process the various information described above. The recording medium according to the embodiment also includes a recording medium in which the program is downloaded and stored by LAN, the Internet, or the like. The above processing may be performed based on a plurality of recording media.

実施形態に係るコンピュータは、１つまたは複数の装置（例えばパーソナルコンピュータなど）を含む。実施形態に係るコンピュータは、ネットワークにより接続された複数の装置を含んでも良い。 The computer according to the embodiment includes one or more devices (eg, a personal computer, etc.). The computer according to the embodiment may include a plurality of devices connected by a network.

以上、具体例を参照しつつ、本発明の実施の形態について説明した。しかし、本発明の実施形態は、これらの具体例に限定されるものではない。例えば、処理部、取得部、出力部、入力部、記憶部などの各要素の具体的な構成に関しては、当業者が公知の範囲から適宜選択することにより本発明を同様に実施し、同様の効果を得ることができる限り、本発明の範囲に包含される。 Hereinafter, embodiments of the present invention have been described with reference to specific examples. However, the embodiments of the present invention are not limited to these specific examples. For example, regarding the specific configuration of each element such as the processing unit, the acquisition unit, the output unit, the input unit, and the storage unit, the present invention is similarly carried out by appropriately selecting from a range known to those skilled in the art. As long as the effect can be obtained, it is included in the scope of the present invention.

また、各具体例のいずれか２つ以上の要素を技術的に可能な範囲で組み合わせたものも、本発明の要旨を包含する限り本発明の範囲に含まれる。 Further, a combination of any two or more elements of each specific example to the extent technically possible is also included in the scope of the present invention as long as the gist of the present invention is included.

その他、本発明の実施の形態として上述したモデル更新支援システムを基にして、当業者が適宜設計変更して実施し得る全てのモデル更新支援システムも、本発明の要旨を包含する限り、本発明の範囲に属する。 In addition, all model update support systems that can be appropriately designed and implemented by those skilled in the art based on the model update support system described above as an embodiment of the present invention are also included in the present invention as long as the gist of the present invention is included. It belongs to the range of.

その他、本発明の思想の範疇において、当業者であれば、各種の変更例及び修正例に想到し得るものであり、それら変更例及び修正例についても本発明の範囲に属するものと了解される。 In addition, in the scope of the idea of the present invention, those skilled in the art can come up with various modified examples and modified examples, and it is understood that these modified examples and modified examples also belong to the scope of the present invention. ..

本発明のいくつかの実施形態を説明したが、これらの実施形態は、例として提示したものであり、発明の範囲を限定することは意図していない。これら新規な実施形態は、その他の様々な形態で実施されることが可能であり、発明の要旨を逸脱しない範囲で、種々の省略、置き換え、変更を行うことができる。これら実施形態やその変形は、発明の範囲や要旨に含まれるとともに、特許請求の範囲に記載された発明とその均等の範囲に含まれる。 Although some embodiments of the present invention have been described, these embodiments are presented as examples and are not intended to limit the scope of the invention. These novel embodiments can be implemented in various other embodiments, and various omissions, replacements, and changes can be made without departing from the gist of the invention. These embodiments and variations thereof are included in the scope and gist of the invention, and are also included in the scope of the invention described in the claims and the equivalent scope thereof.

１０処理部、１１受付部、１２分類確信度算出部、１３判定部、１４類似度算出部、１５要因選択部、１６ラベル付与部、１７更新部、２０取得部、３０出力部、４０入力部、５１モデル記憶部、５２学習データ記憶部、１１０、２１０モデル更新支援システム、Ｓ１～Ｓ１１ステップ、ａ１第１データ、ａ２第１分類、ａ３第１分類確信度、ａ４確信度、ａ５例示データ、ａ６分類、ａ７分類確信度、ａ８ラベル、ａ９確信度、ｂ１第１データ、ｂ２第１分類、ｂ３第１分類確信度、ｂ４確信度、ｃ１例示データ、ｃ２分類、ｃ３分類確信度、ｃ４ラベル、ｃ５確信度、ｃ６第１情報、ｄ１例示データ、ｄ２分類、ｄ３分類確信度、ｄ４ラベル、ｄ５確信度、ｄ６第２情報 10 Processing unit, 11 Reception unit, 12 Classification certainty calculation unit, 13 Judgment unit, 14 Similarity calculation unit, 15 Factor selection unit, 16 Labeling unit, 17 Update unit, 20 Acquisition unit, 30 Output unit, 40 Input unit , 51 model storage unit, 52 learning data storage unit, 110, 210 model update support system, S1 to S11 steps, a1 first data, a2 first classification, a3 first classification certainty, a4 certainty, a5 example data, a6 classification, a7 classification certainty, a8 label, a9 certainty, b1 first data, b2 first classification, b3 first classification certainty, b4 certainty, c1 exemplary data, c2 classification, c3 classification certainty, c4 label , C5 certainty, c6 first information, d1 exemplary data, d2 classification, d3 classification certainty, d4 label, d5 certainty, d6 second information

Claims

A model update support system that supports updating of a first model trained using a learning data group including a plurality of exemplary data and a plurality of labels assigned to the plurality of exemplary data, respectively.
The classification certainty that indicates the certainty of classification of the first data calculated using the first model, and the plurality of similarities that indicate the similarity between the first data and the plurality of exemplary data, respectively. Based on this, the model update support provided with a processing unit capable of outputting the first information indicating that the training to the first model is not sufficient or the second information indicating that one of the plurality of labels is not appropriate. system.

When the processing unit outputs the first information, the processing unit outputs the first information, the first data, the first exemplary data included in the plurality of exemplary data, and the classification certainty. The model update support system described in Item 1.

The first information is
The first detailed information indicating that the learning about the first data is not performed in the first model, and
Second detailed information indicating that learning about the first data is insufficient in the first model, and
Including
The model update support system according to claim 2, wherein the processing unit can output either the first detailed information or the second detailed information as the first information.

For a plurality of similar data extracted from the plurality of exemplary data, including the first similarity data from which the maximum value of the plurality of similarities was obtained, the processing unit calculated using the first model. Refer to the plurality of reference confidence levels, each of which indicates the certainty of classification of the plurality of similar data.
The processing unit has a first condition in which the maximum value of the plurality of similarities is less than the first threshold value, and the average value of the plurality of reference confidence levels is less than the second threshold value, or the plurality of reference confidence levels. The model update support system according to claim 2 or 3, wherein when at least one of the second condition in which the variation is equal to or greater than the third threshold value is satisfied, the first information is output.

When the processing unit outputs the second information, the second information, one of the plurality of labels, the second exemplary data to which the one of the plurality of labels is attached, and the first model are used. The second classification of the inferred second exemplary data and the second classification are output.
The second category is the model update support system according to claim 1, which is different from the one of the plurality of labels.

The processing unit
A plurality of similar data extracted from the plurality of exemplary data including the first similar data obtained from which the maximum value of the plurality of similarities is obtained, and a plurality of the plurality of similar data inferred by the first model. Classification and, see,
For each of the plurality of similar data, a part of the plurality of labels given to the plurality of similar data and the plurality of classifications are compared.
When one of the plurality of labels does not match one of the plurality of classifications, the one of the plurality of labels is used as the second exemplary data, and the one of the plurality of classifications is output as the second classification. do,
The model update support system according to claim 5.

An output unit capable of outputting the first information or the second information,
An input unit capable of inputting the operation related to the first information or the second information, and
Further prepare
The model update support system according to any one of claims 1 to 6, wherein the processing unit updates the first model based on the operation when the operation is received.

Accept the input of the first data,
The first data is input to a first model trained using a learning data group including a plurality of exemplary data and a plurality of labels assigned to the plurality of exemplary data, respectively, and the first data is used. Obtain the first classification certainty, which indicates the certainty of classification,
A plurality of similarities indicating the similarity between the first data and the plurality of exemplary data were calculated.
The first exemplary data is extracted from the plurality of exemplary data based on the plurality of similarities.
In the processing unit, in addition to the output of the first data, the first classification certainty, and the first exemplary data, the first data is determined to be abnormal based on the first classification certainty. Is provided with the processing unit capable of outputting information related to the learning of the first model.
The first condition set based on the plurality of similarities or the second condition set based on a plurality of reference confidence levels indicating the certainty of classification of some of the plurality of exemplary data is satisfied. In the case, the information includes the first information indicating that the training to the first model is not sufficient.
A model update support system comprising second information indicating that one of the plurality of labels is not appropriate if neither the first condition nor the second condition is satisfied.

The processing unit extracts a plurality of similar data including the first exemplary data from the plurality of exemplary data based on the plurality of similarity degrees.
The first exemplary data is data in which the maximum degree of similarity is obtained among the plurality of exemplary data.
The plurality of similar data include second exemplary data different from the first exemplary data.
When the classification of the second exemplary data by the first model and the label given to the second exemplary data do not match, the processing unit assigns the second exemplary data and the second exemplary data. The model update support system according to claim 8, further outputting the label and the label.