JP7842774B2

JP7842774B2 - Learning device

Info

Publication number: JP7842774B2
Application number: JP2023553882A
Authority: JP
Inventors: 健李; 雅信本江; 圭祐大島
Original assignee: PFU Ltd
Current assignee: PFU Ltd
Priority date: 2021-10-15
Filing date: 2021-10-15
Publication date: 2026-04-08
Anticipated expiration: 2041-10-15
Also published as: WO2023062828A1; JPWO2023062828A1

Description

本開示は、学習装置に関する。This disclosure relates to a learning device.

廃棄物処理場では日々大量の廃棄物がベルトコンベア上を流れ、処理されている。廃棄物が処理される現場では人の手によって廃棄物の選別作業が行われている。廃棄物の選別作業は単純作業である一方で、廃棄物の選別を行う作業者（以下では「選別作業者」と呼ぶことがある）の負担が大きいことから、廃棄物の選別を自動的に行う装置（以下では「廃棄物選別装置」と呼ぶことがある）が開発されている。At waste disposal sites, large quantities of waste flow along conveyor belts and are processed daily. At these sites, waste sorting is carried out manually. While waste sorting is a simple task, it places a heavy burden on the workers who perform it (hereinafter sometimes referred to as "sorters"). Therefore, automated waste sorting devices (hereinafter sometimes referred to as "waste sorting devices") have been developed.

国際公開第２０１６／０８４３３６号International Publication No. 2016/084336 特開２０１９－０７４９４５号公報Japanese Patent Publication No. 2019-074945

選別作業者が行っていた作業を選別作業者に代わって廃棄物選別装置にやらせる場合、廃棄物選別装置が、ベルトコンベア上を流れる各々の廃棄物を認識し、認識結果に基づいてロボットハンドや吸引パッドを用いて、ベルトコンベア上を流れる廃棄物群から所望の廃棄物（以下では「所望廃棄物」と呼ぶことがある）を抽出することが考えられる。そこで、ベルトコンベア上を複数種類の廃棄物が混在して流れてくる場合には、廃棄物選別装置が廃棄物の種類を識別することが必要になる。廃棄物選別装置が様々な種類の廃棄物を認識するにあたっては、機械学習によって生成された学習済モデルを用いた認識が有効である。When a waste sorting machine replaces a sorting worker, it is conceivable that the machine would recognize each type of waste flowing on the conveyor belt and, based on the recognition results, use a robotic hand or suction pad to extract the desired waste (hereinafter sometimes referred to as "desired waste") from the group of waste flowing on the conveyor belt. Therefore, when multiple types of waste flow together on the conveyor belt, the waste sorting machine needs to be able to identify the types of waste. In order for the waste sorting machine to recognize various types of waste, recognition using a trained model generated by machine learning is effective.

しかし、機械学習には膨大な数の教師データが必要であるため、様々な種類の廃棄物が撮影された画像データから教師データを生成するときに画像データに対して行われるアノテーション作業を画像の目視による手作業で行うことになると、教師データの生成に大変な労力がかかってしまう。However, since machine learning requires a vast amount of training data, if the annotation process for generating training data from image data of various types of waste is to be done manually by visually inspecting the images, generating the training data becomes extremely labor-intensive.

そこで、本開示では、アノテーション作業の効率をあげることができる技術を提案する。Therefore, this disclosure proposes a technology that can improve the efficiency of annotation work.

本開示の学習装置は、第一アノテーション部と、学習部と、モデル管理部とを有する。前記第一アノテーション部は、物体画像の認識結果に基づいて所望の物体を選別する物体選別装置が用いる第一学習済モデルから複製された第二学習済モデルを用いて、前記物体画像の特徴を示す情報である特徴情報を前記物体画像に付与する第一アノテーション処理を行う。前記学習部は、前記物体画像と、前記第一アノテーション処理によって付与済みの前記特徴情報とを教師データとして用いて機械学習を行うことにより前記第二学習済モデルを更新する。前記モデル管理部は、更新後の前記第二学習済モデルを用いて前記第一学習済モデルを更新する。The learning device of this disclosure comprises a first annotation unit, a learning unit, and a model management unit. The first annotation unit performs a first annotation process to assign feature information, which is information indicating the features of the object image, to the object image using a second trained model replicated from a first trained model used by an object selection device that selects a desired object based on the recognition result of the object image. The learning unit updates the second trained model by performing machine learning using the object image and the feature information assigned by the first annotation process as training data. The model management unit updates the first trained model using the updated second trained model.

開示の技術によれば、アノテーション作業の効率をあげることができる。The disclosure technology can improve the efficiency of annotation work.

図１は、本開示の実施例１の物体選別システムの構成例を示す図である。Figure 1 shows an example of the configuration of an object sorting system according to Embodiment 1 of this disclosure. 図２は、本開示の実施例１の学習装置の動作例を示す図である。Figure 2 shows an example of the operation of the learning device of Embodiment 1 of this disclosure. 図３は、本開示の実施例１の学習装置の動作例を示す図である。Figure 3 shows an example of the operation of the learning device according to Embodiment 1 of this disclosure. 図４は、本開示の実施例１の学習装置の動作例を示す図である。Figure 4 shows an example of the operation of the learning device according to Embodiment 1 of this disclosure. 図５は、本開示の実施例１の学習装置の動作例を示す図である。Figure 5 shows an example of the operation of the learning device according to Embodiment 1 of this disclosure. 図６は、本開示の実施例１の学習装置の動作例を示す図である。Figure 6 shows an example of the operation of the learning device according to Embodiment 1 of this disclosure. 図７は、本開示の実施例１の学習装置の動作例を示す図である。Figure 7 shows an example of the operation of the learning device of Embodiment 1 of this disclosure. 図８は、本開示の実施例１の学習装置の動作例を示す図である。Figure 8 shows an example of the operation of the learning device according to Embodiment 1 of this disclosure. 図９は、本開示の実施例１の学習装置の動作例を示す図である。Figure 9 shows an example of the operation of the learning device according to Embodiment 1 of this disclosure. 図１０は、本開示の実施例１の学習装置の動作例を示す図である。Figure 10 shows an example of the operation of the learning device according to Embodiment 1 of this disclosure. 図１１は、本開示の実施例１の学習装置の動作例を示す図である。Figure 11 shows an example of the operation of the learning device according to Embodiment 1 of the present disclosure. 図１２は、本開示の実施例１の学習装置の動作例を示す図である。Figure 12 shows an example of the operation of the learning device according to Embodiment 1 of this disclosure.

以下、本開示の実施例を図面に基づいて説明する。以下の実施例において同一の構成には同一の符号を付す。The embodiments of this disclosure will be described below with reference to the drawings. In the following embodiments, identical components will be denoted by the same reference numerals.

［実施例１］
＜物体選別システムの構成＞
図１は、本開示の実施例１の物体選別システムの構成例を示す図である。 [Example 1]
<Configuration of the object sorting system>
Figure 1 shows an example of the configuration of an object sorting system according to Embodiment 1 of this disclosure.

図１において、物体選別システム１は、学習装置１０と、カメラ２０と、物体選別装置３０と、ディスプレイ４０と、入力装置５０とを有する。ディスプレイ４０及び入力装置５０は、学習装置１０に接続される。入力装置５０の一例として、マウス等のポインティングデバイス、及び、キーボードが挙げられる。学習装置１０と、カメラ２０と、物体選別装置３０とは、ネットワークを介して互いに接続される。In Figure 1, the object sorting system 1 comprises a learning device 10, a camera 20, an object sorting device 30, a display 40, and an input device 50. The display 40 and the input device 50 are connected to the learning device 10. Examples of the input device 50 include a pointing device such as a mouse and a keyboard. The learning device 10, the camera 20, and the object sorting device 30 are connected to each other via a network.

学習装置１０は、第一アノテーション部１１と、教師データ記憶部１２と、機械学習部１３と、アノテーション用学習済モデル記憶部１４と、モデル管理部１５と、複製部１６と、第二アノテーション部１７とを有する。The learning device 10 includes a first annotation unit 11, a training data storage unit 12, a machine learning unit 13, a pre-trained model storage unit 14 for annotation, a model management unit 15, a replication unit 16, and a second annotation unit 17.

物体選別装置３０は、認識用学習済モデル記憶部３１と、画像認識部３２と、所望物体抽出部３３とを有する。The object sorting device 30 includes a learning model storage unit 31 for recognition, an image recognition unit 32, and a desired object extraction unit 33.

以下では、図１に示す物体選別システム１が、ベルトコンベア上を廃棄物群が流れる廃棄物処理場に設置される場合を一例に挙げて説明する。つまり、以下では、物体選別システム１の選別対象の物体が廃棄物である場合を一例に挙げて説明する。しかし、物体選別システム１は、ベルトコンベア上を部品群が流れる組立工場等に設置されても良い。つまり、物体選別システム１の選別対象の物体は廃棄物に限定されず、物体選別システム１は様々な物体に対して使用可能である。In the following explanation, we will use the example of the object sorting system 1 shown in Figure 1 being installed in a waste treatment plant where waste materials flow on a conveyor belt. In other words, the following explanation will use the example of the object sorting system 1 being used when the objects to be sorted are waste materials. However, the object sorting system 1 may also be installed in an assembly plant or similar facility where parts flow on a conveyor belt. In short, the objects to be sorted by the object sorting system 1 are not limited to waste materials, and the object sorting system 1 can be used for a variety of objects.

カメラ２０は、廃棄物群が搬送されるベルトコンベアの上方に配置され、ベルトコンベアによって搬送される廃棄物群を撮影する。カメラ２０によって撮影された画像（以下では「撮影画像」と呼ぶことがある）は、カメラ２０から学習装置１０及び物体選別装置３０の双方へ送信される。第一アノテーション部１１及び画像認識部３２は、カメラ２０から送信された撮影画像を受信する。Camera 20 is positioned above the conveyor belt through which the waste materials are transported, and it photographs the waste materials as they are being transported by the conveyor belt. The images captured by camera 20 (hereinafter sometimes referred to as "captured images") are transmitted from camera 20 to both the learning device 10 and the object sorting device 30. The first annotation unit 11 and the image recognition unit 32 receive the captured images transmitted from camera 20.

物体選別装置３０において、画像認識部３２は、認識用学習済モデル記憶部３１に記憶されている認識用学習済モデルを用いて、撮影画像の中に存在する各廃棄物の画像（以下では「廃棄物画像」と呼ぶことがある）を認識し、認識結果を所望物体抽出部３３へ出力する。画像認識部３２は、例えば、撮影画像に対してインスタンスセグメンテーションを行うことにより廃棄物画像を認識する。In the object sorting device 30, the image recognition unit 32 uses the trained recognition model stored in the trained recognition model storage unit 31 to recognize each waste image (hereinafter sometimes referred to as "waste image") present in the captured image, and outputs the recognition result to the desired object extraction unit 33. The image recognition unit 32 recognizes the waste images, for example, by performing instance segmentation on the captured image.

所望物体抽出部３３は、画像認識部３２での認識結果に従って、ベルトコンベアによって搬送される廃棄物群の中から所望廃棄物を抽出する。所望物体抽出部３３の一例として、ロボットハンド、吸引パッド等が挙げられる。The desired object extraction unit 33 extracts the desired waste from the group of waste being transported by the belt conveyor, according to the recognition results from the image recognition unit 32. Examples of the desired object extraction unit 33 include a robotic hand and a suction pad.

一方で、学習装置１０において、複製部１６は、アノテーション用学習済モデルが初期化されるときにだけ動作する。複製部１６は、アノテーション用学習済モデルが初期化されるときに、認識用学習済モデル記憶部３１に記憶されている認識用学習済モデルを複製し、複製後の認識用学習済モデルをアノテーション用学習済モデル記憶部１４に記憶させることにより、認識用学習済モデルによってアノテーション用学習済モデルを初期化する。On the other hand, in the learning device 10, the duplication unit 16 operates only when the annotation-trained model is initialized. When the annotation-trained model is initialized, the duplication unit 16 duplicates the recognition-trained model stored in the recognition-trained model storage unit 31, and stores the duplicated recognition-trained model in the annotation-trained model storage unit 14, thereby initializing the annotation-trained model using the recognition-trained model.

第一アノテーション部１１は、アノテーション用学習済モデル記憶部１４に記憶されているアノテーション用学習済モデルを用いて、撮影画像の中に存在する各廃棄物画像を認識し、各廃棄物画像の特徴を示す情報（以下では「特徴情報」と呼ぶことがある）を各廃棄物画像に付与する処理（以下では「第一アノテーション処理」と呼ぶことがある）を行う。第一アノテーション部１１は、画像認識のスコアが閾値ＴＨ１以上となる廃棄物画像に特徴情報を付与する。特徴情報は、廃棄物画像の輪郭を示す情報（以下では「輪郭情報」と呼ぶことがある）と、廃棄物画像の色を示す情報（以下では「色情報」と呼ぶことがある）とを含む。第一アノテーション部１１は、廃棄物画像と、廃棄物画像に付与済みの特徴情報とを互いに対応付け、廃棄物画像と特徴情報とを含む教師データ（以下では「第一教師データ」と呼ぶことがある）を生成する。よって、撮影画像において各廃棄物画像に対して特徴情報が付与されたデータが第一教師データとなる。第一アノテーション部１１は、生成した第一教師データを教師データ記憶部１２に記憶させる。第一アノテーション部１１は、例えば、撮影画像に対してインスタンスセグメンテーションを行うことにより廃棄物画像を認識する。The first annotation unit 11 uses the pre-trained annotation model stored in the pre-trained annotation model storage unit 14 to recognize each waste image present in the captured image and performs a process (hereinafter sometimes referred to as "first annotation processing") to assign information indicating the characteristics of each waste image (hereinafter sometimes referred to as "feature information") to each waste image. The first annotation unit 11 assigns feature information to waste images whose image recognition score is equal to or greater than the threshold TH1. The feature information includes information indicating the contour of the waste image (hereinafter sometimes referred to as "contour information") and information indicating the color of the waste image (hereinafter sometimes referred to as "color information"). The first annotation unit 11 associates the waste images with the feature information already assigned to them and generates training data (hereinafter sometimes referred to as "first training data") that includes the waste images and feature information. Therefore, the data in which feature information has been assigned to each waste image in the captured image becomes the first training data. The first annotation unit 11 stores the generated first training data in the training data storage unit 12. The first annotation unit 11 recognizes waste images, for example, by performing instance segmentation on the captured image.

第二アノテーション部１７は、教師データ記憶部１２に記憶されている第一教師データを取得し、各廃棄物画像に対して特徴情報が付与されている撮影画像をディスプレイ４０に表示させる。入力装置５０は、オペレータによって操作され、オペレータは、ディスプレイ４０に表示された撮影画像の任意の箇所を入力装置５０を用いて指定することが可能である。第二アノテーション部１７は、アノテーション用学習済モデル記憶部１４に記憶されているアノテーション用学習済モデルを用いて、撮影画像の中の任意に指定された箇所の画像（以下では「指定箇所画像」と呼ぶことがある）を認識し、特徴情報を指定箇所画像に付与する処理（以下では「第二アノテーション処理」と呼ぶことがある）を行う。第二アノテーション部１７は、画像認識のスコアが、第一アノテーション部１１によって用いられる閾値ＴＨ１より小さい値を有する閾値ＴＨ２以上となる指定箇所画像に特徴情報を付与する。第二アノテーション部１７は、指定箇所画像と、指定箇所画像に付与済みの特徴情報とを互いに対応付け、指定箇所画像と特徴情報とを含む教師データ（以下では「第二教師データ」と呼ぶことがある）を生成する。よって、撮影画像において指定箇所画像に対して特徴情報が付与されたデータが第二教師データとなる。第二アノテーション部１７は、生成した第二教師データを教師データ記憶部１２に記憶させる。第二アノテーション部１７は、例えば、撮影画像の中の任意に指定された箇所に対してインスタンスセグメンテーションを行うことにより廃棄物画像を認識する。The second annotation unit 17 acquires the first training data stored in the training data storage unit 12 and displays the captured images to which feature information has been added for each waste image on the display 40. The input device 50 is operated by an operator, who can specify any part of the captured image displayed on the display 40 using the input device 50. The second annotation unit 17 uses the trained annotation model stored in the trained annotation model storage unit 14 to recognize an image of an arbitrarily specified part of the captured image (hereinafter sometimes referred to as the "specified area image") and performs a process (hereinafter sometimes referred to as the "second annotation process") to add feature information to the specified area image. The second annotation unit 17 adds feature information to specified area images whose image recognition score is greater than or equal to a threshold TH2, which has a value smaller than the threshold TH1 used by the first annotation unit 11. The second annotation unit 17 associates the designated area image with the feature information already assigned to the designated area image, and generates training data (hereinafter sometimes referred to as "second training data") that includes the designated area image and the feature information. Therefore, the data on which feature information has been assigned to the designated area image in the captured image becomes the second training data. The second annotation unit 17 stores the generated second training data in the training data storage unit 12. The second annotation unit 17 recognizes waste images by, for example, performing instance segmentation on an arbitrarily designated area in the captured image.

機械学習部１３は、教師データ記憶部１２に記憶されている第一教師データ及び第二教師データを用いて機械学習を行い、機械学習後の学習済モデルによって、アノテーション用学習済モデル記憶部１４に記憶されているアノテーション用学習済モデルを更新する。The machine learning unit 13 performs machine learning using the first and second training data stored in the training data storage unit 12, and updates the annotation trained model stored in the annotation trained model storage unit 14 with the trained model after machine learning.

モデル管理部１５は、アノテーション用学習済モデル記憶部１４に記憶されている更新後のアノテーション用学習済モデル（以下では「更新後アノテーション用学習済モデル」と呼ぶことがある）を用いて、認識用学習済モデル記憶部３１に記憶されている認識用学習済モデルを更新する。モデル管理部１５は、例えば、画像認識部３２によって行われる認識処理と同様の認識テストを更新後アノテーション用学習済モデルを用いて行い、認識精度が目標精度に達したときに、更新後アノテーション用学習済モデルによって認識用学習済モデルを更新する。The model management unit 15 updates the recognition trained model stored in the recognition trained model storage unit 31 using the updated annotation trained model stored in the annotation trained model storage unit 14 (hereinafter sometimes referred to as the "updated annotation trained model"). The model management unit 15 performs a recognition test similar to the recognition process performed by the image recognition unit 32, for example, using the updated annotation trained model, and updates the recognition trained model with the updated annotation trained model when the recognition accuracy reaches the target accuracy.

＜学習装置の動作＞
図２～図１２は、本開示の実施例１の学習装置の動作例を示す図である。以下では、廃棄物として空きビンを想定し、物体選別装置３０では、ベルトコンベア上を流れる各々の空きビンが、茶色の空きビン（以下では「茶色ビン」と呼ぶことがある）と、茶色以外の色（以下では「他色」と呼ぶことがある）を有する空きビン（以下では「他色ビン」と呼ぶことがある）とに選別される場合を想定する。また以下では、所望廃棄物が茶色ビンであり、茶色ビン用の第一教師データ及び第二教師データが生成される場合を想定する。 <Operation of the learning device>
Figures 2 to 12 show examples of the operation of the learning device according to Embodiment 1 of this disclosure. In the following, we assume that empty bottles are used as waste, and that the object sorting device 30 sorts each empty bottle flowing on the conveyor belt into brown empty bottles (hereinafter sometimes referred to as "brown bottles") and empty bottles of other colors (hereinafter sometimes referred to as "other-colored bottles") (hereinafter sometimes referred to as "other-colored bottles"). In the following, we also assume that the desired waste is brown bottles, and that first and second training data for brown bottles are generated.

図２に示すように、例えば、撮影画像Ｉ１には、廃棄物画像として、空きビンの画像（以下では「空きビン画像」と呼ぶことがある）Ｂ１１，Ｂ１２が含まれる。空きビン画像Ｂ１１は、茶色ビンの画像（以下では「茶色ビン画像」と呼ぶことがある）であり、空きビン画像Ｂ１２は、他色ビンの画像（以下では「他色ビン画像」と呼ぶことがある）である。As shown in Figure 2, for example, the captured image I1 includes images of empty bottles (hereinafter sometimes referred to as "empty bottle images") B11 and B12 as waste images. Empty bottle image B11 is an image of a brown bottle (hereinafter sometimes referred to as "brown bottle image"), and empty bottle image B12 is an image of a bottle of another color (hereinafter sometimes referred to as "other color bottle image").

第一アノテーション部１１は、撮影画像Ｉ１において、空きビン画像Ｂ１１に対して輪郭情報ＩＮｂ１を付与し、空きビン画像Ｂ１２に対して輪郭情報ＩＮｂ２を付与する。輪郭情報ＩＮｂ１，ＩＮｂ２は、複数の座標点[x1,y1]，[x2,y2]，…で形成される。さらに、第一アノテーション部１１は、撮影画像Ｉ１において、空きビン画像Ｂ１１に対してラベル情報ＩＮａ１と画像情報ＩＮｃ１とを付与し、空きビン画像Ｂ１２に対してラベル情報ＩＮａ２と画像情報ＩＮｃ２とを付与する。ラベル情報ＩＮａ１には、空きビン画像Ｂ１１の色情報“茶色ビン”が含まれ、ラベル情報ＩＮａ２には、空きビン画像Ｂ１２の色情報“他色ビン”が含まれる。また、画像情報ＩＮｃ１，ＩＮｃ２には、撮影画像Ｉ１を示す画像ファイル名“XXX.bmp”と、撮影画像Ｉ１のピクセルサイズを示す高さ情報“1536”及び幅情報“2048”とが含まれる。そして、ラベル情報ＩＮａ１、輪郭情報ＩＮｂ１及び画像情報ＩＮｃ１によって、空きビン画像Ｂ１１に対するアノテーションデータＡＤ１が形成され、ラベル情報ＩＮａ２、輪郭情報ＩＮｂ２及び画像情報ＩＮｃ２によって、空きビン画像Ｂ１２に対するアノテーションデータＡＤ２が形成される。第一アノテーション部１１は、撮影画像Ｉ１とアノテーションデータＡＤ１，ＡＤ２とを互いに対応付け、撮影画像Ｉ１とアノテーションデータＡＤ１，ＡＤ２とを含む第一教師データを生成する。アノテーションデータＡＤ１，ＡＤ２は、第一アノテーション部１１によって、例えばJSON形式のファイルとして生成される。The first annotation unit 11 assigns contour information INb1 to the empty bottle image B11 and contour information INb2 to the empty bottle image B12 in the captured image I1. The contour information INb1 and INb2 are formed by a plurality of coordinate points [x1, y1], [x2, y2], ... Furthermore, the first annotation unit 11 assigns label information INa1 and image information INc1 to the empty bottle image B11 in the captured image I1, and assigns label information INa2 and image information INc2 to the empty bottle image B12. Label information INa1 includes the color information "brown bottle" for the empty bottle image B11, and label information INa2 includes the color information "other color bottle" for the empty bottle image B12. Furthermore, the image information INc1 and INc2 include the image file name "XXX.bmp" indicating the captured image I1, and height information "1536" and width information "2048" indicating the pixel size of the captured image I1. Then, annotation data AD1 for the empty bottle image B11 is formed from the label information INa1, contour information INb1, and image information INc1, and annotation data AD2 for the empty bottle image B12 is formed from the label information INa2, contour information INb2, and image information INc2. The first annotation unit 11 associates the captured image I1 with the annotation data AD1 and AD2 with each other and generates first training data including the captured image I1 and the annotation data AD1 and AD2. The annotation data AD1 and AD2 are generated by the first annotation unit 11, for example, as files in JSON format.

また例えば、図３に示すように、撮影画像Ｉ２には、廃棄物画像として、空きビン画像Ｂ２１，Ｂ２２，Ｂ２３，Ｂ２４，Ｂ２５が含まれる。撮影画像Ｉ２に対して第一アノテーション処理を実行した第一アノテーション部１１は、空きビン画像Ｂ２１に対してラベル情報“茶色ビン”と輪郭情報ＣＯ１とを付与し、空きビン画像Ｂ２２に対してラベル情報“他色ビン”と輪郭情報ＣＯ２とを付与し、空きビン画像Ｂ２４に対してラベル情報“茶色ビン”と輪郭情報ＣＯ４Ａとを付与し、空きビン画像Ｂ２５に対してラベル情報“茶色ビン”と輪郭情報ＣＯ５とを付与する。For example, as shown in Figure 3, the captured image I2 includes empty bottle images B21, B22, B23, B24, and B25 as waste images. The first annotation unit 11, which performs the first annotation process on the captured image I2, assigns label information "brown bottle" and contour information CO1 to empty bottle image B21, label information "other color bottle" and contour information CO2 to empty bottle image B22, label information "brown bottle" and contour information CO4A to empty bottle image B24, and label information "brown bottle" and contour information CO5 to empty bottle image B25.

ここで、空きビン画像Ｂ２１の正しい色は茶色であり、空きビン画像Ｂ２２の正しい色は他色であり、空きビン画像Ｂ２４の正しい色は茶色であり、空きビン画像Ｂ２５の正しい色は他色であるものとする。よって、空きビン画像Ｂ２１，Ｂ２２，Ｂ２４に付与されたラベル情報は正しい一方で、図３に示すように、空きビン画像Ｂ２５に付与されたラベル情報は誤っている。よって、空きビン画像Ｂ２５は、第一アノテーション処理による特徴情報の付与に失敗した廃棄物画像（以下では「失敗画像」と呼ぶことがある）である。Here, we assume that the correct color of empty bottle image B21 is brown, the correct color of empty bottle image B22 is another color, the correct color of empty bottle image B24 is brown, and the correct color of empty bottle image B25 is another color. Therefore, the label information assigned to empty bottle images B21, B22, and B24 is correct, while the label information assigned to empty bottle image B25 is incorrect, as shown in Figure 3. Thus, empty bottle image B25 is a waste image (hereinafter sometimes referred to as a "failed image") in which the assignment of feature information by the first annotation process failed.

また、図３に示すように、空きビン画像Ｂ２４に付与された輪郭情報ＣＯ４Ａによって示される輪郭は、空きビン画像Ｂ２４の正しい輪郭からずれている。よって、空きビン画像Ｂ２４も、空きビン画像Ｂ２５と同様に、失敗画像である。Furthermore, as shown in Figure 3, the contour indicated by the contour information CO4A assigned to the empty bottle image B24 deviates from the correct contour of the empty bottle image B24. Therefore, the empty bottle image B24, like the empty bottle image B25, is a failed image.

また、撮影画像Ｉ２には空きビン画像Ｂ２３が含まれにもかかわらず、図３に示すように、空きビン画像Ｂ２３に対するラベル情報及び輪郭情報の付与が為されていない。つまり、第一アノテーション部１１は、空きビン画像Ｂ２３を未検出である。よって、空きビン画像Ｂ２３も、空きビン画像Ｂ２４，Ｂ２５と同様に、失敗画像である。Furthermore, although the captured image I2 includes the empty bottle image B23, as shown in Figure 3, label information and contour information have not been assigned to the empty bottle image B23. In other words, the first annotation unit 11 has not detected the empty bottle image B23. Therefore, the empty bottle image B23, like the empty bottle images B24 and B25, is a failed image.

また、図３に示すように、撮影画像Ｉ２内の領域Ｒ６には輪郭情報ＣＯ６によって示される輪郭を有する空きビン画像が含まれていないにもかかわらず、領域Ｒ６に対してラベル情報“茶色ビン”と輪郭情報ＣＯ６とが付与されている。つまり、第一アノテーション部１１は、領域Ｒ６を空きビン画像と誤検出した。Furthermore, as shown in Figure 3, even though region R6 in the captured image I2 does not contain an empty bottle image with a contour indicated by contour information CO6, the label information "brown bottle" and contour information CO6 are assigned to region R6. In other words, the first annotation unit 11 mistakenly detected region R6 as an empty bottle image.

以上のように、撮影画像Ｉ２において、空きビン画像Ｂ２３，Ｂ２４，Ｂ２５が失敗画像になっているとともに、領域Ｒ６が空きビン画像と誤検出されている。As described above, in the captured image I2, the empty bottle images B23, B24, and B25 are failure images, and region R6 is incorrectly detected as an empty bottle image.

そこで、例えばオペレータは、入力装置５０の操作により撮影画像Ｉ２に表示されるポインタＰＯを用いて、例えば図４に示すようにして、撮影画像Ｉ２において空きビン画像Ｂ２３をクリックすることにより空きビン画像Ｂ２３を指定する。または、オペレータは、ポインタＰＯを用いて、例えば図５に示すようにして、撮影画像Ｉ２において空きビン画像Ｂ２３が存在する箇所を囲む領域ＲＳを描くことにより空きビン画像Ｂ２３を指定する。入力装置５０の操作により空きビン画像Ｂ２３が指定されると、第二アノテーション部１７は、アノテーション用学習済モデル記憶部１４に記憶されているアノテーション用学習済モデルを用いて、図６に示すように、空きビン画像Ｂ２３に対して輪郭情報ＣＯ３を付与する。Therefore, for example, the operator uses the pointer PO displayed on the captured image I2 by operating the input device 50 to specify the empty bottle image B23 by clicking on the empty bottle image B23 in the captured image I2, for example as shown in Figure 4. Alternatively, the operator uses the pointer PO to specify the empty bottle image B23 by drawing a region RS surrounding the location where the empty bottle image B23 exists in the captured image I2, for example as shown in Figure 5. Once the empty bottle image B23 is specified by operating the input device 50, the second annotation unit 17 uses the annotation trained model stored in the annotation trained model storage unit 14 to assign contour information CO3 to the empty bottle image B23, as shown in Figure 6.

また例えば、オペレータは、ポインタＰＯを用いて、図７に示すようにして、撮影画像Ｉ２において輪郭情報ＣＯ６によって示される輪郭で囲まれた領域をクリックすることにより領域Ｒ６を指定する。入力装置５０の操作により領域Ｒ６が指定されると、第二アノテーション部１７は、図８に示すように、領域Ｒ６に付与されていた輪郭情報ＣＯ６を削除する。For example, the operator uses the pointer PO to specify region R6 by clicking on the area enclosed by the contour indicated by the contour information CO6 in the captured image I2, as shown in Figure 7. Once region R6 is specified by the operation of the input device 50, the second annotation unit 17 deletes the contour information CO6 that was assigned to region R6, as shown in Figure 8.

また例えば、オペレータは、ポインタＰＯを用いて、図９に示すようにして、撮影画像Ｉ２において空きビン画像Ｂ２５に付与されているラベル情報“茶色ビン”をクリックすることにより空きビン画像Ｂ２５のラベル情報を指定する。入力装置５０の操作によりラベル情報が指定されると、第二アノテーション部１７は、指定されたラベル情報の修正を可能とする。そこで、オペレータは、例えばキーボードを用いて、図１０に示すように、空きビン画像Ｂ２５に付与されているラベル情報を“茶色ビン”から“他色ビン”に修正する。オペレータの修正に従って、第二アノテーション部１７は、空きビン画像Ｂ２５に付与されているラベル情報“茶色ビン”を、空きビン画像Ｂ２５の正しい色を示すラベル情報“他色ビン”に修正する。For example, the operator uses the pointer PO to click on the label information "brown bottle" attached to the empty bottle image B25 in the captured image I2, as shown in Figure 9, thereby specifying the label information for the empty bottle image B25. Once the label information is specified by the operation of the input device 50, the second annotation unit 17 enables the modification of the specified label information. The operator then uses, for example, the keyboard to modify the label information attached to the empty bottle image B25 from "brown bottle" to "other color bottle," as shown in Figure 10. In accordance with the operator's modification, the second annotation unit 17 modifies the label information "brown bottle" attached to the empty bottle image B25 to the label information "other color bottle" which indicates the correct color of the empty bottle image B25.

また例えば、オペレータは、ポインタＰＯを用いて、図１１に示すようにして、撮影画像Ｉ２において、輪郭情報ＣＯ４Ａによって示される輪郭を指定する。入力装置５０の操作により輪郭が指定されると、第二アノテーション部１７は、指定された輪郭の修正を可能とする。そこで、オペレータは、ポインタＰＯを用いて、図１２に示すように、空きビン画像Ｂ２４の輪郭を正しい輪郭に修正する。例えば、オペレータは、ポインタＰＯを用いて、輪郭情報ＣＯ４Ａによって示される輪郭の頂点を移動させることにより輪郭を修正する。オペレータの修正に従って、第二アノテーション部１７は、空きビン画像Ｂ２４に付与されている輪郭情報ＣＯ４Ａを、空きビン画像Ｂ２４の正しい輪郭を示す輪郭情報ＣＯ４Ｂに修正する。For example, the operator uses the pointer PO to specify the contour indicated by the contour information CO4A in the captured image I2, as shown in Figure 11. Once the contour is specified by the operation of the input device 50, the second annotation unit 17 enables the modification of the specified contour. The operator then uses the pointer PO to modify the contour of the empty bottle image B24 to the correct contour, as shown in Figure 12. For example, the operator modifies the contour by moving the vertices of the contour indicated by the contour information CO4A using the pointer PO. In accordance with the operator's modification, the second annotation unit 17 modifies the contour information CO4A attached to the empty bottle image B24 to contour information CO4B that indicates the correct contour of the empty bottle image B24.

以上、実施例１について説明した。The above describes Example 1.

［実施例２］
教師データ記憶部１２、アノテーション用学習済モデル記憶部１４、及び、認識用学習済モデル記憶部３１は、ハードウェアとして、例えば、メモリまたはストレージにより実現される。第一アノテーション部１１、機械学習部１３、モデル管理部１５、複製部１６、第二アノテーション部１７、及び、画像認識部３２は、ハードウェアとして、例えば、ＣＰＵ（Central Processing Unit）、ＤＳＰ（Digital Signal Processor）、ＦＰＧＡ（Field Programmable Gate Array）、ＡＳＩＣ（Application Specific Integrated Circuit）等のプロセッサにより実現される。 [Example 2]
The training data storage unit 12, the annotation-trained model storage unit 14, and the recognition-trained model storage unit 31 are implemented as hardware, for example, by memory or storage. The first annotation unit 11, the machine learning unit 13, the model management unit 15, the replication unit 16, the second annotation unit 17, and the image recognition unit 32 are implemented as hardware, for example, by a processor such as a CPU (Central Processing Unit), DSP (Digital Signal Processor), FPGA (Field Programmable Gate Array), or ASIC (Application Specific Integrated Circuit).

以上、実施例２について説明した。The above describes Example 2.

以上のように、本開示の学習装置（実施例の学習装置１０）は、第一アノテーション部（実施例の第一アノテーション部１１）と、学習部（実施例の機械学習部１３）と、モデル管理部（実施例のモデル管理部１５）とを有する。第一アノテーション部は、物体画像の認識結果に基づいて所望の物体を選別する物体選別装置（実施例の物体選別装置３０）が用いる第一学習済モデル（実施例の認識用学習済モデル）から複製された第二学習済モデル（実施例のアノテーション用学習済モデル）を用いて、物体画像の特徴を示す情報である特徴情報を物体画像に付与する第一アノテーション処理を行う。学習部は、物体画像と、第一アノテーション処理によって付与済みの特徴情報とを教師データとして用いて機械学習を行うことにより第二学習済モデルを更新する。モデル管理部は、更新後の第二学習済モデルを用いて第一学習済モデルを更新する。As described above, the learning device of this disclosure (learning device 10 in the embodiment) comprises a first annotation unit (first annotation unit 11 in the embodiment), a learning unit (machine learning unit 13 in the embodiment), and a model management unit (model management unit 15 in the embodiment). The first annotation unit performs a first annotation process to attach feature information, which is information indicating the features of an object image, to an object image using a second trained model (trained model for annotation in the embodiment) which is duplicated from a first trained model (trained model for recognition in the embodiment) used by an object selection device (object selection device 30 in the embodiment) that selects a desired object based on the recognition result of an object image. The learning unit updates the second trained model by performing machine learning using the object image and the feature information attached by the first annotation process as training data. The model management unit updates the first trained model using the updated second trained model.

こうすることで、物体画像に対するアノテーションを第一アノテーション部によって自動的に行うことができるため、アノテーション作業の効率をあげることができる。また、物体選別装置での物体画像の認識に用いられる学習済モデルを第一アノテーション処理にも共用することで、第一アノテーション処理用の学習済モデルを別途用意する必要がなくなるため、さらに効率的にアノテーションを行うことができる。また、第二学習済モデルが随時更新されるため、第二学習済モデルの更新に伴って第一アノテーション処理の精度が向上する。さらに、第二学習済モデルの更新に伴って第一学習済モデルも随時更新されるため、第二学習済モデルの更新に伴って物体選別装置での物体画像の認識精度も向上する。This allows the first annotation unit to automatically perform annotation on object images, thereby increasing the efficiency of the annotation process. Furthermore, by sharing the pre-trained model used for object image recognition in the object sorting device with the first annotation process, there is no need to prepare a separate pre-trained model for the first annotation process, further improving the efficiency of annotation. Additionally, since the second pre-trained model is updated as needed, the accuracy of the first annotation process improves with each update to the second pre-trained model. Moreover, since the first pre-trained model is updated as needed with each update to the second pre-trained model, the accuracy of object image recognition in the object sorting device also improves with each update to the second pre-trained model.

また、本開示の学習装置（実施例の学習装置１０）は、第二アノテーション部（実施例の第二アノテーション部１７）を有する。第二アノテーション部は、第一アノテーション処理による特徴情報の付与に失敗した物体画像である失敗画像のうち、任意に指定された失敗画像に対して第二学習済モデルを用いて特徴情報を付与する第二アノテーション処理を行う。学習部は、物体画像と、第一アノテーション処理によって付与済みの特徴情報と、第二アノテーション処理によって付与済みの特徴情報とを教師データとして用いて機械学習を行うことにより第二学習済モデルを更新する。例えば、特徴情報は、物体画像の輪郭を示す輪郭情報を含む。Furthermore, the learning device of this disclosure (learning device 10 in the embodiment) has a second annotation unit (second annotation unit 17 in the embodiment). The second annotation unit performs a second annotation process to arbitrarily designated failure images, which are object images for which the first annotation process failed to assign feature information, by using a second trained model to assign feature information. The learning unit updates the second trained model by performing machine learning using the object image, the feature information assigned by the first annotation process, and the feature information assigned by the second annotation process as training data. For example, the feature information includes contour information that shows the contour of the object image.

こうすることで、失敗画像に対するオペレータによるアノテーション作業の効率をあげることができる。特に、失敗画像に対する輪郭情報の付与に要する手間を削減することができる。また、第二アノテーション処理により第二学習済モデルの精度が向上するため、第一アノテーション処理の精度がさらに向上する。This improves the efficiency of annotation work performed by operators on failed images. In particular, it reduces the effort required to add contour information to failed images. Furthermore, the accuracy of the second pre-trained model is improved by the second annotation process, further improving the accuracy of the first annotation process.

また、第二アノテーション処理により特徴情報が付与される際に用いられるスコアの閾値（実施例の閾値ＴＨ２）は、第一アノテーション処理により特徴情報が付与される際に用いられるスコアの閾値（実施例の閾値ＴＨ１）より小さい。Furthermore, the score threshold used when feature information is assigned by the second annotation process (threshold TH2 in the example) is smaller than the score threshold used when feature information is assigned by the first annotation process (threshold TH1 in the example).

こうすることで、第一アノテーション処理での物体画像の認識に比べて、第二アノテーション処理での失敗画像の認識の成功率をあげることができる。By doing this, the success rate of recognizing failed images in the second annotation process can be increased compared to the success rate of recognizing object images in the first annotation process.

１物体選別システム
１０学習装置
１１第一アノテーション部
１２教師データ記憶部
１３機械学習部
１４アノテーション用学習済モデル記憶部
１５モデル管理部
１６複製部
１７第二アノテーション部
２０カメラ
３０物体選別装置
３１認識用学習済モデル記憶部
３２画像認識部
３３所望物体抽出部 1. Object sorting system 10. Learning device 11. First annotation unit 12. Training data storage unit 13. Machine learning unit 14. Annotation trained model storage unit 15. Model management unit 16. Replication unit 17. Second annotation unit 20. Camera 30. Object sorting device 31. Recognition trained model storage unit 32. Image recognition unit 33. Desired object extraction unit.

Claims

A first annotation unit uses a second trained model, which is a replica of a first trained model used by an object sorting device that sorts desired objects based on the recognition results of an object image, to add feature information, which is information indicating the features of the object image, to the object image.
A second annotation unit modifies the feature information based on the image specification, targeting the object image in which the first annotation unit failed to assign the feature information.
A learning unit updates the first trained model by performing machine learning using the object image, the feature information assigned by the first annotation unit, and the feature information modified by the second annotation unit as training data.
It is equipped with ,
The score threshold used when the feature information is modified by the second annotation unit is smaller than the score threshold used when the feature information is assigned by the first annotation unit.
Learning device.