JP2021042955A

JP2021042955A - Food inspection device, food inspection method and learning method of food reconstruction neural network for food inspection device

Info

Publication number: JP2021042955A
Application number: JP2017252576A
Authority: JP
Inventors: 幸太郎降旗; Kotaro Furuhata; 武荻野; Takeshi Ogino; 祐介水谷; Yusuke Mizutani; 田村　崇; Takashi Tamura; 崇田村; 満久太田; Mitsuhisa Ota; 義充今津; Yoshimitsu Imazu; ティネオアレハンドロハビエルゴンザレス; Alejandro Javier Gonzales Tineo; 勇太吉田; Yuta Yoshida; 洋平菅原; Yohei Sugawara
Original assignee: BRAINPAD Inc; QP Corp
Current assignee: BRAINPAD Inc; QP Corp
Priority date: 2017-12-27
Filing date: 2017-12-27
Publication date: 2021-03-18
Also published as: WO2019131945A1

Abstract

To provide a food inspection device capable of improving the robustness of food inspection for confirming that no foreign matter is mixed.SOLUTION: The food inspection device is provided with light irradiation means 3 that irradiates an inspection object A with light, imaging means 4 for capturing an image of the inspection object A, and an identification processing device 6. The identification processing device 6 includes a food reconstruction neural network that has learned only good food products in advance, reconstructs a food assumption image using the food reconstruction neural network from the image obtained by the imaging means 4, calculates a difference between the image obtained by the imaging means 4 and the food assumption image, and identifies the inspection object A whose difference exceeds the preset threshold value as a foreign substance.SELECTED DRAWING: Figure 1

Description

本発明は、食品検査装置、食品検査方法及び食品検査装置の食品再構成ニューラルネットワークの学習方法に関する。 The present invention relates to a food inspection device, a food inspection method, and a learning method of a food reconstruction neural network of the food inspection device.

従来、食品などの検査対象物に混在する異物を検出する食品検査装置に関し、金属検知やＸ線検査など各種の技術が提案されている。近年は、撮影画像を画像処理して、異物を検出するものも提案されている（特許文献１）。 Conventionally, various techniques such as metal detection and X-ray inspection have been proposed for food inspection devices that detect foreign substances mixed in inspection objects such as food. In recent years, there has been proposed an image processing of a photographed image to detect a foreign substance (Patent Document 1).

特許文献１（請求項１参照）には、食物等の腐敗する物質の腐敗部サンプルと健全部サンプルについてそれぞれの分光画像を取得し、各サンプルについての吸光スペクトルの違いを利用して主成分分析や回帰分析等の統計的手法により腐敗部か否かを判定する検量式を作成し、該検量式に同種物質の未知サンプルの分光画像を当てはめて該未知サンプルが腐敗部か否かを判断することが記載されている。 In Patent Document 1 (see claim 1), spectroscopic images of a rotten portion sample and a healthy portion sample of a perishable substance such as food are obtained, and principal component analysis is performed using the difference in absorption spectra of each sample. Create a calibration formula to determine whether or not it is a rotting part by a statistical method such as regression analysis or regression analysis, and apply a spectral image of an unknown sample of the same substance to the calibration formula to determine whether or not the unknown sample is a rotting part. It is stated that.

しかしながら、特許文献１の判定方法では、腐敗部サンプルとは異なる異物に対して異物の検出精度が落ちる可能性があった。そのため、一定の検出精度を維持するには、想定されるすべての異物に対して機械学習することが必要であるが、時間や労力、コンピュータの処理能力などの制約により実施が難しい場合があった。 However, in the determination method of Patent Document 1, there is a possibility that the detection accuracy of the foreign matter may be lowered for the foreign matter different from the putrefactive part sample. Therefore, in order to maintain a certain detection accuracy, it is necessary to perform machine learning for all possible foreign substances, but it may be difficult to implement due to restrictions such as time, labor, and computer processing power. ..

特開２００５−２０１６３６号公報Japanese Unexamined Patent Publication No. 2005-201636

本発明は、このような事情を考慮してなされたものであり、その目的は、異物の混在がないことを確認する食品検査のロバスト性の向上を図ることができる食品検査装置、食品検査方法及び食品検査装置の食品再構成ニューラルネットワークの学習方法を提供することにある。 The present invention has been made in consideration of such circumstances, and an object of the present invention is a food inspection device and a food inspection method capable of improving the robustness of a food inspection for confirming that no foreign matter is mixed. And to provide a learning method of a food reconstruction neural network of a food inspection device.

このような目的を達成するため、本発明は、以下の構成によって把握される。
（１）本発明は、異物の混在がないことを確認する食品を検査対象物として前記検査対象物に光を照射する光照射手段と、前記検査対象物の映像を撮像する撮像手段と、識別処理装置と、を備え、前記識別処理装置は、予め食品の良品のみを学習した食品再構成ニューラルネットワークを備えており、前記撮像手段によって得られた画像から、前記食品再構成ニューラルネットワークを用いて、食品仮定画像を再構成し、前記撮像手段によって得られた前記画像と前記食品仮定画像との差分を算出し、前記差分が予め設定していた閾値を超える検査対象物を異物として識別するものである。 In order to achieve such an object, the present invention is grasped by the following configuration.
(1) The present invention distinguishes between a light irradiating means for irradiating the inspection object with light using a food for confirming that no foreign matter is mixed as an inspection object and an imaging means for capturing an image of the inspection object. The identification processing device includes a processing device, and the identification processing device includes a food reconstruction neural network in which only non-defective food products are learned in advance, and the food reconstruction neural network is used from an image obtained by the imaging means. , The food assumption image is reconstructed, the difference between the image obtained by the imaging means and the food assumption image is calculated, and the inspection object whose difference exceeds a preset threshold is identified as a foreign substance. Is.

（２）本発明は、上記（１）の構成において、前記食品再構成ニューラルネットワークとして自己符号化器を用いた再構成の畳み込みニューラルネットワークを用いることを特徴とするものである。 (2) The present invention is characterized in that, in the configuration of (1) above, a reconstructed convolutional neural network using a self-encoder is used as the food reconstructed neural network.

（３）本発明は、上記（１）の構成において、前記食品再構成ニューラルネットワークとして輝度のみを用いた色再構成を用いることを特徴とするものである。 (3) The present invention is characterized in that, in the configuration of (1) above, color reconstruction using only brightness is used as the food reconstruction neural network.

（４）本発明は、上記（１）から（３）までのいずれか１つの構成の食品検査装置によって、検査対象物の食品に異物の混在がないことを確認する食品検査方法である。 (4) The present invention is a food inspection method for confirming that the food to be inspected is free of foreign substances by using the food inspection apparatus having any one of the above (1) to (3).

（５）本発明は、上記（１）の構成の食品検査装置の食品再構成ニューラルネットワークの学習方法であって、前記食品検査装置の撮像手段によって、異物の混在がないことを確認する食品を撮像し、前記食品検査装置の撮像手段によって得られた元画像に基づく前記食品の良品が写っている且つ異物が写っていない良品画像から畳み込みニューラルネットワークが生成する食品仮定画像と、前記良品画像との差分が最小化されるように学習を行うものである。 (5) The present invention is a method for learning a food reconstruction neural network of the food inspection device having the above configuration (1), and uses the imaging means of the food inspection device to confirm that no foreign matter is mixed in the food. The food assumption image generated by the convolutional neural network from the good product image in which the good product of the food is shown and no foreign matter is shown based on the original image obtained by the imaging means of the food inspection device, and the good product image. Learning is performed so that the difference between the two is minimized.

（６）本発明は、上記（５）の構成の学習方法であって、前記食品検査装置の撮像手段によって前記食品の良品のみを撮像することにより、前記良品画像を得ることを特徴とするものである。 (6) The present invention is the learning method of the configuration of the above (5), characterized in that the good product image is obtained by imaging only the good product of the food with the imaging means of the food inspection device. Is.

（７）本発明は、上記（５）の構成の学習方法であって、前記元画像から異物の画像を取り除く処理を行うことにより、前記良品画像を得ることを特徴とするものである。 (7) The present invention is the learning method of the configuration of the above (5), and is characterized in that the good product image is obtained by performing a process of removing an image of a foreign substance from the original image.

本発明によれば、異物の混在がないことを確認する食品検査のロバスト性の向上を図ることができる食品検査装置、食品検査方法及び食品検査装置の食品再構成ニューラルネットワークの学習方法を提供することができる。 According to the present invention, there is provided a food inspection apparatus, a food inspection method, and a learning method of a food reconstruction neural network of a food inspection apparatus, which can improve the robustness of a food inspection for confirming that no foreign matter is mixed. be able to.

本発明の実施形態に係る食品検査装置を示す概略図である。It is the schematic which shows the food inspection apparatus which concerns on embodiment of this invention. 検査対象物の識別状態を示す概略図である。It is the schematic which shows the identification state of the inspection object. 本発明の実施形態に係る識別処理装置の実施例１の識別処理方法のモデル学習段階の処理の手順を示すフロー図である。It is a flow chart which shows the process of the model learning stage of the identification processing method of Example 1 of the identification processing apparatus which concerns on embodiment of this invention. 実施例１に係る学習データの準備段階（Ｓ１１０）の説明図である。It is explanatory drawing of the preparation stage (S110) of the learning data which concerns on Example 1. FIG. 実施例１に係る平均輝度の計算段階（Ｓ１２０）の説明図である。It is explanatory drawing of the calculation step (S120) of the average luminance which concerns on Example 1. FIG. 実施例１に係る前処理段階（Ｓ２１０）の説明図である。It is explanatory drawing of the pretreatment step (S210) which concerns on Example 1. FIG. 実施例１に係るモデルの学習段階（Ｓ２２０）の説明図である。It is explanatory drawing of the learning stage (S220) of the model which concerns on Example 1. FIG. 本発明の実施形態に係る識別処理装置の実施例１の識別処理方法の検知段階の処理の手順を示すフロー図である。It is a flow chart which shows the processing procedure of the detection step of the identification processing method of Example 1 of the identification processing apparatus which concerns on embodiment of this invention. 実施例１に係る検知段階（Ｓ３００）の各段階（Ｓ３１０，Ｓ３２０，Ｓ３３０）の説明図である。It is explanatory drawing of each step (S310, S320, S330) of the detection step (S300) which concerns on Example 1. FIG. ａｂ空間のカテゴリカル化（ＲＧＢ色域）の例を示すグラフである。It is a graph which shows the example of the categorization (RGB color gamut) of ab space. ピクセルのａｂ値のカテゴリカル化の表現例を示す説明図である。It is explanatory drawing which shows the expression example of the categorization of the ab value of a pixel. 実施例１に係るニューラルネットワーク「モデルＮｅｔ（ｆ（θ））」の構造の例を示す概念図である。It is a conceptual diagram which shows the example of the structure of the neural network "model Net (f (θ))" which concerns on Example 1. FIG. 実施例２に係る前処理段階（Ｓ２１０）の説明図である。It is explanatory drawing of the pretreatment step (S210) which concerns on Example 2. FIG. 実施例２に係るモデルの学習段階（Ｓ２２０）の説明図である。It is explanatory drawing of the learning stage (S220) of the model which concerns on Example 2. FIG. 実施例２に係る検知段階（Ｓ３００）の各段階（Ｓ３１０，Ｓ３２０，Ｓ３３０）の説明図である。It is explanatory drawing of each stage (S310, S320, S330) of the detection stage (S300) which concerns on Example 2. FIG.

以下、添付図面を参照して、本発明を実施するための形態（以下、「実施形態」と称する）について詳細に説明する。実施形態の説明の全体を通して同じ要素には同じ符号を付している。 Hereinafter, embodiments for carrying out the present invention (hereinafter, referred to as “embodiments”) will be described in detail with reference to the accompanying drawings. The same elements are designated by the same reference numerals throughout the description of the embodiments.

まず、異物Ｆの混在がないことを確認したい食品Ｂ（検査対象物Ａ）から、異物Ｆをインラインで識別する食品検査装置１について、図１に基づいて説明する。図１は本発明の実施形態に係る食品検査装置を示す概略図である。図１に示される食品検査装置１は、搬送手段２と、光照射手段３と、撮像手段４と、識別処理装置６とを備える。 First, a food inspection device 1 for in-line identification of foreign matter F from food B (inspection object A) for which it is desired to confirm that foreign matter F is not mixed will be described with reference to FIG. FIG. 1 is a schematic view showing a food inspection device according to an embodiment of the present invention. The food inspection device 1 shown in FIG. 1 includes a transport means 2, a light irradiation means 3, an image pickup means 4, and an identification processing device 6.

搬送手段２は、検査対象物Ａを上流工程から検査場所Ｃでの検査工程を経て下流工程へ搬送するもので、ベルトコンベアなどから構成されている。搬送手段２は、２ｍ／分から２０ｍ／分程度の搬送速度で検査対象物Ａを搬送する。検査対象物Ａは、食品Ｂの良品Ｓと異物Ｆを含む。 The transport means 2 transports the inspection object A from the upstream process to the downstream process through the inspection process at the inspection location C, and is composed of a belt conveyor or the like. The transport means 2 transports the inspection object A at a transport speed of about 2 m / min to 20 m / min. The inspection object A includes a non-defective product S of food B and a foreign substance F.

光照射手段３は、検査場所Ｃに在る検査対象物Ａに光を照射する。なお、光照射手段３は、補光器３２を含んでもよい。 The light irradiating means 3 irradiates the inspection object A at the inspection place C with light. The light irradiation means 3 may include a light enhancer 32.

撮像手段４は、検査場所Ｃに在る搬送中の検査対象物Ａを撮像するもので、ＣＣＤカメラや、ハイパースペクトルカメラなどで構成される。 The imaging means 4 captures an image of the inspection object A being transported at the inspection location C, and is composed of a CCD camera, a hyperspectral camera, and the like.

識別処理装置６は、検査対象物Ａ中の食品Ｂの異物Ｆを識別するものである。識別処理装置６は、撮像手段４により撮像された映像Ｄ１から検査対象物Ａ中の食品Ｂの異物Ｆを識別する。 The identification processing device 6 identifies the foreign matter F of the food B in the inspection object A. The identification processing device 6 identifies the foreign matter F of the food B in the inspection object A from the image D1 captured by the imaging means 4.

識別処理装置６は、映像Ｄ１に基づいて、搬送手段２によって搬送中の検査対象物Ａから異物Ｆをインラインで識別する。この識別処理装置６は、検査対象物Ａから異物Ｆを識別する処理を、あらかじめディープラーニングにより学習している。 The identification processing device 6 identifies the foreign matter F in-line from the inspection object A being transported by the transport means 2 based on the image D1. The identification processing device 6 has learned in advance the process of identifying the foreign matter F from the inspection object A by deep learning.

図２は、検査対象物の識別状態を示す概略図である。図２においては、搬送手段２によって搬送中の検査対象物Ａから異物Ｆ１，Ｆ２がインラインで識別されている。異物Ｆ２は、食品Ｂ中に良品Ｓの部分と混在するものである。 FIG. 2 is a schematic view showing an identification state of an inspection object. In FIG. 2, foreign substances F1 and F2 are identified in-line from the inspection object A being transported by the transport means 2. The foreign matter F2 is mixed with the portion of the non-defective product S in the food B.

次に本実施形態に係る識別処理装置６について、実施例１，２を挙げて説明する。 Next, the identification processing device 6 according to the present embodiment will be described with reference to Examples 1 and 2.

［実施例１］
本実施形態に係る識別処理装置６の実施例１について、図３から図１２を参照して説明する。図３から図１２は、本実施形態に係る識別処理装置６の実施例１の識別処理方法を示す図である。本識別処理方法は、モデル学習段階と、検知段階と、から構成される。モデル学習段階は、検知段階で使用されるニューラルネットワークの学習を行う段階である。実施例１では、輝度のみを用いた色再構成を行うように、ニューラルネットワークの学習を行う。検知段階は、モデル学習段階で学習したニューラルネットワークを使用して、搬送手段２によって搬送中の検査対象物Ａ中の食品Ｂの異物Ｆを識別する段階である。 [Example 1]
The first embodiment of the identification processing device 6 according to the present embodiment will be described with reference to FIGS. 3 to 12. 3 to 12 are diagrams showing the identification processing method of the first embodiment of the identification processing device 6 according to the present embodiment. This identification processing method includes a model learning stage and a detection stage. The model learning stage is a stage in which the neural network used in the detection stage is trained. In the first embodiment, the neural network is trained so as to perform color reconstruction using only the brightness. The detection step is a step of identifying the foreign matter F of the food B in the inspection object A being transported by the transport means 2 using the neural network learned in the model learning step.

＜モデル学習段階＞
図３から図７を参照して、実施例１に係るモデル学習段階を説明する。図３は、本実施形態に係る識別処理装置の実施例１の識別処理方法のモデル学習段階の処理の手順を示すフロー図である。図３に示されるように、モデル学習段階では、まずデータ準備段階（Ｓ１００）が行われ、次いで学習段階（Ｓ２００）が行われる。 <Model learning stage>
The model learning stage according to the first embodiment will be described with reference to FIGS. 3 to 7. FIG. 3 is a flow chart showing a processing procedure in the model learning stage of the identification processing method of the first embodiment of the identification processing apparatus according to the present embodiment. As shown in FIG. 3, in the model learning stage, the data preparation stage (S100) is first performed, and then the learning stage (S200) is performed.

（データ準備段階）
データ準備段階（Ｓ１００）を説明する。データ準備段階（Ｓ１００）は、学習データの準備段階（Ｓ１１０）と、平均輝度の計算段階（Ｓ１２０）と、から構成される。以下、データ準備段階（Ｓ１００）の各段階（Ｓ１１０，Ｓ１２０）を、図４，図５をそれぞれ参照して説明する。 (Data preparation stage)
The data preparation stage (S100) will be described. The data preparation stage (S100) is composed of a training data preparation stage (S110) and an average luminance calculation stage (S120). Hereinafter, each stage (S110, S120) of the data preparation stage (S100) will be described with reference to FIGS. 4 and 5, respectively.

（データ準備段階：学習データの準備段階（Ｓ１１０））
図４は、学習データの準備段階（Ｓ１１０）の説明図である。学習データの準備段階（Ｓ１１０）では、実施例１に係る学習段階（Ｓ２００）で使用されるデータを準備する。この準備のために、まず、複数枚（Ｎａｌｌ枚）の学習用画像（ＲＧＢ画像）を用意する。学習用画像（ＲＧＢ画像）は、食品Ｂの良品Ｓが写っている且つ異物Ｆが写っていない画像である。学習用画像（ＲＧＢ画像）のサイズは、「Ｐ１×Ｐ２」ピクセル（ｐｘ）である。学習用画像（ＲＧＢ画像）は、撮像手段４によって食品Ｂの良品Ｓのみが撮像された画像であってもよい。この場合、撮像手段４により撮像された良品Ｓの映像Ｄ１が複数枚（Ｎａｌｌ枚）の学習用画像（ＲＧＢ画像）となる。又は、学習用画像（ＲＧＢ画像）は、撮像手段４によって撮像された食品Ｂの良品Ｓ及び異物Ｆの映像Ｄ１から、異物Ｆの画像を取り除く処理が行われた後の映像Ｄ１であってもよい。この場合、撮像手段４により撮像された映像Ｄ１から異物Ｆの画像を取り除く処理が行われた後の映像Ｄ１が複数枚（Ｎａｌｌ枚）の学習用画像（ＲＧＢ画像）となる。異物Ｆの画像を取り除く処理は、異物Ｆが写っている画像自体を取り除く処理であってもよく、又は、異物Ｆが写っている画像から当該異物Ｆの画像領域を消す処理（例えば、白色で塗りつぶす処理等）であってもよい。 (Data preparation stage: Learning data preparation stage (S110))
FIG. 4 is an explanatory diagram of the learning data preparation stage (S110). In the learning data preparation stage (S110), the data used in the learning stage (S200) according to the first embodiment is prepared. For this preparation, first, a plurality of (Nall) learning images (RGB images) are prepared. The learning image (RGB image) is an image in which the good product S of the food B is shown and the foreign matter F is not shown. The size of the learning image (RGB image) is "P1 x P2" pixels (px). The learning image (RGB image) may be an image obtained by capturing only the non-defective product S of the food B by the imaging means 4. In this case, the image D1 of the non-defective product S captured by the imaging means 4 becomes a plurality of (Nall) learning images (RGB images). Alternatively, the learning image (RGB image) may be the image D1 after the process of removing the image of the foreign substance F from the image D1 of the non-defective product S and the foreign substance F of the food B captured by the imaging means 4. Good. In this case, the image D1 after the process of removing the image of the foreign matter F from the image D1 captured by the imaging means 4 becomes a plurality of (Nall) learning images (RGB images). The process of removing the image of the foreign matter F may be a process of removing the image itself in which the foreign matter F appears, or a process of erasing the image area of the foreign matter F from the image in which the foreign matter F appears (for example, in white). It may be a painting process, etc.).

次に、Ｎａｌｌ枚の学習用画像（ＲＧＢ画像）を使用して、データ準備段階（Ｓ１００）を行う。まず、Ｎａｌｌ枚の学習用画像（ＲＧＢ画像）を、Ｎａｌｌ枚のＸ％の枚数の学習用画像（ＲＧＢ画像）と、Ｎａｌｌ枚の（１００−Ｘ）％の枚数の学習用画像（ＲＧＢ画像）と、に分ける（Ｓ１１１）。Ｎａｌｌ枚のＸ％の枚数の学習用画像（ＲＧＢ画像）は、学習セット（Train set）の学習用画像（ＲＧＢ画像）である。Ｎａｌｌ枚の（１００−Ｘ）％の枚数の学習用画像（ＲＧＢ画像）は、検証セット（Validation set）の学習用画像（ＲＧＢ画像）である。学習セット（Train set）は、ニューラルネットワークの学習に使用される。検証セット（Validation set）は、ニューラルネットワークの学習のクオリティを検証するために使用される。 Next, the data preparation step (S100) is performed using Nall learning images (RGB images). First, Nall's learning images (RGB images) are divided into Nall's X% number of learning images (RGB images) and Nall's (100-X)% number of learning images (RGB images). And (S111). The X% number of learning images (RGB images) of Nall are learning images (RGB images) of the training set (Train set). The number of (100-X)% of Nall learning images (RGB images) is the learning images (RGB images) of the validation set. Train sets are used to train neural networks. The validation set is used to validate the learning quality of neural networks.

次いで、学習セット（Train set）と、検証セット（Validation set）とに対して、それぞれの学習用画像（ＲＧＢ画像）を、「Ｐ３×Ｐ３」ピクセル（ｐｘ）のサブ画像（ＲＧＢ画像）に分解する（Ｓ１１２、但し、「Ｐ１＞Ｐ３，Ｐ２＞Ｐ３」である）。また、１枚の学習用画像（ＲＧＢ画像）から分解された複数枚のサブ画像（ＲＧＢ画像）の間には、重なりがないようにする。
なお、ステップＳ１１２（サブ画像への分解処理）は実行しなくてもよい。ステップＳ１１２（サブ画像への分解処理）を実行しない場合、学習セット（Train set）の学習用画像（ＲＧＢ画像）及び検証セット（Validation set）の学習用画像（ＲＧＢ画像）がそのまま以降の処理で使用される。但し、サブ画像に分解することにより、後述するステップＳ２００の学習段階における画素組み合わせを指数関数的に減らすことができるので、当該学習段階の学習時間の短縮が可能となる。
以降の処理の説明では、ステップＳ１１２（サブ画像への分解処理）が実行されたとして説明を行う。 Next, for the training set (Train set) and the verification set (Validation set), each training image (RGB image) is decomposed into sub-images (RGB images) of "P3 x P3" pixels (px). (S112, where "P1> P3, P2>P3"). In addition, there should be no overlap between a plurality of sub-images (RGB images) decomposed from one learning image (RGB image).
It is not necessary to execute step S112 (decomposition process into sub-images). When step S112 (decomposition processing into sub-images) is not executed, the training image (RGB image) of the training set (Train set) and the learning image (RGB image) of the verification set (Validation set) are used as they are in the subsequent processing. used. However, by decomposing into sub-images, the pixel combinations in the learning stage of step S200, which will be described later, can be exponentially reduced, so that the learning time in the learning stage can be shortened.
In the following description of the process, it is assumed that step S112 (decomposition process into sub-images) has been executed.

（データ準備段階：平均輝度の計算段階（Ｓ１２０））
図５は、実施例１に係る平均輝度の計算段階（Ｓ１２０）の説明図である。識別処理装置６は、学習セット（Train set）のＮｔｒａｉｎ枚のサブ画像（ＲＧＢ画像）を計算対象にして、平均輝度の計算段階（Ｓ１２０）を行う。まず、識別処理装置６は、学習セット（Train set）のＮｔｒａｉｎ枚のサブ画像（ＲＧＢ画像）のそれぞれから輝度チャネル（輝度ｃｈ）を抽出する（Ｓ１２１）。次いで、識別処理装置６は、該抽出されたＮｔｒａｉｎ枚のサブ画像の輝度チャネル（輝度ｃｈ）を使用して、平均輝度（Ｌａｖｇ）を計算する（Ｓ１２２）。この平均輝度（Ｌａｖｇ）の計算式は、次式（１）で表される。 (Data preparation stage: Average brightness calculation stage (S120))
FIG. 5 is an explanatory diagram of the calculation stage (S120) of the average luminance according to the first embodiment. The identification processing device 6 performs a calculation step (S120) of the average brightness on the sub-images (RGB images) of the N trains of the training set as the calculation target. First, the identification processing device 6 extracts a luminance channel (luminance ch) from each of the N train sub-images (RGB images) of the training set (S121). Next, the identification processing device 6 calculates the average brightness (Lavg) using the brightness channel (luminance ch) of the extracted Ntrain sub-images (S122). The calculation formula of the average brightness (Lavg) is expressed by the following formula (1).

識別処理装置６によって計算された平均輝度（Ｌａｖｇ）は、学習段階（Ｓ２００）におけるニューラルネットワークの学習で使用される入力データを正規化するために使用される。 The average brightness (Lavg) calculated by the identification processing device 6 is used to normalize the input data used in the training of the neural network in the learning stage (S200).

（学習段階）
学習段階（Ｓ２００）を説明する。学習段階（Ｓ２００）は、前処理段階（Ｓ２１０）と、モデルの学習段階（Ｓ２２０）と、から構成される。以下、学習段階（Ｓ２００）の各段階（Ｓ２１０，Ｓ２２０）を、図６，図７をそれぞれ参照して説明する。 (Learning stage)
The learning stage (S200) will be described. The learning stage (S200) is composed of a preprocessing stage (S210) and a model learning stage (S220). Hereinafter, each stage (S210, S220) of the learning stage (S200) will be described with reference to FIGS. 6 and 7, respectively.

（学習段階：前処理段階（Ｓ２１０））
図６は、実施例１に係る前処理段階（Ｓ２１０）の説明図である。前処理段階（Ｓ２１０）には、学習入力画像（ＲＧＢ画像、「Ｐｉｎ１×Ｐｉｎ２」ピクセル）が入力される。学習データの準備段階（Ｓ１１０）においてステップＳ１１２（サブ画像への分解処理）が実行された場合、学習入力画像はサブ画像（Ｐｉｎ１＝Ｐｉｎ２＝Ｐ３）である。一方、学習データの準備段階（Ｓ１１０）においてステップＳ１１２（サブ画像への分解処理）が実行されなかった場合には、学習入力画像は学習用画像（Ｐｉｎ１＝Ｐ１、Ｐｉｎ２＝Ｐ２）である。前処理段階（Ｓ２１０）では、学習セット（Train set）と、検証セット（Validation set）とに対して、それぞれの学習入力画像（ＲＧＢ画像）から「入力（ｘ）と正解色（ｙ）」のペアを生成する。 (Learning stage: Preprocessing stage (S210))
FIG. 6 is an explanatory diagram of the pretreatment stage (S210) according to the first embodiment. A learning input image (RGB image, "Pin1 x Pin2" pixel) is input to the preprocessing stage (S210). When step S112 (decomposition process into sub-images) is executed in the training data preparation stage (S110), the learning input image is a sub-image (Pin1 = Pin2 = P3). On the other hand, when step S112 (decomposition process into sub-images) is not executed in the learning data preparation stage (S110), the learning input image is a learning image (Pin1 = P1, Pin2 = P2). In the preprocessing stage (S210), for the training set (Train set) and the verification set (Validation set), "input (x) and correct color (y)" are selected from the respective learning input images (RGB images). Generate a pair.

識別処理装置６は、学習入力画像（ＲＧＢ画像、「Ｐｉｎ１×Ｐｉｎ２」ピクセル）に対して、ＲＧＢ空間からＬａｂ空間へ変換する色空間変換処理を行い、Ｌチャネル（Ｌｃｈ）のＬ値（輝度）と、ａｂチャネル（ａｂｃｈ）のａ値及びｂ値（色）とを生成する（Ｓ２１１）。 The identification processing device 6 performs color space conversion processing for converting the learning input image (RGB image, “Pin1 × Pin2” pixels) from RGB space to Lab space, and L value (luminance) of L channel (Lch). And the a value and the b value (color) of the ab channel (abch) are generated (S211).

次いで、識別処理装置６は、ＬｃｈのＬ値（輝度）を正規化する（Ｓ２１２）。この正規化の計算式は、次式（２）で表される。平均輝度（Ｌａｖｇ）は、上述した平均輝度の計算段階（Ｓ１２０）で計算された値である。なお、式（２）は正規化の方法の一例である。 Next, the identification processing device 6 normalizes the L value (luminance) of Lch (S212). The calculation formula for this normalization is expressed by the following formula (2). The average luminance (Lavg) is a value calculated in the above-mentioned average luminance calculation step (S120). Equation (2) is an example of the normalization method.

Ｌｃｈの正規化の結果の値は入力（ｘ＝｛ｘｉ，ｊ｝）である。「ｘｉ，ｊ」は、「ｉ，ｊ」番目のピクセルのＬ値（輝度）の正規化の結果の値である。 The value of the result of Lch normalization is an input (x = {xi, j}). “Xi, j” is a value obtained as a result of normalization of the L value (luminance) of the “i, j” th pixel.

なお、上述したステップＳ２１２では、平均輝度（Ｌａｖｇ）を使用してＬ値（輝度）を正規化したが、正規化の代わりに、Ｌ値（輝度）を規格化してもよい。Ｌ値（輝度）の規格化は、上記の式（２）において平均輝度（Ｌａｖｇ）を０にすればよい（つまり、上記の式（２）の分子をＬ値「Ｌｉ，ｊ」にする）。但し、Ｌ値（輝度）を、規格化よりも正規化した方が、後述するステップＳ２００の学習段階における学習効率をよくすることができる。 In step S212 described above, the L value (luminance) is normalized using the average brightness (Lavg), but the L value (luminance) may be standardized instead of the normalization. To standardize the L value (luminance), the average brightness (Lavg) may be set to 0 in the above formula (2) (that is, the numerator of the above formula (2) is set to the L value "Li, j"). .. However, if the L value (luminance) is normalized rather than standardized, the learning efficiency in the learning stage of step S200, which will be described later, can be improved.

また、識別処理装置６は、ａｂｃｈのａ値及びｂ値（色）を、カテゴリカル化し、規格化する（Ｓ２１３）。ａｂｃｈのカテゴリカル化の結果の値は正解色（ｙ）である。 Further, the identification processing device 6 categorizes and standardizes the a value and the b value (color) of the abch (S213). The value of the result of categorization of ach is the correct color (y).

ここで、図１０，図１１を参照して、ａｂｃｈのａ値及びｂ値（色）のカテゴリカル化と規格化の方法を説明する。図１０は、ａｂ空間のカテゴリカル化（ＲＧＢ色域）の例を示すグラフである。図１１は、ピクセルのａｂ値のカテゴリカル化と規格化の表現例を示す説明図である。 Here, a method of categorizing and standardizing the a value and the b value (color) of ach will be described with reference to FIGS. 10 and 11. FIG. 10 is a graph showing an example of categorization (RGB color gamut) of ab space. FIG. 11 is an explanatory diagram showing an example of representation of categorization and normalization of pixel ab values.

まず、図１０に示されるように、予め、ａｂ空間を、任意のカテゴリ数（Ｑ）のカテゴリ基底（図１０中の丸印）に均等的に離散化する。次いで、カテゴリカル化する対象の画像の各ピクセルについて、ピクセル値（ｐ）に最も近い所定の複数個数（図１０の例では５個）のカテゴリ基底（ｖｋ）を判定する。次いで、該判定結果の各カテゴリ基底（ｖｋ）とピクセル値（ｐ）との間の距離を重みとしたベクタ（ｙ’）を求める。最後にクラス間の重みを規格化し、ベクタ（ｙ）を算出する。このベクタ（ｙ＝｛ｙｋ｝）の計算式は、次式（３）で表される（式（３）の例ではカテゴリ基底（ｖｋ）の個数は５個）。 First, as shown in FIG. 10, the ab space is discretized in advance evenly to the category bases (circles in FIG. 10) of an arbitrary number of categories (Q). Next, for each pixel of the image to be categorized, a predetermined number of categorical bases (vk) closest to the pixel value (p) (5 in the example of FIG. 10) is determined. Next, a vector (y') weighted by the distance between each category basis (vk) of the determination result and the pixel value (p) is obtained. Finally, the weights between the classes are standardized and the vector (y) is calculated. The calculation formula of this vector (y = {yk}) is represented by the following formula (3) (in the example of the formula (3), the number of category bases (vk) is five).

ベクタ（ｙ）は、ピクセル値（ｐ）のカテゴリカル化と規格化の表現である。図１１に示されるカテゴリカル化と規格化の表現の一例では、カテゴリ数をＱ個として（つまり、式（３）において、ｋは１からＱまでの整数である）、ピクセル値（ｐ：ａ値「−３４」，ｂ値「６５」）のカテゴリカル化と規格化の表現「ベクタ（ｙ＝｛ｙｋ｝）」が示されている。 The vector (y) is an expression of categorization and normalization of the pixel value (p). In an example of the representation of categorization and normalization shown in FIG. 11, the number of categories is Q (that is, k is an integer from 1 to Q in the equation (3)), and the pixel value (p: a). The expression "vector (y = {yk})" of categorization and normalization of the value "-34" and the b value "65") is shown.

（学習段階：モデルの学習段階（Ｓ２２０））
図７は、実施例１に係るモデルの学習段階（Ｓ２２０）の説明図である。モデルの学習段階（Ｓ２２０）では、学習セット（Train set）の「入力（ｘ）と正解色（ｙ）」のペアと、検証セット（Validation set）の「入力（ｘ）と正解色（ｙ）」のペアとを使用して、ニューラルネットワーク「モデルＮｅｔ（ｆ（θ））」の学習を行う。 (Learning stage: Model learning stage (S220))
FIG. 7 is an explanatory diagram of the learning stage (S220) of the model according to the first embodiment. In the learning stage (S220) of the model, the pair of "input (x) and correct answer color (y)" of the training set (Train set) and the "input (x) and correct answer color (y)" of the validation set (Validation set) The neural network "model Net (f (θ))" is trained using the pair of "".

ここで、図１２を参照して、ニューラルネットワーク「モデルＮｅｔ（ｆ（θ））」の構造を説明する。図１２は、実施例１に係るニューラルネットワーク「モデルＮｅｔ（ｆ（θ））」の構造の例を示す概念図である。 Here, the structure of the neural network “model Net (f (θ))” will be described with reference to FIG. FIG. 12 is a conceptual diagram showing an example of the structure of the neural network “model Net (f (θ))” according to the first embodiment.

図１２に示されるニューラルネットワーク「モデルＮｅｔ（ｆ（θ））」は、例えば８個の層（Ｌａｙｅｒ）から構成される、畳み込みニューラルネットワークである。第１層は入力層である。第２層から第４層までは畳み込み層である。第５層から第７層までは逆畳み込み層であり、第８層は出力層である。入力画像は、サイズが「Ｐｉｎ１×Ｐｉｎ２」ピクセルであり、ｎａ個のチャネルから構成される。本実施例１では、入力画像はＬｃｈ（輝度チャネル）のみである（つまり、ｎａ＝１）。第２層の出力サイズは「Ｐ４×Ｐ４×１６」である。第３層、第４層及び第５層の出力サイズは「Ｐ５×Ｐ５×１６」である。第６層の出力サイズは「Ｐ４×Ｐ４×１６」である。第７層の出力サイズは「Ｐｉｎ１×Ｐｉｎ２×１６」である。第８層の出力サイズは、「Ｐｉｎ１×Ｐｉｎ２×ｎｂ」である。本実施例１では、ｎｂはａｂｃｈのカテゴリ数（Ｑ）である（つまり、ｎｂ＝Ｑ）。Ｐｉｎ１，Ｐｉｎ２，Ｐ３，Ｐ４，Ｐ５の大小関係は「Ｐｉｎ１又はＰｉｎ２≧Ｐｉｎ２又はＰｉｎ１＞Ｐ４＞Ｐ５」である。 The neural network "model Net (f (θ))" shown in FIG. 12 is a convolutional neural network composed of, for example, eight layers (Layer). The first layer is an input layer. The second to fourth layers are convolutional layers. The fifth to seventh layers are deconvolution layers, and the eighth layer is an output layer. The input image is "Pin1 x Pin2" pixels in size and is composed of na channels. In the first embodiment, the input image is only Lch (luminance channel) (that is, na = 1). The output size of the second layer is "P4 x P4 x 16". The output size of the third layer, the fourth layer, and the fifth layer is "P5 x P5 x 16". The output size of the sixth layer is "P4 x P4 x 16". The output size of the seventh layer is "Pin1 x Pin2 x 16". The output size of the eighth layer is "Pin1 x Pin2 x nb". In the first embodiment, nb is the number of categories (Q) of abch (that is, nb = Q). The magnitude relationship of Pin1, Pin2, P3, P4, P5 is "Pin1 or Pin2 ≥ Pin2 or Pin1> P4> P5".

以下、図７を参照して、モデルの学習段階（Ｓ２２０）を説明する。
識別処理装置６は、学習セット（Train set）の「入力（ｘ）と正解色（ｙ）」のペアを使用して、「Categorical Cross-entropy」のロスを最小化するニューラルネットワーク「モデルＮｅｔ（ｆ（θ））」のパラメータ（θ）を学習する（Ｓ２２１，Ｓ２２２）。ステップＳ２２１では、入力（ｘ）をニューラルネットワーク「モデルＮｅｔ（ｆ（θ））」に入力して、ニューラルネットワーク「モデルＮｅｔ（ｆ（θ））」の出力値を予測色（ｙ＾＝ｆθ（ｘ））とする。次いで、ステップＳ２２２では、予測色（ｙ＾＝ｆθ（ｘ））と正解色（ｙ）とを使用して、「Categorical Cross-entropy」のロス（Ｅ（ｙ，ｙ＾）を計算する。「Categorical Cross-entropy」のロス（Ｅ（ｙ，ｙ＾）の計算式は、次式（４）で表される。Ｑは、上述したステップＳ２１３のａｂｃｈのカテゴリカル化におけるカテゴリ数である。正解色（ｙ）の「ｙｉ，ｊ，ｋ」は、「ｉ，ｊ」番目のピクセルの第ｋカテゴリの値（つまり、式（３）によるｙｋ）である。予測色（ｙ＾）の「ｙ＾ｉ，ｊ，ｋ」は、ニューラルネットワーク「モデルＮｅｔ（ｆ（θ））」の「ｉ，ｊ」番目の出力値の第ｋカテゴリの値である。 Hereinafter, the learning stage (S220) of the model will be described with reference to FIG. 7.
The identification processing device 6 uses a pair of “input (x) and correct color (y)” of the training set to minimize the loss of “Categorical Cross-entropy”, which is a neural network “Model Net”. The parameter (θ) of "f (θ))" is learned (S221, S222). In step S221, the input (x) is input to the neural network “model Net (f (θ))”, and the output value of the neural network “model Net (f (θ))” is predicted to be the predicted color (y ^ = fθ (y ^ = fθ)). x)). Next, in step S222, the loss (E (y, y ^) of "Categorical Cross-entropy" is calculated using the predicted color (y ^ = fθ (x)) and the correct color (y). The calculation formula for the loss (E (y, y ^)) of "Categorical Cross-entropy" is represented by the following formula (4). Q is the number of categories in the categorization of ach in step S213 described above. Correct answer. The “yi, j, k” of the color (y) is the value of the kth category of the “i, j” th pixel (that is, yk according to the equation (3)). The “y” of the predicted color (y ^). “^ I, j, k” is a value in the k-th category of the “i, j” th output value of the neural network “model Net (f (θ))”.

「Categorical Cross-entropy」のロス（Ｅ（ｙ，ｙ＾）を最小化するニューラルネットワーク「モデルＮｅｔ（ｆ（θ））」のパラメータ（θ）を学習する方法として、誤差逆伝搬法と確率的勾配降下法（Stochastic Gradient Descent：ＳＧＤ）とを使用する。 As a method of learning the parameter (θ) of the neural network "model Net (f (θ))" that minimizes the loss of "Categorical Cross-entropy" (E (y, y ^)), the error back propagation method and stochastic descent The gradient descent method (Stochastic Gradient Descent: SGD) is used.

識別処理装置６は、ニューラルネットワーク「モデルＮｅｔ（ｆ（θ））」のパラメータ（θ）の学習を、一定の反復数（epochs）だけ繰り返して行う。識別処理装置６は、該パラメータ（θ）の学習の繰り返し毎に、検証セット（Validation set）の「入力（ｘ）と正解色（ｙ）」のペアを使用して、該パラメータ（θ）を検証する。このパラメータ（θ）の検証では、検証セット（Validation set）の「入力（ｘ）と正解色（ｙ）」のペアを使用して、上述したステップＳ２２１，Ｓ２２２を行い、「Categorical Cross-entropy」のロス（Ｅ（ｙ，ｙ＾）を計算する。このロス（Ｅ（ｙ，ｙ＾）が下がったときの最新のパラメータ（θ）をベストパラメータとして保存する。この保存されたベストパラメータのパラメータ（θ）は、検知段階で使用されるニューラルネットワーク「モデルＮｅｔ（ｆ（θ））」に適用されるパラメータ（θ）である。 The identification processing device 6 repeatedly learns the parameter (θ) of the neural network “model Net (f (θ))” by a fixed number of iterations (epochs). The identification processing device 6 uses the pair of "input (x) and correct color (y)" of the validation set to set the parameter (θ) every time the learning of the parameter (θ) is repeated. Verify. In the verification of this parameter (θ), the above-mentioned steps S221 and S222 are performed using the pair of “input (x) and correct color (y)” of the validation set, and “Categorical Cross-entropy” is performed. The loss (E (y, y ^)) of is calculated. The latest parameter (θ) when this loss (E (y, y ^) decreases is saved as the best parameter. The parameter of this saved best parameter. (Θ) is a parameter (θ) applied to the neural network “model Net (f (θ))” used in the detection stage.

以上が実施例１に係るモデル学習段階の説明である。このモデル学習段階によって、食品Ｂの良品Ｓのみを学習したニューラルネットワーク「モデルＮｅｔ（ｆ（θ））」が生成される。 The above is the description of the model learning stage according to the first embodiment. By this model learning stage, a neural network "model Net (f (θ))" that learns only the non-defective product S of food B is generated.

＜検知段階＞
図８，図９を参照して、実施例１に係る検知段階を説明する。図８は、本実施形態に係る識別処理装置の実施例１の識別処理方法の検知段階の処理の手順を示すフロー図である。図８に示されるように、検知段階（Ｓ３００）では、まず前処理段階（Ｓ３１０）が行われ、次いで予測段階（Ｓ３２０）が行われ、次いで後処理段階（Ｓ３３０）が行われる。図９は、実施例１に係る検知段階（Ｓ３００）の各段階（Ｓ３１０，Ｓ３２０，Ｓ３３０）の説明図である。以下、検知段階（Ｓ３００）の各段階（Ｓ３１０，Ｓ３２０，Ｓ３３０）を、図９を参照して説明する。 <Detection stage>
The detection stage according to the first embodiment will be described with reference to FIGS. 8 and 9. FIG. 8 is a flow chart showing a processing procedure in the detection stage of the identification processing method of the first embodiment of the identification processing apparatus according to the present embodiment. As shown in FIG. 8, in the detection stage (S300), the pretreatment stage (S310) is first performed, then the prediction stage (S320) is performed, and then the post-processing stage (S330) is performed. FIG. 9 is an explanatory diagram of each stage (S310, S320, S330) of the detection stage (S300) according to the first embodiment. Hereinafter, each stage (S310, S320, S330) of the detection stage (S300) will be described with reference to FIG.

（検知段階：前処理段階（Ｓ３１０））
入力画像は、撮像手段４によって、搬送中の検査対象物Ａが撮像された映像Ｄ１のＲＧＢ画像である。入力画像（ＲＧＢ画像）のサイズは、「Ｐ１×Ｐ２」ピクセルである。前処理段階（Ｓ３１０）では、入力画像（ＲＧＢ画像）から「入力（ｘ）と正解色（ｙ）」のペアを生成する。 (Detection stage: Preprocessing stage (S310))
The input image is an RGB image of the image D1 in which the inspection object A being transported is captured by the imaging means 4. The size of the input image (RGB image) is "P1 x P2" pixels. In the preprocessing step (S310), a pair of "input (x) and correct color (y)" is generated from the input image (RGB image).

識別処理装置６は、入力画像（ＲＧＢ画像）に対して、ＲＧＢ空間からＬａｂ空間へ変換する色空間変換処理を行い、Ｌｃｈ）のＬ値（輝度）と、ａｂｃｈのａ値及びｂ値（色）とを生成する（Ｓ３１１）。次いで、識別処理装置６は、ＬｃｈのＬ値（輝度）を正規化する（Ｓ３１２）。この正規化の計算式は、上述の式（２）で表される。平均輝度（Ｌａｖｇ）は、上述した平均輝度の計算段階（Ｓ１２０）で計算された値である。Ｌｃｈの正規化の結果の値は入力（ｘ）である。また、識別処理装置６は、ａｂｃｈのａ値及びｂ値（色）を、カテゴリカル化し、規格化する（Ｓ３１３）。このカテゴリカル化と規格化の方法は、上述したステップＳ２１３と同じである。ａｂｃｈのカテゴリカル化の結果の値は正解色（ｙ）である。 The identification processing device 6 performs color space conversion processing for converting the input image (RGB image) from the RGB space to the Lab space, and the L value (luminance) of Lch) and the a value and b value (color) of abch. ) And (S311). Next, the identification processing device 6 normalizes the L value (luminance) of Lch (S312). The calculation formula for this normalization is represented by the above formula (2). The average luminance (Lavg) is a value calculated in the above-mentioned average luminance calculation step (S120). The value of the result of Lch normalization is the input (x). Further, the identification processing device 6 categorizes and standardizes the a value and the b value (color) of the abch (S313). The method of categorization and normalization is the same as that of step S213 described above. The value of the result of categorization of ach is the correct color (y).

（検知段階：予測段階（Ｓ３２０））
予測段階（Ｓ３２０）では、上述したモデル学習段階で食品Ｂの良品Ｓのみを学習したニューラルネットワーク「モデルＮｅｔ（ｆ（θ））」を使用して、入力画像（ＲＧＢ画像）の「入力（ｘ）と正解色（ｙ）」のペアからヒートマップ（ｈ）を生成する。 (Detection stage: Prediction stage (S320))
In the prediction stage (S320), the "input (x)" of the input image (RGB image) is used by using the neural network "model Net (f (θ))" in which only the non-defective product S of food B is learned in the model learning stage described above. ) And the correct color (y) ”to generate a heat map (h).

まず、識別処理装置６は、入力（ｘ）をニューラルネットワーク「モデルＮｅｔ（ｆ（θ））」に入力して、ニューラルネットワーク「モデルＮｅｔ（ｆ（θ））」の出力値を予測色（ｙ＾＝ｆθ（ｘ））とする（Ｓ３２１）。次いで、識別処理装置６は、予測色（ｙ＾＝ｆθ（ｘ））と正解色（ｙ）との間の再構成誤差（ヒートマップ（ｈ））を、ピクセル毎に計算する（Ｓ３２２）。このヒートマップ（ｈ）の計算式は、次式（５）で表される。Ｎｋはカテゴリ数（Ｑ）である。「ｈｉ，ｊ」は、「ｉ，ｊ」番目のピクセルのヒートマップ値である。 First, the identification processing device 6 inputs the input (x) to the neural network "model Net (f (θ))" and predicts the output value of the neural network "model Net (f (θ))" (y). ^ = Fθ (x)) (S321). Next, the identification processing device 6 calculates the reconstruction error (heat map (h)) between the predicted color (y ^ = fθ (x)) and the correct color (y) for each pixel (S322). The calculation formula of this heat map (h) is expressed by the following formula (5). Nk is the number of categories (Q). “Hi, j” is a heat map value of the “i, j” th pixel.

（検知段階：後処理段階（Ｓ３３０））
後処理段階（Ｓ３３０）では、入力画像（ＲＧＢ画像）のヒートマップ（ｈ）を使用して、異物Ｆの検知位置（ｉ，ｊ）を求める。 (Detection stage: Post-processing stage (S330))
In the post-processing stage (S330), the heat map (h) of the input image (RGB image) is used to obtain the detection positions (i, j) of the foreign matter F.

まず、識別処理装置６は、ヒートマップ（ｈ）において、ヒートマップ値「ｈｉ，ｊ」が所定の閾値Ｔｈ１を超えるピクセル（ｉ，ｊ）を不良「１」とし、その他のピクセルを良「０」として、ヒートマップ（ｈ）をバイナリ化する（Ｓ３３１）。次いで、識別処理装置６は、不良「１」のピクセル同士の間の距離を計算し、該距離が所定の閾値Ｔｈ２以下である不良「１」のピクセル同士を同じクラスターに分類する（Ｓ３３２）。次いで、識別処理装置６は、各クラスターの重心を計算し、該重心のピクセル（ｉ，ｊ）を異物Ｆの検知位置（ｉ，ｊ）として出力する（Ｓ３３３）。 First, in the heat map (h), the identification processing device 6 sets a pixel (i, j) in which the heat map value “hi, j” exceeds a predetermined threshold value Th1 as a defective “1”, and sets the other pixels as a good “0”. , The heat map (h) is binarized (S331). Next, the identification processing device 6 calculates the distance between the pixels with the defect “1”, and classifies the pixels with the defect “1” whose distance is equal to or less than the predetermined threshold Th2 into the same cluster (S332). Next, the identification processing device 6 calculates the center of gravity of each cluster and outputs the pixels (i, j) of the center of gravity as the detection positions (i, j) of the foreign matter F (S333).

以上が実施例１に係る検知段階の説明である。この検知段階によって、搬送手段２によって搬送中の検査対象物Ａ中の食品Ｂの異物Ｆが識別された検知位置（ｉ，ｊ）が出力される。 The above is the description of the detection stage according to the first embodiment. In this detection step, the detection positions (i, j) at which the foreign matter F of the food B in the inspection object A being transported by the transport means 2 is identified are output.

上述した実施例１において、ニューラルネットワーク「モデルＮｅｔ（ｆ（θ））」は、Ｌ値（輝度）を用いた色再構成の畳み込みニューラルネットワークであり、食品再構成ニューラルネットワークの例である。また、予測色（ｙ＾＝ｆθ（ｘ））は食品仮定画像の例である。 In Example 1 described above, the neural network "model Net (f (θ))" is a color reconstruction convolutional neural network using an L value (luminance), and is an example of a food reconstruction neural network. The predicted color (y ^ = fθ (x)) is an example of a food assumption image.

なお、ニューラルネットワーク「モデルＮｅｔ（ｆ（θ））」の学習に使用される学習用画像（ＲＧＢ画像）の撮像において、撮像手段４により撮像される撮像対象場所の画角四隅のうちの最低照度は、画角中央の照度の８０％以上であることが好ましく、より好ましくは９０％以上である。Ｌ値を用いた色再構成の畳み込みニューラルネットワークでは、ニューラルネットワークにインプットされるＬ値に大きく作用されるため、照度を安定させることで、再現性のある学習を行いやくなる。 In the imaging of the learning image (RGB image) used for learning the neural network "model Net (f (θ))", the lowest illuminance among the four corners of the angle of view of the imaging target location imaged by the imaging means 4. Is preferably 80% or more, more preferably 90% or more of the illuminance at the center of the angle of view. In the convolutional neural network of color reconstruction using the L value, since it is greatly affected by the L value input to the neural network, it becomes easy to perform reproducible learning by stabilizing the illuminance.

また、Ｌ値（輝度）を用いた色再構成の畳み込みニューラルネットワークの学習において、畳み込み層による畳み込み処理によってニューラルネットワークを経て複数のフィーチャーマップを得ることで特徴抽出される。次に逆畳み込み層による逆畳み込み処理によってニューラルネットワークを経て元の画像サイズのピクセル数を再現する。これらを複数回入れ子状（第２、３、…、第ｎ層）に繰り返すことで、より概念的な特徴抽出が可能となる。
この輝度を用いた色再構成の学習には、サブ画像（ＲＧＢ画像）を用いるのが好ましい。サブ画像（ＲＧＢ画像）は、学習用画像に写りこむ食品Ｂの平均ピクセル長の約１０倍未満の画像サイズで切り出して得られる。なお、食品Ｂの平均ピクセル長とは、対象となる食品Ｂを撮像したときの縦・横いずれかの最大ピクセル数を計測し、所定複数個（ここでの例として１０個）を撮像したときの平均値である。
この輝度を用いた色再構成の学習では、まず学習用画像またはサブ画像のＬ値（入力（ｘ））を入力し、第ｎの逆畳み込み層から得られた色再構成された再構成カテゴリカル表現が得られる。元画像である学習用画像またはサブ画像のカテゴリカル表現（正解色（ｙ））と再構成カテゴリカル表現（予測色（ｙ＾＝ｆθ（ｘ）））との損失（ロス）関数（Categorical Cross-entropy）が極力低くなるように学習される。 Further, in the learning of a convolutional neural network for color reconstruction using the L value (brightness), features are extracted by obtaining a plurality of feature maps via the neural network by convolutional processing by the convolutional layer. Next, the number of pixels of the original image size is reproduced via the neural network by the deconvolution process by the deconvolution layer. By repeating these in a nested manner (second, third, ..., nth layer) a plurality of times, more conceptual feature extraction becomes possible.
It is preferable to use a sub-image (RGB image) for learning color reconstruction using this brightness. The sub image (RGB image) is obtained by cutting out an image size of less than about 10 times the average pixel length of the food B reflected in the learning image. The average pixel length of food B is when the maximum number of pixels in either the vertical or horizontal direction when the target food B is imaged is measured and a predetermined number of pixels (10 as an example here) are imaged. Is the average value of.
In the learning of color reconstruction using this brightness, the L value (input (x)) of the learning image or sub-image is first input, and the color-reconstructed reconstruction category obtained from the nth deconvolution layer is obtained. Cal expression is obtained. The loss function (Categorical Cross) between the categorical representation (correct color (y)) of the learning image or sub-image that is the original image and the reconstructed categorical representation (predicted color (y ^ = fθ (x))). -entropy) is learned to be as low as possible.

また、Ｌ値（輝度）を用いた色再構成の畳み込みニューラルネットワークを使用する検知段階では、入力画像のＬ値を第１の畳み込み層に入力して第ｎ層から得られたカテゴリカル表現（食品仮定画像）と、入力画像から得られたカテゴリカル表現との差分を求め、その差分に対して予め定めた閾値を超えるピクセルを「良品でない物体（つまり、異物）」として検出する。 Further, in the detection stage using the convolutional neural network of color reconstruction using the L value (brightness), the L value of the input image is input to the first convolutional layer and the categorical representation obtained from the nth layer ( The difference between the food assumption image) and the categorical expression obtained from the input image is obtained, and pixels exceeding a predetermined threshold value for the difference are detected as "non-good objects (that is, foreign substances)".

以上が実施例１の説明である。 The above is the description of the first embodiment.

［実施例２］
本実施形態に係る識別処理装置６の実施例２について、図１３から図１５を参照して説明する。図１３から図１５は、本実施形態に係る識別処理装置６の実施例２の識別処理方法を示す図である。本識別処理方法は、上述した実施例１と同様に、モデル学習段階と、検知段階と、から構成される。モデル学習段階は、検知段階で使用されるニューラルネットワークの学習を行う段階である。実施例２では、自己符号化器を用いた再構成の畳み込みニューラルネットワークを用いる。検知段階は、モデル学習段階で学習したニューラルネットワークを使用して、搬送手段２によって搬送中の検査対象物Ａ中の食品Ｂの異物Ｆを識別する段階である。 [Example 2]
The second embodiment of the identification processing device 6 according to the present embodiment will be described with reference to FIGS. 13 to 15. 13 to 15 are diagrams showing the identification processing method of the second embodiment of the identification processing device 6 according to the present embodiment. The present identification processing method is composed of a model learning stage and a detection stage, as in the first embodiment described above. The model learning stage is a stage in which the neural network used in the detection stage is trained. In the second embodiment, a reconstructed convolutional neural network using a self-encoder is used. The detection step is a step of identifying the foreign matter F of the food B in the inspection object A being transported by the transport means 2 using the neural network learned in the model learning step.

＜モデル学習段階＞
図１３及び図１４を参照して、実施例２に係るモデル学習段階を説明する。実施例２に係るモデル学習段階は、上述した実施例１の図３に示される手順と同様に、まずデータ準備段階（Ｓ１００）が行われ、次いで学習段階（Ｓ２００）が行われる。但し、実施例２では、データ準備段階（Ｓ１００）において、平均輝度の計算段階（Ｓ１２０）は実行されない。 <Model learning stage>
The model learning stage according to the second embodiment will be described with reference to FIGS. 13 and 14. In the model learning stage according to the second embodiment, the data preparation stage (S100) is first performed, and then the learning stage (S200) is performed, similarly to the procedure shown in FIG. 3 of the first embodiment. However, in the second embodiment, the average brightness calculation step (S120) is not executed in the data preparation step (S100).

（データ準備段階）
実施例２に係るデータ準備段階（Ｓ１００）は、学習データの準備段階（Ｓ１１０）から構成される。実施例２に係る学習データの準備段階（Ｓ１１０）は、上述した実施例１（図４）と同じであるので、その説明を省略する。なお、実施例１と同様に、ステップＳ１１２（サブ画像への分解処理）は実行してもしなくてもよい。 (Data preparation stage)
The data preparation stage (S100) according to the second embodiment is composed of a training data preparation stage (S110). Since the learning data preparation stage (S110) according to the second embodiment is the same as that of the first embodiment (FIG. 4) described above, the description thereof will be omitted. As in the first embodiment, step S112 (decomposition process into sub-images) may or may not be executed.

（学習段階）
実施例２に係る学習段階（Ｓ２００）を説明する。実施例２に係る学習段階（Ｓ２００）は、前処理段階（Ｓ２１０）と、モデルの学習段階（Ｓ２２０）と、から構成される。以下、実施例２に係る学習段階（Ｓ２００）の各段階（Ｓ２１０，Ｓ２２０）を、図１３，図１４をそれぞれ参照して説明する。 (Learning stage)
The learning stage (S200) according to the second embodiment will be described. The learning stage (S200) according to the second embodiment is composed of a preprocessing stage (S210) and a model learning stage (S220). Hereinafter, each stage (S210, S220) of the learning stage (S200) according to the second embodiment will be described with reference to FIGS. 13 and 14, respectively.

（学習段階：前処理段階（Ｓ２１０））
図１３は、実施例２に係る前処理段階（Ｓ２１０）の説明図である。前処理段階（Ｓ２１０）には、学習入力画像（ＲＧＢ画像、「Ｐｉｎ１×Ｐｉｎ２」ピクセル）が入力される。実施例２に係る学習データの準備段階（Ｓ１１０）においてステップＳ１１２（サブ画像への分解処理）が実行された場合、学習入力画像はサブ画像（Ｐｉｎ１＝Ｐｉｎ２＝Ｐ３）である。一方、実施例２に係る学習データの準備段階（Ｓ１１０）においてステップＳ１１２（サブ画像への分解処理）が実行されなかった場合には、学習入力画像は学習用画像（Ｐｉｎ１＝Ｐ１、Ｐｉｎ２＝Ｐ２）である。実施例２に係る前処理段階（Ｓ２１０）では、学習セット（Train set）と、検証セット（Validation set）とに対して、それぞれの学習入力画像（ＲＧＢ画像）から「入力（ｘ）と正解画像（ｙ）」のペアを生成する。 (Learning stage: Preprocessing stage (S210))
FIG. 13 is an explanatory diagram of the pretreatment stage (S210) according to the second embodiment. A learning input image (RGB image, "Pin1 x Pin2" pixel) is input to the preprocessing stage (S210). When step S112 (decomposition process into sub-images) is executed in the learning data preparation stage (S110) according to the second embodiment, the learning input image is a sub-image (Pin1 = Pin2 = P3). On the other hand, when step S112 (decomposition process into sub-images) is not executed in the learning data preparation stage (S110) according to the second embodiment, the learning input image is a learning image (Pin1 = P1, Pin2 = P2). ). In the preprocessing stage (S210) according to the second embodiment, the training set (Train set) and the verification set (Validation set) are subjected to "input (x) and correct answer image" from the respective learning input images (RGB images). (Y) ”pair is generated.

識別処理装置６は、学習入力画像（ＲＧＢ画像、「Ｐｉｎ１×Ｐｉｎ２」ピクセル）に対して、ＲＧＢ各チャネル（ｃｈ）の規格化を行う（Ｓ２１１０）。この規格化の計算式は、次式（６）で表される。なお、式（６）は規格化の方法の一例である。 The identification processing device 6 standardizes each RGB channel (ch) with respect to the learning input image (RGB image, “Pin1 × Pin2” pixel) (S2110). The calculation formula for this standardization is expressed by the following formula (6). Equation (6) is an example of a standardization method.

ＲＧＢ各ｃｈの規格化の結果の値は入力（ｘ＝｛ｘｉ，ｊ，ｋ｝）である。「ｘｉ，ｊ，ｋ」は、「ｉ，ｊ」番目のピクセルのチャンネルｋの規格化の結果の値である。また、入力（ｘ＝｛ｘｉ，ｊ，ｋ｝）とペアの正解画像（ｙ）として、当該入力（ｘ＝｛ｘｉ，ｊ，ｋ｝）を使用する。つまり、一つのペアにおいて、入力（ｘ）と正解画像（ｙ）とは同じである。 The value as a result of standardization of each RGB channel is an input (x = {xi, j, k}). “Xi, j, k” is a value obtained as a result of normalization of the channel k of the “i, j” th pixel. Further, the input (x = {xi, j, k}) is used as the correct image (y) paired with the input (x = {xi, j, k}). That is, in one pair, the input (x) and the correct image (y) are the same.

なお、上述した実施例２に係る前処理段階（Ｓ２１０）では、ＲＧＢ各ｃｈの規格化を行ったが、ＲＧＢ各ｃｈの平均値を使用してＲＧＢ各ｃｈの正規化を行ってもよい。ＲＧＢ各ｃｈの正規化の方法は、上述した実施例１に係るＬ値（輝度）の正規化の方法と同様である。 In the pretreatment step (S210) according to the second embodiment described above, the RGB channels are standardized, but the RGB channels may be normalized using the average value of the RGB channels. The method of normalizing each RGB channel is the same as the method of normalizing the L value (luminance) according to the first embodiment described above.

（学習段階：モデルの学習段階（Ｓ２２０））
図１４は、実施例２に係るモデルの学習段階（Ｓ２２０）の説明図である。実施例２に係るモデルの学習段階（Ｓ２２０）では、学習セット（Train set）の「入力（ｘ）と正解画像（ｙ）」のペアと、検証セット（Validation set）の「入力（ｘ）と正解画像（ｙ）」のペアとを使用して、ニューラルネットワーク「ＣＡＥ（ｆ（θ））」の学習を行う。 (Learning stage: Model learning stage (S220))
FIG. 14 is an explanatory diagram of the learning stage (S220) of the model according to the second embodiment. In the learning stage (S220) of the model according to the second embodiment, the pair of “input (x) and correct image (y)” of the training set (Train set) and the “input (x)” of the validation set (Validation set) The neural network "CAE (f (θ))" is learned by using the pair of "correct image (y)".

実施例２に係るニューラルネットワーク「ＣＡＥ（ｆ（θ））」の構造は、上述した実施例１に係る図１２のニューラルネットワーク「モデルＮｅｔ（ｆ（θ））」の構造と同様である。但し、図１２において、実施例２に係るニューラルネットワーク「ＣＡＥ（ｆ（θ））」では、入力画像は、サイズが「Ｐｉｎ１×Ｐｉｎ２」ピクセルであり、ＲＧＢの３個のチャネルから構成される（つまり、ｎａ＝３）。また、第８層の出力サイズは、「Ｐｉｎ１×Ｐｉｎ２×３」である（つまり、ｎｂ＝３）。本実施例２では、ｎａ及びｎｂはＲＧＢのチャネル数「３」である（ｎａ＝ｎｂ＝３）。 The structure of the neural network “CAE (f (θ))” according to the second embodiment is the same as the structure of the neural network “model Net (f (θ))” of FIG. 12 according to the first embodiment described above. However, in FIG. 12, in the neural network “CAE (f (θ))” according to the second embodiment, the input image has a size of “Pin1 × Pin2” pixels and is composed of three RGB channels ( That is, na = 3). The output size of the eighth layer is "Pin1 x Pin2 x 3" (that is, nb = 3). In the second embodiment, na and nb are the number of RGB channels "3" (na = nb = 3).

以下、図１４を参照して、実施例２に係るモデルの学習段階（Ｓ２２０）を説明する。
識別処理装置６は、学習セット（Train set）の「入力（ｘ）と正解画像（ｙ）」のペアを使用して、「Binary Cross-entropy」のロスを最小化するニューラルネットワーク「ＣＡＥ（ｆ（θ））」のパラメータ（θ）を学習する（Ｓ２２１０，Ｓ２２２０）。ステップＳ２２１０では、入力（ｘ）をニューラルネットワーク「ＣＡＥ（ｆ（θ））」に入力して、ニューラルネットワーク「ＣＡＥ（ｆ（θ））」の出力値を予測画像（ｙ＾＝ｆθ（ｘ））とする。次いで、ステップＳ２２２０では、予測画像（ｙ＾＝ｆθ（ｘ））と正解画像（ｙ）とを使用して、「Binary Cross-entropy」のロス（Ｅ（ｙ，ｙ＾）を計算する。「Binary Cross-entropy」のロス（Ｅ（ｙ，ｙ＾）の計算式は、次式（７）で表される。正解画像（ｙ）の「ｙｉ，ｊ，ｋ」は、「ｉ，ｊ」番目のピクセルのチャンネルｋ（ＲＧＢ各ｃｈ）の値である。予測画像（ｙ＾）の「ｙ＾ｉ，ｊ，ｋ」は、ニューラルネットワーク「ＣＡＥ（ｆ（θ））」の「ｉ，ｊ」番目の出力値のチャンネルｋ（ＲＧＢ各ｃｈ）の値である。 Hereinafter, the learning stage (S220) of the model according to the second embodiment will be described with reference to FIG.
The identification processing device 6 uses a pair of "input (x) and correct image (y)" of the training set to minimize the loss of "Binary Cross-entropy", which is a neural network "CAE (f)". (Θ)) ”parameter (θ) is learned (S2210, S2220). In step S2210, the input (x) is input to the neural network “CAE (f (θ))”, and the output value of the neural network “CAE (f (θ))” is predicted image (y ^ = fθ (x)). ). Next, in step S2220, the loss (E (y, y ^) of "Binary Cross-entropy" is calculated using the predicted image (y ^ = fθ (x)) and the correct image (y). The calculation formula for the loss (E (y, y ^)) of "Binary Cross-entropy" is expressed by the following formula (7). The "yi, j, k" in the correct image (y) is "i, j". It is the value of the channel k (each of RGB channels) of the third pixel. The “y ^ i, j, k” of the predicted image (y ^) is the “i, j” of the neural network “CAE (f (θ))”. This is the value of the channel k (each RGB channel) of the third output value.

「Binary Cross-entropy」のロス（Ｅ（ｙ，ｙ＾）を最小化するニューラルネットワーク「ＣＡＥ（ｆ（θ））」のパラメータ（θ）を学習する方法として、誤差逆伝搬法と確率的勾配降下法（Stochastic Gradient Descent：ＳＧＤ）とを使用する。 The error back propagation method and stochastic gradient descent are methods for learning the parameter (θ) of the neural network “CAE (f (θ))” that minimizes the loss (E (y, y ^)) of the “Binary Cross-entropy”. The descent method (Stochastic Gradient Descent: SGD) is used.

識別処理装置６は、ニューラルネットワーク「ＣＡＥ（ｆ（θ））」のパラメータ（θ）の学習を、一定の反復数（epochs）だけ繰り返して行う。識別処理装置６は、該パラメータ（θ）の学習の繰り返し毎に、検証セット（Validation set）の「入力（ｘ）と正解画像（ｙ）」のペアを使用して、該パラメータ（θ）を検証する。このパラメータ（θ）の検証では、検証セット（Validation set）の「入力（ｘ）と正解画像（ｙ）」のペアを使用して、上述したステップＳ２２１０，Ｓ２２２０を行い、「Binary Cross-entropy」のロス（Ｅ（ｙ，ｙ＾）を計算する。このロス（Ｅ（ｙ，ｙ＾）が下がったときの最新のパラメータ（θ）をベストパラメータとして保存する。この保存されたベストパラメータのパラメータ（θ）は、検知段階で使用されるニューラルネットワーク「ＣＡＥ（ｆ（θ））」に適用されるパラメータ（θ）である。 The identification processing device 6 repeatedly learns the parameter (θ) of the neural network “CAE (f (θ))” by a fixed number of iterations (epochs). The identification processing device 6 uses the pair of "input (x) and correct image (y)" of the validation set to set the parameter (θ) every time the learning of the parameter (θ) is repeated. Verify. In the verification of this parameter (θ), the above-mentioned steps S2210 and S2220 are performed using the pair of “input (x) and correct image (y)” of the validation set, and “Binary Cross-entropy” is performed. The loss (E (y, y ^)) of is calculated. The latest parameter (θ) when this loss (E (y, y ^) decreases is saved as the best parameter. The parameter of this saved best parameter. (Θ) is a parameter (θ) applied to the neural network “CAE (f (θ))” used in the detection stage.

以上が実施例２に係るモデル学習段階の説明である。このモデル学習段階によって、食品Ｂの良品Ｓのみを学習したニューラルネットワーク「ＣＡＥ（ｆ（θ））」が生成される。 The above is the description of the model learning stage according to the second embodiment. By this model learning stage, a neural network "CAE (f (θ))" that learns only the non-defective product S of food B is generated.

＜検知段階＞
図１５を参照して、実施例２に係る検知段階を説明する。実施例２に係る検知段階は、上述した実施例１の図８に示される検知段階（Ｓ３００）の手順と同様に、まず前処理段階（Ｓ３１０）が行われ、次いで予測段階（Ｓ３２０）が行われ、次いで後処理段階（Ｓ３３０）が行われる。図１５は、実施例２に係る検知段階（Ｓ３００）の各段階（Ｓ３１０，Ｓ３２０，Ｓ３３０）の説明図である。以下、実施例２に係る検知段階（Ｓ３００）の各段階（Ｓ３１０，Ｓ３２０，Ｓ３３０）を、図１５を参照して説明する。 <Detection stage>
The detection step according to the second embodiment will be described with reference to FIG. As for the detection step according to the second embodiment, the pretreatment step (S310) is first performed, and then the prediction step (S320) is performed in the same manner as the procedure of the detection step (S300) shown in FIG. 8 of the first embodiment. Then, the post-treatment step (S330) is performed. FIG. 15 is an explanatory diagram of each stage (S310, S320, S330) of the detection stage (S300) according to the second embodiment. Hereinafter, each stage (S310, S320, S330) of the detection stage (S300) according to the second embodiment will be described with reference to FIG.

（検知段階：前処理段階（Ｓ３１０））
入力画像は、撮像手段４によって、搬送中の検査対象物Ａが撮像された映像Ｄ１のＲＧＢ画像である。入力画像（ＲＧＢ画像）のサイズは、「Ｐ１×Ｐ２」ピクセルである。前処理段階（Ｓ３１０）では、入力画像（ＲＧＢ画像）から「入力（ｘ）と正解画像（ｙ）」のペアを生成する。 (Detection stage: Preprocessing stage (S310))
The input image is an RGB image of the image D1 in which the inspection object A being transported is captured by the imaging means 4. The size of the input image (RGB image) is "P1 x P2" pixels. In the preprocessing step (S310), a pair of "input (x) and correct answer image (y)" is generated from the input image (RGB image).

識別処理装置６は、入力画像（ＲＧＢ画像）に対して、ＲＧＢ各チャネル（ｃｈ）の規格化を行う（Ｓ３１１０）。この規格化の計算式は、上述の式（６）で表される。ＲＧＢ各ｃｈの規格化の結果の値は入力（ｘ＝｛ｘｉ，ｊ，ｋ｝）である。「ｘｉ，ｊ，ｋ」は、「ｉ，ｊ」番目のピクセルのチャンネルｋの規格化の結果の値である。また、入力（ｘ＝｛ｘｉ，ｊ，ｋ｝）とペアの正解画像（ｙ）として、当該入力（ｘ＝｛ｘｉ，ｊ，ｋ｝）を使用する。つまり、一つのペアにおいて、入力（ｘ）と正解画像（ｙ）とは同じである。 The identification processing device 6 standardizes each RGB channel (ch) with respect to the input image (RGB image) (S3110). The calculation formula for this standardization is represented by the above formula (6). The value as a result of standardization of each RGB channel is an input (x = {xi, j, k}). “Xi, j, k” is a value obtained as a result of normalization of the channel k of the “i, j” th pixel. Further, the input (x = {xi, j, k}) is used as the correct image (y) paired with the input (x = {xi, j, k}). That is, in one pair, the input (x) and the correct image (y) are the same.

（検知段階：予測段階（Ｓ３２０））
予測段階（Ｓ３２０）では、上述した実施例２に係るモデル学習段階で食品Ｂの良品Ｓのみを学習したニューラルネットワーク「ＣＡＥ（ｆ（θ））」を使用して、入力画像（ＲＧＢ画像）の「入力（ｘ）と正解画像（ｙ）」のペアからヒートマップ（ｈ）を生成する。 (Detection stage: Prediction stage (S320))
In the prediction stage (S320), the input image (RGB image) is obtained by using the neural network “CAE (f (θ))” in which only the non-defective product S of the food B is learned in the model learning stage according to the second embodiment described above. A heat map (h) is generated from a pair of "input (x) and correct image (y)".

まず、識別処理装置６は、入力（ｘ）をニューラルネットワーク「ＣＡＥ（ｆ（θ））」に入力して、ニューラルネットワーク「ＣＡＥ（ｆ（θ））」の出力値を予測画像（ｙ＾＝ｆθ（ｘ））とする（Ｓ３２１０）。次いで、識別処理装置６は、予測画像（ｙ＾＝ｆθ（ｘ））と正解画像（ｙ）との間の再構成誤差（ヒートマップ（ｈ））を、ピクセル毎に計算する（Ｓ３２２０）。このヒートマップ（ｈ）の計算式は、次式（８）で表される。「ｈｉ，ｊ」は、「ｉ，ｊ」番目のピクセルのヒートマップ値である。 First, the identification processing device 6 inputs the input (x) to the neural network “CAE (f (θ))” and predicts the output value of the neural network “CAE (f (θ))” (y ^ = Let fθ (x)) (S3210). Next, the identification processing device 6 calculates the reconstruction error (heat map (h)) between the predicted image (y ^ = fθ (x)) and the correct image (y) for each pixel (S3220). The calculation formula of this heat map (h) is expressed by the following formula (8). “Hi, j” is a heat map value of the “i, j” th pixel.

なお、ヒートマップ（ｈ）の他の計算方法として、正解画像（ｙ）と予測画像（ｙ＾＝ｆθ（ｘ））との差分をグレースケール変換し、このグレースケール変換後の値をヒートマップ値に使用してもよい。 As another calculation method of the heat map (h), the difference between the correct image (y) and the predicted image (y ^ = fθ (x)) is grayscale-converted, and the value after the grayscale conversion is heat-mapped. May be used for the value.

（検知段階：後処理段階（Ｓ３３０））
後処理段階（Ｓ３３０）では、入力画像（ＲＧＢ画像）のヒートマップ（ｈ）を使用して、異物Ｆの検知位置（ｉ，ｊ）を求める。実施例２に係る後処理段階（Ｓ３３０）は、上述した実施例１（図９の後処理段階（Ｓ３３０）と同じであるので、その説明を省略する。 (Detection stage: Post-processing stage (S330))
In the post-processing stage (S330), the heat map (h) of the input image (RGB image) is used to obtain the detection positions (i, j) of the foreign matter F. Since the post-treatment step (S330) according to the second embodiment is the same as the post-treatment step (S330) of the above-described first embodiment (S330), the description thereof will be omitted.

以上が実施例２に係る検知段階の説明である。この検知段階によって、搬送手段２によって搬送中の検査対象物Ａ中の食品Ｂの異物Ｆが識別された検知位置（ｉ，ｊ）が出力される。 The above is the description of the detection stage according to the second embodiment. In this detection step, the detection positions (i, j) at which the foreign matter F of the food B in the inspection object A being transported by the transport means 2 is identified are output.

上述した実施例２において、ニューラルネットワーク「ＣＡＥ（ｆ（θ））」は、自己符号化器を用いた再構成の畳み込みニューラルネットワークであり、食品再構成ニューラルネットワークの例である。また、予測画像（ｙ＾＝ｆθ（ｘ））は食品仮定画像の例である。 In Example 2 described above, the neural network "CAE (f (θ))" is a reconstructed convolutional neural network using a self-encoder, and is an example of a food reconstructed neural network. The predicted image (y ^ = fθ (x)) is an example of a food assumption image.

なお、自己符号化器を用いた再構成の畳み込みニューラルネットワークの学習において、学習用画像またはサブ画像（入力（ｘ））は、畳み込み層による畳み込み処理によってニューラルネットワークを経て複数のフィーチャーマップを得ることで特徴抽出され、次に逆畳み込み層による逆畳み込み処理によってニューラルネットワークを経て再構成される。さらに、畳み込み処理と逆畳み込み処理を複数回入れ子状に繰り返すことで、より概念的な特徴抽出が可能となる。
この自己符号化器を用いた再構成の畳み込みニューラルネットワークの学習は、最終逆畳み込み層からの出力が元の入力された学習用画像またはサブ画像（正解画像（ｙ））と極力一致するように、つまり差分を表す損失関数が極力低くなるように学習される。この時の損失関数は、ＢＣＥ（ＢｉｎａｒｙＣｒｏｓｓ−ｅｎｔｒｏｐｙ）といった損失関数が利用できる。 In the learning of the reconstructed convolutional neural network using the self-encoder, the learning image or sub-image (input (x)) obtains a plurality of feature maps via the neural network by the convolution processing by the convolutional layer. The features are extracted with, and then reconstructed via the neural network by the reverse convolution processing by the reverse convolution layer. Further, by repeating the convolution process and the deconvolution process a plurality of times in a nested manner, more conceptual feature extraction becomes possible.
In the learning of the reconstructed convolutional neural network using this self-encoder, the output from the final deconvolutional layer should match the original input learning image or sub-image (correct image (y)) as much as possible. That is, the loss function representing the difference is learned to be as low as possible. As the loss function at this time, a loss function such as BCE (Binary Cross-entropy) can be used.

以上が実施例２の説明である。 The above is the description of the second embodiment.

なお、上述した図１に示される食品検査装置１は搬送手段２を備えたが、搬送手段２は無くてもよい。例えば、検査対象物Ａを落下させながら撮像して検査するように構成してもよい。さらには、検査対象物Ａを回転落下させながら撮像して検査するように構成してもよい。検査対象物Ａを回転落下させながら撮像して検査することによって、検査対象物Ａの複数の方向さらには全方向を撮像して検査することができる。 Although the food inspection device 1 shown in FIG. 1 described above includes the transport means 2, the transport means 2 may not be provided. For example, the inspection object A may be configured to be inspected by taking an image while dropping it. Further, the inspection object A may be configured to be inspected by taking an image while rotating and dropping it. By imaging and inspecting the inspection object A while rotating and dropping it, it is possible to image and inspect a plurality of directions and all directions of the inspection object A.

以上、説明した実施形態の効果について述べる。
実施形態の食品検査装置１は、異物Ｆの混在がないことを確認する食品Ｂを検査対象物Ａとして該検査対象物Ａに光を照射する光照射手段３と、検査対象物Ａの映像を撮像する撮像手段４と、識別処理装置６と、を備え、識別処理装置６は、予め食品Ｂの良品Ｓのみを学習した食品再構成ニューラルネットワークを備えており、撮像手段４によって得られた画像から、該食品再構成ニューラルネットワークを用いて、食品仮定画像を再構成し、撮像手段４によって得られた該画像と該食品仮定画像との差分を算出し、該差分が予め設定していた閾値を超える検査対象物Ａを異物として識別するものである。この食品検査装置１によれば、食品仮定画像は「もし仮に撮像された食品Ｂが全て良品Ｓであった場合を予測した画像」であり、食品仮定画像と撮像手段４によって得られた画像との差分が予め設定していた閾値を超える検査対象物Ａを異物として識別することが可能となる。これにより、異物を学習する必要がないので、任意の異物に対応することができ、異物の混在がないことを確認する食品検査のロバスト性、つまり異常判別においての汎用性の向上を図ることができる。特に、食品再構成ニューラルネットワークとしてＬ値（輝度）のみを用いた色再構成の畳み込みニューラルネットワークを用いれば、自己符号化器を用いた再構成の畳み込みニューラルネットワークと比べて、ニューラルネットワークの詳細設計モデルの検査対象物依存性が少なく、Ｌ値のみを用いた色再構成の畳み込みニューラルネットワークの詳細設計モデルを変更することなく、多様な検査対象物に対して汎用的に現場で学習し、異物の混在がないことを確認することができる。 The effects of the described embodiments will be described above.
In the food inspection device 1 of the embodiment, the food B for confirming that the foreign matter F is not mixed is set as the inspection target A, and the light irradiation means 3 for irradiating the inspection target A with light and the image of the inspection target A are displayed. An image pickup means 4 for taking an image and an identification processing device 6 are provided, and the identification processing device 6 is provided with a food reconstruction neural network in which only the non-defective product S of the food B is learned in advance, and the image obtained by the image pickup means 4 is provided. From, the food assumption image is reconstructed using the food reconstruction neural network, the difference between the image obtained by the imaging means 4 and the food assumption image is calculated, and the difference is a preset threshold value. The inspection object A exceeding the above is identified as a foreign substance. According to this food inspection apparatus 1, the food assumption image is "an image predicted when all the captured foods B are non-defective products S", and the food assumption image and the image obtained by the imaging means 4 It becomes possible to identify the inspection object A whose difference exceeds the preset threshold value as a foreign substance. As a result, since it is not necessary to learn foreign substances, it is possible to deal with arbitrary foreign substances, and it is possible to improve the robustness of food inspection to confirm that there is no foreign matter mixed, that is, the versatility in abnormality discrimination. it can. In particular, if a color reconstruction convolutional neural network that uses only the L value (brightness) is used as the food reconstruction neural network, the detailed design of the neural network will be compared with the reconstruction convolutional neural network that uses a self-encoder. Detailed design of a convolutional neural network for color reconstruction using only the L value, which is less dependent on the inspection object of the model. It can be confirmed that there is no mixture of.

また、食品再構成ニューラルネットワークとして輝度のみを用いた色再構成を用いることにより、食品再構成ニューラルネットワークの学習において輝度のみから色を再構成させるという難題を学習させるので、結果的に自明なルールの学習が避けられて学習効率が向上する効果が期待される。 In addition, by using color reconstruction using only brightness as the food reconstruction neural network, the difficult problem of reconstructing colors from only brightness is learned in the learning of the food reconstruction neural network, and as a result, a self-explanatory rule. It is expected that the learning of the neural network will be avoided and the learning efficiency will be improved.

また、実施形態の識別処理装置６は、専用のハードウェアにより実現されるものであってもよく、あるいはパーソナルコンピュータ等のコンピュータシステムにより構成され、実施形態の識別処理装置６の機能を実現するためのプログラムを実行することによりその機能を実現させるものであってもよい。 Further, the identification processing device 6 of the embodiment may be realized by dedicated hardware, or may be configured by a computer system such as a personal computer to realize the function of the identification processing device 6 of the embodiment. The function may be realized by executing the program of.

また、その識別処理装置６には、周辺機器として入力装置、表示装置等が接続されてもよい。ここで、入力装置とはキーボード、マウス等の入力デバイスのことをいう。表示装置とはＣＲＴ（Cathode Ray Tube）や液晶表示装置等のことをいう。
また、上記周辺機器については、識別処理装置６に直接接続するものであってもよく、あるいは通信回線を介して接続するようにしてもよい。 Further, an input device, a display device, or the like may be connected to the identification processing device 6 as peripheral devices. Here, the input device refers to an input device such as a keyboard and a mouse. The display device refers to a CRT (Cathode Ray Tube), a liquid crystal display device, or the like.
Further, the peripheral device may be directly connected to the identification processing device 6, or may be connected via a communication line.

以上、実施形態を用いて本発明を説明したが、本発明の技術的範囲は上記実施形態に記載の範囲には限定されないことは言うまでもない。上記実施形態に、多様な変更又は改良を加えることが可能であることが当業者に明らかである。また、その様な変更又は改良を加えた形態も本発明の技術的範囲に含まれ得ることが、特許請求の範囲の記載から明らかである。 Although the present invention has been described above using the embodiments, it goes without saying that the technical scope of the present invention is not limited to the scope described in the above embodiments. It will be apparent to those skilled in the art that various changes or improvements can be made to the above embodiments. Further, it is clear from the description of the scope of claims that such a modified or improved form may be included in the technical scope of the present invention.

１食品検査装置
２搬送手段
３光照射手段
３２補光器
４撮像手段
６識別処理装置
Ａ検査対象物
Ｂ食品
Ｓ良品
Ｆ異物 1 Food inspection device 2 Transport means 3 Light irradiation means 32 Auxiliary device 4 Imaging means 6 Identification processing device A Inspection object B Food S Good product F Foreign matter

Claims

A light irradiation means for irradiating the inspection object with light using a food for which it is confirmed that there is no foreign matter as an inspection object, and a light irradiation means.
An imaging means for capturing an image of the inspection object and
Equipped with an identification processing device
The identification processing device is
It is equipped with a food reconstruction neural network that learns only good food products in advance.
From the image obtained by the imaging means, the food assumption image is reconstructed using the food reconstruction neural network.
The difference between the image obtained by the imaging means and the food assumption image is calculated.
An inspection object whose difference exceeds a preset threshold value is identified as a foreign substance.
Food inspection equipment.

The food inspection device according to claim 1.
A food inspection apparatus characterized in that color reconstruction using only luminance is used as the food reconstruction neural network.

The food inspection device according to claim 1.
A food inspection apparatus characterized in that a reconstructed convolutional neural network using a self-encoder is used as the food reconstructed neural network.

By the food inspection apparatus according to any one of claims 1 to 3.
A food inspection method that confirms that the food to be inspected is free of foreign substances.

The method for learning the food reconstruction neural network of the food inspection apparatus according to claim 1.
By the imaging means of the food inspection device, the food to be confirmed to be free of foreign substances is imaged,
The difference between the food assumption image generated by the convolutional neural network from the good product image in which the good product of the food is shown and no foreign matter is shown based on the original image obtained by the imaging means of the food inspection device is different from the good product image. Learn to be minimized,
Learning method of food reconstruction neural network of food inspection equipment.

The learning method according to claim 5.
The image of the non-defective product is obtained by imaging only the non-defective product of the food product by the imaging means of the food inspection device.
Learning method of food reconstruction neural network of food inspection equipment.

The learning method according to claim 5.
It is characterized in that a non-defective image is obtained by performing a process of removing an image of a foreign substance from the original image.
Learning method of food reconstruction neural network of food inspection equipment.