JP2021002259A

JP2021002259A - Object identification device and object identification method

Info

Publication number: JP2021002259A
Application number: JP2019116299A
Authority: JP
Inventors: 茂樹古屋仲; Shigeki Koyanaka; 賢一郎小林; Kenichiro Kobayashi
Original assignee: National Institute of Advanced Industrial Science and Technology AIST
Current assignee: National Institute of Advanced Industrial Science and Technology AIST
Priority date: 2019-06-24
Filing date: 2019-06-24
Publication date: 2021-01-07
Anticipated expiration: 2039-06-24
Also published as: JP7258345B2

Abstract

To provide an object identification device and an object identification method capable of accurately identifying an object with less learning data.SOLUTION: The object identification device 1 includes a controller 13, which creates a first background image according to a density of a sample body and pastes a three-dimensional image of the sample body on the first background image to create a three-dimensional composite image. Further, for each pixel in a two-dimensional composite image obtained by pasting a two-dimensional image of the sample body on a second background image, the controller creates image data obtained by parameter addition according to a pixel value of each pixel of the three-dimensional composite image at a same position as the above-mentioned each pixel, and identifies an object by a convolutional neural network model trained with the above image data.SELECTED DRAWING: Figure 1

Description

本発明は、物体のリサイクル工程などにおいて物体を識別する技術に関するものである。 The present invention relates to a technique for identifying an object in an object recycling process or the like.

近年においては省資源が重要な社会テーマの一つとなっており、リサイクルを実現するために廃製品の選別技術も考案されてきている（特許文献１参照）。 In recent years, resource saving has become one of the important social themes, and a technology for sorting waste products has been devised in order to realize recycling (see Patent Document 1).

特開２０１７−１０９１６１号公報JP-A-2017-109161

特許文献１には畳み込みニューラルネットワークによる機械学習を利用した廃棄物選別システム等が開示されているが、精度の高い選別を行うためには多数の画像データを収集して機械学習部４３に学習させる必要がある。 Patent Document 1 discloses a waste sorting system using machine learning by a convolutional neural network, but in order to perform highly accurate sorting, a large amount of image data is collected and trained by the machine learning unit 43. There is a need.

本発明は、このような問題を解決するためになされたもので、より少ない学習データで高精度な物体の識別が可能な物体識別装置と物体識別方法を提供することを目的とする。 The present invention has been made to solve such a problem, and an object of the present invention is to provide an object identification device and an object identification method capable of accurately identifying an object with less training data.

上記課題を解決するため、本発明は、サンプル体の密度に応じた第一の背景画像を作成すると共に、第一の背景画像にサンプル体の三次元画像を貼付して三次元合成画像を作成し、第二の背景画像にサンプル体の二次元画像を貼付して得られた二次元合成画像における各画素に、上記各画素と同じ位置にある三次元合成画像の各画素の画素値に応じたパラメータを追加して得られる画像データを作成し、上記画像データを学習させた畳み込みニューラルネットワークモデルにより物体を識別する識別手段を備えた物体識別装置を提供する。 In order to solve the above problems, the present invention creates a first background image according to the density of the sample body, and attaches a three-dimensional image of the sample body to the first background image to create a three-dimensional composite image. Then, each pixel in the two-dimensional composite image obtained by pasting the two-dimensional image of the sample body on the second background image corresponds to the pixel value of each pixel in the three-dimensional composite image at the same position as the above pixels. An object identification device provided with an identification means for identifying an object by a convolutional neural network model trained with the image data is provided by creating image data obtained by adding the above parameters.

また、上記課題を解決するため、本発明は、サンプル体の重量を測定する第一のステップと、サンプル体の三次元画像からサンプル体の体積を算出する第二のステップと、第一のステップで測定された重量と第二のステップで算出された体積を用いてサンプル体の密度を算出する第三のステップと、第三のステップにおいて算出された密度に応じた第一の背景画像を作成する第四のステップと、第一の背景画像にサンプル体の三次元画像を貼付して三次元合成画像を作成する第五のステップと、第二の背景画像にサンプル体の二次元画像を貼付して得られた二次元合成画像における各画素に、上記各画素と同じ位置にある上記三次元合成画像の各画素の画素値に応じたパラメータを追加して得られる画像データを作成する第六のステップと、上記画像データを学習データとして畳み込みニューラルネットワークモデルに学習させる第七のステップと、識別対象とする物体につき上記第一から第六のステップを実行することにより得られた上記画像データを、上記第七のステップで得られた学習済みの前記畳み込みニューラルネットワークモデルへ入力して物体を識別させる第八のステップとを有する物体識別方法を提供する。 Further, in order to solve the above problems, the present invention has a first step of measuring the weight of the sample body, a second step of calculating the volume of the sample body from the three-dimensional image of the sample body, and a first step. Create a third step of calculating the density of the sample body using the weight measured in step 3 and the volume calculated in the second step, and the first background image according to the density calculated in the third step. The fourth step to create a three-dimensional composite image by pasting the three-dimensional image of the sample body on the first background image, and the second step of pasting the two-dimensional image of the sample body on the second background image. The sixth is to create image data obtained by adding parameters corresponding to the pixel values of each pixel of the three-dimensional composite image at the same position as each pixel to each pixel in the two-dimensional composite image obtained. , The seventh step of training the convolution neural network model with the image data as training data, and the image data obtained by executing the first to sixth steps for the object to be identified. Provided is an object identification method having an eighth step of inputting into the trained convolutional neural network model obtained in the seventh step to identify an object.

本発明によれば、より少ない学習データで高精度な物体の識別が可能な物体識別装置と物体識別方法を得ることができる。 According to the present invention, it is possible to obtain an object identification device and an object identification method capable of accurately identifying an object with less learning data.

本発明の実施の形態に係る物体識別装置１の全体構成を示す図である。It is a figure which shows the whole structure of the object identification apparatus 1 which concerns on embodiment of this invention. 本発明の実施の形態に係る物体識別方法を示すフローチャートである。It is a flowchart which shows the object identification method which concerns on embodiment of this invention. 図２に示された三次元合成画像の作成方法の具体例を示すフローチャートである。It is a flowchart which shows the specific example of the method of creating a three-dimensional composite image shown in FIG. 図３に示された作成方法を説明するための図である。It is a figure for demonstrating the production method shown in FIG. 図２に示された二次元合成画像の作成方法の具体例を示すフローチャートである。It is a flowchart which shows the specific example of the method of creating a two-dimensional composite image shown in FIG. 図５に示された作成方法を説明するための図である。It is a figure for demonstrating the production method shown in FIG. 図２に示された画像データの作成方法の具体例を示すフローチャートである。It is a flowchart which shows the specific example of the image data creation method shown in FIG. 図７に示された作成方法を説明するための図である。It is a figure for demonstrating the production method shown in FIG.

以下において、本発明の実施の形態を図面を参照しつつ詳しく説明する。なお、図中同一符号は同一又は相当部分を示す。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. In the figure, the same reference numerals indicate the same or corresponding parts.

図１は、本発明の実施の形態に係る物体識別装置１の全体構成を示す図である。図１に示されるように、本発明の実施の形態に係る物体識別装置１は、供給装置３から供給された廃製品２を搬送しながら廃製品２の重量を測定するベルトコンベヤ式重量計４と、廃製品２を搬送するベルトコンベヤ５と、三次元（３Ｄ）カメラ用フォトセンサ６と、３Ｄカメラ用線状レーザ７と、３Ｄカメラ８と、２Ｄカメラ用フォトセンサ９と、２Ｄカメラ１０と、回収容器１４〜１６へ廃製品２を分別回収するための分別用アクチュエータ１１，１２と、これらの動作を統括制御すると共に廃製品２を後述する方法により識別する制御装置１３とを備える。 FIG. 1 is a diagram showing an overall configuration of an object identification device 1 according to an embodiment of the present invention. As shown in FIG. 1, the object identification device 1 according to the embodiment of the present invention is a belt conveyor type scale 4 that measures the weight of the waste product 2 while transporting the waste product 2 supplied from the supply device 3. , A belt conveyor 5 that conveys the waste product 2, a photosensor 6 for a three-dimensional (3D) camera, a linear laser 7 for a 3D camera, a 3D camera 8, a photosensor 9 for a 2D camera, and a 2D camera 10. It is provided with the sorting actuators 11 and 12 for separating and collecting the waste products 2 into the collection containers 14 to 16, and the control device 13 for controlling the operations of these and identifying the waste products 2 by a method described later.

なお、単位時間当たりの識別量を増やす必要がある場合には、図１に示された物体識別装置１を例えば複数平行に配置して同時に動作させるのが好適である。 When it is necessary to increase the amount of identification per unit time, it is preferable to arrange, for example, a plurality of object identification devices 1 shown in FIG. 1 in parallel and operate them at the same time.

以下において、上記物体識別装置１の動作の概要について説明する。供給装置３からサンプル体として個別に供給された廃製品２は、ベルトコンベヤ式重量計４により重量が測定され、その結果を示すデータが制御装置１３へ送信される。続いて、フォトセンサ６により廃製品２が検知されると、当該検知信号を受けて上方に吊り下げられた３Ｄカメラ８が動作して廃製品２の３Ｄ画像が撮像され、制御装置１３へ送信される。 The outline of the operation of the object identification device 1 will be described below. The weight of the waste product 2 individually supplied from the supply device 3 as a sample body is measured by the belt conveyor type weight scale 4, and the data indicating the result is transmitted to the control device 13. Subsequently, when the scrap product 2 is detected by the photo sensor 6, the 3D camera 8 suspended upward operates in response to the detection signal to capture a 3D image of the scrap product 2 and transmit it to the control device 13. Will be done.

続いて、フォトセンサ９によって廃製品２を検知すると、その検知信号に応じて上方に吊り下げられた２Ｄカメラ１０が動作し、廃製品２の２Ｄカラー画像が撮像され、制御装置１３へ送信される。 Subsequently, when the scrap product 2 is detected by the photo sensor 9, the 2D camera 10 suspended upward operates in response to the detection signal, and the 2D color image of the scrap product 2 is captured and transmitted to the control device 13. To.

制御装置１３は、上記のように取得された廃製品２の重量、３Ｄ画像、及び２Ｄ画像に基づいた演算処理を行って廃製品２の識別を行うが、本演算処理については後に詳しく説明する。 The control device 13 identifies the waste product 2 by performing arithmetic processing based on the weight of the waste product 2 acquired as described above, the 3D image, and the 2D image, but this arithmetic processing will be described in detail later. ..

識別された廃製品２は、識別結果に応じて分別用アクチュエータ１１，１２によりベルトコンベヤ５からの落下位置が制御され、それぞれ該当する回収容器１４〜１６に収納される。 The identified waste products 2 are stored in the corresponding collection containers 14 to 16 in which the drop position from the belt conveyor 5 is controlled by the sorting actuators 11 and 12 according to the identification result.

なお、図１においては、分別機構として、押し出し動作のオン／オフが制御可能なシリンダを用いているが、これに限られるものではなく、例えばエアガンや電磁石で動作するパドル、あるいはロボットアームなどであっても良い。 In FIG. 1, a cylinder that can control the on / off of the pushing operation is used as the sorting mechanism, but the present invention is not limited to this, and for example, a paddle operated by an air gun or an electromagnet, a robot arm, or the like is used. There may be.

上記において、廃製品２は、ベルトコンベヤ５の幅方向の位置や当該ベルトの進行方向と廃製品２の長辺がなす角が一定でないという点でランダムな状態で供給装置３からベルトコンベヤ５へ落下する。 In the above, the waste product 2 is transferred from the supply device 3 to the belt conveyor 5 in a random state in that the position in the width direction of the belt conveyor 5 and the angle formed by the traveling direction of the belt and the long side of the waste product 2 are not constant. Fall.

また、３Ｄカメラ８は、一定方向に移動する廃製品２の表面における線状レーザ光の反射光ラインの高さ方向における位置変化を電荷結合素子（ＣＣＤ）あるいは相補型金属酸化膜半導体（ＣＭＯＳ）によって検出する、いわゆる光切断法により３Ｄ画像をデジタルデータとしてメモリ（図示していない）に記録する。 Further, the 3D camera 8 changes the position of the reflected light line of the linear laser light on the surface of the waste product 2 moving in a certain direction in the height direction by a charge coupling element (CCD) or a complementary metal oxide film semiconductor (CMOS). A 3D image is recorded as digital data in a memory (not shown) by a so-called optical cutting method detected by.

図２は、本発明の実施の形態に係る物体識別方法を示すフローチャートである。以下においては、図２に示された物体識別方法を、一例として図１に示された物体識別装置１を用いて実現する場合について説明する。 FIG. 2 is a flowchart showing an object identification method according to an embodiment of the present invention. In the following, a case where the object identification method shown in FIG. 2 is realized by using the object identification device 1 shown in FIG. 1 as an example will be described.

なお、本物体識別方法は、図１に示された物体識別装置１を用いて実現する場合に限られず、広く適用できることは言うまでもない。 Needless to say, this object identification method is not limited to the case where it is realized by using the object identification device 1 shown in FIG. 1, and can be widely applied.

最初にステップＳ１では、サンプル体の重量を測定する。具体的には例えば、上記のようにベルトコンベヤ式重量計４によりサンプル体の重量を測定し、得られたデータを制御装置１３へ送信する。 First, in step S1, the weight of the sample body is measured. Specifically, for example, the weight of the sample body is measured by the belt conveyor type weight scale 4 as described above, and the obtained data is transmitted to the control device 13.

次に、ステップＳ２では、上記サンプル体の三次元画像からサンプル体の体積を算出する。具体的には例えば、制御装置１３は上記のような方法により受信された３Ｄ画像に基づいて当該廃製品２の体積を算出する。 Next, in step S2, the volume of the sample body is calculated from the three-dimensional image of the sample body. Specifically, for example, the control device 13 calculates the volume of the waste product 2 based on the 3D image received by the method as described above.

次に、ステップＳ３では、ステップＳ１で測定されたサンプル体の重量とステップＳ２で算出された当該サンプル体の体積を用いて、例えば制御装置１３がサンプル体としての廃製品２の密度を算出する。 Next, in step S3, for example, the control device 13 calculates the density of the waste product 2 as the sample body by using the weight of the sample body measured in step S1 and the volume of the sample body calculated in step S2. ..

次に、ステップＳ４では、例えば制御装置１３が、ステップＳ３において算出された密度に応じた第一の背景画像を作成する。 Next, in step S4, for example, the control device 13 creates a first background image according to the density calculated in step S3.

次に、ステップＳ５では、例えば制御装置１３が、上記第一の背景画像にサンプル体としての廃製品２の三次元画像を貼付して三次元合成画像を作成する。 Next, in step S5, for example, the control device 13 attaches a three-dimensional image of the scrap product 2 as a sample body to the first background image to create a three-dimensional composite image.

図３は、図２に示された三次元合成画像の作成方法の具体例を示すフローチャートである。また、図４は、図３に示された作成方法を説明するための図である。以下においては、図３及び図４を参照しつつ、上記三次元合成画像の作成方法について詳しく説明する。 FIG. 3 is a flowchart showing a specific example of the method of creating the three-dimensional composite image shown in FIG. Further, FIG. 4 is a diagram for explaining the production method shown in FIG. In the following, the method of creating the three-dimensional composite image will be described in detail with reference to FIGS. 3 and 4.

ここでは、図１に示されたベルトコンベヤ５上を搬送される個別の廃製品２を３Ｄカメラ８で撮像することにより、図４に示された３Ｄ画像Ｇ０が得られた場合につき説明する。なお、当該画像中における廃製品２の位置と向きは一定ではなく、様々な状態を取り得るものとする。 Here, the case where the 3D image G0 shown in FIG. 4 is obtained by photographing the individual waste products 2 conveyed on the belt conveyor 5 shown in FIG. 1 with the 3D camera 8 will be described. It should be noted that the position and orientation of the waste product 2 in the image is not constant, and various states can be taken.

３Ｄ画像Ｇ０を構成する各画素は被写体である廃製品２の高さのレベルを表しているので、制御装置１３は、廃製品２の背景をなすベルトコンベヤ５の表面の画素値よりも数値が大きな画素を抽出して総和を計算し、実寸に変換することにより当該廃製品２の体積値を計算する。 Since each pixel constituting the 3D image G0 represents the height level of the waste product 2 which is the subject, the control device 13 has a numerical value larger than the pixel value on the surface of the belt conveyor 5 forming the background of the waste product 2. The volume value of the waste product 2 is calculated by extracting large pixels, calculating the total sum, and converting it to the actual size.

続いて、制御装置１３は、ベルトコンベヤ式重量計４で測定された重量値を上記体積値で除算して密度値を計算し、この密度値を画素レベルの相対画素値に変換する。ここで、相対画素値は、例えば、上記密度値の最大値が上記画素レベルの最大値に対応するよう密度値の大きさに正比例した画素レベルとされる。 Subsequently, the control device 13 calculates the density value by dividing the weight value measured by the belt conveyor type weight scale 4 by the volume value, and converts this density value into a relative pixel value at the pixel level. Here, the relative pixel value is, for example, a pixel level that is directly proportional to the magnitude of the density value so that the maximum value of the density value corresponds to the maximum value of the pixel level.

そして、制御装置１３は、上記相対画素値を背景とする図４に示された相対画素値埋込背景画像Ｇ３を作成し、上記メモリへ保存する。 Then, the control device 13 creates the relative pixel value embedded background image G3 shown in FIG. 4 with the relative pixel value as the background, and saves it in the memory.

また、制御装置１３は、図３に示されたステップＳ５０において、図４に示されるように３Ｄ画像Ｇ０のコピー画像Ｇ２を作成し、上記メモリへ保存しておく。 Further, in step S50 shown in FIG. 3, the control device 13 creates a copy image G2 of the 3D image G0 as shown in FIG. 4 and stores it in the memory.

また、制御装置１３は、ステップＳ５１において３Ｄ画像Ｇ０を図４に示された二値化画像Ｇ１に変換し、ステップＳ５２において膨張や縮退などの画像処理によりノイズを除去した後、ステップＳ５３において物体（ここでは廃製品）の輪郭を検出する。 Further, the control device 13 converts the 3D image G0 into the binarized image G1 shown in FIG. 4 in step S51, removes noise by image processing such as expansion and contraction in step S52, and then an object in step S53. Detects the contour of (here, scrap product).

続けて、制御装置１３は、ステップＳ５４において検出した物体の輪郭を囲む矩形の中で面積が最小となるものを抽出し、四隅の頂点座標と回転角を検出する。このとき回転角は、図４の二値化画像Ｇ１に示すように、矩形の長辺がベルトコンベヤ５の移動方向である画像の縦方向と平行となるように回転するときの角度とする。具体的には例えば、頂点３と頂点４を結ぶ辺３４が短辺のときは回転角を図中の角θ、仮に辺３４が長辺のときは回転角を角（θ＋９０°）とする。 Subsequently, the control device 13 extracts the rectangle surrounding the outline of the object detected in step S54 and having the smallest area, and detects the coordinates of the vertices of the four corners and the angle of rotation. At this time, the rotation angle is an angle when the rectangle is rotated so that the long side of the rectangle is parallel to the vertical direction of the image, which is the moving direction of the belt conveyor 5, as shown in the binarized image G1 of FIG. Specifically, for example, when the side 34 connecting the apex 3 and the apex 4 is a short side, the rotation angle is the angle θ in the figure, and when the side 34 is the long side, the rotation angle is the angle (θ + 90 °).

次に、制御装置１３は、ステップＳ５５において先に検出した頂点座標を利用して、上記コピー画像Ｇ２から面積最小矩形の範囲を切り抜く。なお、図３においてステップＳ５４とステップＳ５５の間に記された破線の矢印は、ステップＳ５５においてステップＳ５４で検出された頂点座標を利用することを示す。 Next, the control device 13 cuts out the range of the minimum area rectangle from the copy image G2 by using the vertex coordinates detected earlier in step S55. The broken line arrow drawn between step S54 and step S55 in FIG. 3 indicates that the vertex coordinates detected in step S54 in step S55 are used.

続けて、制御装置１３は、ステップＳ５６において上記相対画素値埋込背景画像Ｇ３を上記メモリから読み出し、ステップＳ５７において、上記の切り抜いた画像を先の回転角だけ回転させた画像を上記相対画素値埋込背景画像Ｇ３の所定の位置に貼り付け、背景が変換され定方向及び定位置に上記物体の画像が配置された３Ｄ画像（以下、「背景変換・定方向・定位置・３Ｄ画像」という。）Ｇ４を作成する。なお、図３においてステップＳ５４とステップＳ５７の間に記された破線の矢印は、ステップＳ５７においてステップＳ５４で検出された回転角を利用することを示す。 Subsequently, the control device 13 reads the relative pixel value embedded background image G3 from the memory in step S56, and in step S57, rotates the cropped image by the previous rotation angle to obtain the relative pixel value. Embedded background image A 3D image (hereinafter referred to as "background conversion / fixed direction / fixed position / 3D image"" in which an image of the object is arranged in a fixed direction and a fixed position by pasting it at a predetermined position on the G3. .) Create G4. The broken line arrow drawn between step S54 and step S57 in FIG. 3 indicates that the rotation angle detected in step S54 in step S57 is used.

ここで、上記「所定の位置」とは、例えば図４に示されるように、相対画素値埋込背景画像Ｇ３の中心線と上記回転後の切抜き画像の中心線が一致し、かつ、回転後の切抜き画像の上辺が相対画素値埋込背景画像Ｇ３の上辺から一定距離ｄだけ下がった位置とされる。 Here, the "predetermined position" means, for example, as shown in FIG. 4, that the center line of the relative pixel value embedded background image G3 and the center line of the cut-out image after rotation coincide with each other and after rotation. The upper side of the cropped image is set to be a position lowered by a certain distance d from the upper side of the relative pixel value embedded background image G3.

次に、図２に示されたステップＳ６では、第二の背景画像に上記サンプル体の二次元画像を貼付して得られた二次元合成画像における各画素に、上記各画素と同じ位置にある上記三次元合成画像の各画素の画素値に応じたパラメータを追加して得られる画像データを作成する。 Next, in step S6 shown in FIG. 2, each pixel in the two-dimensional composite image obtained by pasting the two-dimensional image of the sample body on the second background image is at the same position as each pixel. Image data obtained by adding parameters corresponding to the pixel values of each pixel of the three-dimensional composite image is created.

ここで、図５は、図２に示された二次元合成画像の作成方法の具体例を示すフローチャートである。また、図６は、図５に示された作成方法を説明するための図である。以下においては、図５及び図６を参照しつつ、上記二次元合成画像の作成方法について詳しく説明する。 Here, FIG. 5 is a flowchart showing a specific example of the method for creating the two-dimensional composite image shown in FIG. Further, FIG. 6 is a diagram for explaining the production method shown in FIG. In the following, the method of creating the two-dimensional composite image will be described in detail with reference to FIGS. 5 and 6.

最初に、上記３Ｄ画像Ｇ０に撮像された廃製品を同一の画像サイズ及び縮尺で撮影した２Ｄカラー画像Ｇ１０と共に、廃製品が存在しないベルトコンベヤ５の表面を撮影した背景画像Ｇ１３を用意する。 First, a 2D color image G10 obtained by photographing the waste product captured in the 3D image G0 at the same image size and scale, and a background image G13 obtained by photographing the surface of the belt conveyor 5 in which the waste product does not exist are prepared.

制御装置１３は、図３に示された方法と同様に、ステップＳ６０において図６に示されたコピー画像Ｇ１２を作成し、ステップＳ６１で２Ｄカラー画像Ｇ１０をグレー画像変換した上で二値化画像Ｇ１１を作成し、ステップＳ６２からステップＳ６８を実行することにより背景画像Ｇ１３において定方向及び定位置に当該物体の画像が配置された２Ｄカラー画像（以下、「定方向・定位置・２Ｄカラー画像」という。）Ｇ１４を作成する。なお、図５においてステップＳ６５とステップＳ６６の間に記された破線の矢印は、ステップＳ６６においてステップＳ６５で検出された頂点座標を利用することを示し、ステップＳ６５とステップＳ６８の間に記された破線の矢印は、ステップＳ６８においてステップＳ６５で検出された回転角を利用することを示す。 The control device 13 creates the copy image G12 shown in FIG. 6 in step S60, converts the 2D color image G10 into a gray image in step S61, and then binarizes the image, in the same manner as the method shown in FIG. A 2D color image in which an image of the object is arranged in a fixed direction and a fixed position in the background image G13 by creating G11 and executing steps S62 to S68 (hereinafter, "fixed direction / fixed position / 2D color image"". ) Create G14. Note that the broken line arrow marked between steps S65 and S66 in FIG. 5 indicates that the vertex coordinates detected in step S65 in step S66 are used, and are marked between steps S65 and S68. The dashed arrow indicates that the rotation angle detected in step S65 is used in step S68.

図７は、図２に示された画像データの作成方法の具体例を示すフローチャートである。また、図８は、図７に示された作成方法を説明するための図である。以下においては、図７及び図８を参照しつつ、上記画像データの作成方法について詳しく説明する。 FIG. 7 is a flowchart showing a specific example of the method of creating the image data shown in FIG. Further, FIG. 8 is a diagram for explaining the production method shown in FIG. 7. In the following, the method of creating the above image data will be described in detail with reference to FIGS. 7 and 8.

図７に示されるように、制御装置１３は、上記のようにベルトコンベヤ式重量計４で測定された重量値と、３Ｄカメラ８で撮像された３Ｄ画像を用いて、図３に示された方法により背景変換・定方向・定位置・３Ｄ画像を作成する。一方、制御装置１３は、２Ｄカメラ１０で撮像された２Ｄカラー画像を用いて、図５に示された方法により定方向・定位置・２Ｄカラー画像を作成する。 As shown in FIG. 7, the control device 13 is shown in FIG. 3 using the weight value measured by the belt conveyor type weight scale 4 as described above and the 3D image captured by the 3D camera 8. Background conversion, fixed direction, fixed position, and 3D image are created by the method. On the other hand, the control device 13 creates a fixed direction / fixed position / 2D color image by the method shown in FIG. 5 using the 2D color image captured by the 2D camera 10.

ここで、背景変換・定方向・定位置・３Ｄ画像と定方向・定位置・２Ｄカラー画像は、縦横の画素数が一致する同じ画像サイズを有し、同じ廃製品２の同一面が背景画像の同じ位置に同じ向きで貼付されたものとされている。 Here, the background conversion / fixed direction / fixed position / 3D image and the fixed direction / fixed position / 2D color image have the same image size with the same number of vertical and horizontal pixels, and the same surface of the same scrap product 2 is the background image. It is said that it was attached to the same position and in the same direction.

また、背景変換・定方向・定位置・３Ｄ画像はグレースケール画像であり、１画素が一つの成分で表される。一方、定方向・定位置・２Ｄカラー画像は赤、緑、青の三つの色成分を利用したＲＧＢカラー画像（又はシアン、マゼンタ、黄、黒の四つの色成分を利用したＣＭＹＫカラー画像）であり、１画素が三つ（ないし四つ）の成分で表される。 Further, the background conversion, the fixed direction, the fixed position, and the 3D image are grayscale images, and one pixel is represented by one component. On the other hand, the fixed-direction, fixed-position, and 2D color images are RGB color images that use the three color components of red, green, and blue (or CMYK color images that use the four color components of cyan, magenta, yellow, and black). Yes, one pixel is represented by three (or four) components.

ここで、制御装置１３は、ステップＳ６９において、上記の図８に示された定方向・定位置・２Ｄカラー画像Ｇ１４が、１画素当たり四つ（ないし五つ）の成分を有する４（ないし５）チャンネル配列の画像となるよう、背景変換・定方向・定位置・３Ｄ画像Ｇ４の各画素値を定方向・定位置・２Ｄカラー画像Ｇ１４において同位置にある画素の４（ないし５）チャンネル目にパラメータを追加する処理を行うことにより、４（ないし５）チャンネル画像（以下、「背景変換・定方向・定位置・４ｃｈ画像」という。）Ｇ１５を作成する。 Here, in step S69, the control device 13 has 4 (or 5) components in which the fixed direction / fixed position / 2D color image G14 shown in FIG. 8 has four (or five) components per pixel. ) The 4th (or 5th) channel of the pixels at the same position in the fixed direction, fixed position, and 2D color image G14 by setting each pixel value of the background conversion, fixed direction, fixed position, and 3D image G4 so that the image has a channel arrangement. A 4 (or 5) channel image (hereinafter referred to as "background conversion / fixed direction / fixed position / 4ch image") G15 is created by performing a process of adding a parameter to.

なお、上記のような色情報以外の４（ないし５）成分目のパラメータは、各画素の透明度（若しくは不透明度）を示す値とされ、具体的には、廃製品２が写っている画素においては廃製品２の立体形状（高さ）に応じ、背景が写っている画素においては廃製品２の密度に応じた値とされる。 The parameter of the 4th (or 5th) component other than the color information as described above is a value indicating the transparency (or opacity) of each pixel. Specifically, in the pixel in which the disused product 2 is shown. Is a value according to the three-dimensional shape (height) of the waste product 2 and according to the density of the waste product 2 in the pixels showing the background.

従って、背景変換・定方向・定位置・４ｃｈ画像Ｇ１５によれば、個々の廃製品２が有する特徴について、一つの画像データとして通常の２Ｄカラー画像よりも多くの情報を持たせることができる。 Therefore, according to the background conversion / fixed direction / fixed position / 4ch image G15, it is possible to provide more information as one image data than a normal 2D color image about the features of each scrap product 2.

次に、図２に示されたステップＳ７において、上記画像データを学習データとして畳み込みニューラルネットワークモデルに学習させる。具体的には例えば、識別対象とする廃製品２の集合から一定数のサンプル体を抽出し、品目、型式、製造メーカ等の廃製品２の特性を示す製品情報を収集し、当該廃製品２につき作成された上記の背景変換・定方向・定位置・４ｃｈ画像（以下、「４ｃｈ画像」と略す。）と関連づけたデータベースを作成する。このとき、識別目的に応じて廃製品２とその４ｃｈ画像を上記特性に応じて複数のグループに分類した上で、一つの４ｃｈ画像毎に当該グループ名やグループ番号などからなる深層畳み込みニューラルネットワークモデルによる学習用教師ラベルを定義した画像データセットを作成し、深層畳み込みニューラルネットワークモデルに学習させる。 Next, in step S7 shown in FIG. 2, the convolutional neural network model is trained by using the image data as training data. Specifically, for example, a certain number of samples are extracted from a set of waste products 2 to be identified, product information indicating the characteristics of the waste products 2 such as items, models, and manufacturers is collected, and the waste products 2 are collected. Create a database associated with the above-mentioned background conversion, fixed direction, fixed position, and 4ch image (hereinafter, abbreviated as "4ch image") created for the above. At this time, after classifying the scrap product 2 and its 4ch image into a plurality of groups according to the purpose of identification, a deep convolutional neural network model consisting of the group name, group number, etc. for each 4ch image. Create an image dataset that defines a teacher label for training by, and train it in a deep convolutional neural network model.

なお、この深層畳み込みニューラルネットワークモデルは、独自に作成しても良いが、一般に公開されている既存のモデルを流用することができる。 The deep convolutional neural network model may be created independently, but an existing model that is open to the public can be used.

次に、図２に示されたステップＳ８において、識別対象とする物体につきステップＳ１〜Ｓ６を実行することにより得られた上記画像データを、ステップＳ７で得られた学習済みの畳み込みニューラルネットワークモデルへ入力して上記物体を識別させる。上記例においては、上記の特性、すなわち所属する上記グループ、が未知の廃製品２につき作成された４ｃｈ画像を、最適化された内部パラメータを有する上記の学習済み深層畳み込みニューラルネットワークモデルへ入力することにより、当該廃製品２がいずれのグループに属するものかが識別され、その識別結果に応じて回収容器１４〜１６に分別回収されることになる。 Next, in step S8 shown in FIG. 2, the image data obtained by executing steps S1 to S6 for the object to be identified is transferred to the trained convolutional neural network model obtained in step S7. Input to identify the object. In the above example, the above-mentioned characteristic, that is, the above-mentioned group to which the above-mentioned group belongs, inputs the 4ch image created for the unknown waste product 2 into the above-mentioned trained deep convolutional neural network model having the optimized internal parameters. As a result, which group the waste product 2 belongs to is identified, and the waste products 2 are separately collected in the collection containers 14 to 16 according to the identification result.

なお、上記識別は、以下の方法により実現することもできる。上記特性が既知であるサンプル体について図２に示されたステップＳ１〜Ｓ６を通じて作成された上記画像データを上記畳み込みニューラルネットワークモデルに入力して得られる中間層の出力ベクトル（第一の出力ベクトル）を、上記特性と対応させて記録したリストを予め作成する。 The above identification can also be realized by the following method. The output vector of the intermediate layer (first output vector) obtained by inputting the image data created through steps S1 to S6 shown in FIG. 2 into the convolutional neural network model for the sample body having the known characteristics. Is recorded in advance in association with the above characteristics.

そして、図２に示されたステップＳ８では、識別対象とする物体について作成された上記画像データを学習済みの当該畳み込みニューラルネットワークモデルに入力して得られる上記中間層の出力ベクトル（第二の出力ベクトル）に対して最短距離となる上記第一の出力ベクトルを上記リストに記録された第一の出力ベクトルの中から特定することにより、上記物体の特性を識別する。 Then, in step S8 shown in FIG. 2, the output vector (second output) of the intermediate layer obtained by inputting the image data created for the object to be identified into the trained convolutional neural network model. The characteristics of the object are identified by specifying the first output vector, which is the shortest distance to the vector), from the first output vectors recorded in the list.

以上より、本発明の実施の形態に係る物体識別装置１や物体識別方法によれば、４ｃｈ画像を利用することによって、従来よりも少ない学習データで高精度な物体の識別が可能な物体識別装置や物体識別方法を得ることができる。 From the above, according to the object identification device 1 and the object identification method according to the embodiment of the present invention, the object identification device capable of highly accurate object identification with less learning data than the conventional one by using the 4ch image. And an object identification method can be obtained.

１物体識別装置
４ベルトコンベヤ式重量計
８３Ｄカメラ
１０２Ｄカメラ
１３制御装置

1 Object identification device 4 Belt conveyor type weighing scale 8 3D camera 10 2D camera 13 Control device

Claims

A first background image corresponding to the density of the sample body is created, and a three-dimensional image of the sample body is attached to the first background image to create a three-dimensional composite image.
Each pixel in the two-dimensional composite image obtained by pasting the two-dimensional image of the sample body on the second background image corresponds to the pixel value of each pixel of the three-dimensional composite image at the same position as each pixel. Create the image data obtained by adding the above parameters
An object identification device provided with an identification means for identifying an object by a convolutional neural network model trained with the image data.

A weight measuring means for measuring the weight of the sample body and
Further provided with a 3D imaging means for capturing the three-dimensional image,
The identification means further calculates the volume of the sample body from the three-dimensional image captured by the 3D imaging means, and from the calculated volume and the weight measured by the weight measuring means, the sample body The object identification device according to claim 1, which calculates the density.

The object identification device according to claim 2, further comprising a 2D imaging means for capturing the two-dimensional image.

The identification means attaches the three-dimensional image to the first background image in a predetermined direction and position, and attaches the two-dimensional image to the second background image in a predetermined direction and position. The object identification device according to claim 1, which is attached to the above.

The object identification device according to claim 1, wherein the two-dimensional image is an RGB color image or a CMYK color image.

The first step in measuring the weight of the sample body,
The second step of calculating the volume of the sample body from the three-dimensional image of the sample body, and
A third step of calculating the density of the sample body using the weight measured in the first step and the volume calculated in the second step,
A fourth step of creating a first background image according to the density calculated in the third step, and
A fifth step of attaching a three-dimensional image of the sample body to the first background image to create a three-dimensional composite image, and
Each pixel in the two-dimensional composite image obtained by pasting the two-dimensional image of the sample body on the second background image corresponds to the pixel value of each pixel of the three-dimensional composite image at the same position as each pixel. The sixth step of creating the image data obtained by adding the above parameters,
The seventh step of training the convolutional neural network model using the image data as training data,
The image data obtained by executing the first to sixth steps for the object to be identified is input to the trained convolutional neural network model obtained in the seventh step to input the object. An object identification method having an eighth step of identifying the object.

The three-dimensional image is attached to the first background image in a predetermined direction and position.
The object identification method according to claim 6, wherein the two-dimensional image is attached to the second background image in a predetermined direction and position.

The object identification method according to claim 6, wherein the two-dimensional image is an RGB color image or a CMYK color image.

A step of preliminarily classifying the sample body into a plurality of groups according to the characteristics, and
Further having a step of creating a database having the image data as an element created for each group through the first step to the sixth step.
In the seventh step, the database is trained by the convolutional neural network model.
The object identification method according to claim 6, wherein in the eighth step, the group to which the object belongs is identified.

The first output vector of the intermediate layer obtained by inputting the image data created from the first step to the sixth step of the sample body having known characteristics into the convolutional neural network model is obtained. It also has the step of creating a pre-recorded list that corresponds to the characteristics.
In the eighth step, the image data created from the first step to the sixth step for the object is input to the convolutional neural network model and used as the second output vector of the intermediate layer. The object identification according to claim 6, wherein the characteristic of the object is identified by specifying the first output vector having the shortest distance from the first output vector recorded in the list. Method.