JP7258345B2

JP7258345B2 - OBJECT IDENTIFICATION DEVICE AND OBJECT IDENTIFICATION METHOD

Info

Publication number: JP7258345B2
Application number: JP2019116299A
Authority: JP
Inventors: 茂樹古屋仲; 賢一郎小林
Original assignee: National Institute of Advanced Industrial Science and Technology AIST
Current assignee: National Institute of Advanced Industrial Science and Technology AIST
Priority date: 2019-06-24
Filing date: 2019-06-24
Publication date: 2023-04-17
Anticipated expiration: 2039-06-24
Also published as: JP2021002259A

Description

本発明は、物体のリサイクル工程などにおいて物体を識別する技術に関するものである。 The present invention relates to technology for identifying objects in an object recycling process or the like.

近年においては省資源が重要な社会テーマの一つとなっており、リサイクルを実現するために廃製品の選別技術も考案されてきている（特許文献１参照）。 In recent years, resource saving has become one of the most important social themes, and techniques for sorting waste products have been devised in order to realize recycling (see Patent Document 1).

特開２０１７－１０９１６１号公報JP 2017-109161 A

特許文献１には畳み込みニューラルネットワークによる機械学習を利用した廃棄物選別システム等が開示されているが、精度の高い選別を行うためには多数の画像データを収集して機械学習部４３に学習させる必要がある。 Patent document 1 discloses a waste sorting system and the like that utilizes machine learning using a convolutional neural network. In order to perform highly accurate sorting, a large number of image data are collected and the machine learning unit 43 is made to learn. There is a need.

本発明は、このような問題を解決するためになされたもので、より少ない学習データで高精度な物体の識別が可能な物体識別装置と物体識別方法を提供することを目的とする。 SUMMARY OF THE INVENTION It is an object of the present invention to provide an object identification apparatus and an object identification method capable of highly accurate object identification with less learning data.

上記課題を解決するため、本発明は、サンプル体の密度に応じた第一の背景画像を作成すると共に、第一の背景画像にサンプル体の三次元画像を貼付して三次元合成画像を作成し、第二の背景画像にサンプル体の二次元画像を貼付して得られた二次元合成画像における各画素に、上記各画素と同じ位置にある三次元合成画像の各画素の画素値に応じたパラメータを追加して得られる画像データを作成し、上記画像データを学習させた畳み込みニューラルネットワークモデルにより物体を識別する識別手段を備えた物体識別装置を提供する。 In order to solve the above problems, the present invention creates a first background image according to the density of the sample, and creates a three-dimensional composite image by pasting a three-dimensional image of the sample on the first background image. Then, for each pixel in the two-dimensional composite image obtained by attaching the two-dimensional image of the sample to the second background image, according to the pixel value of each pixel in the three-dimensional composite image at the same position as each pixel An object identification device is provided, which includes identification means for creating image data obtained by adding the parameters obtained by adding the above parameters, and identifying an object by a convolutional neural network model trained on the image data.

また、上記課題を解決するため、本発明は、サンプル体の重量を測定する第一のステップと、サンプル体の三次元画像からサンプル体の体積を算出する第二のステップと、第一のステップで測定された重量と第二のステップで算出された体積を用いてサンプル体の密度を算出する第三のステップと、第三のステップにおいて算出された密度に応じた第一の背景画像を作成する第四のステップと、第一の背景画像にサンプル体の三次元画像を貼付して三次元合成画像を作成する第五のステップと、第二の背景画像にサンプル体の二次元画像を貼付して得られた二次元合成画像における各画素に、上記各画素と同じ位置にある上記三次元合成画像の各画素の画素値に応じたパラメータを追加して得られる画像データを作成する第六のステップと、上記画像データを学習データとして畳み込みニューラルネットワークモデルに学習させる第七のステップと、識別対象とする物体につき上記第一から第六のステップを実行することにより得られた上記画像データを、上記第七のステップで得られた学習済みの前記畳み込みニューラルネットワークモデルへ入力して物体を識別させる第八のステップとを有する物体識別方法を提供する。 Further, in order to solve the above problems, the present invention provides a first step of measuring the weight of the sample body, a second step of calculating the volume of the sample body from the three-dimensional image of the sample body, and a first step A third step of calculating the density of the sample body using the weight measured in and the volume calculated in the second step, and creating a first background image according to the density calculated in the third step a fourth step of attaching a three-dimensional image of the sample to the first background image to create a three-dimensional composite image; and attaching a two-dimensional image of the sample to the second background image creating image data obtained by adding, to each pixel in the two-dimensional composite image obtained by adding a parameter corresponding to the pixel value of each pixel in the three-dimensional composite image located at the same position as each pixel; a seventh step of training a convolutional neural network model with the image data as learning data; and the image data obtained by executing the first to sixth steps for an object to be identified. and an eighth step of inputting the learned convolutional neural network model obtained in the seventh step to identify the object.

本発明によれば、より少ない学習データで高精度な物体の識別が可能な物体識別装置と物体識別方法を得ることができる。 According to the present invention, it is possible to obtain an object identification device and an object identification method capable of highly accurate object identification with less learning data.

本発明の実施の形態に係る物体識別装置１の全体構成を示す図である。It is a figure showing the whole object identification device 1 composition concerning an embodiment of the invention. 本発明の実施の形態に係る物体識別方法を示すフローチャートである。4 is a flow chart showing an object identification method according to an embodiment of the present invention; 図２に示された三次元合成画像の作成方法の具体例を示すフローチャートである。3 is a flow chart showing a specific example of a method for creating the three-dimensional composite image shown in FIG. 2; 図３に示された作成方法を説明するための図である。4 is a diagram for explaining the creation method shown in FIG. 3; FIG. 図２に示された二次元合成画像の作成方法の具体例を示すフローチャートである。3 is a flow chart showing a specific example of a method for creating the two-dimensional composite image shown in FIG. 2; 図５に示された作成方法を説明するための図である。6 is a diagram for explaining the creation method shown in FIG. 5; FIG. 図２に示された画像データの作成方法の具体例を示すフローチャートである。3 is a flow chart showing a specific example of a method for creating image data shown in FIG. 2; 図７に示された作成方法を説明するための図である。8 is a diagram for explaining the creation method shown in FIG. 7; FIG.

以下において、本発明の実施の形態を図面を参照しつつ詳しく説明する。なお、図中同一符号は同一又は相当部分を示す。 BEST MODE FOR CARRYING OUT THE INVENTION Embodiments of the present invention will be described in detail below with reference to the drawings. The same reference numerals in the drawings indicate the same or corresponding parts.

図１は、本発明の実施の形態に係る物体識別装置１の全体構成を示す図である。図１に示されるように、本発明の実施の形態に係る物体識別装置１は、供給装置３から供給された廃製品２を搬送しながら廃製品２の重量を測定するベルトコンベヤ式重量計４と、廃製品２を搬送するベルトコンベヤ５と、三次元（３Ｄ）カメラ用フォトセンサ６と、３Ｄカメラ用線状レーザ７と、３Ｄカメラ８と、２Ｄカメラ用フォトセンサ９と、２Ｄカメラ１０と、回収容器１４～１６へ廃製品２を分別回収するための分別用アクチュエータ１１，１２と、これらの動作を統括制御すると共に廃製品２を後述する方法により識別する制御装置１３とを備える。 FIG. 1 is a diagram showing the overall configuration of an object identification device 1 according to an embodiment of the invention. As shown in FIG. 1, an object identifying apparatus 1 according to an embodiment of the present invention includes a belt conveyor type weight scale 4 for measuring the weight of a waste product 2 while conveying the waste product 2 supplied from a supply device 3. , a belt conveyor 5 that conveys the waste product 2, a three-dimensional (3D) camera photosensor 6, a 3D camera linear laser 7, a 3D camera 8, a 2D camera photosensor 9, and a 2D camera 10 , sorting actuators 11 and 12 for sorting and recovering the waste products 2 into the recovery containers 14 to 16, and a control device 13 that integrally controls these operations and identifies the waste products 2 by a method to be described later.

なお、単位時間当たりの識別量を増やす必要がある場合には、図１に示された物体識別装置１を例えば複数平行に配置して同時に動作させるのが好適である。 If it is necessary to increase the amount of identification per unit time, it is preferable to arrange a plurality of the object identification devices 1 shown in FIG. 1 in parallel and to operate them simultaneously.

以下において、上記物体識別装置１の動作の概要について説明する。供給装置３からサンプル体として個別に供給された廃製品２は、ベルトコンベヤ式重量計４により重量が測定され、その結果を示すデータが制御装置１３へ送信される。続いて、フォトセンサ６により廃製品２が検知されると、当該検知信号を受けて上方に吊り下げられた３Ｄカメラ８が動作して廃製品２の３Ｄ画像が撮像され、制御装置１３へ送信される。 An outline of the operation of the object identification device 1 will be described below. The weight of the waste product 2 individually supplied as a sample from the supply device 3 is measured by the belt conveyor type weighing scale 4 , and data indicating the result is transmitted to the control device 13 . Subsequently, when the waste product 2 is detected by the photosensor 6 , the 3D camera 8 suspended above receives the detection signal and operates to capture a 3D image of the waste product 2 , which is transmitted to the control device 13 . be done.

続いて、フォトセンサ９によって廃製品２を検知すると、その検知信号に応じて上方に吊り下げられた２Ｄカメラ１０が動作し、廃製品２の２Ｄカラー画像が撮像され、制御装置１３へ送信される。 Subsequently, when the waste product 2 is detected by the photo sensor 9, the 2D camera 10 suspended above operates according to the detection signal, and a 2D color image of the waste product 2 is captured and transmitted to the control device 13. be.

制御装置１３は、上記のように取得された廃製品２の重量、３Ｄ画像、及び２Ｄ画像に基づいた演算処理を行って廃製品２の識別を行うが、本演算処理については後に詳しく説明する。 The control device 13 performs arithmetic processing based on the weight, 3D image, and 2D image of the waste product 2 acquired as described above to identify the waste product 2. This arithmetic processing will be described later in detail. .

識別された廃製品２は、識別結果に応じて分別用アクチュエータ１１，１２によりベルトコンベヤ５からの落下位置が制御され、それぞれ該当する回収容器１４～１６に収納される。 The identified waste products 2 are controlled in their falling position from the belt conveyor 5 by the sorting actuators 11 and 12 according to the identification result, and are stored in the corresponding collection containers 14 to 16, respectively.

なお、図１においては、分別機構として、押し出し動作のオン／オフが制御可能なシリンダを用いているが、これに限られるものではなく、例えばエアガンや電磁石で動作するパドル、あるいはロボットアームなどであっても良い。 In FIG. 1, a cylinder capable of controlling the on/off of the pushing operation is used as the separating mechanism, but it is not limited to this, and for example, an air gun, a paddle operated by an electromagnet, or a robot arm can be used. It can be.

上記において、廃製品２は、ベルトコンベヤ５の幅方向の位置や当該ベルトの進行方向と廃製品２の長辺がなす角が一定でないという点でランダムな状態で供給装置３からベルトコンベヤ５へ落下する。 In the above, the waste products 2 are transferred from the supply device 3 to the belt conveyor 5 in a random state in that the position in the width direction of the belt conveyor 5 and the angle formed by the traveling direction of the belt and the long side of the waste products 2 are not constant. Fall.

また、３Ｄカメラ８は、一定方向に移動する廃製品２の表面における線状レーザ光の反射光ラインの高さ方向における位置変化を電荷結合素子（ＣＣＤ）あるいは相補型金属酸化膜半導体（ＣＭＯＳ）によって検出する、いわゆる光切断法により３Ｄ画像をデジタルデータとしてメモリ（図示していない）に記録する。 In addition, the 3D camera 8 detects the position change in the height direction of the reflected light line of the linear laser light on the surface of the waste product 2 moving in a certain direction using a charge-coupled device (CCD) or a complementary metal oxide semiconductor (CMOS). A 3D image is recorded as digital data in a memory (not shown) by a so-called light section method.

図２は、本発明の実施の形態に係る物体識別方法を示すフローチャートである。以下においては、図２に示された物体識別方法を、一例として図１に示された物体識別装置１を用いて実現する場合について説明する。 FIG. 2 is a flowchart illustrating an object identification method according to an embodiment of the invention. A case of realizing the object identification method shown in FIG. 2 by using the object identification device 1 shown in FIG. 1 as an example will be described below.

なお、本物体識別方法は、図１に示された物体識別装置１を用いて実現する場合に限られず、広く適用できることは言うまでもない。 Needless to say, the present object identification method is not limited to being implemented using the object identification device 1 shown in FIG. 1, and can be widely applied.

最初にステップＳ１では、サンプル体の重量を測定する。具体的には例えば、上記のようにベルトコンベヤ式重量計４によりサンプル体の重量を測定し、得られたデータを制御装置１３へ送信する。 First, in step S1, the weight of the sample body is measured. Specifically, for example, the weight of the sample body is measured by the belt conveyor type weight scale 4 as described above, and the obtained data is transmitted to the control device 13 .

次に、ステップＳ２では、上記サンプル体の三次元画像からサンプル体の体積を算出する。具体的には例えば、制御装置１３は上記のような方法により受信された３Ｄ画像に基づいて当該廃製品２の体積を算出する。 Next, in step S2, the volume of the sample is calculated from the three-dimensional image of the sample. Specifically, for example, the control device 13 calculates the volume of the waste product 2 based on the received 3D image by the method described above.

次に、ステップＳ３では、ステップＳ１で測定されたサンプル体の重量とステップＳ２で算出された当該サンプル体の体積を用いて、例えば制御装置１３がサンプル体としての廃製品２の密度を算出する。 Next, in step S3, using the weight of the sample measured in step S1 and the volume of the sample calculated in step S2, for example, the control device 13 calculates the density of the waste product 2 as the sample. .

次に、ステップＳ４では、例えば制御装置１３が、ステップＳ３において算出された密度に応じた第一の背景画像を作成する。 Next, in step S4, for example, the control device 13 creates a first background image according to the density calculated in step S3.

次に、ステップＳ５では、例えば制御装置１３が、上記第一の背景画像にサンプル体としての廃製品２の三次元画像を貼付して三次元合成画像を作成する。 Next, in step S5, for example, the control device 13 creates a three-dimensional composite image by attaching a three-dimensional image of the waste product 2 as a sample to the first background image.

図３は、図２に示された三次元合成画像の作成方法の具体例を示すフローチャートである。また、図４は、図３に示された作成方法を説明するための図である。以下においては、図３及び図４を参照しつつ、上記三次元合成画像の作成方法について詳しく説明する。 FIG. 3 is a flow chart showing a specific example of a method for creating the three-dimensional composite image shown in FIG. 4A and 4B are diagrams for explaining the creation method shown in FIG. In the following, the method of creating the three-dimensional composite image will be described in detail with reference to FIGS. 3 and 4. FIG.

ここでは、図１に示されたベルトコンベヤ５上を搬送される個別の廃製品２を３Ｄカメラ８で撮像することにより、図４に示された３Ｄ画像Ｇ０が得られた場合につき説明する。なお、当該画像中における廃製品２の位置と向きは一定ではなく、様々な状態を取り得るものとする。 Here, a case will be described in which the 3D image G0 shown in FIG. 4 is obtained by imaging the individual waste products 2 conveyed on the belt conveyor 5 shown in FIG. 1 with the 3D camera 8 . It should be noted that the position and orientation of the waste product 2 in the image are not constant, and can take various states.

３Ｄ画像Ｇ０を構成する各画素は被写体である廃製品２の高さのレベルを表しているので、制御装置１３は、廃製品２の背景をなすベルトコンベヤ５の表面の画素値よりも数値が大きな画素を抽出して総和を計算し、実寸に変換することにより当該廃製品２の体積値を計算する。 Since each pixel constituting the 3D image G0 represents the height level of the object waste product 2, the control device 13 determines that the pixel value is higher than the pixel value of the surface of the belt conveyor 5 forming the background of the waste product 2. The volume value of the waste product 2 is calculated by extracting large pixels, calculating the sum, and converting to the actual size.

続いて、制御装置１３は、ベルトコンベヤ式重量計４で測定された重量値を上記体積値で除算して密度値を計算し、この密度値を画素レベルの相対画素値に変換する。ここで、相対画素値は、例えば、上記密度値の最大値が上記画素レベルの最大値に対応するよう密度値の大きさに正比例した画素レベルとされる。 Subsequently, the control device 13 calculates the density value by dividing the weight value measured by the belt conveyor type weighing scale 4 by the volume value, and converts the density value into a relative pixel value at the pixel level. Here, the relative pixel value is, for example, a pixel level directly proportional to the magnitude of the density value such that the maximum value of the density value corresponds to the maximum value of the pixel level.

そして、制御装置１３は、上記相対画素値を背景とする図４に示された相対画素値埋込背景画像Ｇ３を作成し、上記メモリへ保存する。 Then, the control device 13 creates a relative pixel value-embedded background image G3 shown in FIG.

また、制御装置１３は、図３に示されたステップＳ５０において、図４に示されるように３Ｄ画像Ｇ０のコピー画像Ｇ２を作成し、上記メモリへ保存しておく。 Also, in step S50 shown in FIG. 3, the control device 13 creates a copy image G2 of the 3D image G0 as shown in FIG. 4 and stores it in the memory.

また、制御装置１３は、ステップＳ５１において３Ｄ画像Ｇ０を図４に示された二値化画像Ｇ１に変換し、ステップＳ５２において膨張や縮退などの画像処理によりノイズを除去した後、ステップＳ５３において物体（ここでは廃製品）の輪郭を検出する。 Further, the control device 13 converts the 3D image G0 into the binarized image G1 shown in FIG. Detect the contours of (here, the waste product).

続けて、制御装置１３は、ステップＳ５４において検出した物体の輪郭を囲む矩形の中で面積が最小となるものを抽出し、四隅の頂点座標と回転角を検出する。このとき回転角は、図４の二値化画像Ｇ１に示すように、矩形の長辺がベルトコンベヤ５の移動方向である画像の縦方向と平行となるように回転するときの角度とする。具体的には例えば、頂点３と頂点４を結ぶ辺３４が短辺のときは回転角を図中の角θ、仮に辺３４が長辺のときは回転角を角（θ＋９０°）とする。 Subsequently, the control device 13 extracts the rectangle having the smallest area from among the rectangles surrounding the outline of the object detected in step S54, and detects the vertex coordinates of the four corners and the rotation angle. At this time, as shown in the binary image G1 of FIG. 4, the rotation angle is the angle when the rectangle rotates so that the long side of the rectangle is parallel to the vertical direction of the image, which is the movement direction of the belt conveyor 5. FIG. Specifically, for example, if the side 34 connecting the vertex 3 and the vertex 4 is a short side, the rotation angle is the angle θ in the figure, and if the side 34 is a long side, the rotation angle is the angle (θ+90°).

次に、制御装置１３は、ステップＳ５５において先に検出した頂点座標を利用して、上記コピー画像Ｇ２から面積最小矩形の範囲を切り抜く。なお、図３においてステップＳ５４とステップＳ５５の間に記された破線の矢印は、ステップＳ５５においてステップＳ５４で検出された頂点座標を利用することを示す。 Next, the control device 13 uses the vertex coordinates previously detected in step S55 to cut out the range of the minimum area rectangle from the copy image G2. In FIG. 3, the dashed arrow drawn between steps S54 and S55 indicates that the vertex coordinates detected in step S54 are used in step S55.

続けて、制御装置１３は、ステップＳ５６において上記相対画素値埋込背景画像Ｇ３を上記メモリから読み出し、ステップＳ５７において、上記の切り抜いた画像を先の回転角だけ回転させた画像を上記相対画素値埋込背景画像Ｇ３の所定の位置に貼り付け、背景が変換され定方向及び定位置に上記物体の画像が配置された３Ｄ画像（以下、「背景変換・定方向・定位置・３Ｄ画像」という。）Ｇ４を作成する。なお、図３においてステップＳ５４とステップＳ５７の間に記された破線の矢印は、ステップＳ５７においてステップＳ５４で検出された回転角を利用することを示す。 Subsequently, in step S56, the control device 13 reads the relative pixel value-embedded background image G3 from the memory. A 3D image in which the background is converted and the image of the object is arranged in a fixed direction and fixed position (hereinafter referred to as "background conversion, fixed direction, fixed position, 3D image") is pasted at a predetermined position of the embedded background image G3 .) Create G4. In FIG. 3, the dashed arrow drawn between steps S54 and S57 indicates that the rotation angle detected in step S54 is used in step S57.

ここで、上記「所定の位置」とは、例えば図４に示されるように、相対画素値埋込背景画像Ｇ３の中心線と上記回転後の切抜き画像の中心線が一致し、かつ、回転後の切抜き画像の上辺が相対画素値埋込背景画像Ｇ３の上辺から一定距離ｄだけ下がった位置とされる。 Here, the above-mentioned "predetermined position" means that the center line of the background image G3 with embedded relative pixel values and the center line of the clipped image after rotation match, and the The upper side of the clipped image is located at a position lower than the upper side of the relative pixel value-embedded background image G3 by a constant distance d.

次に、図２に示されたステップＳ６では、第二の背景画像に上記サンプル体の二次元画像を貼付して得られた二次元合成画像における各画素に、上記各画素と同じ位置にある上記三次元合成画像の各画素の画素値に応じたパラメータを追加して得られる画像データを作成する。 Next, in step S6 shown in FIG. 2, each pixel in the two-dimensional composite image obtained by attaching the two-dimensional image of the sample to the second background image is added to each pixel at the same position as each pixel. Image data obtained by adding a parameter corresponding to the pixel value of each pixel of the three-dimensional composite image is created.

ここで、図５は、図２に示された二次元合成画像の作成方法の具体例を示すフローチャートである。また、図６は、図５に示された作成方法を説明するための図である。以下においては、図５及び図６を参照しつつ、上記二次元合成画像の作成方法について詳しく説明する。 Here, FIG. 5 is a flow chart showing a specific example of the method of creating the two-dimensional composite image shown in FIG. FIG. 6 is a diagram for explaining the creation method shown in FIG. In the following, the method of creating the two-dimensional composite image will be described in detail with reference to FIGS. 5 and 6. FIG.

最初に、上記３Ｄ画像Ｇ０に撮像された廃製品を同一の画像サイズ及び縮尺で撮影した２Ｄカラー画像Ｇ１０と共に、廃製品が存在しないベルトコンベヤ５の表面を撮影した背景画像Ｇ１３を用意する。 First, a 2D color image G10 photographing the waste products imaged in the 3D image G0 with the same image size and scale, and a background image G13 photographing the surface of the belt conveyor 5 where no waste products are present are prepared.

制御装置１３は、図３に示された方法と同様に、ステップＳ６０において図６に示されたコピー画像Ｇ１２を作成し、ステップＳ６１で２Ｄカラー画像Ｇ１０をグレー画像変換した上で二値化画像Ｇ１１を作成し、ステップＳ６２からステップＳ６８を実行することにより背景画像Ｇ１３において定方向及び定位置に当該物体の画像が配置された２Ｄカラー画像（以下、「定方向・定位置・２Ｄカラー画像」という。）Ｇ１４を作成する。なお、図５においてステップＳ６５とステップＳ６６の間に記された破線の矢印は、ステップＳ６６においてステップＳ６５で検出された頂点座標を利用することを示し、ステップＳ６５とステップＳ６８の間に記された破線の矢印は、ステップＳ６８においてステップＳ６５で検出された回転角を利用することを示す。 The control device 13 creates a copy image G12 shown in FIG. 6 in step S60 in the same manner as in the method shown in FIG. G11 is created, and steps S62 to S68 are executed to create a 2D color image in which the image of the object is arranged in a fixed direction and a fixed position in the background image G13 (hereinafter referred to as a “fixed direction/fixed position/2D color image”). ) Create G14. In FIG. 5, the dashed arrow between steps S65 and S66 indicates that the vertex coordinates detected in step S65 are used in step S66, and the arrow between steps S65 and S68 is used. A dashed arrow indicates that the rotation angle detected in step S65 is used in step S68.

図７は、図２に示された画像データの作成方法の具体例を示すフローチャートである。また、図８は、図７に示された作成方法を説明するための図である。以下においては、図７及び図８を参照しつつ、上記画像データの作成方法について詳しく説明する。 FIG. 7 is a flow chart showing a specific example of a method for creating the image data shown in FIG. Also, FIG. 8 is a diagram for explaining the creation method shown in FIG. The method for creating the image data will be described in detail below with reference to FIGS. 7 and 8. FIG.

図７に示されるように、制御装置１３は、上記のようにベルトコンベヤ式重量計４で測定された重量値と、３Ｄカメラ８で撮像された３Ｄ画像を用いて、図３に示された方法により背景変換・定方向・定位置・３Ｄ画像を作成する。一方、制御装置１３は、２Ｄカメラ１０で撮像された２Ｄカラー画像を用いて、図５に示された方法により定方向・定位置・２Ｄカラー画像を作成する。 As shown in FIG. 7, the control device 13 uses the weight value measured by the belt conveyor type weighing scale 4 as described above and the 3D image captured by the 3D camera 8 to obtain the data shown in FIG. The method creates a background transformation, oriented, fixed position, and 3D image. On the other hand, the control device 13 uses the 2D color image captured by the 2D camera 10 to create a fixed-direction, fixed-position, 2D color image by the method shown in FIG.

ここで、背景変換・定方向・定位置・３Ｄ画像と定方向・定位置・２Ｄカラー画像は、縦横の画素数が一致する同じ画像サイズを有し、同じ廃製品２の同一面が背景画像の同じ位置に同じ向きで貼付されたものとされている。 Here, the background conversion/unidirectional/fixed-position/3D image and the fixed-direction/fixed-position/2D color image have the same image size with the same number of vertical and horizontal pixels. are affixed in the same position and in the same direction.

また、背景変換・定方向・定位置・３Ｄ画像はグレースケール画像であり、１画素が一つの成分で表される。一方、定方向・定位置・２Ｄカラー画像は赤、緑、青の三つの色成分を利用したＲＧＢカラー画像（又はシアン、マゼンタ、黄、黒の四つの色成分を利用したＣＭＹＫカラー画像）であり、１画素が三つ（ないし四つ）の成分で表される。 In addition, the background conversion, fixed orientation, fixed position, 3D image is a grayscale image, and one pixel is represented by one component. On the other hand, unidirectional, fixed position, 2D color images are RGB color images using three color components of red, green, and blue (or CMYK color images using four color components of cyan, magenta, yellow, and black). , and one pixel is represented by three (or four) components.

ここで、制御装置１３は、ステップＳ６９において、上記の図８に示された定方向・定位置・２Ｄカラー画像Ｇ１４が、１画素当たり四つ（ないし五つ）の成分を有する４（ないし５）チャンネル配列の画像となるよう、背景変換・定方向・定位置・３Ｄ画像Ｇ４の各画素値を定方向・定位置・２Ｄカラー画像Ｇ１４において同位置にある画素の４（ないし５）チャンネル目にパラメータを追加する処理を行うことにより、４（ないし５）チャンネル画像（以下、「背景変換・定方向・定位置・４ｃｈ画像」という。）Ｇ１５を作成する。 Here, in step S69, the control device 13 causes the fixed-orientation/fixed-position/2D color image G14 shown in FIG. ) In order to obtain a channel array image, each pixel value of the background conversion, fixed direction, fixed position, 3D image G4 is changed to the fixed direction, fixed position, and the 4th (or 5th) channel of the pixel at the same position in the 2D color image G14. A 4 (or 5) channel image (hereinafter referred to as "background conversion/fixed direction/fixed position/4ch image") G15 is created by adding parameters to G15.

なお、上記のような色情報以外の４（ないし５）成分目のパラメータは、各画素の透明度（若しくは不透明度）を示す値とされ、具体的には、廃製品２が写っている画素においては廃製品２の立体形状（高さ）に応じ、背景が写っている画素においては廃製品２の密度に応じた値とされる。 The fourth (or fifth) component parameter other than the above color information is a value indicating the transparency (or opacity) of each pixel. is a value corresponding to the three-dimensional shape (height) of the waste product 2, and to the density of the waste product 2 in pixels where the background is shown.

従って、背景変換・定方向・定位置・４ｃｈ画像Ｇ１５によれば、個々の廃製品２が有する特徴について、一つの画像データとして通常の２Ｄカラー画像よりも多くの情報を持たせることができる。 Therefore, according to the background conversion/fixed-direction/fixed-position/4ch image G15, it is possible to provide more information about the features of individual waste products 2 as one image data than a normal 2D color image.

次に、図２に示されたステップＳ７において、上記画像データを学習データとして畳み込みニューラルネットワークモデルに学習させる。具体的には例えば、識別対象とする廃製品２の集合から一定数のサンプル体を抽出し、品目、型式、製造メーカ等の廃製品２の特性を示す製品情報を収集し、当該廃製品２につき作成された上記の背景変換・定方向・定位置・４ｃｈ画像（以下、「４ｃｈ画像」と略す。）と関連づけたデータベースを作成する。このとき、識別目的に応じて廃製品２とその４ｃｈ画像を上記特性に応じて複数のグループに分類した上で、一つの４ｃｈ画像毎に当該グループ名やグループ番号などからなる深層畳み込みニューラルネットワークモデルによる学習用教師ラベルを定義した画像データセットを作成し、深層畳み込みニューラルネットワークモデルに学習させる。 Next, in step S7 shown in FIG. 2, the convolutional neural network model is made to learn the image data as learning data. Specifically, for example, a certain number of samples are extracted from a set of waste products 2 to be identified, product information indicating the characteristics of the waste product 2 such as item, model, manufacturer, etc. is collected, and the waste product 2 A database is created in association with the background conversion/fixed-direction/fixed-position/4ch image (hereinafter abbreviated as "4ch image") created for each. At this time, after classifying the waste product 2 and its 4ch image into a plurality of groups according to the above characteristics according to the purpose of identification, a deep convolutional neural network model consisting of the group name and group number for each 4ch image Create an image dataset that defines teacher labels for learning by , and train a deep convolutional neural network model.

なお、この深層畳み込みニューラルネットワークモデルは、独自に作成しても良いが、一般に公開されている既存のモデルを流用することができる。 This deep convolutional neural network model may be created independently, but an existing model that is open to the public can be diverted.

次に、図２に示されたステップＳ８において、識別対象とする物体につきステップＳ１～Ｓ６を実行することにより得られた上記画像データを、ステップＳ７で得られた学習済みの畳み込みニューラルネットワークモデルへ入力して上記物体を識別させる。上記例においては、上記の特性、すなわち所属する上記グループ、が未知の廃製品２につき作成された４ｃｈ画像を、最適化された内部パラメータを有する上記の学習済み深層畳み込みニューラルネットワークモデルへ入力することにより、当該廃製品２がいずれのグループに属するものかが識別され、その識別結果に応じて回収容器１４～１６に分別回収されることになる。 Next, in step S8 shown in FIG. 2, the image data obtained by executing steps S1 to S6 for the object to be identified is applied to the trained convolutional neural network model obtained in step S7. Enter to identify the object. In the above example, inputting the 4ch image created for the waste product 2 with unknown properties, i.e., the group to which it belongs, into the trained deep convolutional neural network model with optimized internal parameters. , the group to which the waste product 2 belongs is identified, and the waste product 2 is sorted and collected in the collection containers 14 to 16 according to the identification result.

なお、上記識別は、以下の方法により実現することもできる。上記特性が既知であるサンプル体について図２に示されたステップＳ１～Ｓ６を通じて作成された上記画像データを上記畳み込みニューラルネットワークモデルに入力して得られる中間層の出力ベクトル（第一の出力ベクトル）を、上記特性と対応させて記録したリストを予め作成する。 Note that the above identification can also be realized by the following method. An intermediate layer output vector (first output vector) obtained by inputting the image data created through steps S1 to S6 shown in FIG. 2 for a sample body whose characteristics are known to the convolutional neural network model are recorded in correspondence with the above characteristics.

そして、図２に示されたステップＳ８では、識別対象とする物体について作成された上記画像データを学習済みの当該畳み込みニューラルネットワークモデルに入力して得られる上記中間層の出力ベクトル（第二の出力ベクトル）に対して最短距離となる上記第一の出力ベクトルを上記リストに記録された第一の出力ベクトルの中から特定することにより、上記物体の特性を識別する。 Then, in step S8 shown in FIG. 2, the intermediate layer output vector (second output The property of the object is identified by identifying, from among the first output vectors recorded in the list, the first output vector that provides the shortest distance to the object vector.

以上より、本発明の実施の形態に係る物体識別装置１や物体識別方法によれば、４ｃｈ画像を利用することによって、従来よりも少ない学習データで高精度な物体の識別が可能な物体識別装置や物体識別方法を得ることができる。 As described above, according to the object identification device 1 and the object identification method according to the embodiment of the present invention, by using 4ch images, an object identification device capable of highly accurate object identification with less learning data than before. and object identification methods can be obtained.

１物体識別装置
４ベルトコンベヤ式重量計
８３Ｄカメラ
１０２Ｄカメラ
１３制御装置

1 object identification device 4 belt conveyor type weight scale 8 3D camera 10 2D camera 13 control device

Claims

creating a first background image corresponding to the density of the sample body, and pasting the three-dimensional image of the sample body onto the first background image to create a three-dimensional composite image;
For each pixel in the two-dimensional composite image obtained by attaching the two-dimensional image of the sample to the second background image, according to the pixel value of each pixel of the three-dimensional composite image at the same position as each pixel Create image data obtained by adding the parameters
An object identification device comprising identification means for identifying an object by means of a convolutional neural network model trained on the image data.

weight measuring means for measuring the weight of the sample body;
Further comprising 3D imaging means for imaging the three-dimensional image,
The identification means further calculates the volume of the sample body from the three-dimensional image captured by the 3D imaging means, and calculates the volume of the sample body from the calculated volume and the weight measured by the weight measurement means. 2. The object identification device according to claim 1, wherein said density is calculated.

3. The object identification device according to claim 2, further comprising 2D imaging means for imaging said two-dimensional image.

The identifying means pastes the three-dimensional image in a predetermined direction and position with respect to the first background image, and attaches the two-dimensional image in a predetermined direction and position with respect to the second background image. 2. The object identification device according to claim 1, which is attached to the .

2. The object identification device according to claim 1, wherein said two-dimensional image is an RGB color image or a CMYK color image.

a first step of weighing the sample body;
a second step of calculating the volume of the sample body from the three-dimensional image of the sample body;
a third step of calculating the density of the sample body using the weight measured in the first step and the volume calculated in the second step;
a fourth step of creating a first background image according to the density calculated in the third step;
a fifth step of pasting the three-dimensional image of the sample body onto the first background image to create a three-dimensional composite image;
For each pixel in the two-dimensional composite image obtained by attaching the two-dimensional image of the sample to the second background image, according to the pixel value of each pixel of the three-dimensional composite image at the same position as each pixel a sixth step of creating the resulting image data by adding the parameters
a seventh step of training a convolutional neural network model with the image data as training data;
The image data obtained by executing the first to sixth steps for an object to be identified is input to the trained convolutional neural network model obtained in the seventh step to identify the object and an eighth step of identifying the object identification method.

pasting the three-dimensional image in a predetermined direction and position with respect to the first background image;
7. The object identification method according to claim 6, wherein said two-dimensional image is pasted in a predetermined direction and position with respect to said second background image.

7. The object identification method according to claim 6, wherein said two-dimensional image is an RGB color image or a CMYK color image.

a step of preliminarily classifying the sample bodies into a plurality of groups according to their characteristics;
further comprising the step of creating a database whose elements are the image data created for each of the groups through the first step to the sixth step;
In the seventh step, training the convolutional neural network model on the database;
7. The object identification method according to claim 6, wherein in said eighth step, said group to which said object belongs is identified.

The first output vector of the intermediate layer obtained by inputting the image data created through the first step to the sixth step for the sample body whose characteristics are known to the convolutional neural network model to the further comprising pre-creating a list of records associated with the characteristics;
In the eighth step, a second output vector of the intermediate layer obtained by inputting the image data created through the first step to the sixth step for the object into the convolutional neural network model 7. The object identification according to claim 6, wherein the characteristic of the object is identified by identifying the first output vector that is the shortest distance from the first output vector recorded in the list. Method.