JP7304070B2

JP7304070B2 - Information processing method, apparatus, and program for recognizing the arrangement state of a group of articles

Info

Publication number: JP7304070B2
Application number: JP2020008386A
Authority: JP
Inventors: 和之永田; 卓郎西
Original assignee: National Institute of Advanced Industrial Science and Technology AIST
Current assignee: National Institute of Advanced Industrial Science and Technology AIST
Priority date: 2020-01-22
Filing date: 2020-01-22
Publication date: 2023-07-06
Anticipated expiration: 2040-01-22
Also published as: JP2021117531A

Description

本発明は、商品などの物品群の配置状態を認識するための技術に関する。 The present invention relates to a technique for recognizing the arrangement state of a group of articles such as merchandise.

画像において個別物品を検出する技術としては、ＹＯＬＯ（You Only Look Once）（非特許文献１）やＳＳＤ（Single Shot multibox Detector）（非特許文献２）などが知られている。これらの技術を用いて個別物品を検出する場合には、検出すべき種類の個々の物品の画像を学習して、処理対象の画像における物品の種類と位置を検出する。これらの技術は、基本的にはＣＮＮ（Convolutional Neural Network）を用いている。 YOLO (You Only Look Once) (Non-Patent Document 1), SSD (Single Shot multibox Detector) (Non-Patent Document 2), and the like are known as techniques for detecting an individual article in an image. When these techniques are used to detect individual articles, images of individual articles of the type to be detected are learned to detect the type and position of the article in the image to be processed. These techniques basically use a CNN (Convolutional Neural Network).

ＹＯＬＯやＳＳＤを用いて、通常どおり検出すべき種類の個々の物品の画像を学習すれば、例えば店舗における棚に整列配置された物品群についての画像についても、同様に処理できるものと考えられるが、密に並べられた物品群に含まれる個別物品の検出は苦手としている、という報告もなされている（非特許文献３）。 If YOLO or SSD is used to learn images of individual articles of the type to be detected as usual, it is conceivable that images of groups of articles arranged on shelves in a store, for example, can be similarly processed. It has also been reported that it is not good at detecting individual items contained in a group of closely arranged items (Non-Patent Document 3).

例えば店舗の棚に、物品の種類毎（例えば内容、サイズや形状の種類毎）に同一種類の複数の物品が密に整列配置されるような状態において、ロボットなどに物品の補充その他の処理を行わせるようなことを想定すると、物品群の配置状態の自動認識が求められる。店舗は一例であり、倉庫などにおいても同様である。 For example, in a state in which a plurality of articles of the same type are densely arranged on a store shelf for each type of article (for example, for each type of content, size, or shape), a robot or the like is used to replenish articles or perform other processes. Assuming that such a thing is performed, automatic recognition of the arrangement state of the article group is required. A store is an example, and the same applies to a warehouse or the like.

J Redmon, et al.”You Only Look Once: Unified, Real-Time Object Detection”arXiv:1506.02640v5 9 May 2016J Redmon, et al.”You Only Look Once: Unified, Real-Time Object Detection”arXiv:1506.02640v5 9 May 2016 Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu,Alexander C. Berg, "SSD: Single shot multibox detector. " in ECCV2016Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, Alexander C. Berg, "SSD: Single shot multibox detector. " in ECCV2016 Goldman, et.al., “Precise Detection in Densely Packed Scenes,” Proc. of CVPR 2019Goldman, et.al., “Precise Detection in Densely Packed Scenes,” Proc. of CVPR 2019

従って、本発明の目的は、一側面として、整列配置された物品群の配置状態を認識するために用いられる新規な技術を提供することである。 SUMMARY OF THE INVENTION Accordingly, it is an object of the present invention, as one aspect, to provide a novel technique that can be used to recognize the arrangement of a group of aligned articles.

本発明に係る情報処理方法は、整列配置された物品群の画像において、重力方向に平行な方向である垂直方向に連続する物品群の領域と、水平方向に連続する物品群の領域と、垂直方向及び水平方向に直交する奥行方向に連続する物品群の領域とを検出するように学習された学習済みモデルにより、入力画像から、垂直方向に連続する物品群の第１の領域と、水平方向に連続する物品群の第２の領域と、奥行方向に連続する物品群の第３の領域とを検出する処理を含む。 The information processing method according to the present invention includes, in an image of a group of articles arranged in line, an area of the group of articles continuing in the vertical direction parallel to the direction of gravity, an area of the group of articles continuing in the horizontal direction, and an area of the group of articles continuing in the horizontal direction. From the input image, a first region of the vertically continuous article group and a horizontally continuous and a process of detecting a second area of the group of articles that is continuous in the direction of depth and a third area of the group of articles that is continuous in the depth direction.

なお、上記情報処理方法は、第１の領域と前記第２の領域と前記第３の領域とのうち少なくとも２つの重なりの状態に基づき、入力画像において、上記物品群に含まれる個々の物品についての第４の領域を特定する処理をさらに含むようにしてもよい。 In the above information processing method, based on the state of overlap of at least two of the first area, the second area, and the third area, in the input image, for each article included in the article group, may further include a process of specifying a fourth region of .

一側面によれば、整列配置された物品群の配置状態を認識するための新たな要素技術が提供される。 According to one aspect, a new element technology is provided for recognizing the arrangement state of a group of aligned articles.

図１は、実施の形態におけるシステム概要を示す図である。FIG. 1 is a diagram showing a system outline in an embodiment. 図２は、方向の定義を示す図である。FIG. 2 is a diagram showing the definition of directions. 図３は、撮影方向を例示するための図である。FIG. 3 is a diagram for illustrating imaging directions. 図４Ａは、配列方向検出部の機械学習について説明するための図である。FIG. 4A is a diagram for explaining machine learning of an arrangement direction detection unit; 図４Ｂは、配列方向検出部の機械学習について説明するための図である。FIG. 4B is a diagram for explaining the machine learning of the arrangement direction detection unit; 図４Ｃは、配列方向検出部の機械学習について説明するための図である。FIG. 4C is a diagram for explaining the machine learning of the arrangement direction detection unit; 図５Ａは、配列方向検出部の機械学習について説明するための図である。FIG. 5A is a diagram for explaining machine learning of an arrangement direction detection unit; 図５Ｂは、配列方向検出部の機械学習について説明するための図である。FIG. 5B is a diagram for explaining the machine learning of the arrangement direction detection unit; 図５Ｃは、配列方向検出部の機械学習について説明するための図である。FIG. 5C is a diagram for explaining the machine learning of the arrangement direction detection unit; 図５Ｄは、配列方向検出部の機械学習について説明するための図である。FIG. 5D is a diagram for explaining machine learning of the arrangement direction detection unit; 図５Ｅは、配列方向検出部の機械学習について説明するための図である。FIG. 5E is a diagram for explaining the machine learning of the arrangement direction detection unit; 図６Ａは、配列方向検出部の機械学習について説明するための図である。FIG. 6A is a diagram for explaining machine learning of an arrangement direction detection unit; 図６Ｂは、配列方向検出部の機械学習について説明するための図である。FIG. 6B is a diagram for explaining the machine learning of the arrangement direction detection unit; 図６Ｃは、配列方向検出部の機械学習について説明するための図である。FIG. 6C is a diagram for explaining machine learning of the arrangement direction detection unit; 図７は、実施の形態に係るメインの処理フローを示す図である。FIG. 7 is a diagram illustrating a main processing flow according to the embodiment; 図８は、個別物品ＢＢ検出処理の処理フローを示す図である。FIG. 8 is a diagram showing a processing flow of individual article BB detection processing. 図９Ａは、配列方向ＢＢの重ね合わせ処理について説明するための図である。FIG. 9A is a diagram for explaining the overlay processing in the arrangement direction BB. 図９Ｂは、配列方向ＢＢの重ね合わせ処理について説明するための図である。FIG. 9B is a diagram for explaining the overlay processing in the arrangement direction BB. 図９Ｃは、配列方向ＢＢの重ね合わせ処理について説明するための図である。FIG. 9C is a diagram for explaining the overlay processing in the arrangement direction BB. 図１０Ａは、配列方向ＢＢの重ね合わせ処理について説明するための図である。FIG. 10A is a diagram for explaining the overlay processing in the arrangement direction BB. 図１０Ｂは、配列方向ＢＢの重ね合わせ処理について説明するための図である。FIG. 10B is a diagram for explaining the overlay processing in the arrangement direction BB. 図１０Ｃは、配列方向ＢＢの重ね合わせ処理について説明するための図である。FIG. 10C is a diagram for explaining the overlay processing in the arrangement direction BB. 図１０Ｄは、配列方向ＢＢの重ね合わせ処理について説明するための図である。FIG. 10D is a diagram for explaining the overlay processing in the arrangement direction BB. 図１１Ａは、一部重なりの場合における処理を説明するための図である。FIG. 11A is a diagram for explaining processing in the case of partial overlap. 図１１Ｂは、一部重なりの場合における処理を説明するための図である。FIG. 11B is a diagram for explaining processing in the case of partial overlap. 図１１Ｃは、一部重なりの場合における処理を説明するための図である。FIG. 11C is a diagram for explaining processing in the case of partial overlap. 図１１Ｄは、一部重なりの場合における処理を説明するための図である。FIG. 11D is a diagram for explaining processing in the case of partial overlap. 図１２Ａは、個別物品ＢＢの検出結果の一例を示す図である。FIG. 12A is a diagram showing an example of the detection result of the individual article BB. 図１２Ｂは、個別物品ＢＢの検出結果の一例を示す図である。FIG. 12B is a diagram showing an example of the detection result of the individual article BB. 図１３は、３Ｄ配置パターン生成処理の処理フローを示す図である。FIG. 13 is a diagram illustrating a processing flow of 3D arrangement pattern generation processing. 図１４Ａは、仮想ネットワークの一例を示す図である。FIG. 14A is a diagram illustrating an example of a virtual network; 図１４Ｂは、仮想ネットワークの一例を示す図である。FIG. 14B is a diagram illustrating an example of a virtual network; 図１４Ｃは、仮想ネットワークの一例を示す図である。FIG. 14C is a diagram illustrating an example of a virtual network; 図１５は、３Ｄ配置パターンの一例を示す図である。FIG. 15 is a diagram showing an example of a 3D arrangement pattern. 図１６は、基本の３Ｄ配置パターンの一例を示す図である。FIG. 16 is a diagram showing an example of a basic 3D arrangement pattern. 図１７は、欠品検出結果の表示例を示す図である。FIG. 17 is a diagram showing a display example of a missing item detection result. 図１８は、異物検出処理の処理フローを示す図である。FIG. 18 is a diagram illustrating a processing flow of foreign matter detection processing. 図１９は、異物検出処理の例を説明するための図である。FIG. 19 is a diagram for explaining an example of foreign matter detection processing. 図２０は、異物検出処理の例を説明するための図である。FIG. 20 is a diagram for explaining an example of foreign matter detection processing. 図２１は、コンピュータ装置のブロック構成図である。FIG. 21 is a block configuration diagram of a computer device.

本発明の実施の形態に係るシステムの概要を図１に示す。本実施の形態に係るシステムは、画像（動画像を含む）を撮影するカメラ２１０を有するロボット２００と、当該ロボット２００と無線などにより通信可能な情報処理装置１００とを含む。ロボット２００は、例えば店舗の通路を移動して、店舗の棚の画像を撮影し、撮影した画像のデータを情報処理装置１００に送信するようになっている。 FIG. 1 shows an outline of a system according to an embodiment of the invention. The system according to the present embodiment includes a robot 200 having a camera 210 that captures images (including moving images), and an information processing device 100 that can communicate with the robot 200 wirelessly. The robot 200 moves, for example, through the aisles of the store, takes images of shelves in the store, and transmits the data of the taken images to the information processing apparatus 100 .

図２に示すように、例えば店舗の棚に、個別物品（図２の例では円柱状の物品であるが、形状は問わない）が密に整列配置されていることを前提とする。なお、店舗の通路から棚を撮影するので、重力方向に平行な方向を垂直方向、棚の横方向を水平方向、棚の奥行方向であって垂直方向と水平方向に直交する方向を奥行方向と定義する。 As shown in FIG. 2, for example, it is assumed that individual articles (in the example of FIG. 2, they are columnar articles, but the shape does not matter) are arranged in a dense array on a store shelf. Since the shelves are photographed from the aisle of the store, the vertical direction is the direction parallel to the direction of gravity, the horizontal direction is the lateral direction of the shelf, and the depth direction is the direction perpendicular to the vertical direction and the horizontal direction of the shelf. Define.

また、図３に模式的に示すように、カメラ２１０は、物品群の側面を正面から見る位置（１）から、物品群の側面及び上面を斜め上から見る位置（２）を経由して、物品群の上面を上から見る位置（３）までを撮影するものとする。位置（１）から（３）までを動画像で撮影しても良いし、各位置において静止画を撮影しても良いが、例えば前者であるものとする。 Further, as schematically shown in FIG. 3, the camera 210 moves from a position (1) where the side surface of the group of articles is viewed from the front to a position (2) where the side surface and the upper surface of the group of articles are viewed obliquely from above. It is assumed that an image is taken up to the position (3) where the upper surface of the group of articles is viewed from above. A moving image may be captured from positions (1) to (3), or a still image may be captured at each position.

情報処理装置１００は、画像取得部１０１と、画像データ格納部１０２と、配列方向検出部１０３と、第１データ格納部１０４と、個別物品検出部１０５と、第２データ格納部１０６と、３Ｄ配置パターン生成部１０７と、第３データ格納部１０８と、欠品検出部１０９と、異物検出部１１０とを有する。 The information processing apparatus 100 includes an image acquisition unit 101, an image data storage unit 102, an arrangement direction detection unit 103, a first data storage unit 104, an individual article detection unit 105, a second data storage unit 106, and a 3D It has an arrangement pattern generation unit 107 , a third data storage unit 108 , a missing item detection unit 109 , and a foreign object detection unit 110 .

画像取得部１０１は、ロボット２００からカメラ２１０によって撮影された画像のデータを受信し、画像データ格納部１０２に格納する。配列方向検出部１０３は、画像データ格納部１０２に格納されている各画像フレームについて、垂直方向に連続する物品群を包含する領域と、水平方向に連続する物品群を包含する領域と、奥行方向に連続する物品群を包含する領域とを検出する機能を有する。 The image acquisition unit 101 receives image data captured by the camera 210 from the robot 200 and stores the data in the image data storage unit 102 . The arrangement direction detection unit 103 detects, for each image frame stored in the image data storage unit 102, an area including a group of articles continuing in the vertical direction, an area including a group of articles continuing in the horizontal direction, and an area including a group of articles continuing in the depth direction. It has a function of detecting a region containing a group of articles that is continuous with the

配列方向検出部１０３は、例えば、垂直方向に連続する物品群を包含する領域と、水平方向に連続する物品群を包含する領域と、奥行方向に連続する物品群を包含する領域とを検出するように機械学習された学習済みモデルである。 The arrangement direction detection unit 103 detects, for example, an area that includes a group of articles that are continuous in the vertical direction, an area that includes a group of articles that are continuous in the horizontal direction, and an area that includes a group of items that are continuous in the depth direction. It is a trained model that has been machine-learned as follows.

ＹＯＬＯやＳＳＤ等を用いた物体検出では、入力画像から個別物体を検出するように学習を行っているが、本実施の形態では、ＹＯＬＯやＳＳＤ等における仕組みはそのままで、学習内容が異なる。具体的には、物品の配列状態を学習するものである。より具体的には、物品を個別に認識するのではなく、各画像フレームにおいて、垂直方向に連続する物品群を包含する領域（以下、垂直ＢＢ（Bounding Box）と呼ぶ）と、水平方向に連続する物品群を包含する領域（以下、水平ＢＢと呼ぶ）と、奥行方向に連続する物品群を包含する領域（以下、奥行ＢＢと呼ぶ）とを学習する。 In object detection using YOLO, SSD, etc., learning is performed to detect individual objects from an input image, but in this embodiment, the mechanism of YOLO, SSD, etc. remains the same, but the learning content is different. Specifically, it learns the arrangement state of the articles. More specifically, instead of recognizing each item individually, in each image frame, an area containing a group of items that are vertically continuous (hereinafter referred to as a vertical BB (bounding box)) and a horizontally continuous A region containing a group of articles (hereinafter referred to as a horizontal BB) and a region containing a group of articles continuing in the depth direction (hereinafter referred to as a depth BB) are learned.

例えば、図４Ａに模式的に示すように、垂直方向に３つ、水平方向に４つの物品が整列配置されている状態の画像が得られた場合、図４Ｂに模式的に示すように、各々４つの物品を包含する３つの水平ＢＢを指定して学習する。同様に、図４Ｃに模式的に示すように、各々３つの物品を包含する４つの垂直ＢＢを指定して学習する。これによって、図４Ａのような画像が入力されれば、図４Ｂで示したような水平ＢＢと図４Ｃで示したような垂直ＢＢとが検出できるようになる。 For example, as schematically shown in FIG. 4A, when an image is obtained in which three articles are arranged in the vertical direction and four articles are arranged in the horizontal direction, as schematically shown in FIG. Designate and study 3 horizontal BBs containing 4 items. Similarly, four vertical BBs each containing three items are specified and learned, as shown schematically in FIG. 4C. As a result, when an image such as that shown in FIG. 4A is input, the horizontal BB shown in FIG. 4B and the vertical BB shown in FIG. 4C can be detected.

また、図５Ａに模式的に示すように、水平方向及び垂直方向は図４Ａと同様であるが、奥行方向にも３つの物品が整列配置されている状態の画像が得られた場合、図５Ｂに模式的に示すように、各々４つの物品を包含する３つの水平ＢＢを指定して学習する。同様に、図５Ｃに模式的に示すように、各々３つの物品を包含する４つの垂直ＢＢを指定して学習する。さらに、図５Ｄに模式的に示すように、各々３つの物品を包含する４つの奥行ＢＢを指定して学習する。図５Ａの場合には、さらに、図５Ｅに模式的に示すように、各々４つの物品を包含する３つの水平ＢＢをさらに指定して学習する。これによって、図５Ａのような画像が入力されれば、図５Ｂ及び図５Ｅに示すような水平ＢＢと、図５Ｃで示したような垂直ＢＢと、図５Ｄで示したような奥行ＢＢとが検出できるようになる。 Also, as schematically shown in FIG. 5A, when an image is obtained in which three articles are aligned in the depth direction, although the horizontal and vertical directions are the same as in FIG. 4A, the image shown in FIG. Three horizontal BBs, each containing four items, are specified and learned, as schematically shown in . Similarly, four vertical BBs, each containing three articles, are specified and learned, as shown schematically in FIG. 5C. Furthermore, as schematically shown in FIG. 5D, four depths BB each containing three articles are designated and learned. In the case of FIG. 5A, three horizontal BBs each containing four articles are further specified and learned, as schematically shown in FIG. 5E. Accordingly, if an image as shown in FIG. 5A is input, a horizontal BB as shown in FIGS. 5B and 5E, a vertical BB as shown in FIG. 5C, and a depth BB as shown in FIG. 5D are generated. be detectable.

図示しないが、図４Ａ及び図５Ａで示したような物品の上面のみが含まれる状態の画像が得られた場合、図５Ｄに示したように、各々３つの物品を包含する４つの奥行ＢＢを指定して学習する。同様に、図５Ｅに示したように、各々４つの物品を包含する３つの水平ＢＢを指定して学習する。これによって、物品の上面のみが含まれる状態の画像が入力されれば、４つの奥行ＢＢと３つの水平ＢＢが検出できるようになる。 Although not shown, if an image is obtained in which only the top surface of the article is included, as shown in FIGS. 4A and 5A, then four depths BB, each containing three articles, are obtained, as shown in FIG. 5D. Specify and learn. Similarly, three horizontal BBs containing four items each are specified and learned, as shown in FIG. 5E. Accordingly, if an image including only the top surface of the article is input, four depth BBs and three horizontal BBs can be detected.

なお、図６Ａに模式的に示すように、図５Ａで示した状態から１つの物品が欠落した場合には、図５Ｂ乃至図５Ｅに示したようなＢＢとは異なるＢＢを指定して学習する。図６Ｂに模式的に示すように、最上段の水平ＢＢ（ｈ１）は、２つの物品のみを包含する水平ＢＢとして指定して学習する。また、右から２列目の垂直ＢＢ（ｖ１）も、２つの物品のみを包含する垂直ＢＢとして指定して学習する。あくまで学習するのは、複数の物品を包含するバウンディングボックス（ＢＢ）である。 As schematically shown in FIG. 6A, when one item is missing from the state shown in FIG. 5A, a BB different from the BB shown in FIGS. 5B to 5E is specified and learned. . As shown schematically in FIG. 6B, the top horizontal BB (h1) is designated and learned as the horizontal BB containing only two items. Also, the vertical BB (v1) in the second column from the right is also designated as a vertical BB containing only two items and learned. What is learned is a bounding box (BB) containing multiple items.

また、図６Ｃに模式的に示すように、物品上面についての水平ＢＢのうち最も手前の水平ＢＢ（ｈ２）は、２つの物品のみを包含する水平ＢＢとして指定して学習する。同様に、右から２列目の奥行ＢＢ（ｄ１）も、２つの物品のみを包含する奥行ＢＢとして指定して学習する。複数の物品を包含するバウンダリボックスＢＢを学習するためである。 Further, as schematically shown in FIG. 6C, the frontmost horizontal BB (h2) of the horizontal BBs on the upper surface of the article is designated as the horizontal BB containing only two articles and learned. Similarly, the depth BB (d1) in the second column from the right is also specified and learned as the depth BB containing only two articles. This is for learning a boundary box BB containing multiple items.

図６Ａのような画像が入力されれば、図６Ｂ及び図６Ｃで示されたような水平ＢＢ、垂直ＢＢ及び奥行ＢＢが検出される。 If an image as shown in FIG. 6A is input, horizontal BB, vertical BB and depth BB as shown in FIGS. 6B and 6C are detected.

このような多数の画像について上で述べたように指定した水平ＢＢ、垂直ＢＢ及び奥行ＢＢを学習させる。なお、様々な種類（パッケージの違いを含む）の物品について同様の学習を行うものとする。 A number of such images are trained for the specified horizontal BB, vertical BB and depth BB as described above. In addition, similar learning shall be performed for various types of goods (including differences in packaging).

図１の説明に戻って、配列方向検出部１０３は、検出結果である水平ＢＢ、垂直ＢＢ及び奥行ＢＢについてのデータ（位置及びサイズなど）を第１データ格納１０４に格納する。なお、第１データ格納部１０４に格納されたデータを表示装置などの出力装置に出力させてもよい。第２データ格納部１０５及び第３データ格納部１０８に格納されたデータについても同様である。 Returning to the description of FIG. 1, the arrangement direction detection unit 103 stores data (position, size, etc.) on the horizontal BB, vertical BB, and depth BB, which are detection results, in the first data storage 104 . Note that the data stored in the first data storage unit 104 may be output to an output device such as a display device. The same applies to data stored in the second data storage unit 105 and the third data storage unit 108 .

個別物品検出部１０５は、第１データ格納部１０４及び画像データ格納部１０２に格納されているデータを用いて、個別物品の領域（以下、個別物品ＢＢと呼ぶ）を検出し、検出結果である個別物品ＢＢについてのデータ（位置及びサイズなど）を第２データ格納部１０６に格納する。個別物品検出部１０５は、基本的には、水平ＢＢ、垂直ＢＢ及び奥行ＢＢの重なり状態に基づき個別物品ＢＢを検出するが、補助的には、一般的な物体検出技術（上記のＹＯＬＯやＳＳＤ等で個別の物品を検出するように機械学習された学習済みモデル）も活用するようになっている。 The individual article detection unit 105 uses the data stored in the first data storage unit 104 and the image data storage unit 102 to detect the area of the individual article (hereinafter referred to as individual article BB), and the detection result is Data (position, size, etc.) about the individual article BB are stored in the second data storage unit 106 . The individual article detection unit 105 basically detects the individual article BB based on the overlapping state of the horizontal BB, vertical BB, and depth BB. (Learning models that have been machine-learned to detect individual items in

また、３Ｄ配置パターン生成部１０７は、第１データ格納部１０４、第２データ格納部１０５及び画像データ格納部１０２に格納されているデータを用いて、複数の画像フレームから物品群の３Ｄ配置パターンを特定し、特定された３Ｄ配置パターンのデータを第３データ格納部１０８に格納する。 Also, the 3D layout pattern generation unit 107 uses the data stored in the first data storage unit 104, the second data storage unit 105, and the image data storage unit 102 to generate a 3D layout pattern of the item group from a plurality of image frames. is specified, and data of the specified 3D arrangement pattern is stored in the third data storage unit 108 .

欠品検出部１０９は、第３データ格納部１０８に格納されたデータを用いて、欠落している物品を検出する処理を実行し、処理結果を出力する。なお、画像データ格納部１０２、第１データ格納部１０４及び第２データ格納部１０６に格納されているデータを用いるようにしてもよい。 The missing item detection unit 109 uses the data stored in the third data storage unit 108 to execute processing for detecting missing items, and outputs the processing result. Data stored in the image data storage unit 102, the first data storage unit 104, and the second data storage unit 106 may be used.

また、異物検出部１１０は、画像データ格納部１０２と第１データ格納部１０４と第２データ格納部１０６とに格納されているデータを用いて、物体群における異物（異姿勢などを含む）を検出する処理を実行し、処理結果を出力する。 Further, the foreign object detection unit 110 uses the data stored in the image data storage unit 102, the first data storage unit 104, and the second data storage unit 106 to detect foreign objects (including different postures) in the object group. Execute the detection process and output the process result.

次に、情報処理装置１００における処理内容について、図７乃至図２０を用いて説明する。 Next, details of processing in the information processing apparatus 100 will be described with reference to FIGS. 7 to 20. FIG.

配列方向検出部１０３及び個別物品検出部１０５は、画像取得部１０１が取得して画像データ格納部１０２に格納した画像フレーム毎に、個別物品ＢＢを検出する個別物品ＢＢ検出処理を実行し、検出結果である個別物品ＢＢについてのデータを第２データ格納部１０６に格納する（ステップＳ１）。個別物品ＢＢ検出処理については、図８乃至図１２を用いて詳細に説明する。 The arrangement direction detection unit 103 and the individual article detection unit 105 execute individual article BB detection processing for detecting the individual article BB for each image frame acquired by the image acquisition unit 101 and stored in the image data storage unit 102, and detect the individual article BB. The resulting data on the individual article BB is stored in the second data storage unit 106 (step S1). The individual article BB detection process will be described in detail with reference to FIGS. 8 to 12. FIG.

次に、３Ｄ配置パターン生成部１０７は、画像データ格納部１０２、第１データ格納部１０４及び第２データ格納部１０６に格納されているデータを用いて、複数の画像フレームに含まれる特定の物品群についての３Ｄ配置パターンのデータを生成する３Ｄ配置パターン生成処理を実行し、処理結果を第３データ格納部１０８に格納する（ステップＳ３）。３Ｄ配置パターン生成処理については、図１３乃至１５を用いて詳細に説明する。 Next, the 3D layout pattern generation unit 107 uses the data stored in the image data storage unit 102, the first data storage unit 104, and the second data storage unit 106 to generate specific articles included in the plurality of image frames. A 3D arrangement pattern generation process for generating 3D arrangement pattern data for the group is executed, and the processing result is stored in the third data storage unit 108 (step S3). The 3D arrangement pattern generation processing will be described in detail with reference to FIGS. 13 to 15. FIG.

そして、欠品検出部１０９は、欠品検出処理を実行するように設定されているか又はユーザに指示されているかを確認し（ステップＳ５）、欠品検出処理を実行すべき場合には、欠品検出処理を実行し、検出結果を出力する（ステップＳ７）。欠品検出処理については、後に図１６及び１７を用いて説明する。 Then, the out-of-item detection unit 109 confirms whether it is set to execute the out-of-item detection process or whether it is instructed by the user (step S5). A product detection process is executed, and the detection result is output (step S7). The missing item detection process will be described later with reference to FIGS. 16 and 17. FIG.

また、異物検出部１１０は、異物検出処理を実行するように設定されているか又はユーザに指示されているかを確認し（ステップＳ９）、異物検出処理を実行すべき場合には、異物検出処理を実行し、検出結果を出力する（ステップＳ１１）。異物検出処理については、後に図１８乃至２０を用いて説明する。 Further, the foreign object detection unit 110 checks whether the foreign object detection process is set to be executed or is instructed by the user (step S9). Execute and output the detection result (step S11). The foreign object detection process will be described later with reference to FIGS. 18 to 20. FIG.

次に、個別物品ＢＢ検出処理について、図８乃至図１２を用いて説明する。 Next, individual article BB detection processing will be described with reference to FIGS. 8 to 12. FIG.

配列方向検出部１０３は、画像データ格納部１０２に格納されている各画像フレームに対して、水平ＢＢ、垂直ＢＢ及び奥行ＢＢを検出し、検出結果を第１データ格納部１０４に格納する（ステップＳ２１）。上でも述べたように、配列方向検出部１０３は、水平ＢＢ、垂直ＢＢ及び奥行ＢＢを検出するように機械学習された学習済みモデルであるので、その機能を用いた検出を行うものである。なお、水平ＢＢ、垂直ＢＢ及び奥行ＢＢを総称して配列方向ＢＢと呼ぶことにする。 The arrangement direction detection unit 103 detects horizontal BB, vertical BB, and depth BB for each image frame stored in the image data storage unit 102, and stores the detection results in the first data storage unit 104 (step S21). As described above, the arrangement direction detection unit 103 is a learned model that has undergone machine learning to detect horizontal BB, vertical BB, and depth BB, so detection is performed using that function. The horizontal BB, vertical BB, and depth BB are collectively referred to as arrangement direction BB.

次に、個別物品検出部１０５は、第１データ格納部１０４に格納されたデータに基づき、検出された配列方向ＢＢに何らかの重なりが存在するか否かを判断する（ステップＳ２３）。例えば、１つしか配列方向ＢＢが検出されなかった場合には、重なりは存在しないし、複数の配列方向ＢＢが検出されたとしても、孤立した状態で検出される場合もあるので、この場合も重なりが存在しない。一般に、これまでに示したような整列配置された物品群の場合には、重なりが存在するように配列方向ＢＢが検出される。 Next, the individual article detection unit 105 determines whether there is any overlap in the detected array direction BB based on the data stored in the first data storage unit 104 (step S23). For example, if only one alignment direction BB is detected, there is no overlap, and even if multiple alignment directions BB are detected, they may be detected in an isolated state. No overlap exists. Generally, in the case of a group of aligned articles as shown above, the alignment direction BB is detected such that there is an overlap.

例えば、重なりのある配列方向ＢＢについてはステップＳ２５乃至Ｓ３３の処理を実行し（ステップＳ２３：Ｙｅｓルート）、重なりのない配列方向ＢＢについては、ステプＳ３５以降で処理する（ステップＳ２３：Ｎｏルート）。 For example, the processing of steps S25 to S33 is executed for the overlapping arrangement direction BB (step S23: Yes route), and the processing for the non-overlapping arrangement direction BB is carried out after step S35 (step S23: No route).

配列方向ＢＢ同士の重なりがある場合には、個別物品検出部１０５は、配列方向ＢＢの重ね合わせ処理を実行し、所定条件を満たす積領域を個別物品ＢＢとして特定し、個別物品ＢＢについてのデータを第２データ格納部１０６に格納する（スエップＳ２５）。 If there is an overlap in the arrangement direction BB, the individual article detection unit 105 executes a process of superimposing the arrangement direction BB, identifies a product area that satisfies a predetermined condition as an individual article BB, and obtains data about the individual article BB. is stored in the second data storage unit 106 (step S25).

重ね合わせ処理については、以下の４つの条件に従って行う。
条件１．同じ方向の配列方向ＢＢの重ね合わせ処理はしない。
条件２．水平ＢＢと垂直ＢＢの重なりにおいて、水平ＢＢの横中心線と垂直ＢＢの縦中心線の交点が切り出されたＢＢに含まれている。
条件３．水平ＢＢと奥行ＢＢの重なりにおいて、水平ＢＢの横中心線と奥行ＢＢの縦中心線の交点が切り出されたＢＢに含まれている。
条件４．垂直ＢＢと奥行ＢＢの重なりにおいて、垂直ＢＢの縦中心線と奥行ＢＢの縦中心線の両方が切り出されたＢＢに含まれている。但し、切り出された領域の高さに下限を設ける。 Superposition processing is performed according to the following four conditions.
Condition 1. No overlap processing is performed in the same arrangement direction BB.
Condition 2. In the overlap of the horizontal BB and the vertical BB, the intersection of the horizontal center line of the horizontal BB and the vertical center line of the vertical BB is included in the cut out BB.
Condition 3. In the overlap of the horizontal BB and the depth BB, the intersection of the horizontal center line of the horizontal BB and the vertical center line of the depth BB is included in the clipped BB.
Condition 4. In the overlap of the vertical BB and the depth BB, both the vertical center line of the vertical BB and the vertical center line of the depth BB are included in the cut BB. However, a lower limit is set for the height of the clipped region.

例えば、図９Ａに示すように、２つの物品を包含する２つの水平ＢＢと２つの物品を含む２つの垂直ＢＢとが検出された場合、条件１に従って同じ方向の配列方向ＢＢの重ね合わせ処理は行わない。従って、水平ＢＢ同士、垂直ＢＢ同士、奥行ＢＢ同士の重ね合わせは行わない。これによって、図９Ａにおけて、物品の間に生ずる細い縦方向の重なり領域ｅ２と、物品の間に生ずる細い横方向の重なり領域ｅ１とは、無視する。 For example, when two horizontal BBs containing two articles and two vertical BBs containing two articles are detected as shown in FIG. Not performed. Therefore, horizontal BBs, vertical BBs, and depth BBs are not superimposed. Thus, in FIG. 9A, the narrow longitudinal overlap region e2 occurring between the articles and the narrow lateral overlap region e1 occurring between the articles are ignored.

また、図９Ｂに示すように、２つの物品を包含する水平ＢＢと、３つの物品を包含する２つの水平ＢＢとが存在する中で、右端の２つの物品を包含する垂直ＢＢとの重なりを考える場合には、条件２及び３に従って、微小な重なり領域ｅ３は無視する。すなわち、水平ＢＢの横中心線と垂直ＢＢの縦中心線の交点を含まない重なり領域は無視する。水平ＢＢと奥行ＢＢの組み合わせについても同様である。 Also, as shown in FIG. 9B, in the presence of a horizontal BB containing two items and two horizontal BBs containing three items, the overlap of the vertical BB containing the two items on the right end is When considering, according to the conditions 2 and 3, the minute overlapping region e3 is ignored. That is, the overlap region that does not include the intersection of the horizontal centerline of the horizontal BB and the vertical centerline of the vertical BB is ignored. The same is true for the combination of horizontal BB and depth BB.

さらに、図９Ｃに示すように、最も手前の物品が２段しか積まれておらず、奥の２列については３段物品が積まれているような場合、奥行ＢＢ（ｄ１１）と垂直ＢＢ（ｖ１１）とが検出されて、それらの重なり領域ｅ４が生ずるが、これは条件４に従って高さが閾値未満ということで排除される。 Furthermore, as shown in FIG. 9C, when only two items are stacked on the frontmost side and three items are stacked on the back two lines, the depth BB (d11) and the vertical BB ( v11) are detected, resulting in their overlap region e4, which according to Condition 4 is rejected as being less than the threshold height.

一方、図１０Ａに示すように、垂直ＢＢ（ｖ１２）と水平ＢＢ（ｈ１２）といった異なる方向の配列方向ＢＢの積領域ｐ１２は、上記の条件１及び２を満たして、個別物品ＢＢとして特定される。 On the other hand, as shown in FIG. 10A, the product area p12 of the arrangement direction BB in different directions such as the vertical BB (v12) and the horizontal BB (h12) satisfies the above conditions 1 and 2 and is identified as the individual product BB. .

同様に、図１０Ｂに示すように、水平ＢＢ（ｈ１３）と奥行ＢＢ（ｄ１３）の積領域ｐ１３も、上記の条件１及び３を満たして、個別物品ＢＢとして特定される。 Similarly, as shown in FIG. 10B, the product area p13 of the horizontal BB (h13) and the depth BB (d13) also satisfies the above conditions 1 and 3 and is identified as the individual article BB.

さらに、図１０Ｃに示すように、奥行ＢＢ（ｄ１４）と垂直ＢＢ（ｖ１４）の積領域ｐ１４も、上記の条件１及び４を満たすので、個別物品ＢＢとして特定される。 Furthermore, as shown in FIG. 10C, the product area p14 of the depth BB (d14) and the vertical BB (v14) also satisfies the above conditions 1 and 4, so it is identified as the individual article BB.

また、図１０Ｄに示すように、水平ＢＢ（ｈ１５）と奥行ＢＢ（ｄ１５）と垂直ＢＢ（ｖ１５）とが検出された場合、水平ＢＢ（ｈ１５）と垂直ＢＢ（ｖ１５）と、水平ＢＢ（ｈ１５）と奥行ＢＢ（ｄ１５）と、垂直ＢＢ（ｖ１５）と奥行ＢＢ（ｄ１５）との組み合わせで、同じ物品について積領域ｐ１５が得られるが、上記の条件１乃至４を満たしており、個別物品ＢＢとして特定される。但し、重ね合わせの優先度に従って、最初に検出された積領域と実質的に同じ積領域が検出されれば、後の検出された積領域については排除すれば良い。 Also, as shown in FIG. 10D, when a horizontal BB (h15), a depth BB (d15), and a vertical BB (v15) are detected, the horizontal BB (h15), the vertical BB (v15), and the horizontal BB (h15) ) and depth BB (d15) and vertical BB (v15) and depth BB (d15) yield product area p15 for the same article, but satisfying conditions 1 to 4 above, and individual article BB identified as However, if a product area that is substantially the same as the first detected product area is detected according to the superimposition priority, the later detected product area may be excluded.

なお、個別物品ＢＢを効率的且つ正確に検出するために、重ね合わせ処理は、例えば以下のルールに従って順番に行われる。
１．水平ＢＢと他の配列方向ＢＢとの重ね合わせを優先的に処理し、その後に垂直ＢＢと奥行ＢＢの重ね合わせ処理を実行する。
２．水平ＢＢと他の配列方向ＢＢとの重ね合わせ処理は，画像内の下から上に、左から右に、手前から奥に向かって水平ＢＢを選定する。また、水平ＢＢ内では、左から右に向かって垂直ＢＢ又は奥行ＢＢとの交差をチェックし、重ね合わせ処理を行う。
３．垂直ＢＢと奥行ＢＢの重ね合わせ処理については、画像内の左から右に、手前から奥に向かって垂直ＢＢを選定し、垂直ＢＢ内においては、上から奥行ＢＢとの交差をチェックし、重ね合わせ処理を行う。 In order to efficiently and accurately detect the individual article BB, the overlaying process is performed in order, for example, according to the following rules.
1. Preferential processing is performed for superposition of the horizontal BB and the BB in the other arrangement direction, and then superimposition processing of the vertical BB and the depth BB is performed.
2. In the superposition processing of the horizontal BB and the BB in the other arrangement direction, the horizontal BB is selected from bottom to top, from left to right, and from front to back in the image. Also, within the horizontal BB, intersections with the vertical BB or the depth BB are checked from left to right, and superposition processing is performed.
3. For the process of superimposing the vertical BB and the depth BB, the vertical BB is selected from left to right in the image and from the front to the back. Alignment processing is performed.

このような処理を行えば、配列方向ＢＢに重なりがある部分についてはおおよそ個別物品ＢＢの検出が行われるようになる。しかしながら、配列方向ＢＢが粗で重なりが少ないと、個別物品ＢＢの検出漏れが生ずる。 By performing such a process, detection of individual articles BB is generally performed for portions where there is an overlap in the arrangement direction BB. However, if the arrangement direction BB is coarse and there is little overlap, detection failure of the individual articles BB will occur.

そのため、個別物品検出部１０５は、配列方向ＢＢの一部重なりが存在するか否かを判断する（ステップＳ２７）。これは、以下のような状態を想定する。例えば図１１Ａに模式的に示すように、複数の配列方向ＢＢに重なりがあって積領域ｐ１６が検出されているが、検出された積領域の和領域（図１１Ａの場合には積領域＝和領域）と、元の配列方向ＢＢ（ｈ１６）との差領域が、検出された１つの積領域よりも大きい場合に、一部重なりが存在すると判断する。一部重なりが全く検出されなければ、処理は呼び出し元の処理に戻る。 Therefore, the individual article detection unit 105 determines whether or not there is partial overlap in the arrangement direction BB (step S27). This assumes the following situation. For example, as schematically shown in FIG. 11A, a product region p16 is detected with a plurality of overlaps in the arrangement direction BB. area) and the original array direction BB (h16) is larger than one detected product area, it is determined that there is a partial overlap. If no partial overlap is detected, processing returns to the calling processing.

一方、一部重なりが存在すれば、個別物品検出部１０５は、所定のルールに基づいて、個別物品ＢＢが特定されていない部分について物品数の推定を行い、当該物品個数に応じて個別物品ＢＢを設定して、その個別物品ＢＢについてのデータを第２データ格納部１０５に格納する（ステップＳ２９）。 On the other hand, if there is a partial overlap, the individual article detection unit 105 estimates the number of articles for the portion where the individual article BB is not specified based on a predetermined rule, and determines the individual article BB is set, and data on the individual article BB is stored in the second data storage unit 105 (step S29).

より具体的には、例えば図１１Ａに示すように、水平ＢＢ（ｈ１６）と垂直ＢＢ（ｖ１６）とが部分的に重なっている場合、水平ＢＢ（ｈ１６）内にある個別物品ＢＢのうち一番左（又は右）にある個別物品ＢＢ（ｐ１６）の左辺（又は右辺）と、水平ＢＢの左辺（又は右辺）との距離Ｌと、その個別物品ＢＢ（ｐ１６）の幅Ｗの比（＝Ｌ／Ｗ）が１近傍かそれ以上ならば、一番左（又は右）にある個々の商品の左側（又は右側）に物品が配置されていると推定する。この場合、物品の個数は、Ｌ／Ｗに最も近い自然数と推定される。このように物品の個数が推定できた場合には、検出された１つの個別物品ＢＢを、重なりを許容しつつ、推定個数分だけ均等に配置する。 More specifically, for example, as shown in FIG. 11A, when the horizontal BB (h16) and the vertical BB (v16) partially overlap, the most individual item BB within the horizontal BB (h16) is Ratio (=L /W) is close to 1 or more, it is assumed that the item is placed to the left (or right) of the leftmost (or right) individual item. In this case, the number of articles is estimated to be the nearest natural number to L/W. When the number of articles can be estimated in this way, one detected individual article BB is evenly arranged by the estimated number while allowing overlap.

また、図１１Ｂに示すように、１つの水平ＢＢ（ｈ１７）に対して、２つの垂直ＢＢ（ｖ１７及びｖ１８）が部分的に重なっている場合、２つの積領域（ｐ１７及びｐ１８）が検出される。このような場合には、水平ＢＢ（ｈ１７）内にある個別商品ＢＢのうち隣接する２つの個別物品ＢＢ（ｐ１７及びｐ１８）の中心の距離ＬＣと、それら２つの個別物品ＢＢ（ｐ１７及びｐ１８）の平均幅Ｗの比（＝ＬＣ／Ｗ）が２近傍かそれ以上ならば、それら２つの個別物品ＢＢの間に物品が配置されていると推定できる。この場合、物品の個数は、ＬＣ／Ｗ－１に最も近い自然数と推定される。このように物品の個数が推定された場合には、検出された個別物品ＢＢの平均サイズを有する領域を、重なりを許容しつつ、推定個数分だけ均等に配置する。 Also, as shown in FIG. 11B, when two vertical BBs (v17 and v18) partially overlap one horizontal BB (h17), two product regions (p17 and p18) are detected. be. In such a case, the distance LC between the centers of two adjacent individual products BB (p17 and p18) among the individual products BB within the horizontal BB (h17), and the distance LC between the centers of these two individual products BB (p17 and p18) If the ratio (=LC/W) of the average widths W of the is close to 2 or more, it can be estimated that an article is placed between the two individual articles BB. In this case, the number of items is assumed to be the natural number closest to LC/W-1. When the number of articles is estimated in this way, areas having the average size of the detected individual articles BB are evenly arranged by the estimated number while allowing overlap.

さらに図１１Ｃに示すように、水平ＢＢ（ｈ２０）と垂直ＢＢ（ｖ２０）とが部分的に重なっている場合、垂直ＢＢ（ｖ２０）内にある個別物品ＢＢのうち一番下（又は上）にある個別物品ＢＢ（ｐ２０）の上辺（又は下辺）と、垂直ＢＢ（ｖ２０）の上辺（又は下辺）との距離Ｌと、その個別商品ＢＢの高さＨとの比(＝Ｌ／Ｈ）が１近傍かそれ以上ならば、一番下（又は上）にある個別物品ＢＢ（ｐ２０）の上側（又は下側）に物品が配置されていると推定できる。この場合、物品の個数は、Ｌ／Ｈに最も近い自然数であると推定される。このように物品の個数が推定できた場合には、検出された１つの個別物品ＢＢを、重なりを許容しつつ、推定個数分だけ均等に配置する。 Furthermore, as shown in FIG. 11C , when the horizontal BB (h20) and the vertical BB (v20) partially overlap, the bottom (or top) individual item BB within the vertical BB (v20) The ratio (=L/H) of the distance L between the upper (or lower) side of a certain individual product BB (p20) and the upper (or lower) side of the vertical BB (v20) and the height H of the individual product BB is If it is in the vicinity of 1 or more, it can be estimated that the article is arranged above (or below) the lowest (or above) individual article BB (p20). In this case, the number of articles is assumed to be the natural number closest to L/H. When the number of articles can be estimated in this way, one detected individual article BB is evenly arranged by the estimated number while allowing overlap.

さらに、図１１Ｄに示すように、水平ＢＢ（ｈ２１及びｈ２２）と、奥行ＢＢ（ｄ２１）とが部分的に重なっている場合、奥行ＢＢ（ｄ２１）内にある個別物品ＢＢのうち一番手前（又は奥）にある個別物品ＢＢ（ｐ２１）の下辺（又は上辺）と、奥行ＢＢの下辺（又は上辺）との距離Ｌと、その個別物品ＢＢ（ｐ２１）の高さＨの比（＝Ｌ／Ｈ）が１近傍かそれ以上ならば、一番手前（又は一番奥）にある個別物品ＢＢの手前側（又は奥側）に物品が配置されていると推定される。この場合、物品の個数は、Ｌ／Ｈに最も近い自然数と推定される。このように物品の個数が推定できた場合には、用いられた１つの個別物品ＢＢを、重なりを許容しつつ、推定個数分だけ均等に配置する。 Furthermore, as shown in FIG. 11D, when the horizontal BB (h21 and h22) partially overlaps the depth BB (d21), the frontmost individual product BB ( The ratio of the distance L between the lower side (or upper side) of the individual article BB (p21) and the lower side (or upper side) of the depth BB to the height H of the individual article BB (p21) (=L/ If H) is close to 1 or more, it is estimated that the article is arranged on the front side (or the back side) of the individual article BB that is the closest (or the farthest). In this case, the number of articles is estimated to be the natural number closest to L/H. When the number of articles can be estimated in this way, the individual articles BB used are evenly arranged by the estimated number while allowing overlap.

このようなルールに従って、上記の差領域に個別物品ＢＢを設定することで、積領域だけで対処できなかった部分を埋める。このように、配列方向ＢＢの重なりから、個別物品ＢＢが特定されていない部分について物品数を推定する方法は、物品群の側面を正面（図３の（１））または物品群の上面を上から見る位置（図３の（３））の画像に対して特に有効である。 By setting the individual article BB in the difference area according to such rules, the portion that could not be dealt with only by the product area is filled. In this way, the method for estimating the number of articles for the portion where the individual article BB is not specified from the overlap in the arrangement direction BB is to set the side of the article group to the front ((1) in FIG. 3) or the upper surface of the article group. This is particularly effective for an image at a position viewed from above ((3) in FIG. 3).

そして、個別物品検出部１０５は、所定ルールで対処できなかった部分が存在するか否かを判断する（ステップＳ３１）。ステップＳ２９におけるルールは物品個数が推定可能な一部の領域についてのみ対処可能で、特に垂直ＢＢ及び奥行ＢＢについては、上記ルールでは対処できないこともある。例えば、図１１Ａ及び図１１Ｂの場合でも垂直ＢＢの個別物品ＢＢ以外の領域、図１１Ｄの最前列の領域などである。これらでは、物品が、検出されている個別物品ＢＢのサイズとは異なるサイズで見えるため、個別物品ＢＢのサイズに基づく推定では対処できない。従って、配列方向ＢＢにおいて、個別物品ＢＢが配置されていない所定サイズ以上の領域が存在する場合には、所定ルールで対処できなかった部分が存在すると判断する。所定ルールで対処できなかった部分が存在しない場合には、処理は呼び出し元の処理に戻る。 Then, the individual article detection unit 105 determines whether or not there is a portion that could not be dealt with according to the predetermined rule (step S31). The rule in step S29 can deal only with a part of the area where the number of articles can be estimated, and in particular, the vertical BB and the depth BB may not be dealt with by the above rule. For example, even in the case of FIGS. 11A and 11B, it is the area other than the individual article BB in the vertical BB, the front row area in FIG. 11D, and the like. In these, an estimate based on the size of the discrete item BB cannot address because the item appears to be a different size than the size of the discrete item BB being detected. Therefore, if there is an area of a predetermined size or larger in which the individual articles BB are not arranged in the arrangement direction BB, it is determined that there is a portion that could not be dealt with according to the predetermined rule. If there is no portion that could not be dealt with by the predetermined rule, the process returns to the calling process.

一方、所定ルールで対処できなかった部分が存在する場合には、個別物品検出部１０５は、その配列方向ＢＢでの一般物体検出を行って個別物品ＢＢを特定し、その結果を第２データ格納部１０６に格納する（ステップＳ３３）。一般物体検出は、上でも述べたように、従来どおり検出対象となる物品を画像から検出できるように機械学習された学習済みモデルによって行われる。そして、処理は呼び出し元の処理に戻る。 On the other hand, if there is a portion that could not be dealt with by the predetermined rule, the individual article detection unit 105 performs general object detection in the arrangement direction BB to specify the individual article BB, and stores the result as the second data. Stored in unit 106 (step S33). As described above, general object detection is performed by a machine-learned model that has been machine-learned so that an article to be detected can be detected from an image as before. Then, the process returns to the calling process.

また、個別物品検出部１０５は、他の配列方向ＢＢとの重なりが全くない配列方向ＢＢについては、ステップＳ３３と同様に、各配列方向ＢＢ内での一般物体検出を実行して個別物品ＢＢを特定し、その結果を第２データ格納部１０６に格納する（ステップＳ３５）。 Further, for the arrangement direction BB that does not overlap with any other arrangement direction BB, the individual article detection unit 105 performs general object detection in each arrangement direction BB to detect the individual article BB, as in step S33. Then, the result is stored in the second data storage unit 106 (step S35).

そして、個別物品検出部１０５は、ステップＳ３５の処理対象の配列方向ＢＢにおいて、一般物体検出で個別物品ＢＢを検出できなかった領域があるか否かを判断する（ステップＳ３７）。この判断も、基本的にはステップＳ２７と同様である。すなわち、検出された個別物品ＢＢ域の和領域と、元の配列方向ＢＢとの差領域が、検出された１つの個別物品ＢＢよりも大きい場合に、個別物品ＢＢを検出できなかった領域が存在すると判断する。このような領域が存在しなければ、処理は呼び出し元の処理に戻る。 Then, the individual article detection unit 105 determines whether or not there is an area in which the individual article BB could not be detected by general object detection in the arrangement direction BB to be processed in step S35 (step S37). This determination is also basically the same as step S27. That is, if the difference between the sum of the detected individual article BB areas and the original arrangement direction BB is larger than the detected individual article BB, there is an area where the individual article BB cannot be detected. Then judge. If no such region exists, processing returns to the calling processing.

一方、個別物品ＢＢを検出できなかった領域が存在すると判断された場合には、個別物品検出部１０５は、ステップＳ２９と同様の処理を実行する（ステップＳ３９）。すなわち、サイズに基づき物品個数が推定できれば、その個数分だけ個別物品ＢＢを設定する処理である。そして処理は、呼び出し元の処理に戻る。 On the other hand, when it is determined that there is an area in which the individual article BB could not be detected, the individual article detection section 105 executes the same processing as in step S29 (step S39). In other words, if the number of articles can be estimated based on the size, this is the process of setting individual articles BB for that number. Then, the process returns to the calling process.

以上の処理を行うことで、主に配列方向ＢＢの重なり状態を基に、個別物品ＢＢを検出できるようになる。 By performing the above processing, the individual articles BB can be detected mainly based on the overlapping state in the arrangement direction BB.

例えば、図１２Ａに示すように、物品群の側面から見た画像フレームの場合には、当該画像フレームに現れる９個の物品の各々について、点線で表される個別物品ＢＢが提示可能となる。また、図１２Ｂに示すように、物品群を斜め上から見た画像フレームの場合には、当該画像フレームに現れている１４個の物品の各々について、点線で表される個別物品ＢＢが提示可能となる。 For example, as shown in FIG. 12A, in the case of an image frame viewed from the side of a group of items, individual items BB represented by dotted lines can be presented for each of nine items appearing in the image frame. In addition, as shown in FIG. 12B, in the case of an image frame in which an article group is viewed obliquely from above, an individual article BB represented by a dotted line can be presented for each of the 14 articles appearing in the image frame. becomes.

なお、以下で述べる３Ｄ配置パターン生成処理を行う場合には、個別物品ＢＢについては、どのようにして検出されたのかについてのデータを保持しておく。例えば、水平ＢＢと垂直ＢＢとの積領域であれば、「水平及び垂直」というデータを保持しておく。水平ＢＢ内における物品個数推定から設定された場合には、「水平」というデータを保持しておく。 When performing the 3D arrangement pattern generation process described below, data on how the individual article BB was detected is held. For example, in the case of the product area of the horizontal BB and the vertical BB, the data "horizontal and vertical" are held. When the setting is made from the estimation of the number of articles in the horizontal BB, the data "horizontal" is held.

次に、図１３乃至図１５を用いて、３Ｄ配置パターン生成処理について説明する。 Next, 3D arrangement pattern generation processing will be described with reference to FIGS. 13 to 15. FIG.

３Ｄ配置パターン生成部１０７は、同一物品群についての各画像フレームについての処理結果（画像データ格納部１０２、第１データ格納部１０４及び第２データ格納部１０６に格納されたデータ）を用いて、各画像フレームについて仮想ネットワークを生成する（ステップＳ５１）。 The 3D arrangement pattern generation unit 107 uses the processing results (data stored in the image data storage unit 102, the first data storage unit 104, and the second data storage unit 106) for each image frame for the same product group, A virtual network is generated for each image frame (step S51).

具体的には、個別物品ＢＢ毎に１つのノードを設け、その個別物品ＢＢの上記属性（例えば「水平及び垂直」など）を引き継がせる。そして、同じ水平ＢＢに含まれる個別商品ＢＢについてのノードであれば水平方向に仮想リンクで連結し、同じ垂直ＢＢに含まれる個別商品ＢＢについてのノードであれば垂直方向に仮想リンクで連結し、同じ奥行ＢＢに含まれる個別商品ＢＢについてのノードであれば奥行方向に仮想リンクで連結する。 Specifically, one node is provided for each individual article BB, and the above attributes (for example, "horizontal and vertical") of the individual article BB are inherited. If the nodes are related to individual product BBs included in the same horizontal BB, they are horizontally connected by virtual links, and if the nodes are related to individual product BBs contained in the same vertical BB, they are connected vertically by virtual links, Nodes for individual products BB included in the same depth BB are connected by virtual links in the depth direction.

図１２Ａに示した画像フレームの例では、図１４Ａに示すように、９つの個別物品ＢＢの各々にノードが設定され、水平方向及び垂直方向にノード間に仮想リンクで連結されて、仮想ネットワークが生成される。各ノードは、「水平及び垂直」属性を有している。 In the example of the image frame shown in FIG. 12A, as shown in FIG. 14A, nodes are set for each of the nine individual articles BB, and the nodes are horizontally and vertically connected by virtual links to form a virtual network. generated. Each node has a "horizontal and vertical" attribute.

図１２Ｂに示した画像フレームの例では、図１４Ｂに示すように、１４個の個別物品ＢＢの各々にノードが設定され、水平方向、垂直方向及び奥行方向に仮想リンクが設定されて、図１４Ａとは異なる仮想ネットワークが生成される。図１４Ｂでは、ノードｎ１乃至ｎ６については、「水平及び奥行」属性を有しており、ノードｎ７及びｎ８については、「垂直及び奥行」属性を有しており、その他のノードは「水平及び垂直」属性を有している。 In the example of the image frame shown in FIG. 12B, as shown in FIG. 14B, nodes are set for each of the 14 individual articles BB, virtual links are set in the horizontal direction, vertical direction and depth direction, and the virtual links are set as shown in FIG. A different virtual network is created. In FIG. 14B, nodes n1 to n6 have a “horizontal and depth” attribute, nodes n7 and n8 have a “vertical and depth” attribute, and other nodes have “horizontal and vertical” attributes. ” attribute.

なお、このような物品群を真上から見た画像フレームの場合、図１４Ｃで模式的に示すように、９つの個別物体ＢＢの各々にノードが設定され、水平方向及び奥行方向に仮想リンクで連結する。各ノードは、「水平及び奥行」属性を有している。 In the case of an image frame in which such a group of articles is viewed from directly above, as schematically shown in FIG. Link. Each node has a "horizontal and depth" attribute.

次に、３Ｄ配置パターン生成部１０７は、同一物品群についての各画像フレームについて生成された仮想ネットワークの対応付けを行って、仮想ネットワークの統合を行う（ステップＳ５３）。 Next, the 3D arrangement pattern generation unit 107 associates the virtual networks generated for each image frame for the same product group, and integrates the virtual networks (step S53).

例えば、図１４Ａ乃至１４Ｃについての仮想ネットワークが得られた場合、リンク構造が少なくとも一部一致していることを確認しつつ、個別物品ＢＢ内の画像特徴の類似度が閾値以上といった基準にて、統合を行う。 For example, when a virtual network for FIGS. 14A to 14C is obtained, while confirming that the link structures are at least partially matching, based on the criteria that the similarity of the image features in the individual product BB is a threshold value or more, do the integration.

図１４Ａにおける下２段分の部分ネットワークは、図１４Ｂにおける下２段分の部分ネットワークと、ノードの属性を含めてリンク構造が一致している。それらの部分の個別物品ＢＢにおける画像特徴も類似しているので、これらの対応関係は特定できる。一方、残余の部分についてはリンク構造が一致せず、図１４Ｂのｎ５ノード近辺の画像特徴量が図１４Ａの最上段中央のノードの画像特徴量と近い値を取ることから図１４Ｂにおける仮想ネットワークのノード群を優先して採用する。なお、図１４Ｃにおける仮想ネットワークと、図１４Ｂにおける仮想ネットワークを比較すると、両方とも上２段分の部分ネットワークは、ノード属性も含めてリンク構造が一致している。それらの部分の個別物品ＢＢにおける画像特徴も類似しているので、これらの対応関係は特定できる。一方、残余の部分についてはリンク構造が一致せず、図１４Ｂのｎ５ノード直下のノードと図１４Ｃにおける最下段中央のノードの画像特徴量が近い値を取ることから図１４Ｂにおける仮想ネットワークの部分ネットワークを優先して採用する。なお、このような処理の詳細については、例えば、恒川法和等、「動物体の３次元境界線からの逐次３次元幾何モデリング」、社団法人情報処理学会、研究報告、p9-16、2004.3.5などを参照のこと。 The partial network of the lower two stages in FIG. 14A has the same link structure as the partial network of the lower two stages in FIG. 14B, including the attributes of the nodes. These correspondences can be identified because the image features in the individual article BB of those parts are also similar. On the other hand, the link structure of the remaining portion does not match, and the image feature amount near the n5 node in FIG. Preferentially adopt node groups. When the virtual network in FIG. 14C and the virtual network in FIG. 14B are compared, the partial networks of the upper two stages in both have the same link structure including node attributes. These correspondences can be identified because the image features in the individual article BB of those parts are also similar. On the other hand, the link structure of the remaining part does not match, and the image feature values of the node immediately below the n5 node in FIG. 14B and the node in the middle of the bottom row in FIG. are preferentially adopted. For details of such processing, see, for example, Norikazu Tsunekawa et al., "Sequential 3D Geometric Modeling from 3D Boundaries of Moving Objects", Information Processing Society of Japan, Research Report, p9-16, 2004.3. See 5, etc.

以上のように、複数の画像フレームについての仮想ネットワーク間の対応付けを行いつつ、画像特徴量の近さを考慮して、仮想ネットワークの統合を行うものである。図１４Ａ乃至図１４Ｃの例の場合には、結果として図１４Ｂに示した仮想ネットワークが採用されることになる。 As described above, the virtual networks are integrated in consideration of the closeness of the image feature amounts while associating the virtual networks with respect to a plurality of image frames. In the case of the examples of FIGS. 14A to 14C, the virtual network shown in FIG. 14B is adopted as a result.

言い換えれば、視点の移動で配列方向ＢＢの種類が変わらない場合に新たに出現したノードは、視野外にあった物品が見えるようになったとみなして、その新たなノード及びそのノード間の仮想リンクを採用する。 In other words, a node newly appearing when the type of arrangement direction BB does not change due to movement of the viewpoint is assumed to be an object that was out of the field of view, and the new node and the virtual link between the nodes are established. to adopt.

また、視点の移動により配列方向ＢＢの種類が増えたときに出現した新たなノードは、隠れていた物品が見えるようになったとみなして、その新たなノード及びそのノード間の仮想リンクを採用する。 In addition, a new node that appears when the number of types of arrangement directions BB increases due to movement of the viewpoint is regarded as a hidden item, and a virtual link between the new node and the node is adopted. .

さらに、視点の移動により配列方向ＢＢの種類が増えたときに消滅したノードは、実際に物品がなかったものとみなして、その消滅ノードを不採用とする。逆に、配列方向ＢＢの種類が減少したときに出現したノードは、誤認識とみなして、採用しない。 Furthermore, a node that disappears when the number of types in the arrangement direction BB increases due to movement of the viewpoint is regarded as having actually no article, and the disappearing node is rejected. Conversely, a node that appears when the number of types of array direction BB is reduced is regarded as erroneous recognition and is not adopted.

そして、３Ｄ配置パターン生成部１０７は、統合された仮想ネットワークから、３Ｄ配置パターンを生成し、当該３Ｄ配置パターンについてのデータを第３データ格納部１０８に格納する（ステップＳ５５）。 The 3D arrangement pattern generation unit 107 then generates a 3D arrangement pattern from the integrated virtual network, and stores data on the 3D arrangement pattern in the third data storage unit 108 (step S55).

例えば、図１４Ｂの例では、最も手前に配置された８つの物品については、仮想ネットワークで表された配置そのままで物品が配置されており、それより奥に配置されている物品群については、最も上に配置されている物品の配置は仮想ネットワークで表されているが、それ以外の部分については仮想ネットワークでは表されていない。従って、見えない部分については、仮想ネットワークで表されており且つ見えている部分の配置と同じ配置、すなわち水平方向の個数、垂直方向の個数及び奥行方向の個数は同じものとして、３Ｄ配置パターンを生成する。 For example, in the example of FIG. 14B, the 8 items arranged in the foreground are arranged as they are in the arrangement represented by the virtual network, and the items arranged in the back are arranged as they are in the arrangement represented by the virtual network. The arrangement of the articles arranged on the top is represented by the virtual network, but the other parts are not represented by the virtual network. Therefore, the invisible parts are represented by the virtual network and are arranged in the same arrangement as the visible parts, that is, the number in the horizontal direction, the number in the vertical direction, and the number in the depth direction are the same, and the 3D arrangement pattern is set. Generate.

図１４Ｂの場合には、図１５に模式的に示すような３Ｄ配置パターンを生成する。３Ｄ配置パターンは、例えば１つの物品を立方体で表したものである。この例では、最下段には３×３個の物品があり、２段目にも３×３個の物品があり、最上段には、最も手前の中央部には物品はないが、それ以外は最下段及び２段目と同じように物品が配置される。 In the case of FIG. 14B, a 3D arrangement pattern as schematically shown in FIG. 15 is generated. A 3D arrangement pattern is, for example, a cubic representation of one article. In this example, the bottom row has 3×3 items, the second row also has 3×3 items, and the top row has no items in the center, which is the frontmost, but the others. are arranged in the same manner as the bottom and second tiers.

このような３Ｄ配置パターンを生成することで、ロボット２００による物体認識が容易になる。 Generating such a 3D arrangement pattern facilitates object recognition by the robot 200 .

次に、欠品検出部１０９による欠品検出処理について説明する。欠品検出処理については、様々なやり方が可能である。例えば、図１５に示したような３Ｄ配置パターンが取得できれば、基本の３Ｄ配置パターン又は過去の３Ｄ配置パターンとの比較により、欠品商品を特定する。 Next, missing item detection processing by the missing item detection unit 109 will be described. Various methods are possible for the out-of-stock detection processing. For example, if a 3D arrangement pattern such as that shown in FIG. 15 can be obtained, a missing product can be specified by comparing it with a basic 3D arrangement pattern or past 3D arrangement patterns.

例えば、図１６に示すような過去の３Ｄ配置パターンを保持している場合には、図１６における物品ｆが、図１５における３Ｄ配置パターンとの差であると特定できるので、例えば、図１７に示すような形で、画像フレームにおいて、物品ｆに対応する位置にフレーム（点線）を示すことで、欠品を表すようにしてもよい。 For example, if a past 3D arrangement pattern as shown in FIG. 16 is stored, the product f in FIG. 16 can be identified as being different from the 3D arrangement pattern in FIG. As shown, a missing item may be indicated by showing a frame (dotted line) at a position corresponding to the item f in the image frame.

他の方法としては、配列方向ＢＢに含まれる個別物品ＢＢの個数をカウントすることで欠品位置を特定しても良い。図１２Ｂのような場合、４つの水平ＢＢ内の個別物品ＢＢの数は３つで共通するが、両脇２つの垂直ＢＢ内の個別物品ＢＢの数は「３」であるが真ん中の垂直ＢＢ内の個別物品ＢＢの数は「２」である。同様に、両脇２つの奥行ＢＢ内における個別物品ＢＢの数は「３」であるが真ん中の奥行ＢＢ内の個別物品ＢＢの数は「２」となる。これによって、真ん中の垂直ＢＢの上部分に欠品が存在することが特定される。 As another method, the missing item position may be identified by counting the number of individual items BB included in the arrangement direction BB. In the case shown in FIG. 12B, the number of individual articles BB in the four horizontal BBs is three in common, but the number of individual articles BB in the two vertical BBs on both sides is "3", but the vertical BB in the middle The number of individual articles BB in is "2". Similarly, the number of individual articles BB in the two depths BB on both sides is "3", but the number of individual articles BB in the middle depth BB is "2". This identifies the presence of a missing item in the upper portion of the middle vertical BB.

次に、図１８乃至図２０を用いて、異物検出処理について説明する。 Next, foreign matter detection processing will be described with reference to FIGS. 18 to 20. FIG.

まず、異物検出部１１０は、画像データ格納部１０２に格納される複数の画像フレームから、垂直ＢＢ及び水平ＢＢのみ、及び水平ＢＢ及び奥行ＢＢのみを検出した特定の画像フレームを抽出する（ステップＳ６１）。すなわち、ほぼ真横から物品群を見た画像フレームと、ほぼ真上から物品群を見た画像フレームとを抽出する。例えば、図１９に示すような画像フレームが得られたものとする。なお、物品側面には「ＣＵＰ」という英文字列が記載されているが、中央の物品以外は、「Ｃ」が見える姿勢で配置されており、中央の物品については「Ｐ」が見える姿勢で配置されている。 First, the foreign object detection unit 110 extracts specific image frames in which only the vertical BB and horizontal BB and only the horizontal BB and depth BB are detected from a plurality of image frames stored in the image data storage unit 102 (step S61). ). That is, an image frame in which the group of articles is viewed almost from the side and an image frame in which the group of articles is viewed from almost directly above are extracted. For example, assume that an image frame as shown in FIG. 19 is obtained. In addition, although the English character string "CUP" is written on the side of the item, the items other than the central item are arranged so that the "C" can be seen, and the central item is placed so that the "P" can be seen. are placed.

次に、異物検出部１１０は、特定の画像フレームにおいて特定された個々の個別物品ＢＢについて、画像における形状及び色を表す特徴ベクトルを算出する（ステップＳ６３）。このような特徴ベクトルは様々なものが知られており、例えばＣＳ－ＬＢ記述子（画像エッジの統計情報）と、色相分布記述子とを用いる。 Next, the foreign matter detection unit 110 calculates feature vectors representing the shape and color in the image for each individual article BB specified in the specific image frame (step S63). Various types of such feature vectors are known, and for example, CS-LB descriptors (statistical information of image edges) and hue distribution descriptors are used.

また、異物検出部１１０は、各特定の画像フレームについて、特徴ベクトルの平均を算出する（ステップＳ６５）。 Foreign matter detection unit 110 also calculates the average of the feature vectors for each specific image frame (step S65).

さらに、異物検出部１１０は、各特定の画像フレームについて、特徴ベクトルの平均からの距離が閾値以上となる個別物品ＢＢを特定する（ステップＳ６７）。これによって、形状と色の少なくともいずれかにおいて平均的な個別物品ＢＢとは異なる個別物品ＢＢ、すなわち異物（異姿勢を含む）の個別物品ＢＢが特定される。 Further, the foreign object detection unit 110 identifies individual articles BB whose distance from the average of the feature vectors is equal to or greater than a threshold for each specific image frame (step S67). As a result, an individual article BB that differs from the average individual article BB in at least one of shape and color, that is, an individual article BB that is a foreign substance (including a different posture) is specified.

図１９の例では、図２０において点線フレームで示すように、中央の物品が異物（又は異姿勢）であることが特定され、示されるようになる。 In the example of FIG. 19, as indicated by the dotted line frame in FIG. 20, the central article is identified and indicated as a foreign object (or different posture).

なお、検出すべき各物品の姿勢データ付きデータベースが得られる場合には、この姿勢データ付きデータベースに蓄積されたデータを機械学習して物品ＩＤ及び姿勢種別を出力できるようにした学習済みモデルを用意することでも、異物検出処理は行える。すなわち、各個別商品ＢＢの画像をこの学習済みモデルに入力すれば、物品ＩＤ及び姿勢種別が得られるので、物品ＩＤ及び姿勢種別の組み合わせが比較少数の個別物品ＢＢを特定すれば良い。このような方法については、K. Suzuki, Y. Yoshiyasu, A. Gabas, F. Kanehiro, and E. Yoshida, “Toward 6 DOF Object Pose Estimation with Minimum Dataset,” Proc. of the 2019 IEEE/SICE International Symposium on System Integration, pp.462-467. 2019を参照のこと。 If a database with posture data for each object to be detected is available, prepare a learned model that can output the item ID and posture type by performing machine learning on the data accumulated in this database with posture data. Foreign matter detection processing can also be performed by That is, by inputting the image of each individual product BB into this trained model, the product ID and posture type can be obtained. K. Suzuki, Y. Yoshiyasu, A. Gabas, F. Kanehiro, and E. Yoshida, “Toward 6 DOF Object Pose Estimation with Minimum Dataset,” Proc. of the 2019 IEEE/SICE International Symposium on System Integration, pp.462-467. 2019.

以上本発明の実施の形態を説明したが、本発明はこれに限定されるものではない。例えば、処理フローは一例であって、処理結果が変わらない限り、ステップの順番入れ替えや複数ステップの並列実行を行うようにしてもよい。また、図１の機能構成例も一例であって、プログラムモジュール構成とは一致しない場合もある。また、情報処理装置１００は、１台のコンピュータで実装される場合もあれば、複数台のコンピュータで実装される場合もある。また、情報処理装置１００はロボット２００と一体化される場合もあれば、遠隔地に設けられる場合もある。 Although the embodiment of the present invention has been described above, the present invention is not limited to this. For example, the processing flow is an example, and as long as the processing result does not change, the order of steps may be changed or multiple steps may be executed in parallel. The functional configuration example in FIG. 1 is also an example, and may not match the program module configuration. Further, the information processing apparatus 100 may be implemented by one computer or may be implemented by a plurality of computers. Further, the information processing apparatus 100 may be integrated with the robot 200 or may be provided at a remote location.

なお、上で述べた情報処理装置１００は、コンピュータ装置であって、図２１に示すように、メモリ２５０１とＣＰＵ（Central Processing Unit）２５０３とハードディスク・ドライブ（ＨＤＤ：Hard Disk Drive）２５０５と表示装置２５０９に接続される表示制御部２５０７とリムーバブル・ディスク２５１１用のドライブ装置２５１３と入力装置２５１５とネットワークに接続するための通信制御部２５１７とがバス２５１９で接続されている。なお、ＨＤＤはソリッドステート・ドライブ（ＳＳＤ：Solid State Drive）などの記憶装置でもよい。オペレーティング・システム（ＯＳ：Operating System）及び本発明の実施の形態における処理を実施するためのアプリケーション・プログラムは、ＨＤＤ２５０５に格納されており、ＣＰＵ２５０３により実行される際にはＨＤＤ２５０５からメモリ２５０１に読み出される。ＣＰＵ２５０３は、アプリケーション・プログラムの処理内容に応じて表示制御部２５０７、通信制御部２５１７、ドライブ装置２５１３を制御して、所定の動作を行わせる。また、処理途中のデータについては、主としてメモリ２５０１に格納されるが、ＨＤＤ２５０５に格納されるようにしてもよい。本技術の実施例では、上で述べた処理を実施するためのアプリケーション・プログラムはコンピュータ読み取り可能なリムーバブル・ディスク２５１１に格納されて頒布され、ドライブ装置２５１３からＨＤＤ２５０５にインストールされる。インターネットなどのネットワーク及び通信制御部２５１７を経由して、ＨＤＤ２５０５にインストールされる場合もある。このようなコンピュータ装置は、上で述べたＣＰＵ２５０３、メモリ２５０１などのハードウエアとＯＳ及びアプリケーション・プログラムなどのプログラムとが有機的に協働することにより、上で述べたような各種機能を実現する。 The information processing apparatus 100 described above is a computer apparatus, and as shown in FIG. A display control unit 2507 connected to 2509 , a drive device 2513 for a removable disk 2511 , an input device 2515 and a communication control unit 2517 for connecting to a network are connected via a bus 2519 . Note that the HDD may be a storage device such as a solid state drive (SSD). An operating system (OS) and an application program for performing processing in the embodiment of the present invention are stored in the HDD 2505 and read from the HDD 2505 to the memory 2501 when executed by the CPU 2503. . The CPU 2503 controls the display control unit 2507, the communication control unit 2517, and the drive device 2513 according to the processing content of the application program to perform predetermined operations. In addition, data in the middle of processing is mainly stored in the memory 2501, but may be stored in the HDD 2505 as well. In the embodiment of the present technology, an application program for performing the processing described above is stored and distributed in a computer-readable removable disk 2511 and installed from the drive device 2513 to the HDD 2505. It may be installed in the HDD 2505 via a network such as the Internet and the communication control unit 2517 . Such a computer device implements the various functions described above through organic cooperation between hardware such as the CPU 2503 and memory 2501 described above and programs such as the OS and application programs. .

なお、上で述べたような処理を実行することで用いられるデータは、処理途中のものであるか、処理結果であるかを問わず、メモリ２５０１又はＨＤＤ２５０５等の記憶装置に格納される。 It should be noted that data used by executing the above-described processing is stored in a storage device such as the memory 2501 or the HDD 2505 regardless of whether it is in the middle of processing or processing results.

以上述べた実施の形態をまとめると以下のようになる。 The embodiments described above are summarized as follows.

本実施の形態に係る情報処理方法は、整列配置された物品群の画像において、重力方向に平行な方向である垂直方向に連続する物品群の領域と、水平方向に連続する物品群の領域と、垂直方向及び水平方向に直交する奥行方向に連続する物品群の領域とを検出するように学習された学習済みモデルにより、入力画像から、垂直方向に連続する物品群の第１の領域と、水平方向に連続する物品群の第２の領域と、奥行方向に連続する物品群の第３の領域とを検出する処理を含む。 The information processing method according to the present embodiment provides an image of a group of articles arranged in line, in which the area of the group of articles continues in the vertical direction, which is parallel to the direction of gravity, and the area of the group of articles continues in the horizontal direction. , and a region of a group of articles continuous in a depth direction orthogonal to the vertical direction and the horizontal direction, from the input image, a first region of the group of items that is continuous in the vertical direction; It includes a process of detecting a second area of the group of articles continuing in the horizontal direction and a third area of the group of articles continuing in the depth direction.

このような学習済みモデルはこれまで構築及び使用されておらず、整列配置された物品群の配置状態を認識する上で、配列方向を把握することは重要な要素である。この処理結果を用いることで、整列配置された物品群における個々の物品を検出する精度向上も期待される。 Such a trained model has not been constructed and used so far, and grasping the arrangement direction is an important factor in recognizing the arrangement state of a group of articles arranged in line. By using this processing result, it is also expected to improve the accuracy of detecting individual articles in a group of articles that are aligned.

上記情報処理方法は、第１の領域と第２の領域と第３の領域とのうち少なくとも２つの重なりの状態に基づき、入力画像において、物品群に含まれる個々の物品についての第４の領域を特定する処理を含むようにしてもよい。第１乃至第３の領域の活用には様々なものがあるが、このように重なりの状態に着目することで、効果的に個々の物品を検出できるようになる。 In the above information processing method, in the input image, a fourth area for each article included in the article group is generated based on the overlapping state of at least two of the first area, the second area, and the third area. You may make it include the process which specifies. There are various ways to utilize the first to third regions, but by paying attention to the overlapping state in this way, it is possible to effectively detect individual articles.

なお、上で述べた第４の領域を特定する処理において、第１の領域と第２の領域と第３の領域とのうち他の領域との重なりが存在しない領域がある場合、物品群における個々の物品の領域を検出するように学習された第２の学習済みモデルにより、前記他の領域との重なりが存在しない領域について個々の物品についての第４の領域を特定するようにしてもよい。物品があまり配置されていない場合には重なりが生じない場合もあるので、その場合には一般的な物体検出を行うものである。 In addition, in the process of specifying the fourth area described above, if there is an area that does not overlap with other areas among the first area, the second area, and the third area, in the product group A second trained model trained to detect regions of individual articles may identify a fourth region for each article for which there is no overlap with the other regions. . If there are not many articles arranged, there may be no overlap, so in that case, general object detection is performed.

また、上で述べた第４の領域を特定する処理において、第１乃至第３の領域のいずれかと当該いずれかの領域において検出された第４の領域の和領域との差領域が、検出された１つの第４の領域より大きい場合、上記差領域について、検出された第４の領域のサイズに基づき、新たな第４の領域を特定するようにしてもよい。領域の重なりが多くない場合には、重なった部分を用いた推測を行うようにしてもよい。 Further, in the process of identifying the fourth region described above, a difference region between any one of the first to third regions and the sum region of the fourth region detected in any of the regions is detected. If the difference area is larger than one fourth area, a new fourth area may be identified based on the size of the detected fourth area. If there is not much overlap between regions, estimation using the overlapping portion may be performed.

さらに、さらに上記情報処理方法は、第４の領域をノードで表し、且つ垂直方向、水平方向及び奥行方向について第４の領域の隣接関係をリンクで表す仮想ネットワークを生成し、生成された仮想ネットワークに基づいて、物品群についての三次元配置のパターンを決定する処理を、さらに含むようにしてもよい。このように三次元配置のパターンを把握することで、物品群に対するロボット操作などが容易に行えるようになる。 Furthermore, the information processing method generates a virtual network representing the fourth area with nodes and representing adjacency relationships of the fourth area with links in the vertical, horizontal and depth directions, and generating a virtual network The method may further include a process of determining a three-dimensional arrangement pattern for the group of articles based on . By grasping the pattern of the three-dimensional arrangement in this way, it becomes possible to easily perform a robot operation for a group of articles.

また、上記情報処理方法は、三次元の配置パターンに基づき、三次元の配置パターンにおける物品の欠落位置を特定する処理をさらに含むようにしてもよい。例えば、過去の又は基準となる三次元の配置パターンとの差から、欠品を容易に検出できるようになる。 In addition, the information processing method may further include a process of specifying a missing position of the article in the three-dimensional arrangement pattern based on the three-dimensional arrangement pattern. For example, missing items can be easily detected from the difference from the past or reference three-dimensional arrangement pattern.

なお、上記情報処理方法は、第１の領域に含まれる第４の領域の個数、第２の領域に含まれる第４の領域の個数、及び第３の領域に含まれる第４の領域の個数に基づき、物品の欠落位置を特定する処理をさらに含むようにしてもよい。個数についての規則性に反する部分が欠落位置になる。 Note that the above information processing method includes the number of fourth regions included in the first region, the number of fourth regions included in the second region, and the number of fourth regions included in the third region. Based on the above, a process of specifying the missing position of the article may be further included. A part against the regularity of the number becomes a missing position.

さらに、上記情報処理方法は、第４の領域内における画像データに基づき、物品群において異物又は異姿勢の物品についての第４の領域を特定する処理をさらに含むようにしてもよい。画像データの特徴ベクトルなどから異物または異姿勢を検出するようにしてもよいし、画像データから物品の識別子及び姿勢の識別子を抽出するようにして、比較少数の第４の領域を特定しても良い。 Furthermore, the information processing method may further include a process of specifying a fourth area for a foreign object or an article with an abnormal posture in the article group based on the image data in the fourth area. A foreign object or a different posture may be detected from the feature vector of the image data, or the identifier of the article and the identifier of the posture may be extracted from the image data to specify the comparatively small number of fourth regions. good.

以上述べた情報処理方法をコンピュータに実行させるためのプログラムを作成することができて、そのプログラムは、様々な記憶媒体に記憶される。 A program for causing a computer to execute the information processing method described above can be created, and the program is stored in various storage media.

また、上で述べたような情報処理方法を実行する情報処理装置は、１台のコンピュータで実現される場合もあれば、複数台のコンピュータで実現される場合もあり、それらを合わせて情報処理システム又は単にシステムと呼ぶものとする。 Further, the information processing apparatus that executes the information processing method as described above may be realized by a single computer, or may be realized by a plurality of computers. shall be called a system or simply a system.

１００情報処理装置
１０１画像取得部
１０２画像データ格納部
１０３配列方向検出部
１０４第１データ格納部
１０５個別物品検出部
１０６第２データ格納部
１０７３Ｄ配置パターン生成部
１０８第３データ格納部
１０９欠品検出部
１１０異物検出部 100 Information processing device 101 Image acquisition unit 102 Image data storage unit 103 Arrangement direction detection unit 104 First data storage unit 105 Individual article detection unit 106 Second data storage unit 107 3D arrangement pattern generation unit 108 Third data storage unit 109 Missing item Detection unit 110 Foreign object detection unit

Claims

In the image of the group of articles arranged in line, the area of the group of articles continuing in the vertical direction parallel to the direction of gravity, the area of the group of articles continuing in the horizontal direction, and the area of the group of articles continuing in the horizontal direction are orthogonal to the vertical direction and the horizontal direction. A first area of the vertically continuous article group and a first area of the horizontally continuous article group are detected from the input image by a trained model trained to detect the area of the group of articles continuous in the depth direction. A program for causing a computer to execute a process of detecting a second area of the above and a third area of the article group that is continuous in the depth direction.

a fourth area for each article included in the article group in the input image based on the overlapping state of at least two of the first area, the second area, and the third area; 2. The program according to claim 1, further causing said computer to execute specifying processing.

In the process of identifying the fourth area,
If there is an area that does not overlap with other areas among the first area, the second area, and the third area, learning is performed to detect the area of each individual article in the article group. 3. The program according to claim 2, wherein the second learned model specifies a fourth area for each article for areas that do not overlap with the other areas.

In the process of identifying the fourth area,
If the difference area between any one of the first to third areas and the sum area of the fourth area detected in any of the areas is larger than one detected fourth area, the difference area 4. The program according to claim 2 or 3, wherein a new fourth area is specified based on the size of the detected fourth area.

generating a virtual network that represents the fourth area with nodes and represents adjacency relationships of the fourth area with links in the vertical direction, the horizontal direction, and the depth direction;
5. The program according to any one of claims 2 to 4, further causing said computer to execute a process of determining a three-dimensional arrangement pattern of said article group based on said generated virtual network.

6. The program according to claim 5, further causing the computer to execute a process of specifying missing positions of articles in the three-dimensional arrangement pattern based on the three-dimensional arrangement pattern.

Based on the number of fourth regions included in the first region, the number of fourth regions included in the second region, and the number of fourth regions included in the third region, the article 5. The program according to any one of claims 2 to 4, further causing said computer to execute a process of identifying missing positions.

5. The computer according to any one of claims 2 to 4, further causing the computer to execute a process of specifying a fourth area for a foreign object or an article with an abnormal posture in the article group based on the image data in the fourth area. One described program.

In the image of the group of articles arranged in line, the area of the group of articles continuing in the vertical direction parallel to the direction of gravity, the area of the group of articles continuing in the horizontal direction, and the area of the group of articles continuing in the horizontal direction are orthogonal to the vertical direction and the horizontal direction. A first area of the vertically continuous article group and a first area of the horizontally continuous article group are detected from the input image by a trained model trained to detect the area of the group of articles continuous in the depth direction. and a third area of the group of articles that are continuous in the depth direction. An information processing method in which a computer executes a process.

a fourth area for each article included in the article group in the input image based on the overlapping state of at least two of the first area, the second area, and the third area; The information processing method according to claim 9, further comprising identifying.

In the image of the group of articles arranged in line, the area of the group of articles continuing in the vertical direction parallel to the direction of gravity, the area of the group of articles continuing in the horizontal direction, and the area of the group of articles continuing in the horizontal direction are orthogonal to the vertical direction and the horizontal direction. A first area of the vertically continuous article group and a first area of the horizontally continuous article group are detected from the input image by a trained model trained to detect the area of the group of articles continuous in the depth direction. and means for detecting a third area of the group of articles continuous in the depth direction.

a fourth area for each article included in the article group in the input image based on the overlapping state of at least two of the first area, the second area, and the third area; 12. The information processing system according to claim 11, further comprising identifying means.