JP2021533449A

JP2021533449A - Store realogram based on deep learning

Info

Publication number: JP2021533449A
Application number: JP2021504467A
Authority: JP
Inventors: ジョーダンイー．フィッシャー，; ダニエルエル．フィシェッティ，; マイケルエス．サスワル，; ニコラスジェイ．ロカスチオ，
Original assignee: スタンダードコグニションコーポレーション
Priority date: 2018-07-26
Filing date: 2019-07-25
Publication date: 2021-12-02
Anticipated expiration: 2039-07-25
Also published as: TWI779219B; CA3107446A1; TW202013240A; EP3827391A4; WO2020023798A1; EP3827391A1; JP7228671B2

Abstract

実空間のエリア内の在庫商品を追跡するためのシステム及び手法が提供される。複数のカメラまたはその他のセンサは、実空間内の対応する視野内にそれぞれの画像シーケンスを生成する。各カメラの視野は、少なくとも１つの他のカメラの視野と重なる。システムは、複数のカメラに連結され、複数のカメラ内の少なくとも２つのカメラによって生成された画像シーケンスを使用して、在庫イベントを識別する。在庫イベントは、商品識別子、位置、及び、タイムスタンプを含む。実空間のエリア内に座標を有する複数のセルは、データセットとしてメモリ内に記憶される。処理システムは、在庫イベントのそれぞれのカウントを使用して、特定のセルにマッチングする位置を有する在庫商品について、スコアリング時にスコアを計算する。【選択図】図１１ＡSystems and methods for tracking inventories in real-space areas are provided. Multiple cameras or other sensors generate their respective image sequences within the corresponding fields of view in real space. The field of view of each camera overlaps the field of view of at least one other camera. The system is coupled to multiple cameras and uses an image sequence generated by at least two cameras in the multiple cameras to identify inventory events. Inventory events include product identifiers, locations, and time stamps. A plurality of cells having coordinates in an area of real space are stored in memory as a data set. The processing system uses each count of inventory events to calculate a score at the time of scoring for an inventory item that has a position that matches a particular cell. [Selection diagram] FIG. 11A

Description

Priority application

本出願は、参照により本明細書に組み込まれる２０１８年７月２６日出願の米国仮特許出願第６２／７０３，７８５号（代理人整理番号ＳＴＣＧ１００６−１）、及び、２０１９年１月２４日出願の米国特許出願第１６／２５６，３５５号（代理人整理番号ＳＴＣＧ１００７−１）の利益を主張する。該米国特許出願第１６／２５６，３５５号は、２０１７年８月７日出願の米国仮特許出願第６２／５４２，０７７号（代理人整理番号ＳＴＣＧ１０００−１）の利益を主張する、２０１７年１２月１９日出願の米国特許出願第１５／８４７，７９６（代理人整理番号ＳＴＣＧ１００１−１）（現在は米国特許第１０，０５５，８５３号、８月２１日発行）の一部継続出願である、２０１８年２月２７日出願の米国特許出願第１５／９０７，１１２号（代理人整理番号ＳＴＣＧ１００２−１）（現在は米国特許第１０，１３３，９３３号、２０１８年１１月２０日発行）の一部継続出願である、２０１８年４月４日出願の米国特許出願第１５／９４５，４７３号（代理人整理番号ＳＴＣＧ１００５−１）の一部継続出願である。これらの米国特許出願は参照により本明細書に組み込まれる。 This application is incorporated herein by reference in US Provisional Patent Application No. 62 / 703,785 (agent reference number STCG 1006-1), filed July 26, 2018, and January 24, 2019. Claim the interests of US Patent Application No. 16 / 256,355 (agent reference number STCG 1007-1). The US Patent Application No. 16 / 256,355 claims the benefit of US Provisional Patent Application No. 62 / 542,077 (agent reference number STCG 1000-1) filed August 7, 2017, 2017. A partial continuation of US Patent Application No. 15 / 847,796 (agent reference number STCG 1001-1) (currently US Patent No. 10,055,853, issued on August 21) filed on December 19th. There is a US Patent Application No. 15 / 907,112 filed on February 27, 2018 (agent reference number STCG 1002-1) (currently US Patent No. 10,133,933, issued November 20, 2018). ), Which is a partial continuation application of US Patent Application No. 15 / 945,473 (agent reference number STCG 1005-1) filed on April 4, 2018. These US patent applications are incorporated herein by reference.

本発明は、在庫陳列構造を含む実空間のエリア内の在庫商品を追跡するシステムに関する。 The present invention relates to a system for tracking inventory items in a real space area including an inventory display structure.

ショッピングストア等の実空間のエリア内の在庫陳列構造にストックされる様々な在庫商品の数量及び位置を決定することは、ショッピングストアの効率的な業務のために必要とされる。顧客等の実空間のエリア内にいる被写体は、棚から商品を取り、その商品をそれぞれのショッピングカートまたはバスケット内に置く。また、顧客は、商品を購入したくない場合にはその商品を同じ棚または別の棚に置いて戻すこともできる。従って、ある期間にわたって、在庫商品は、棚上の指定された位置から取り出され、ショッピングストア内の他の棚に分散し得る。幾つかのシステムでは、ストックされた商品の数量は、領収書とストック在庫とを連結する必要があるため、かなりの遅延の後に利用可能である。ショッピングストアにストックされる商品の数量に関する情報の利用可能性の遅延は、顧客の購入決定、並びに、高需要の在庫商品をより多く注文するための店舗管理者の行動に影響を及ぼす可能性がある。 Determining the quantity and location of various inventories stocked in an inventory display structure within a real space area such as a shopping store is required for the efficient operation of the shopping store. A subject in a real-space area, such as a customer, takes a product from a shelf and places the product in their shopping cart or basket. Customers can also put the item back on the same shelf or on a different shelf if they do not want to purchase the item. Thus, over a period of time, inventories may be removed from a designated location on a shelf and distributed to other shelves in a shopping store. In some systems, the quantity of goods in stock is available after a significant delay because the receipt and stock inventory need to be linked. Delays in the availability of information about the quantity of merchandise stocked in a shopping store can affect customer purchase decisions as well as store manager behavior to order more high-demand inventory merchandise. be.

棚にストックされた商品の量をリアルタイムでより効果的且つ自動的に提供し、棚上の商品の位置を識別することができるシステムを提供することが望ましい。 It is desirable to provide a system that can more effectively and automatically provide the amount of goods stocked on the shelves in real time and identify the position of the goods on the shelves.

実空間のエリアの在庫商品を追跡するためのシステム及びシステムの動作方法が提供される。複数のカメラまたはその他のセンサは、実空間内の対応する視野のそれぞれの画像シーケンスを生成する。このシステムは、複数のセンサに結合され、少なくとも２つのセンサによって生成された画像シーケンスを使用して、在庫イベントを識別する処理ロジックを含む。システムは、在庫イベントに応答して、実空間のエリア内の在庫商品を追跡する。 A system for tracking inventory items in a real space area and a method of operating the system are provided. Multiple cameras or other sensors generate an image sequence for each of the corresponding fields of view in real space. The system includes processing logic that is coupled to multiple sensors and uses an image sequence generated by at least two sensors to identify inventory events. The system tracks inventory items in a real-space area in response to inventory events.

実空間のエリア内の在庫商品を追跡するためのシステム及び方法が提供される。複数のカメラまたはその他のセンサは、実空間内の対応する視野のそれぞれの画像シーケンスを生成する。各センサの視野は、複数のセンサにおける少なくとも１つの他のセンサの視野と重なる。システムは、複数のセンサ内の少なくとも２つのセンサによって生成された画像のシーケンスを使用して、在庫イベントを識別する。システムは、在庫イベントに応答して、実空間のエリア内の在庫商品の位置を追跡する。 A system and method for tracking inventory items in a real space area are provided. Multiple cameras or other sensors generate an image sequence for each of the corresponding fields of view in real space. The field of view of each sensor overlaps with the field of view of at least one other sensor in the plurality of sensors. The system uses a sequence of images generated by at least two sensors in multiple sensors to identify inventory events. The system tracks the location of inventory items within a real-space area in response to inventory events.

一実施形態では、在庫イベントは、商品識別子、置くまたは取るインジケータ、実空間のエリアの３つの軸に沿った位置によって表される位置、及びタイムスタンプを含む。システムは、実空間のエリア内に座標を有する複数のセルを規定するデータセットを格納するメモリを含むか、またはアクセスを有することができる。システムは、在庫商品の位置をセルの座標とマッチングさせるロジックを含み、複数のセル内のセルとマッチングする在庫商品を表すデータを維持する。実空間のエリアは、複数の在庫位置を含むことができる。複数のセル内のセルの座標は、複数の在庫位置内の在庫位置または在庫位置の一部と相関することができる。システムは、在庫イベントのそれぞれのカウントを使用して、特定のセルにマッチングする位置を有する在庫商品について、スコアリング時にスコアを計算するロジックを含む。セルのスコアを計算するロジックは、置くこと及び取ることのタイムスタンプとスコアリング時との間の分離によって重み付けされた在庫商品を置くこと及び取ることの合計を使用する。システムは、スコアをメモリに格納するロジックを含む。 In one embodiment, the inventory event comprises a goods identifier, an indicator to place or take, a position represented by a position along three axes of an area in real space, and a time stamp. The system may include or have access to a memory that stores a dataset that defines multiple cells with coordinates within an area of real space. The system includes logic that matches the position of the inventory item with the coordinates of the cell, and maintains data representing the inventory item that matches the cells in a plurality of cells. An area in real space can contain multiple inventory locations. The coordinates of a cell within a plurality of cells can correlate with a stock position or part of a stock position within a plurality of stock positions. The system includes logic to calculate the score at the time of scoring for inventory items that have positions that match a particular cell, using each count of inventory events. The logic for calculating the cell's score uses the sum of placing and taking inventory items weighted by the separation between the putting and taking timestamps and the scoring time. The system contains logic to store the score in memory.

一実施形態では、システムが、複数のセル内のセル及び該セルのスコアを表す表示画像をレンダリングするロジックを含む。本実施形態では、セルを表す表示画像における色の変化によってスコアが表される。システムは、スコアに基づいてセル毎の在庫商品のセットを選択するロジックを含んでいる。一実施形態では、実空間のエリアが複数の在庫位置を含み、複数のセル内のセルの座標は複数の在庫位置内の在庫位置と相関する。この実施形態では、メモリ内のデータセットが、実空間のエリア内に座標を有する複数のセルを規定する。 In one embodiment, the system comprises a logic that renders a cell in a plurality of cells and a display image representing the score of the cell. In this embodiment, the score is expressed by the change in color in the display image representing the cell. The system includes logic to select a set of inventories per cell based on the score. In one embodiment, the real space area contains a plurality of inventory positions, and the coordinates of the cells in the plurality of cells correlate with the inventory positions in the plurality of inventory positions. In this embodiment, the dataset in memory defines a plurality of cells having coordinates within an area of real space.

システムは、実空間のエリア内の在庫位置及び在庫位置に配置される在庫商品を識別するプラノグラムを格納するメモリを含むか、またはメモリへのアクセスを有することができる。プラノグラムは、また、特定の在庫商品に対して指定された在庫位置の部分に関する情報を含むことができる。プラノグラムは、実空間のエリア内の在庫位置上の在庫商品の配置のためのプランに基づいて生成することができる。 The system may include or have access to a memory that stores a stock position within an area of real space and a planogram that identifies the stock item placed at the stock position. The planogram can also contain information about the portion of the stock position specified for a particular stock item. Planograms can be generated based on a plan for the placement of inventory goods on inventory locations within real-space areas.

システムは、複数のセル内のセルとマッチングする在庫商品を表すデータを維持するロジックを含む。システムは、また、セルとマッチングする在庫商品を表すデータをプラノグラムと比較することによって、誤配置された商品を決定するロジックを含むことができる。 The system includes logic that maintains data representing in-stock items that match cells within multiple cells. The system can also include logic to determine misplaced merchandise by comparing data representing in-stock merchandise that matches the cell with the planogram.

システムは、ここで議論されたように、検出された在庫イベントにおける商品及びその位置ついてのデータの蓄積に基づいて、実空間のエリア内の在庫商品の位置を識別する、本明細書で「リアログラム」と呼ばれるデータ構造を、メモリ内に生成し格納することができる。リアログラム内のデータは、誤配置された商品の位置を見つける等、在庫商品が計画と比較してエリア内にどのように配置されているかを決定するために、プラノグラム内のデータと比較することができる。また、リアログラムは、例えば、在庫位置のプラノグラムまたは他のマップから決定され得るように、３次元セル内の在庫商品の位置を見つけ、それらのセルを店舗内の在庫位置と相関させるために処理され得る。また、リアログラムを処理して、エリア内の様々な位置にある特定の在庫商品に関連する活動を追跡することができる。リアログラムの他の使用も可能である。 As discussed herein, the system identifies the location of an inventory item within a real-space area based on the accumulation of data about the item and its position in the detected inventory event, "rear" herein. A data structure called a "loggram" can be generated and stored in memory. The data in the realogram is compared to the data in the planogram to determine how the in-stock items are placed in the area compared to the plan, such as finding the location of misplaced items. be able to. Realograms are also used to locate inventory items in 3D cells and correlate those cells with inventory positions in stores, for example, as can be determined from inventory position planograms or other maps. Can be processed. You can also process the realogram to track activities related to specific inventory items at various locations within the area. Other uses of rearograms are also possible.

在庫陳列構造を含む実空間のエリア内の在庫商品を追跡するためのシステム及び方法が提供される。システムは、在庫陳列構造の上方に配置された複数のカメラを含む。カメラは、実空間内の対応する視野内に在庫陳列構造のそれぞれの画像シーケンスを生成する。各カメラの視野は、複数のカメラ内の少なくとも１つの他のカメラの視野と重なる。データセットは、実空間のエリア内に座標を有する複数のセルを規定する。データセットはメモリに保存される。システムは、実空間のエリア内の３次元における在庫イベントの位置を見つけるために、複数のカメラによって生成された画像シーケンスを処理する。在庫イベントに応答して、システムは、在庫イベントの位置に基づいてデータセット内の最も近いセルを決定するロジックを含む。システムは、在庫イベントのそれぞれのカウントを使用して特定のセルにマッチングする位置を有する在庫イベントに関連する在庫商品についてスコアをスコアリング時に計算するロジックを含む。 Systems and methods for tracking inventory items within a real-space area, including inventory display structures, are provided. The system includes multiple cameras located above the inventory display structure. The camera produces each image sequence of the inventory display structure in the corresponding field of view in real space. The field of view of each camera overlaps the field of view of at least one other camera in the plurality of cameras. A dataset defines multiple cells that have coordinates within an area of real space. The dataset is stored in memory. The system processes an image sequence generated by multiple cameras to locate inventory events in three dimensions within an area of real space. In response to an inventory event, the system contains logic to determine the closest cell in the dataset based on the location of the inventory event. The system includes logic that calculates the score at the time of scoring for inventory items related to an inventory event that have positions that match a particular cell using each count of the inventory event.

一実施形態では、システムは、スコアに基づいてセル毎の在庫商品のセットを選択するロジックを含んでいる。一実施形態では、在庫イベントは、商品識別子、置くまたは取るインジケータ、実空間のエリアの３つの軸に沿った位置によって表される位置、及びタイムスタンプを含む。一実施形態では、システムは、実空間のエリア内の座標を有する２次元グリッドとして表される複数のセルを規定するデータセットを含む。セルは、在庫位置の前面図の部分と相関することができる。処理システムは、在庫イベントの位置に基づいて最も近いセルを決定するロジックを含む。一実施形態では、システムは、実空間のエリア内の座標を有する３次元グリッドとして表される複数のセルを規定するデータセットを含む。セルは、在庫位置上の容積の部分と相関することができる。処理システムは、在庫イベントの位置に基づいて最も近いセルを決定するロジックを含む。置くインジケータは、商品が在庫位置に置かれたことを識別し、取るインジケータは、商品が在庫位置から取り出されたことを識別する。 In one embodiment, the system includes logic to select a set of inventories per cell based on a score. In one embodiment, the inventory event comprises a goods identifier, an indicator to place or take, a position represented by a position along three axes of an area in real space, and a time stamp. In one embodiment, the system comprises a dataset that defines multiple cells represented as a two-dimensional grid with coordinates within an area of real space. The cell can correlate with a portion of the front view of the stock position. The processing system includes logic to determine the closest cell based on the position of the inventory event. In one embodiment, the system comprises a dataset that defines multiple cells represented as a three-dimensional grid with coordinates within an area of real space. The cell can correlate with a portion of the volume on the stock position. The processing system includes logic to determine the closest cell based on the position of the inventory event. The put indicator identifies that the item has been placed in the inventory position, and the take indicator identifies that the item has been removed from the inventory position.

一実施形態では、複数のカメラによって生成された画像シーケンスを処理するロジックは、画像認識エンジンを備える。画像認識エンジンは、手に対応する画像内の要素を表すデータセットを生成する。システムは、少なくとも２つのカメラからの画像シーケンスからのデータセットの分析を実行して、３次元における在庫イベントの位置を決定するロジックを含む。画像認識エンジンは、畳み込みニューラル・ネットワークを備える。 In one embodiment, the logic for processing an image sequence generated by a plurality of cameras comprises an image recognition engine. The image recognition engine produces a dataset that represents the elements in the image that correspond to the hand. The system includes logic that performs analysis of the dataset from image sequences from at least two cameras to determine the location of inventory events in three dimensions. The image recognition engine comprises a convolutional neural network.

一実施形態において、システムは、置くこと及び取ることのタイムスタンプとスコアリング時との間の分離によって重み付けされた在庫商品を取ること及び置くことの合計を使用してセルのスコアを計算するロジックを含む。スコアはメモリに記憶される。一実施形態では、在庫イベントの位置に基づいてデータセット内の最も近いセルを決定するロジックは、在庫イベントの位置からデータセット内のセルまでの距離を計算することと、計算された距離に基づいて在庫イベントをセルとマッチングさせることとを含む。 In one embodiment, the system uses the sum of taking and putting in stock weighted by the separation between the putting and taking time stamps and the scoring time to calculate the cell score. including. The score is stored in memory. In one embodiment, the logic that determines the closest cell in the dataset based on the location of the inventory event is to calculate the distance from the location of the inventory event to the cell in the dataset and based on the calculated distance. Includes matching inventory events with cells.

コンピュータ・システムによって実行することができる方法及びコンピュータ・プログラム製品も、本明細書において説明されている。 Methods and computer program products that can be performed by a computer system are also described herein.

本明細書で説明される機能は、在庫イベントに関連する商品を含む在庫イベントを識別し、実空間のエリア内の座標を有する複数のセル内のセルにリンクすること、及び、店舗リアログラムを更新することを含み、これらに限定されないが、例えば、処理される画像データのタイプ、画像データのどの処理を実行すべきか、及び、画像データからどのように動作を高い信頼性で決定するかに関する、コンピュータ・エンジニアリングの複雑な問題を提示する。 The functions described herein identify inventory events, including goods related to inventory events, link to cells within multiple cells that have coordinates within an area of real space, and store realograms. With respect to, but not limited to, the type of image data to be processed, which processing of the image data to perform, and how to reliably determine the behavior from the image data, including, but not limited to, updating. Presents the complex problems of computer engineering.

本発明の他の態様及び利点は、以下の図面、詳細な説明、及び特許請求の範囲を検討することによって理解することができる。 Other aspects and advantages of the invention can be understood by examining the following drawings, detailed description, and claims.

店舗在庫エンジン及び店舗リアログラム・エンジンが在庫陳列構造を含む実空間のエリア内の在庫商品を追跡するシステムのアーキテクチャ・レベルの概略図を示す。Shown is an architectural level schematic of a system in which a store inventory engine and a store realogram engine track inventory items in a real space area including an inventory display structure.

ショッピングストア内の被写体、在庫陳列構造、及びカメラア配置を示すショッピングストア内の通路の側面図である。It is a side view of the aisle in a shopping store which shows the subject in a shopping store, the inventory display structure, and the arrangement of a camera.

在庫陳列構造内の棚から商品を取り出す被写体を示す図２Ａの通路内の在庫陳列構造の斜視図である。It is a perspective view of the inventory display structure in the aisle of FIG. 2A which shows the subject which takes out the goods from the shelf in the inventory display structure.

在庫陳列構造における棚の２Ｄ及び３Ｄマップの例を示す。An example of a 2D and 3D map of a shelf in an inventory display structure is shown.

被写体の関節情報を記憶するための例示的なデータ構造を示す。An exemplary data structure for storing joint information of a subject is shown.

関連する関節の情報を含む被写体を記憶するための例示的なデータ構造を示す。An exemplary data structure for storing a subject, including information on related joints, is shown.

棚から取り出された商品を示す在庫イベントの位置に基づく在庫陳列構造内の棚の選択を示す、ショッピングストア内の図２Ａの通路内の棚ユニットの在庫陳列構造の上面図である。It is a top view of the inventory display structure of the shelf unit in the aisle of FIG. 2A in a shopping store showing the selection of shelves in the inventory display structure based on the position of the inventory event indicating the goods taken out of the shelves.

被写体のショッピングカート、または、棚上またはショッピングストア内にストックされている在庫商品を格納するために利用可能なログ・データ構造の例を示す。Shown is an example of a log data structure that can be used to store a subject's shopping cart or inventory items that are stocked on a shelf or in a shopping store.

在庫商品を置くこと及び取ることの位置に基づいて棚上及びショッピングストア内の在庫商品を決定する処理ステップを示すフローチャートである。It is a flowchart which shows the process step which determines the inventory product on a shelf and in a shopping store based on the position of placing and taking an inventory product.

図８のフローチャートに示された技術を使用して、実空間のエリア内の棚上の在庫商品を決定することができる例示的なアーキテクチャである。It is an exemplary architecture in which the techniques shown in the flowchart of FIG. 8 can be used to determine inventory items on shelves within an area of real space.

図８のフローチャートに示された技術を使用して、店舗在庫データ構造を更新することができる例示的なアーキテクチャである。It is an exemplary architecture in which the store inventory data structure can be updated using the techniques shown in the flowchart of FIG.

２次元（２Ｄ）グリッドを使用した在庫陳列構造内の部分における棚の離散化を示す。Shown is the discretization of shelves in a portion of an inventory display structure using a two-dimensional (2D) grid.

在庫陳列構造内の棚の部分上の指定された位置から、同じ棚上の他の位置へ、及び、１日後にショッピングストア内の他の在庫陳列構造内の異なる棚上の位置へ分散した在庫商品の位置を示す、棚の３次元（３Ｄ）グリッドを使用したリアログラムの例示である。Inventory distributed from a specified location on a portion of a shelf in an inventory display structure to another location on the same shelf, and one day later to a position on a different shelf in another inventory display structure in the shopping store. It is an example of a realogram using a three-dimensional (3D) grid of shelves showing the location of goods.

コンピューティング・デバイスのユーザ・インタフェース上に表示される図１１Ａのリアログラムを示す一例である。It is an example showing the realogram of FIG. 11A displayed on the user interface of a computing device.

在庫商品を置くこと及び取ることの位置に基づいて、ショッピングストア内の在庫陳列構造の棚にストックされている在庫商品のリアログラムを算出するための処理ステップを示すフローチャートである。It is a flowchart which shows the process step for calculating the realogram of the inventory goods which are stocked in the shelf of the inventory display structure in a shopping store based on the position of putting and taking stock goods.

リアログラムを用いて在庫商品の再ストックを決定する処理ステップを示すフローチャートである。It is a flowchart which shows the processing step which determines the restocking of an inventory product using a realogram.

在庫商品に対する再ストック通知を表示する例示的なユーザ・インタフェースである。An exemplary user interface for displaying restock notifications for in-stock items.

リアログラムを用いて、プラノグラムのコンプライアンスを判定する処理ステップを示すフローチャートである。It is a flowchart which shows the processing step which determines the compliance of a planogram using a realogram.

在庫商品に対する誤配置商品通知を表示する例示的なユーザ・インタフェースである。An exemplary user interface for displaying misplaced merchandise notifications for in-stock merchandise.

リアログラムを使用して在庫商品予想の信頼度スコア確率を調整するための処理ステップを示すフローチャートである。It is a flowchart which shows the processing step for adjusting the confidence score probability of an inventory product forecast using a realogram.

図１の在庫統合エンジン及び店舗リアログラム・エンジンをホストするように構成されたカメラ及びコンピュータ・ハードウェア構成である。A camera and computer hardware configuration configured to host the inventory integration engine and store realogram engine of FIG.

以下の説明は、当業者が本発明を作成し使用することを可能にするために提示され、特定の用途及びその要件に即して提供される。開示された実施態様に対する様々な修正は、当業者には容易に明らかであり、本明細書で定義される一般原則は、本発明の精神及び範囲から逸脱することなく、他の実施態様及び用途に適用され得る。従って、本発明は、示された実施態様に限定されることを意図するものではなく、本明細書に開示された原理及び特徴と一致する最も広い範囲が与えられるべきである。
［システム概要］ The following description is presented to allow one of ordinary skill in the art to create and use the invention and is provided in line with the particular application and requirements thereof. Various modifications to the disclosed embodiments will be readily apparent to those of skill in the art, and the general principles defined herein do not deviate from the spirit and scope of the invention of other embodiments and uses. Can be applied to. Accordingly, the invention is not intended to be limited to the embodiments shown, and should be given the broadest scope consistent with the principles and features disclosed herein.
[System overview]

図１〜図１３を参照して、対象技術のシステム及び様々な実施態様を説明する。システム及び処理は、本実施態様によるシステムのアーキテクチャ・レベル概略図である図１を参照して説明される。図１は、アーキテクチャ図であるため、説明の明確性を向上させるために、特定の詳細は省略されている。 The system of the target technique and various embodiments will be described with reference to FIGS. 1 to 13. The system and processing will be described with reference to FIG. 1, which is a schematic of the architecture level of the system according to this embodiment. Since FIG. 1 is an architectural diagram, certain details are omitted to improve the clarity of the description.

図１の説明は、以下のように編成される。最初に、システムの要素を説明し、次にそれらの相互接続を説明する。次に、システムにおける要素の使用についてより詳細に説明する。 The description of FIG. 1 is organized as follows. First, the elements of the system will be described, and then their interconnections will be described. Next, the use of elements in the system will be described in more detail.

図１は、システム１００のブロック図レベルの説明図を提供する。本システム１００は、カメラ１１４、ネットワーク・ノードがホストする画像認識エンジン１１２ａ、１１２ｂ及び１１２ｎ、ネットワーク上のネットワーク・ノード（または、ノード）１０４内に配置された店舗在庫エンジン１８０、ネットワーク上のネットワーク・ノード（または、ノード）１０４内に配置された店舗リアログラム・エンジン１９０、被写体追跡エンジンをホストするネットワーク・ノード１０２、マップ・データベース１４０、在庫イベント・データベース１５０、プラノグラム及び在庫データベース１６０、リアログラム・データベース１７０、及び、１または複数の通信ネットワーク１８１を含む。ネットワーク・ノードは、１つの画像認識エンジンのみ、または、本明細書で説明されるように、複数の画像認識エンジンをホストすることができる。システムは、また、被写体データベース、及び、他のサポートデータを含むことができる。 FIG. 1 provides an explanatory diagram at the block diagram level of the system 100. The system 100 includes a camera 114, image recognition engines 112a, 112b and 112n hosted by a network node, a store inventory engine 180 located within a network node (or node) 104 on the network, and a network on the network. Store realogram engine 190 located within node (or node) 104, network node 102 hosting subject tracking engine, map database 140, inventory event database 150, planogram and inventory database 160, realogram -Includes a database 170 and one or more communication networks 181. The network node can host only one image recognition engine or multiple image recognition engines as described herein. The system can also include a subject database and other support data.

本明細書で使用されるように、ネットワーク・ノードは、ネットワークに接続され、通信チャネルを介して他のネットワーク・ノードとの間で情報を送信、受信、または転送することができる、アドレス可能なハードウェア・デバイスまたは仮想デバイスである。ハードウェア・ネットワーク・ノードとして配置することができる電子デバイスの例には、あらゆる種類のコンピュータ、ワークステーション、ラップトップ・コンピュータ、ハンドヘルド・コンピュータ、及びスマートフォンが含まれる。ネットワーク・ノードは、クラウドベースのサーバ・システムで実施することができる。ネットワーク・ノードとして構成された複数の仮想デバイスを、単一の物理デバイスを使用して実施することができる。 As used herein, a network node is addressable, connected to a network and capable of transmitting, receiving, or forwarding information to and from other network nodes over a communication channel. It is a hardware device or a virtual device. Examples of electronic devices that can be deployed as hardware network nodes include all types of computers, workstations, laptop computers, handheld computers, and smartphones. Network nodes can be implemented in cloud-based server systems. Multiple virtual devices configured as network nodes can be implemented using a single physical device.

明確性のために、画像認識エンジンをホストする３つのネットワーク・ノードのみがシステム１００に示されている。しかしながら、画像認識エンジンをホストする任意の数のネットワーク・ノードを、ネットワーク１８１を介して被写体追跡エンジン１１０に接続することができる。同様に、本明細書で説明する画像認識エンジン、被写体追跡エンジン、店舗在庫エンジン、店舗リアログラム・エンジン、及び、他の処理エンジンは、分散アーキテクチャ内の複数のネットワーク・ノードを使用して実行することができる。 For clarity, only three network nodes hosting the image recognition engine are shown in system 100. However, any number of network nodes hosting the image recognition engine can be connected to the subject tracking engine 110 via the network 181. Similarly, the image recognition engine, subject tracking engine, store inventory engine, store realogram engine, and other processing engines described herein run using multiple network nodes in a distributed architecture. be able to.

次に、システム１００の要素の相互接続について説明する。ネットワーク１８１は、画像認識エンジン１１２ａ、１１２ｂ、及び１１２ｎをそれぞれホストするネットワーク・ノード１０１ａ、１０１ｂ、及び１０１ｎ、店舗在庫エンジン１８０をホストするネットワーク・ノード１０４、店舗リアログラム・エンジン１９０をホストするネットワーク・ノード１０６、追跡エンジン１１０をホストするネットワーク・ノード１０２、マップ・データベース１４０、在庫イベント・データベース１５０、在庫データベース１６０、及び、リアログラム・データベース１７０を結合する。カメラ１１４は、画像認識エンジン１１２ａ、１１２ｂ、及び１１２ｎをホストするネットワーク・ノードを介して被写体追跡エンジン１１０に接続される。一実施形態では、カメラ１１４がショッピングストア（スーパーマーケット等）に設置され、重なり合う視野を有するカメラ１１４のセット（２つ以上）が各通路の上に配置されて、店舗内の実空間の画像を取得する。図１では、２つのカメラが通路１１６ａの上に配置され、２つのカメラが通路１１６ｂの上に配置され、３つのカメラが通路１１６ｎの上に配置されている。カメラ１１４は、重なり合う視野を有する通路上に設置される。斯かる実施形態では、カメラは、ショッピングストアの通路内を移動する顧客がいつの時点でも２つ以上のカメラの視野内に存在することを目標として構成される。 Next, the interconnection of the elements of the system 100 will be described. The network 181 is a network node 101a, 101b, and 101n that hosts the image recognition engines 112a, 112b, and 112n, respectively, a network node 104 that hosts the store inventory engine 180, and a network node that hosts the store realogram engine 190. The node 106, the network node 102 that hosts the tracking engine 110, the map database 140, the inventory event database 150, the inventory database 160, and the realogram database 170 are combined. The camera 114 is connected to the subject tracking engine 110 via a network node that hosts the image recognition engines 112a, 112b, and 112n. In one embodiment, the camera 114 is installed in a shopping store (supermarket, etc.), and a set (two or more) of cameras 114 having overlapping fields of view is arranged on each aisle to acquire an image of the real space in the store. do. In FIG. 1, two cameras are arranged on the passage 116a, two cameras are arranged on the passage 116b, and three cameras are arranged on the passage 116n. The camera 114 is installed on a passageway having overlapping fields of view. In such an embodiment, the camera is configured with the goal of having a customer moving in the aisle of a shopping store in the field of view of two or more cameras at any given time.

カメラ１１４は、互いに時間的に同期させることができ、その結果、画像は、同時にまたは時間的に近く、かつ同じ画像キャプチャレートで取得される。カメラ１１４は、画像認識エンジン１１２ａ〜１１２ｎをホストするネットワーク・ノードに、所定のレートでそれぞれの継続的な画像ストリームを送ることができる。同時にまたは時間的に近くに、実空間のエリアをカバーする全てのカメラにおいて取得された画像は、同期された画像が実空間において固定された位置を有する被写体の異なる光景を表すものとして処理エンジンにおいて識別され得るという意味で、同期している。例えば、一実施形態では、カメラが、３０フレーム／秒（ｆｐｓ）のレートで、画像認識エンジン１１２ａ〜１１２ｎをホストするそれぞれのネットワーク・ノードに画像フレームを送信する。各フレームは、画像データと共に、タイムスタンプ、カメラの識別情報（「カメラＩＤ」と略される）、及びフレーム識別情報（「フレームＩＤ」と略される）を有する。開示された技術の他の実施形態は、このデータを生成するために、赤外線イメージ・センサ、高周波イメージ・センサ、超音波センサ、熱センサ、ライダ（Ｌｉｄａｒ）等の様々なタイプのセンサを使用することができる。ＲＧＢカラー出力を生成するカメラ１１４に追加して、例えば、赤外線または高周波イメージ・センサを含む、複数タイプのセンサが使用され得る。複数のセンサは互いに時間的に同期され、その結果、フレームは、センサによって同時または時間的に近接して、同じフレーム・キャプチャレートで取得される。明細書に開示される全ての実施形態において、カメラ以外のセンサ、または、複数タイプのセンサが、使用される画像シーケンスを生成するために使用され得る。 The cameras 114 can be temporally synchronized with each other so that the images are acquired simultaneously or close in time and at the same image capture rate. The camera 114 can send each continuous image stream at a predetermined rate to the network node hosting the image recognition engines 112a-112n. Images acquired by all cameras covering an area of real space at the same time or close in time are such that the synchronized image represents a different view of the subject with a fixed position in real space in the processing engine. Synchronized in the sense that they can be identified. For example, in one embodiment, the camera sends image frames at a rate of 30 frames per second (fps) to each network node hosting the image recognition engines 112a-112n. Each frame has a time stamp, camera identification information (abbreviated as "camera ID"), and frame identification information (abbreviated as "frame ID") together with image data. Other embodiments of the disclosed techniques use various types of sensors such as infrared image sensors, high frequency image sensors, ultrasonic sensors, thermal sensors, lidars, etc. to generate this data. be able to. In addition to the camera 114 that produces RGB color output, multiple types of sensors may be used, including, for example, infrared or high frequency image sensors. Multiple sensors are temporally synchronized with each other so that frames are captured simultaneously or temporally by the sensors at the same frame capture rate. In all embodiments disclosed herein, sensors other than cameras, or multiple types of sensors, may be used to generate the image sequence used.

通路上に設置されたカメラは、それぞれの画像認識エンジンに接続される。例えば、図１において、通路１１６ａ上に設置された２つのカメラは、画像認識エンジン１１２ａをホストするネットワーク・ノード１０１ａに接続される。同様に、通路１１６ｂ上に設置された２つのカメラは、画像認識エンジン１１２ｂをホストするネットワーク・ノード１０１ｂに接続される。ネットワーク・ノード１０１ａ〜１０１ｎ内でホストされる各画像認識エンジン１１２ａ〜１１２ｎは、図示の例ではそれぞれ１つのカメラから受信した画像フレームを別々に処理する。 Cameras installed on the aisle are connected to their respective image recognition engines. For example, in FIG. 1, the two cameras installed on the passage 116a are connected to the network node 101a hosting the image recognition engine 112a. Similarly, the two cameras installed on the passage 116b are connected to the network node 101b that hosts the image recognition engine 112b. Each image recognition engine 112a-112n hosted within the network nodes 101a-101n separately processes image frames received from one camera in the illustrated example.

一実施形態では、各画像認識エンジン１１２ａ、１１２ｂ、及び１１２ｎは、畳み込みニューラル・ネットワーク（ＣＮＮと略す）などの深層学習アルゴリズムとして実装される。斯かる実施形態では、ＣＮＮがトレーニング・データベース１５０を使用してトレーニングされる。本明細書で説明される実施形態では、実空間内の被写体の画像認識が、画像内で認識可能な関節を識別しグループ化することに基づいており、関節のグループは個々の被写体に帰属することができる。この関節ベースの分析のために、トレーニング・データベース１５０は、被写体のための異なるタイプの関節の各々に対して膨大な画像を収集している。ショッピングストアの例示的な実施形態では、被写体は、棚の間の通路を移動する顧客である。例示的な実施形態では、ＣＮＮのトレーニング中に、システム１００は「トレーニング・システム」と呼ばれる。トレーニング・データベース１５０を使用してＣＮＮをトレーニングした後、ＣＮＮは、プロダクション・モードに切り替えられ、ショッピングストア内の顧客の画像をリアルタイムで処理する。 In one embodiment, each image recognition engine 112a, 112b, and 112n is implemented as a deep learning algorithm such as a convolutional neural network (abbreviated as CNN). In such an embodiment, the CNN is trained using the training database 150. In the embodiments described herein, image recognition of a subject in real space is based on identifying and grouping recognizable joints in an image, where the group of joints belongs to an individual subject. be able to. For this joint-based analysis, the training database 150 collects enormous images for each of the different types of joints for the subject. In an exemplary embodiment of a shopping store, the subject is a customer moving through an aisle between shelves. In an exemplary embodiment, during CNN training, the system 100 is referred to as a "training system". After training the CNN using the training database 150, the CNN is switched to production mode to process the customer's image in the shopping store in real time.

例示的な実施形態では、プロダクション中に、システム１００はランタイム・システムと呼ばれる（推論システムとも呼ばれる）。それぞれの画像認識装置のＣＮＮは、それぞれの画像ストリーム中の画像に対して関節データ構造の配列を生成する。本明細書に記載される実施形態では、関節データ構造の配列が、各処理された画像に対して生成されることで、各画像認識エンジン１１２ａ〜１１２ｎが、関節データ構造の配列の出力ストリームを生成する。重なり合う視野を有するカメラからの関節データ構造のこれらの配列は、関節のグループを形成し、斯かる関節のグループを被写体として識別するために、更に処理される。システムは、被写体が実空間のエリア内に存在している間、識別子「被写体ＩＤ」を使用して被写体を識別及び追跡することができる。 In an exemplary embodiment, during production, the system 100 is referred to as a run-time system (also referred to as an inference system). The CNN of each image recognition device produces an array of joint data structures for the images in each image stream. In the embodiments described herein, an array of joint data structures is generated for each processed image so that each image recognition engine 112a-112n produces an output stream of the array of joint data structures. Generate. These arrays of joint data structures from cameras with overlapping fields of view are further processed to form groups of joints and identify such groups of joints as subjects. The system can identify and track a subject using the identifier "subject ID" while the subject is in an area of real space.

被写体追跡エンジン１１０は、ネットワーク・ノード１０２上でホストされ、この例では、画像認識エンジン１１２ａ〜１１２ｎから被写体の関節データ構造の配列の継続的なストリームを受信する。被写体追跡エンジン１１０は、関節データ構造の配列を処理し、様々なシーケンスの画像に対応する関節データ構造の配列内の要素の座標を、実空間内の座標を有する候補関節に変換する。同期画像の各セットについて、実空間全体にわたって識別された候補関節の組み合わせは、類推目的のために、候補関節の銀河に似ていると考えることができる。後続の各時点において、銀河が経時的に変化するように、候補関節の動きが記録される。被写体追跡エンジン１１０は、ある時点での実空間のエリア内の被写体を識別する。 The subject tracking engine 110 is hosted on the network node 102 and, in this example, receives a continuous stream of an array of subject joint data structures from the image recognition engines 112a-112n. The subject tracking engine 110 processes an array of joint data structures and converts the coordinates of the elements in the array of joint data structures corresponding to the images of various sequences into candidate joints having coordinates in real space. For each set of synchronized images, the combination of candidate joints identified throughout real space can be considered to resemble a galaxy of candidate joints for analogical purposes. At each subsequent time point, the movement of the candidate joint is recorded as the galaxy changes over time. The subject tracking engine 110 identifies a subject in a real space area at a given point in time.

追跡エンジン１１０は、実空間内の座標を有する候補関節のグループまたはセットを、実空間内の被写体として識別するロジックを使用する。類推目的のために、候補点の各セットは、各時点における候補関節の星座（コンステレーション）に似ている。候補関節のコンステレーションは、時間とともに移動することができる。ある期間にわたる被写体追跡エンジン１１０の出力の時系列分析は、実空間のエリア内の被写体の動きを識別する。 The tracking engine 110 uses logic that identifies a group or set of candidate joints with coordinates in real space as subjects in real space. For analogical purposes, each set of candidate points resembles the constellation of the candidate joint at each point in time. Candidate joint constellations can move over time. A time series analysis of the output of the subject tracking engine 110 over a period of time identifies the movement of the subject within an area of real space.

例示的な実施形態では、候補関節のセットを識別するロジックが、実空間内の被写体の関節間の物理的関係に基づくヒューリスティック関数を含む。これらのヒューリスティック関数は、候補関節のセットを被写体として識別するために使用される。候補関節のセットは、他の個々の候補関節とのヒューリスティック・パラメータに基づく関係を有する個々の候補関節、及び、個々の被写体として識別された、または識別され得る特定され得る所与のセット内の候補関節のサブセットを含む。 In an exemplary embodiment, the logic for identifying a set of candidate joints comprises a heuristic function based on the physical relationship between the joints of the subject in real space. These heuristic functions are used to identify a set of candidate joints as a subject. A set of candidate joints is within a given set of individual candidate joints that have heuristic parameter-based relationships with other individual candidate joints and that can be identified or can be identified as individual subjects. Includes a subset of candidate joints.

ショッピングストアの例では、顧客（上記の被写体とも呼ばれる）が通路内及びオープンスペース内を移動する。顧客は、在庫陳列構造内の棚上の在庫位置から商品を取り出す。在庫陳列構造の一例では、棚はフロアから様々なレベル（または高さ）に配置され、在庫商品は棚上にストックされる。棚は、壁に固定されるか、または、ショッピングストア内の通路を形成する自立棚として配置され得る。在庫陳列構造の他の例には、ペグボード棚、マガジン棚、回転式棚、倉庫棚、及び、冷蔵棚ユニットが含まれる。在庫商品は、積み重ねワイヤバスケット、ダンプビン等の他のタイプの在庫陳列構造にストックすることもできる。また、顧客は商品を、それらが取り出された棚から同じ棚に、または別の棚に戻すこともできる。 In the example of a shopping store, a customer (also referred to as the subject above) travels in an aisle and an open space. The customer retrieves the goods from the stock position on the shelf in the stock display structure. In one example of an inventory display structure, shelves are placed at various levels (or heights) from the floor, and inventory items are stocked on the shelves. The shelves may be fixed to the wall or placed as self-supporting shelves forming aisles within the shopping store. Other examples of inventory display structures include pegboard shelves, magazine shelves, rotary shelves, warehouse shelves, and refrigerated shelves units. Inventory items can also be stocked in other types of inventory display structures such as stacked wire baskets, dump bins and the like. Customers can also return merchandise from the shelves from which they were taken out to the same shelf or to another shelf.

システムは、顧客が棚に物品を置き、棚から物品を取り出すときに、ショッピングストア内の在庫位置における在庫を更新するために、（ネットワーク・ノード１０４上でホストされる）店舗在庫エンジン１８０を含む。店舗在庫エンジンは、在庫位置に置かれた在庫商品の識別子（在庫管理単位またはＳＫＵ等）を示すことによって、在庫位置の在庫データ構造を更新する。在庫統合エンジンは、また、ショッピングストアにストックされたそれらの数量を更新することによって、ショッピングストアの在庫データ構造を更新する。在庫位置及び店舗在庫データは、顧客の在庫データ（在庫商品のログ・データ構造、または、ショッピングカート・データ構造とも呼ばれる）とともに、在庫データベース１６０に格納される。 The system includes a store inventory engine 180 (hosted on network node 104) to update inventory at inventory locations within a shopping store when a customer places an item on a shelf and removes the item from the shelf. .. The store inventory engine updates the inventory data structure of an inventory position by indicating an identifier (stock management unit or SKU, etc.) of the inventory item placed at the inventory position. The inventory integration engine also updates the inventory data structure of a shopping store by updating those quantities stocked in the shopping store. The stock position and the store stock data are stored in the stock database 160 together with the customer's stock data (also referred to as a log data structure of stock products or a shopping cart data structure).

店舗在庫エンジン１８０は、在庫位置における在庫商品の状態を提供する。しかしながら、どの在庫商品が棚のどの部分に置かれているかを、何時でも決定することは困難である。これは、ショッピングストアの管理者や従業員にとって重要な情報である。在庫商品は、棚及び在庫商品がストックされることが計画されている棚上の位置を識別するプラノグラムに基づいて、在庫位置に配置することができる。例えば、ケチャップボトルは、列状の配置を形成する在庫陳列構造において、全ての棚の所定の左側部分にストックされてもよい。時間の経過につれて、顧客は棚からケチャップボトルを取り出し、それぞれのバスケットまたはショッピングカートに入れる。一部の顧客は、ケチャップボトルを、同じ在庫陳列構造内の同じ棚の別の部分に戻すかもしれない。また、顧客は、ショッピングストア内の他の在庫陳列構造の棚にケチャップボトルを戻すこともあり得る。店舗リアログラム・エンジン１９０（ネットワーク・ノード１０６上でホストされる）は、ケチャップボトルが時間「ｔ」に配置される棚の部分を識別するために使用できるリアログラムを生成する。この情報は、誤配置されたケチャップボトルの位置を有する従業員への通知を生成するために、システムによって使用され得る。 The store inventory engine 180 provides the state of the inventory item at the inventory position. However, it is always difficult to determine which inventory item is placed on which part of the shelf. This is important information for shopping store managers and employees. In-stock items can be placed in inventory positions based on the shelves and planograms that identify the positions on the shelves where the in-stock items are planned to be stocked. For example, ketchup bottles may be stocked in a predetermined left portion of all shelves in an inventory display structure that forms a row of arrangements. Over time, customers remove ketchup bottles from the shelves and place them in their respective baskets or shopping carts. Some customers may return ketchup bottles to another part of the same shelf within the same inventory display structure. Customers may also return ketchup bottles to other inventory display shelves in the shopping store. The store realogram engine 190 (hosted on network node 106) produces a realogram that can be used to identify the portion of the shelf on which the ketchup bottles are placed at time "t". This information can be used by the system to generate notifications to employees with misplaced ketchup bottle locations.

また、この情報は、実空間のエリア内の在庫商品の位置を遅れずに追跡する、本明細書でリアログラムと呼ばれるデータ構造を生成するために、実空間のエリア内の在庫商品にわたって使用され得る。在庫商品の現在の状態を反映し、幾つかの実施形態では、ある時間間隔にわたって指定された時間「ｔ」における在庫商品の状態を反映する、店舗リアログラム・エンジン１９０によって生成されたショッピングストアのリアログラムは、リアログラム・データベース１７０に保存することができる。 This information is also used across inventory items within a real space area to generate a data structure referred to herein as a realogram that keeps track of the location of inventory items within a real space area. obtain. A shopping store generated by a store database engine 190 that reflects the current state of the goods in stock and, in some embodiments, the state of the goods in stock at a specified time "t" over a time interval. The realogram can be stored in the realogram database 170.

ネットワーク１８１を介した、店舗在庫エンジン１７０をホストするネットワーク・ノード１０４、及び、店舗リアログラム・エンジン１９０をホストするネットワーク・ノード１０６への実際の通信経路は、公衆ネットワーク及び／またはプライベート・ネットワーク上のポイント・ツー・ポイントとすることができる。通信は、プライベート・ネットワーク、ＶＰＮ、ＭＰＬＳ回路、またはインターネットなどの様々なネットワーク１８１を介して行うことができ、適切なアプリケーション・プログラミング・インターフェース（ＡＰＩ）及びデータ交換フォーマット、例えば、ＲＥＳＴ（Representational State Transfer）、ＪＳＯＮ（JavaScript（商標）Object Notation）、ＸＭＬ（Extensible Markup Language）、ＳＯＡＰ（Simple Object Access Protocol）、ＪＭＳ（Java（商標）Message Service）、及び／またはＪａｖａプラットフォーム・モジュール・システム等を使用することができる。全ての通信は、暗号化することができる。通信は、一般に、ＥＤＧＥ、３Ｇ、４ＧＬＴＥ、Ｗｉ−Ｆｉ、及びＷｉＭＡＸ等のプロトコルを介して、ＬＡＮ(ローカル・エリア・ネットワーク）、ＷＡＮ(ワイド・エリア・ネットワーク）、電話ネットワーク（公衆交換電話網（ＰＳＴＮ））、セッション開始プロトコル（ＳＩＰ）、無線ネットワーク、ポイント・ツー・ポイント・ネットワーク、星型ネットワーク，トークンリング型ネットワーク，ハブ型ネットワーク、インターネット（モバイルインターネットを含む）等のネットワーク上で行われる。更に、ユーザ名／パスワード、オープン許可（ＯＡｕｔｈ）、Ｋｅｒｂｅｒｏｓ、ＳｅｃｕｒｅＩＤ、デジタル証明書などの様々な承認及び認証技術を使用して、通信を保護することができる。 The actual communication path via the network 181 to the network node 104 hosting the store inventory engine 170 and to the network node 106 hosting the store realogram engine 190 is on a public network and / or a private network. Can be point-to-point. Communication can be over various networks 181 such as private networks, SOAP, XML circuits, or the Internet, with appropriate application programming interfaces (APIs) and data exchange formats such as REST (Representational State Transfer). ), JSON (JavaScript (trademark) Object Notation), XML (Extensible Markup Language), SOAP (Simple Object Access Protocol), JMS (Java (trademark) Message Service), and / or Java platform module system, etc. be able to. All communications can be encrypted. Communication is generally via protocols such as EDGE, 3G, 4GLTE, Wi-Fi, and WiMAX, LAN (local area network), WAN (wide area network), and telephone network (public exchange telephone network (public exchange telephone network). PSTN)), session initiation protocol (SIP), wireless network, point-to-point network, star network, token ring network, hub network, internet (including mobile internet). In addition, various authorization and authentication techniques such as username / password, OAuth, Kerberos, RSA SecurID, digital certificates, etc. can be used to protect communications.

本明細書に開示される技術は、データベース・システム、マルチテナント環境、または、Ｏｒａｃｌｅ（商標）と互換性のあるデータベース実施態様、ＩＢＭＤＢ２ＥｎｔｅｒｐｒｉｓｅＳｅｒｖｅｒ（商標）と互換性のあるリレーショナル・データベース実施態様、ＭｙＳＱＬ（商標）またはＰｏｓｔｇｒｅＳＱＬ（商標）と互換性のあるリレーショナル・データベース実施態様またはＭｉｃｒｏｓｏｆｔＳＱＬＳｅｒｖｅｒ（商標）と互換性のあるリレーショナル・データベース実施態様等のリレーショナル・データベース実施態様、または、Ｖａｍｐｉｒｅ（商標）と互換性のある非リレーショナル・データベース実施態様、ＡｐａｃｈｅＣａｓｓａｎｄｒａ（商標）と互換性のある非リレーショナル・データベース実施態様、ＢｉｇＴａｂｌｅ（商標）と互換性のある非リレーショナル・データベース実施態様、またはＨＢａｓｅ（商標）またはＤｙｎａｍｏＤＢ（商標）と互換性のある非リレーショナル・データベース実施態様、等のＮｏＳＱＬ（商標）の非リレーショナル・データベース実施態様を含む何かのコンピュータ実装システムという状況下で実施され得る。更に、開示された技術は、ＭａｐＲｅｄｕｃｅ（商標）、バルク同期プログラミング、ＭＰＩプリミティブ等の様々なプログラミングモデル、または、ＡｐａｃｈｅＳｔｏｒｍ（商標）、ＡｐａｃｈｅＳｐａｒｋ（商標）、ＡｐａｃｈｅＫａｆｋａ（商標）、ＡｐａｃｈｅＦｌｉｎｋ（商標）、Ｔｒｕｖｉｓｏ（商標）、ＡｍａｚｏｎＥｌａｓｔｉｃｓｅａｒｃｈＳｅｒｖｉｃｅ（商標）、ＡｍａｚｏｎＷｅｂＳｅｒｖｉｃｅｓ（ＡＷＳ）（商標）、ＩＢＭＩｎｆｏ‐Ｓｐｈｅｒｅ（商標）、Ｂｏｒｅａｌｉｓ（商標）、及びＹａｈｏｏ！Ｓ４（商標）等の様々なスケーラブルなバッチ及びストリーム管理システムを使用して実施され得る。

［カメラ配置］ The techniques disclosed herein are database systems, multi-tenant environments, or database embodiments compatible with Oracle ™, relational database embodiments compatible with IBM DB2 Enterprise Server ™. , A relational database embodiment such as a relational database embodiment compatible with MySQL ™ or PostgreSQL ™ or a relational database embodiment compatible with Microsoft SQL Server ™, or Vampire ™. ), A non-relational database embodiment compatible with Apache Cassandra ™, a non-relational database embodiment compatible with BigTable ™, or HBase ™. ) Or a non-relational database embodiment compatible with DynamoDB ™, etc., which may be implemented in the context of any computer-implemented system including a NoSQL ™ non-relational database embodiment. In addition, the disclosed technology may include various programming models such as MapReduce ™, bulk synchronous programming, MPI primitives, or Apache Storm ™, Apache Spark ™, Apache Kafka ™, Apache Link ™. ), Truviso ™, Amazon Sparksarch Services ™, Amazon Web Services (AWS) ™, IBM Info-Sphere ™, Borealis ™, and Yahoo! It can be performed using various scalable batch and stream management systems such as S4 ™.

[Camera placement]

カメラ１１４は、３次元（３Ｄと略される）実空間において多関節被写体（または存在物）を追跡するように配置される。ショッピングストアの例示的な実施形態では、実空間は、販売用の商品が棚に積み重ねられるショッピングストアのエリアを含むことができる。実空間内の点は、（ｘ，ｙ，ｚ）座標系で表すことができる。システムが適用される実空間のエリア内の各点は、２つ以上のカメラ１１４の視野によってカバーされる。 The camera 114 is arranged to track an articulated subject (or entity) in three-dimensional (abbreviated as 3D) real space. In an exemplary embodiment of a shopping store, the real space can include an area of the shopping store where goods for sale are stacked on shelves. Points in real space can be represented by a (x, y, z) coordinate system. Each point in the real space area to which the system is applied is covered by the field of view of two or more cameras 114.

ショッピングストアでは、棚及び他の在庫陳列構造は、ショッピングストアの側壁に沿って、または通路を形成する列に、または２つの構成の組合せでなど、様々な方法で配置することができる。図２Ａは、通路１１６ａの一端から見た、通路１１６ａを形成する棚ユニットＡ２０２及び棚ユニットＢ２０４の配置を示す。２つのカメラ、カメラＡ２０６及びカメラＢ２０８は、棚ユニットＡ２０２及び棚ユニットＢ２０４等の棚のような在庫陳列構造の上のショッピングストアの天井２３０及びフロア２２０から所定の距離で通路１１６ａの上に配置される。カメラ１１４は、実空間内の在庫陳列構造及びフロアエリアのそれぞれの部分を包含する視野を有し、その上に配置されたカメラを備える。図２Ａに示すように、カメラＡ２０６の視野２１６とカメラＢ２０８の視野２１８は互いに重なり合っている。被写体として識別された候補関節のセットのメンバーの実空間内の座標は、フロアエリア内の被写体の位置を識別する。 In a shopping store, shelves and other inventory display structures can be arranged in various ways, such as along the side walls of the shopping store, in rows forming aisles, or in combination of the two configurations. FIG. 2A shows the arrangement of the shelf unit A 202 and the shelf unit B 204 forming the aisle 116a as seen from one end of the aisle 116a. The two cameras, camera A 206 and camera B 208, are located in passage 116a at a predetermined distance from the ceiling 230 and floor 220 of the shopping store on an inventory display structure such as shelves such as shelf unit A 202 and shelf unit B 204. Placed on top. The camera 114 has a field of view that includes each portion of the inventory display structure and the floor area in the real space, and includes a camera arranged on the inventory display structure. As shown in FIG. 2A, the field of view 216 of the camera A 206 and the field of view 218 of the camera B 208 overlap each other. The coordinates in real space of the members of the set of candidate joints identified as the subject identify the position of the subject in the floor area.

ショッピングストアの例示的な実施形態では、実空間は、ショッピングストア内のフロア２２０の全てを含むことができる。カメラ１１４は、フロア２２０及び棚のエリアが少なくとも２つのカメラによって見えるように配置され、配向される。カメラ１１４は、また、棚２０２及び２０４の前のフロアスペースを覆う。カメラの角度は急峻な視点、真っ直ぐな視点及び角度の付いた視点の両方を有するように選択され、これにより、顧客のより完全な身体画像が得られる。一実施形態では、カメラ１１４が、ショッピングストア全体を通して、８フィート高さ以上で構成される。図１３に、斯かる実施形態の説明図を示す。 In an exemplary embodiment of a shopping store, the real space can include all of the floors 220 in the shopping store. The cameras 114 are arranged and oriented so that the floor 220 and shelf areas are visible by at least two cameras. The camera 114 also covers the floor space in front of the shelves 202 and 204. The angle of the camera is chosen to have both a steep, straight and angled viewpoint, which gives a more complete body image of the customer. In one embodiment, the camera 114 is configured to be at least 8 feet high throughout the shopping store. FIG. 13 shows an explanatory diagram of such an embodiment.

図２Ａにおいて、被写体２４０は在庫陳列構造の棚ユニットＢ２０４の傍に立っており、片手は、棚ユニットＢ２０４内の棚（目に見えない）の近くに位置している。図２Ｂは、フロアから異なる高さに配置された４つの棚、棚１、棚２、棚３、及び棚４を備えた棚ユニットＢ２０４の斜視図である。在庫商品はこれらの棚にストックされる。

［３次元シーン生成］ In FIG. 2A, the subject 240 stands beside the shelf unit B 204 of the inventory display structure, and one hand is located near the shelf (invisible) in the shelf unit B 204. FIG. 2B is a perspective view of a shelf unit B 204 having four shelves, shelves 1, shelves 2, shelves 3, and shelves 4 arranged at different heights from the floor. In-stock items are stocked on these shelves.

[3D scene generation]

実空間内の位置は、実空間座標系の（ｘ，ｙ，ｚ）点として表される。「ｘ」及び「ｙ」は、ショッピングストアのフロア２２０とすることができる２次元（２Ｄ）平面上の位置を表し、値「ｚ」は、１つの構成ではフロア２２０における２Ｄプレーン上の点の高さである。システムは、２つ以上のカメラからの２Ｄ画像を組み合わせて、実空間のエリア内における関節及び在庫イベント（商品を棚へ置く及び棚から商品を取る）の３次元位置を生成する。本項では、関節及び在庫イベントの３Ｄ座標を生成するための処理を説明する。該処理は、３Ｄシーン生成とも呼ばれる。 Positions in real space are represented as points (x, y, z) in the real space coordinate system. The "x" and "y" represent positions on a two-dimensional (2D) plane that can be the floor 220 of the shopping store, and the value "z" is a point on the 2D plane on the floor 220 in one configuration. The height. The system combines 2D images from two or more cameras to generate a three-dimensional position for joints and inventory events (putting goods on shelves and taking goods from shelves) within a real-space area. This section describes the process for generating 3D coordinates of joints and inventory events. The process is also referred to as 3D scene generation.

在庫商品を追跡するために訓練または推論モードでシステム１００を使用する前に、２つのタイプのカメラ較正、すなわち、内部較正と外部較正が実行される。内部較正では、カメラ１１４の内部パラメータが較正される。内部カメラ・パラメータの例には、焦点距離、主点、スキュー、魚眼係数等がある。内部カメラ較正のための種々の技術を使用することができる。斯かる技術の１つは、Ｚｈａｎｇによって、２０００年１１月の第２２巻第１１号、パターン解析と機械知能に関するＩＥＥＥトランザクションで発表された「カメラ較正のためのフレキシブルな新手法」において提示されている。 Two types of camera calibration, namely internal calibration and external calibration, are performed prior to using the system 100 in training or inference mode to track in-stock items. Internal calibration calibrates the internal parameters of the camera 114. Examples of internal camera parameters include focal length, principal point, skew, fisheye factor, and the like. Various techniques for internal camera calibration can be used. One such technique was presented by Zhang in November 2000, Vol. 22, No. 11, "Flexible New Techniques for Camera Calibration," presented in the IEEE Transaction on Pattern Analysis and Machine Intelligence. There is.

外部較正では、外部カメラ・パラメータが、２Ｄ画像データを実空間の３Ｄ座標に変換するためのマッピング・パラメータを生成するために較正される。一実施形態では、人物などの１つの多関節被写体が実空間に導入される。多関節被写体は、各カメラ１１４の視野を通過する経路上で実空間を移動する。実空間内の任意の所与の点において、多関節被写体は、３Ｄシーンを形成する少なくとも２つのカメラの視野内に存在する。しかしながら、２つのカメラは、それぞれの２次元（２Ｄ）画像平面において同じ３Ｄシーンの異なるビューを有する。多関節被写体の左手首などの３Ｄシーン内の特徴は、それぞれの２Ｄ画像平面内の異なる位置にある２つのカメラによって見られる。 In external calibration, external camera parameters are calibrated to generate mapping parameters for converting 2D image data into real space 3D coordinates. In one embodiment, one articulated subject, such as a person, is introduced into real space. The articulated subject moves in real space on a path that passes through the field of view of each camera 114. At any given point in real space, the articulated subject is within the field of view of at least two cameras forming a 3D scene. However, the two cameras have different views of the same 3D scene in their respective two-dimensional (2D) image planes. Features within a 3D scene, such as the left wrist of an articulated subject, are seen by two cameras at different positions in each 2D image plane.

点対応は、所与のシーンについて重複する視野を有する全てのカメラ・ペアの間で確立される。各カメラは同じ３Ｄシーンの異なる視野を有するので、点対応は３Ｄシーンにおける同じ点の投影を表す２つのピクセル位置（重なり合う視野を有する各カメラからの１つの位置）である。外部較正のために、画像認識エンジン１１２ａ〜１１２ｎの結果を使用して、各３Ｄシーンについて多くの点対応が識別される。画像認識エンジンは関節の位置を、それぞれのカメラ１１４の２Ｄ画像平面内のピクセルの（ｘ，ｙ）座標、例えば、行及び列番号として識別する。一実施形態では、関節は、多関節被写体の１９の異なるタイプの関節のうちの１つである。多関節被写体が異なるカメラの視野を通って移動するとき、追跡エンジン１１０は、較正に使用される多関節被写体の１９の異なるタイプの関節の各（ｘ，ｙ）座標を、画像毎にカメラ１１４から受け取る。 Point correspondence is established among all camera pairs that have overlapping fields of view for a given scene. Since each camera has a different field of view in the same 3D scene, the point correspondence is two pixel positions (one position from each camera with overlapping fields of view) that represent the projection of the same point in the 3D scene. For external calibration, the results of the image recognition engines 112a-112n are used to identify many point correspondences for each 3D scene. The image recognition engine identifies the position of the joint as the (x, y) coordinates of the pixels in the 2D image plane of each camera 114, eg, row and column numbers. In one embodiment, the joint is one of 19 different types of joints in an articulated subject. As the articulated subject moves through the field of view of different cameras, the tracking engine 110 captures each (x, y) coordinates of 19 different types of joints of the articulated subject used for calibration for each image of the camera 114. Receive from.

例えば、カメラＡからの画像と、カメラＢからの画像との両方が同じ時点に、重なり合う視野で撮影された場合を考える。カメラＡからの画像には、カメラＢからの同期画像のピクセルに対応するピクセルがあり、カメラＡとカメラＢの両方の視野内の或る物体または表面の特定の点があり、その点が両方の画像フレームのピクセルに取り込まれていると考える。外部カメラ較正では、多数のそのような点が識別され、対応点と呼ばれる。較正中にカメラＡ及びカメラＢの視野内に１つの多関節被写体があるので、この多関節被写体の主要な関節、例えば左手首の中心が識別される。これらの主要な関節がカメラＡ及びカメラＢの両方からの画像フレーム内に見える場合、これらは対応点を表すと仮定される。この処理は、多くの画像フレームについて繰り返され、重なり合う視野を有する全てのカメラ・ペアについて対応点の大きな集合を構築する。一実施形態では、画像が３０ＦＰＳ(フレーム／秒）以上のレートで、フルＲＧＢ(赤、緑、及び青）カラーで７２０ピクセルの解像度で、全てのカメラからストリーミングされる。これらの画像は、一次元配列（フラット配列とも呼ばれる）の形態である。 For example, consider a case where both the image from the camera A and the image from the camera B are taken at the same time point in an overlapping field of view. The image from camera A has pixels that correspond to the pixels of the synchronized image from camera B, and there is a particular point on an object or surface in the field of view of both camera A and camera B, which are both. It is considered that it is captured in the pixels of the image frame of. External camera calibration identifies a number of such points and is referred to as corresponding points. Since there is one articulated subject in the field of view of camera A and camera B during calibration, the major joints of this articulated subject, such as the center of the left wrist, are identified. If these major joints are visible within the image frame from both camera A and camera B, they are assumed to represent corresponding points. This process is repeated for many image frames to build a large set of corresponding points for all camera pairs with overlapping fields of view. In one embodiment, the image is streamed from all cameras at a rate of 30 FPS (frames / second) or higher, in full RGB (red, green, and blue) colors at a resolution of 720 pixels. These images are in the form of a one-dimensional array (also called a flat array).

多関節被写体について上記で収集された多数の画像を使用して、重なり合う視野を有するカメラ間の対応点を決定することができる。重なり合う視野を有する２つのカメラＡ及びＢを考える。カメラＡ、Ｂのカメラ中心と３Ｄシーンの関節位置（特徴点ともいう）を通る平面を「エピポーラ平面」と呼び、エピポーラ平面とカメラＡ、Ｂの２Ｄ画像平面との交差箇所を「エピポーラ線」と定義する。これらの対応点が与えられると、カメラＡからの対応点を、カメラＢの画像フレーム内の対応点と交差することが保証されるカメラＢの視野内のエピポーラ線に正確にマッピングすることができる変換が決定される。多関節被写体について上記で収集された画像フレームを使用して、変換が生成される。この変換は非線形であることが当技術分野で知られている。更に、一般形態では、投影された空間へ及び投影された空間から移動する非線形座標変換と同様に、それぞれのカメラのレンズの半径方向の歪み補正が必要であることが知られている。外部カメラ較正では、理想的な非線形変換への近似が非線形最適化問題を解くことによって決定される。この非線形最適化機能は、重なり合う視野を有するカメラ１１４の画像を処理する様々な画像認識エンジン１１２ａ〜１１２ｎの出力（関節データ構造の配列）内の同じ関節を識別するために、被写体追跡エンジン１１０によって使用される。内部カメラ較正及び外部カメラ較正の結果は、較正データベース１７０に格納される。 The large number of images collected above for an articulated subject can be used to determine correspondence points between cameras with overlapping fields of view. Consider two cameras A and B with overlapping fields of view. The plane that passes through the center of the cameras A and B and the joint position (also called the feature point) of the 3D scene is called the "epipolar plane", and the intersection of the epipolar plane and the 2D image planes of the cameras A and B is the "epipolar line". Is defined as. Given these correspondence points, the correspondence points from camera A can be accurately mapped to epipolar lines in the field of view of camera B that are guaranteed to intersect the correspondence points in the image frame of camera B. The conversion is determined. The transformation is generated using the image frames collected above for the articulated subject. It is known in the art that this transformation is non-linear. Further, it is known that in the general form, it is necessary to correct the radial distortion of the lens of each camera as well as the non-linear coordinate transformation moving to and from the projected space. In external camera calibration, the approximation to the ideal nonlinear transformation is determined by solving the nonlinear optimization problem. This non-linear optimization function is performed by the subject tracking engine 110 to identify the same joint in the outputs (array of joint data structures) of various image recognition engines 112a-112n that process images of cameras 114 with overlapping fields of view. used. The results of internal camera calibration and external camera calibration are stored in the calibration database 170.

実空間におけるカメラ１１４の画像内の点の相対位置を決定するための様々な手法を使用することができる。例えば、Ｌｏｎｇｕｅｔ−Ｈｉｇｇｉｎｓが、「Ａｃｏｍｐｕｔｅｒａｌｇｏｒｉｔｈｍｆｏｒｒｅｃｏｎｓｔｒｕｃｔｉｎｇａｓｃｅｎｅｆｒｏｍｔｗｏｐｒｏｊｅｃｔｉｏｎｓ」（Ｎａｔｕｒｅ、第２９３巻、１９８１年９月１０日）を公表している。本論文では、２つの投影間の空間的関係が未知であるとき、遠近投影の相関ペアからシーンの３次元構造を計算することが提示されている。Ｌｏｎｇｕｅｔ−Ｈｉｇｇｉｎｓの論文は、実空間での各カメラの他のカメラに対する位置を決定する手法を提示する。更に、その手法は、実空間における多関節被写体の三角測量を可能にし、重なり合う視野を有するカメラ１１４からの画像を使用してｚ座標の値（フロアからの高さ）を識別する。実空間の任意の点、例えば、実空間の一角の棚ユニットの端を、実空間の（ｘ，ｙ，ｚ）座標系上の（０，０，０）点とする。 Various techniques can be used to determine the relative positions of points in the image of the camera 114 in real space. For example, Longuet-Higgins has published "A computer algorithm for reconstructing a scene from two projects" (Nature, Vol. 293, September 10, 1981). In this paper, it is presented to calculate the three-dimensional structure of a scene from a correlation pair of perspective projections when the spatial relationship between the two projections is unknown. The Longuet-Highgins paper presents a method for determining the position of each camera in real space with respect to other cameras. In addition, the technique enables triangulation of articulated subjects in real space and uses images from cameras 114 with overlapping fields of view to identify z-coordinate values (height from the floor). Let any point in real space, for example, the end of a shelf unit in one corner of real space be a (0,0,0) point on the (x, y, z) coordinate system in real space.

本技術の一実施形態では、外部較正のパラメータが２つのデータ構造に格納される。第１のデータ構造は、固有パラメータを格納する。固有パラメータは、３Ｄ座標から２Ｄ画像座標への射影変換を表す。第１のデータ構造は以下に示すように、カメラ毎の固有パラメータを含む。データ値は全て浮動小数点数値である。このデータ構造は、「Ｋ」及び歪み係数として表される３×３固有行列を格納する。歪み係数は、６つの半径方向歪み係数と２つの接線方向歪み係数とを含む。半径方向の歪みは、光線がその光学的中心よりも、レンズの縁部の近傍でより大きく屈曲するときに生じる。接線方向の歪みは、レンズと像平面が平行でないときに生じる。以下のデータ構造は、第１のカメラのみの値を示す。同様のデータが全てのカメラ１１４に対して記憶される。

{
1: {
K: [[x, x, x], [x, x, x], [x, x, x]],
distortion _coefficients: [x, x, x, x, x, x, x, x]
},
......
} In one embodiment of the technique, external calibration parameters are stored in two data structures. The first data structure stores unique parameters. The unique parameter represents a projective transformation from 3D coordinates to 2D image coordinates. The first data structure includes unique parameters for each camera, as shown below. All data values are floating point numbers. This data structure stores a 3x3 eigenmatrix represented as a "K" and a strain coefficient. The strain coefficients include six radial strain coefficients and two tangential strain coefficients. Radial distortion occurs when a ray bends more near the edge of the lens than its optical center. Tangent distortion occurs when the lens and image plane are not parallel. The following data structure shows the values of the first camera only. Similar data is stored for all cameras 114.

{
1: {
K: [[x, x, x], [x, x, x], [x, x, x]],
distortion _coefficients: [x, x, x, x, x, x, x, x]
},
......
}

第２のデータ構造は、カメラ・ペア毎に、３×３基本行列（Ｆ）、３×３必須行列（Ｅ）、３×４投影行列（Ｐ）、３×３回転行列（Ｒ）、及び３×１平行移動ベクトル（ｔ）を記憶する。このデータは、１つのカメラの基準フレーム内の点を別のカメラの基準フレームに変換するために使用される。カメラの各ペアについて、１つのカメラから別のカメラへフロア２２０の平面をマッピングするために、８つのホモグラフィ係数も記憶される。基本行列は、同じシーンの２つの画像間の関係であり、シーンからの点の投影が両方の画像において起こり得る場所を制約する。必須行列は、カメラが較正されている状態での、同じシーンの２つの画像間の関係でもある。投影行列は、３Ｄ実空間から部分空間へのベクトル空間投影を与える。回転行列は、ユークリッド空間における回転を実行するために使用される。平行移動ベクトル「ｔ」は、図形または空間の全ての点を所与の方向に同じ距離だけ移動させる幾何学的変形を表す。ホモグラフィ・フロア係数は、重なり合う視野を有するカメラによって見られるフロア２２０上の被写体の特徴の画像を結合するために使用される。第２のデータ構造を以下に示す。同様のデータが、全てのカメラ・ペアについて記憶される。前述のように、ｘは浮動小数点数値を表す。

{
1: {
2: {
F: [[x, x, x], [x, x, x], [x, x, x]],
E: [[x, x, x], [x, x, x], [x, x, x]],
P: [[x, x, x, x], [x, x, x, x], [x, x, x, x]],
R: [[x, x, x], [x, x, x], [x, x, x]],
t: [x, x, x],
homography_floor_coefficients: [x, x, x, x, x, x, x, x]
}
},
.......
}

［２次元マップ及び３次元マップ］ The second data structure is a 3x3 elementary matrix (F), a 3x3 essential matrix (E), a 3x4 projection matrix (P), a 3x3 rotation matrix (R), and a 3x3 rotation matrix (R) for each camera pair. The 3 × 1 translation vector (t) is stored. This data is used to convert points in the reference frame of one camera to the reference frame of another camera. For each pair of cameras, eight homography coefficients are also stored to map the plane of the floor 220 from one camera to another. The elementary matrix is the relationship between two images in the same scene, constraining where point projections from the scene can occur in both images. The required matrix is also the relationship between two images in the same scene with the camera calibrated. The projection matrix gives a vector space projection from a 3D real space to a subspace. The rotation matrix is used to perform rotations in Euclidean space. The translation vector "t" represents a geometric transformation that moves all points in a figure or space by the same distance in a given direction. The homography floor factor is used to combine images of the features of the subject on the floor 220 seen by cameras with overlapping fields of view. The second data structure is shown below. Similar data is stored for all camera pairs. As mentioned above, x represents a floating point number.

{
1: {
2: {
F: [[x, x, x], [x, x, x], [x, x, x]],
E: [[x, x, x], [x, x, x], [x, x, x]],
P: [[x, x, x, x], [x, x, x, x], [x, x, x, x]],
R: [[x, x, x], [x, x, x], [x, x, x]],
t: [x, x, x],
homography_floor_coefficients: [x, x, x, x, x, x, x, x]
}
},
.......
}

[2D map and 3D map]

ショッピングストア内の棚等の在庫位置は、固有識別子（例えば、棚ＩＤ）によって識別することができる。同様に、ショッピングストアは、固有識別子（例えば、店舗ＩＤ）によって識別することができる。２次元（２Ｄ）及び３次元（３Ｄ）マップ・データベース１４０は、それぞれの座標に沿った実空間のエリア内の在庫位置を識別する。例えば、２Ｄマップでは、マップ内の位置が、図３に示されるように、フロア２２０に垂直に形成された平面、すなわちＸＺ平面上の２次元領域を規定する。マップは、在庫商品が配置される在庫位置のエリアを規定する。図３において、棚ユニットＢ２０４内の棚１の２Ｄビュー３６０は、４つの座標位置（ｘ１，ｚ１）、（ｘ１，ｚ２）、（ｘ２，ｚ２）、及び（ｘ２，ｚ１）によって形成されるエリアを示し、在庫商品が棚１上に配置される２Ｄ領域を規定する。同様の２Ｄ領域が、ショッピングストア内の全ての棚ユニット（または他の在庫陳列構造）内の全ての在庫位置に対して規定される。この情報は、マップ・データベース１４０に記憶される。 The inventory position of a shelf or the like in a shopping store can be identified by a unique identifier (for example, a shelf ID). Similarly, a shopping store can be identified by a unique identifier (eg, store ID). Two-dimensional (2D) and three-dimensional (3D) map databases 140 identify stock locations within real-space areas along their respective coordinates. For example, in a 2D map, positions in the map define a plane formed perpendicular to the floor 220, i.e. a two-dimensional region on the XZ plane, as shown in FIG. The map defines the area of the stock position where the stock goods are placed. In FIG. 3, the 2D view 360 of the shelf 1 in the shelf unit B 204 is formed by four coordinate positions (x1, z1), (x1, z2), (x2, z2), and (x2, z1). An area is indicated and a 2D area in which inventory products are arranged on the shelf 1 is defined. Similar 2D areas are defined for all inventory positions within all shelf units (or other inventory display structures) within a shopping store. This information is stored in the map database 140.

３Ｄマップでは、マップ内の位置が、Ｘ、Ｙ、及びＺ座標によって定義される３Ｄ実空間内の３次元領域を規定する。マップは、在庫商品が配置される在庫位置の容積を規定する。図３において、棚ユニットＢ２０４内の棚１の３Ｄビュー３５０は、３Ｄ領域を規定する８つの座標位置（ｘ１，ｙ１，ｚ１）、（ｘ１，ｙ１，ｚ２）、（ｘ１，ｙ２，ｚ１）、（ｘ１，ｙ２，ｚ２）、（ｘ２，ｙ１，ｚ１）、（ｘ２，ｙ１，ｚ２）、（ｘ２，ｙ２，ｚ１）、（ｘ２，ｙ２，ｚ２）によって形成される容積を示し、在庫商品は、棚１上のその３Ｄ領域内に配置される。同様の３Ｄ領域が、ショッピングストア内の全ての棚ユニット内の在庫位置について規定され、マップ・データベース１４０内に実空間（ショッピングストア）の３Ｄマップとして格納される。図３に示すように、３つの軸に沿った座標位置を使用して、在庫位置の長さ、深さ、及び高さを計算することができる。 In a 3D map, a position in the map defines a three-dimensional region in 3D real space defined by X, Y, and Z coordinates. The map defines the volume of the stock position where the stock goods are placed. In FIG. 3, the 3D view 350 of the shelf 1 in the shelf unit B 204 has eight coordinate positions (x1, y1, z1), (x1, y1, z2), (x1, y2, z1) that define the 3D region. , (X1, y2, z2), (x2, y1, z1), (x2, y1, z2), (x2, y2, z1), (x2, y2, z2). Is placed within its 3D area on shelf 1. A similar 3D area is defined for inventory locations in all shelf units in the shopping store and is stored in the map database 140 as a 3D map of real space (shopping store). As shown in FIG. 3, the coordinate positions along the three axes can be used to calculate the length, depth, and height of the stock position.

一実施形態では、マップが、実空間のエリア内の在庫陳列構造上の在庫位置の部分と相関する容積のユニットの構成を識別する。各部分は、実空間の３つの軸に沿った開始位置及び終了位置によって規定される。在庫位置の部分の同様の構成は、陳列構造の前面図を分割する２Ｄマップ在庫位置を使用して生成することもできる。

［関節データ構造］ In one embodiment, the map identifies the configuration of a unit of volume that correlates with a portion of the inventory position on the inventory display structure within an area of real space. Each part is defined by a start position and an end position along the three axes of real space. A similar configuration of the inventory position portion can also be generated using a 2D map inventory position that divides the front view of the display structure.

[Joint data structure]

画像認識エンジン１１２ａ〜１１２ｎは、カメラ１１４からの画像シーケンスを受け取り、画像を処理して、関節データ構造の対応する配列を生成する。システムは、複数のカメラによって生成された画像シーケンスを使用して、実空間のエリア内の複数の被写体（またはショッピングストア内の顧客）の位置を追跡する処理ロジックを含む。一実施形態では、画像認識エンジン１１２ａ〜１１２ｎが、在庫商品を取っているまたは置いている可能性のあるエリア内の被写体を識別するために使用可能な画像の各要素における被写体の１９個の可能な関節の内の１つを識別する。可能な関節は、足関節と非足関節の２つのカテゴリに分類することができる。関節分類の１９番目のタイプは、被写体の全ての非関節特徴（すなわち、関節として分類されない画像の要素）に対するものである。他の実施形態では、画像認識エンジンが特に手の位置を識別するように構成されてもよい。また、ユーザ・チェックイン手順またはバイオメトリック識別処理等の他の技法を、被写体を識別し、被写体が店舗内を移動する際に被写体の手の検出された位置と被写体をリンクさせる目的のために展開することができる。
足関節：
足首関節（左右）
非足関節：
首
鼻
眼（左右）
耳（左右）
肩（左右）
肘（左右）
手首（左右）
尻（左右）
膝（左右）
非関節
The image recognition engines 112a-112n receive an image sequence from the camera 114 and process the image to generate a corresponding array of joint data structures. The system includes processing logic that tracks the location of multiple subjects (or customers in a shopping store) within an area of real space using an image sequence generated by multiple cameras. In one embodiment, 19 possibilities of the subject in each element of the image that the image recognition engines 112a-112n can use to identify the subject in the area where the inventory may be taken or placed. Identify one of the joints. Possible joints can be divided into two categories: ankle joints and non-ankle joints. The 19th type of joint classification is for all non-joint features of the subject (ie, elements of the image that are not classified as joints). In other embodiments, the image recognition engine may be configured to specifically identify the position of the hand. Also, for the purpose of identifying the subject and linking the detected position of the subject's hand to the subject as the subject moves through the store, other techniques such as user check-in procedures or biometric identification processing. Can be deployed.
Ankle joint:
Ankle joint (left and right)
Non-ankle:
neck
nose
Eyes (left and right)
Ears (left and right)
Shoulder (left and right)
Elbow (left and right)
Wrist (left and right)
Buttocks (left and right)
Knee (left and right)
Non-joint

特定の画像の関節データ構造の配列は、関節タイプ、特定の画像の時間、及び特定の画像内の要素の座標によって、特定の画像の要素を分類する。一実施形態では画像認識エンジン１１２ａ〜１１２ｎが畳み込みニューラル・ネットワーク（ＣＮＮ）であり、関節タイプは被写体の１９種類の関節のうちの１つ、特定の画像の時間は特定の画像についてソースカメラ１１４によって生成された画像のタイムスタンプであり、座標（ｘ，ｙ）は２Ｄ画像平面上の要素の位置を特定する。 An array of joint data structures for a particular image classifies the elements of a particular image by joint type, time of the particular image, and coordinates of the elements within the particular image. In one embodiment, the image recognition engines 112a-112n are convolutional neural networks (CNNs), the joint type is one of 19 types of joints in the subject, and the time of a particular image is determined by the source camera 114 for a particular image. It is a time stamp of the generated image, and the coordinates (x, y) specify the position of the element on the 2D image plane.

ＣＮＮの出力は、カメラ当たりの各画像に対する信頼度配列の行列である。信頼度配列の行列は、関節データ構造の配列に変換される。図４に示すような関節データ構造４００は、各関節の情報を記憶するために使用される。関節データ構造６００は、画像が受信されるカメラの２Ｄ画像空間内の特定の画像内の要素のｘ位置及びｙ位置を識別する。関節番号は、識別された関節のタイプを識別する。例えば、一実施形態では、値は１〜１９の範囲である。値１は関節が左足首であることを示し、値２は関節が右足首であることを示し、以下同様である。関節のタイプは、ＣＮＮの出力行列内のその要素に対する信頼度配列を使用して選択される。例えば、一実施形態では、左足首関節に対応する値がその画像要素の信頼度配列において最も高い場合、関節番号の値は「１」である。 The output of the CNN is a matrix of confidence arrays for each image per camera. The matrix of confidence arrays is transformed into an array of joint data structures. The joint data structure 400 as shown in FIG. 4 is used to store information about each joint. The joint data structure 600 identifies the x and y positions of an element in a particular image within the 2D image space of the camera from which the image is received. The joint number identifies the type of joint identified. For example, in one embodiment, the value ranges from 1 to 19. A value of 1 indicates that the joint is the left ankle, a value of 2 indicates that the joint is the right ankle, and so on. The type of joint is selected using the confidence array for that element in the CNN's output matrix. For example, in one embodiment, if the value corresponding to the left ankle joint is the highest in the confidence array of the image element, the value of the joint number is "1".

信頼度数は、その関節を予測する際のＣＮＮの信頼度の程度を示す。信頼度数の値が高ければ、ＣＮＮは自身の予想に確信していることになる。関節データ構造を一意に識別するために、関節データ構造に整数ＩＤが割り当てられる。上記マッピングに続いて、画像毎の信頼度配列の出力行列５４０は、画像毎の関節データ構造の配列に変換される。一実施形態では、関節分析が、各入力画像に対して、ｋ最近傍、ガウス混合、及び、様々な画像形態変換の組み合わせを実行することを含む。この結果は、各時点において画像数をビットマスクにマッピングするリング・バッファ内にビットマスクの形式で格納することができる関節データ構造の配列を含む。

［被写体追跡エンジン］ The confidence frequency indicates the degree of confidence of the CNN in predicting the joint. If the confidence value is high, the CNN is confident in its expectations. An integer ID is assigned to the joint data structure to uniquely identify the joint data structure. Following the above mapping, the output matrix 540 of the reliability array for each image is converted into an array of joint data structures for each image. In one embodiment, joint analysis comprises performing k-nearest neighbors, Gaussian mixing, and various image morphological transformation combinations for each input image. This result contains an array of joint data structures that can be stored in bitmask format in a ring buffer that maps the number of images to a bitmask at each point in time.

[Subject tracking engine]

追跡エンジン１１０は、重なり合う視野を有するカメラからの画像シーケンス内の画像に対応する、画像認識エンジン１１２ａ〜１１２ｎによって生成された関節データ構造の配列を受信するように構成される。画像当たりの関節データ構造の配列は、画像認識エンジン１１２ａ〜１１２ｎによってネットワーク１８１を介して追跡エンジン１１０に送られる。追跡エンジン１１０は、様々な画像シーケンスに対応する関節データ構造の配列内の要素の座標を、実空間内の座標を有する候補関節に変換する。実空間内の位置は、２つ以上のカメラの視野によってカバーされている。追跡エンジン１１０は、実空間における座標（関節のコンステレーション）を有する候補関節のセットを、実空間における被写体として識別するためのロジックを備える。一実施形態では、追跡エンジン１１０が、所与の時点における全てのカメラについて、画像認識エンジンからの関節データ構造の配列を蓄積し、候補関節のコンステレーションを識別するために使用されるように、この情報を辞書として被写体データベース１４０に格納する。辞書は、キー値ペアの形式で編成することができ、ここで、キーはカメラＩＤであり、値はカメラからの関節データ構造の配列である。斯かる実施形態では、この辞書が候補関節を決定し、関節を被写体に割り当てるために、ヒューリスティックス・ベースの分析で使用される。斯かる実施形態では、追跡エンジン１１０の高レベル入力、処理、及び出力が表１に示されている。候補関節を組み合わせて被写体を生成し、実空間のエリア内の被写体の動きを追跡する被写体追跡エンジン１１０によって適用されるロジックの詳細は、２０１８年８月２１日発行の米国特許第１０，０５５，８５３号、「画像認識エンジンを用いた被写体の認識及び追跡」に示されており、これは、参照により本明細書に組み込まれる。

表１：例示的な実施形態における被写体追跡エンジン１１０からの入力、処理、及び出力

［被写体データ構造］ The tracking engine 110 is configured to receive an array of joint data structures generated by the image recognition engines 112a-112n that correspond to the images in the image sequence from cameras with overlapping fields of view. The array of joint data structures per image is sent by the image recognition engines 112a-112n to the tracking engine 110 via the network 181. The tracking engine 110 transforms the coordinates of the elements in the array of joint data structures corresponding to the various image sequences into candidate joints with coordinates in real space. Positions in real space are covered by the field of view of two or more cameras. The tracking engine 110 includes logic for identifying a set of candidate joints having coordinates (joint constellations) in real space as subjects in real space. In one embodiment, the tracking engine 110 is used to accumulate an array of joint data structures from the image recognition engine and identify candidate joint constellations for all cameras at a given time point. This information is stored in the subject database 140 as a dictionary. The dictionary can be organized in the form of key-value pairs, where the key is the camera ID and the values are an array of joint data structures from the camera. In such embodiments, this dictionary is used in heuristics-based analysis to determine candidate joints and assign joints to subjects. In such embodiments, the high level inputs, processes, and outputs of the tracking engine 110 are shown in Table 1. Details of the logic applied by the subject tracking engine 110, which combines candidate joints to generate a subject and track the movement of the subject in a real-space area, are described in US Pat. No. 10,055, Issued August 21, 2018. No. 853, "Recognizing and Tracking Subjects Using an Image Recognition Engine", which is incorporated herein by reference.

Table 1: Inputs, processes, and outputs from the subject tracking engine 110 in an exemplary embodiment.

[Subject data structure]

被写体追跡エンジン１１０は、ヒューリスティックを用いて、画像認識エンジン１１２ａ〜１１２によって識別された被写体の関節を接続する。その際、被写体追跡エンジン１１０は、新しい被写体を生成し、それぞれの関節位置を更新することによって既存の被写体の位置を更新する。被写体追跡エンジン１１０は、三角測量技法を用いて、関節位置を２Ｄ空間座標（ｘ，ｙ）から３Ｄ実空間座標（ｘ，ｙ，ｚ）へ投影する。図５は、被写体を格納するための被写体データ構造５００を示す。該データ構造５００は、被写体関連データをキー値辞書として格納する。キーはフレームＩＤであり、値は別のキー値辞書であり、ここでは、キーはカメラＩＤであり、値は（被写体の）１８個の関節と実空間内のそれらの位置のリストである。被写体データは、被写体データベースに格納される。新しい被写体毎に、被写体データベース内の被写体のデータにアクセスするために使用される固有識別子も割り当てられる。 The subject tracking engine 110 uses heuristics to connect the joints of the subject identified by the image recognition engines 112a-112. At that time, the subject tracking engine 110 creates a new subject and updates the position of the existing subject by updating the joint positions of the new subjects. The subject tracking engine 110 uses a triangulation technique to project joint positions from 2D spatial coordinates (x, y) to 3D real space coordinates (x, y, z). FIG. 5 shows a subject data structure 500 for storing a subject. The data structure 500 stores subject-related data as a key value dictionary. The key is the frame ID and the value is another key-value dictionary, where the key is the camera ID and the value is a list of 18 joints (of the subject) and their position in real space. The subject data is stored in the subject database. For each new subject, a unique identifier used to access the subject's data in the subject database is also assigned.

一実施形態では、システムが被写体の関節を識別し、被写体の骨格を作成する。骨格は、実空間に投影され、実空間における被写体の位置及び向きを示す。これは、マシンビジョンの分野では「姿勢推定」とも呼ばれる。一実施形態では、システムがグラフィカル・ユーザ・インターフェース（ＧＵＩ）上に実空間内の被写体の向き及び位置を表示する。一実施形態では、被写体識別及び画像分析は匿名であり、すなわち、関節分析によって作成された被写体に割り当てられた固有識別子は、上述したように、被写体の個人識別情報を識別しない。 In one embodiment, the system identifies the joints of the subject and creates the skeleton of the subject. The skeleton is projected onto the real space and indicates the position and orientation of the subject in the real space. This is also called "posture estimation" in the field of machine vision. In one embodiment, the system displays the orientation and position of a subject in real space on a graphical user interface (GUI). In one embodiment, the subject identification and image analysis are anonymous, that is, the unique identifier assigned to the subject created by the joint analysis does not identify the subject's personal identification information, as described above.

この実施形態では、関節データ構造の時系列分析によって生成された、識別された被写体の関節のコンステレーションを使用して、被写体の手の位置を見つけることができる。例えば、手首関節単独の位置、または手首関節と肘関節との組み合わせの投影に基づく位置を使用して、識別された被写体の手の位置を識別することができる。

［在庫イベント］ In this embodiment, the constellation of the identified joints of the subject, generated by time series analysis of the joint data structure, can be used to locate the hand of the subject. For example, the position of the wrist joint alone or the position based on the projection of the combination of the wrist joint and the elbow joint can be used to identify the position of the hand of the identified subject.

[Inventory event]

図６は、通路１１６ａの上面図６１０において棚ユニットＢ２０４の棚から在庫商品を取り出す被写体２４０を示す。開示される技術は、複数のカメラ内の少なくとも２つのカメラによって生成される画像シーケンスを使用して、在庫イベントの位置を見つける。単一の被写体の関節は、それぞれの画像チャネル内の複数のカメラの画像フレーム内に現れ得る。ショッピングストアの例では、被写体は、実空間のエリア内を移動し、在庫位置から商品を取り出し、また、在庫位置に商品を置き戻す。一実施形態では、システムが、ＷｈａｔＣＮＮ及びＷｈｅｎＣＮＮと呼ばれる畳み込みニューラル・ネットワークのパイプラインを使用して、在庫イベント（置くことまたは取ること、プラスまたはマイナス・イベントとも呼ばれる）を予測する。 FIG. 6 shows a subject 240 for taking out an in-stock item from the shelf of the shelf unit B 204 in the top view 610 of the passage 116a. The disclosed technique uses an image sequence generated by at least two cameras within multiple cameras to locate inventory events. The joints of a single subject can appear within the image frames of multiple cameras within each image channel. In the example of a shopping store, the subject moves in an area of real space, takes out the goods from the stock position, and puts the goods back in the stock position. In one embodiment, the system uses a pipeline of convolutional neural networks called WhatCNN and WhenCNN to predict inventory events (putting or taking, also called plus or minus events).

被写体データ構造中の関節によって識別された被写体と、カメラ当たりの画像フレームのシーケンスからの対応する画像フレームとを含むデータセットは、有界ボックス生成器への入力として与えられる。有界ボックス生成器は、データセットを処理して、画像シーケンス内の画像内の識別された被写体の手の画像を含む有界ボックスを指定するロジックを実装する。有界ボックス生成器は、例えば、それぞれのソース画像フレームに対応する多関節データ構造５００内の手首関節（それぞれの手に対する）と肘関節の位置を使用して、カメラ毎に各ソース画像フレーム内の手の位置を識別する。被写体データ構造内の関節の座標が３Ｄ実空間座標内の関節の位置を示す一実施形態では、有界ボックス生成器が、関節位置を３Ｄ実空間座標からそれぞれのソース画像の画像フレーム内の２Ｄ座標にマッピングする。 A dataset containing the subject identified by the joints in the subject data structure and the corresponding image frames from the sequence of image frames per camera is given as input to the bounded box generator. The bounded box generator implements logic that processes the dataset to specify a bounded box that contains the image of the identified subject's hand in the image in the image sequence. The bounded box generator uses, for example, the position of the wrist joint (for each hand) and the elbow joint in the articulated data structure 500 corresponding to each source image frame in each source image frame for each camera. Identify the position of the hand. In one embodiment where the coordinates of the joints in the subject data structure indicate the position of the joints in 3D real space coordinates, the bounded box generator sets the joint positions from the 3D real space coordinates to 2D in the image frame of each source image. Map to coordinates.

有界ボックス生成器は、カメラ１１４毎に循環バッファ内の画像フレーム内の手のための有界ボックスを作成する。一実施形態では有界ボックスが、画像フレームの１２８ピクセル（幅）×１２８ピクセル（高さ）部分であり、手は有界ボックスの中心に位置する。他の実施形態では、有界ボックスのサイズが６４ピクセル×６４ピクセルまたは３２ピクセル×３２ピクセルである。カメラからの画像フレーム内のｍ個の被写体について、最大２ｍ個の手、従って２ｍ個の有界ボックスが存在し得る。しかしながら、実際には、他の被写体または他の物体による遮蔽のために、２ｍより少ない手が画像フレーム内で見える。１つの例示的な実施形態では、被写体の手の位置が肘関節及び手首関節の位置から推測される。例えば、被写体の右手の位置は、右肘の位置（ｐ１として識別される）及び右手首の位置（ｐ２として識別される）を用いて、外挿量×(ｐ２−ｐ１)＋p２として外挿される。ここで外挿量は０．４である。別の実施形態では、関節ＣＮＮ１１２ａ〜１１２ｎが左手画像及び右手画像を使用してトレーニングされる。従って、斯かる実施形態では、関節ＣＮＮ１１２ａ〜１１２ｎがカメラ当たりの画像フレーム内の手の位置を直接識別する。画像フレーム当たりの手の位置は、識別された手当たりの有界ボックスを生成するために有界ボックス生成器によって使用される。 The bounded box generator creates a bounded box for each hand in the image frame in the circular buffer for each camera 114. In one embodiment, the bounded box is a 128 pixel (width) x 128 pixel (height) portion of the image frame, and the hand is located in the center of the bounded box. In another embodiment, the size of the bounded box is 64 pixels x 64 pixels or 32 pixels x 32 pixels. For m subjects in the image frame from the camera, there can be up to 2m hands, and thus 2m bounded boxes. However, in practice, less than 2 m of hands are visible in the image frame due to shielding by other subjects or other objects. In one exemplary embodiment, the position of the subject's hand is inferred from the positions of the elbow and wrist joints. For example, the position of the right hand of the subject is extrapolated as extrapolation amount × (p2-p1) + p2 using the position of the right elbow (identified as p1) and the position of the right wrist (identified as p2). .. Here, the extrapolation amount is 0.4. In another embodiment, the joints CNN112a-112n are trained using left and right hand images. Therefore, in such an embodiment, the joints CNN112a-112n directly identify the position of the hand within the image frame per camera. The hand position per image frame is used by the bounded box generator to generate the identified bounded box per hand.

ＷｈａｔＣＮＮは、識別された被写体の手の分類を生成するために、画像内の指定された有界ボックスを処理するようにトレーニングされた畳み込みニューラル・ネットワークである。１つの訓練されたＷｈａｔＣＮＮは、１つのカメラからの画像フレームを処理する。ショッピングストアの例示的な実施形態では、各画像フレーム内の各手について、ＷｈａｔＣＮＮは手が空であるかどうかを識別する。ＷｈａｔＣＮＮは、また、手の中の在庫商品のＳＫＵ(在庫管理単位）番号、手の中の商品を示す信頼値が非ＳＫＵ商品（すなわち、ショッピングストア在庫に属さない）、及び画像フレーム内の手の位置の状況を識別する。 WhatCNN is a convolutional neural network trained to process a specified bounded box in an image to generate a hand classification for an identified subject. One trained WhatCNN processes image frames from one camera. In an exemplary embodiment of a shopping store, for each hand in each image frame, WhatCNN identifies whether the hand is empty. WhatCNN also has a SKU (stock keeping unit) number for the goods in stock in the hand, a non-SKU product (ie, does not belong to the shopping store stock) with a confidence value indicating the goods in the hand, and a hand in the image frame. Identifies the status of the location of.

全てのカメラ１１４のＷｈａｔＣＮＮモデルの出力は、所定の時間帯の間、単一のＷｈｅｎＣＮＮモデルによって処理される。ショッピングストアの例では、ＷｈｅｎＣＮＮが被写体の両手について時系列分析を実行して、被写体が棚から店舗在庫商品を取るか、または店舗在庫商品を棚に置くかを識別する。開示された技術は、複数のカメラの内の少なくとも２つのカメラによって生成された画像シーケンスを用いて、在庫イベントの位置を見つける。ＷｈｅｎＣＮＮは、少なくとも２つのカメラからの画像シーケンスからのデータセットの分析を実行して、３次元における在庫イベントの位置を決定し、在庫イベントに関連する商品を識別する。ある期間にわたる被写体当たりのＷｈｅｎＣＮＮの出力の時系列分析が実行されて、在庫イベント及びそれらの発生時間が識別される。この目的のために、非最大抑制（ＮＭＳ）アルゴリズムが使用される。１つの在庫イベント（すなわち、被写体による商品を置くことまたは取ること）がＷｈｅｎＣＮＮによって複数回（同じカメラ及び複数のカメラの両方から）検出されると、ＮＭＳは、被写体に対する余分なイベントを除去する。ＮＭＳは、２つの主要なタスク、すなわち、余分な検出にペナルティを課す「マッチングロス」と、より良好な検出が手近に存在するかどうかを知るための近隣の「関節処理」とを含む再スコアリング技術である。 The output of the WhatCNN model of all cameras 114 is processed by a single IfCNN model for a given time period. In the example of a shopping store, WhenCNN performs a time series analysis on both hands of the subject to identify whether the subject takes the store inventory from the shelf or puts the store inventory on the shelf. The disclosed technique uses an image sequence generated by at least two of the cameras to locate the inventory event. WhenCNN performs analysis of the dataset from image sequences from at least two cameras to determine the location of the inventory event in three dimensions and identify the goods associated with the inventory event. A time series analysis of the output of WenCNN per subject over a period of time is performed to identify inventory events and their time of occurrence. A non-maximum suppression (NMS) algorithm is used for this purpose. If one inventory event (ie, placing or taking goods by the subject) is detected multiple times by WenCNN (from both the same camera and multiple cameras), the NMS removes the extra event for the subject. NMS rescoring includes two major tasks: "matching loss", which penalizes extra detection, and neighboring "joint processing" to see if better detection is at hand. Ring technology.

各被写体に対する取ること及び置くことの真のイベントは、真のイベントを有する画像フレームの前の３０画像フレームに対するＳＫＵロジットの平均を計算することによって更に処理される。最後に、最大値の引数(ａｒｇｍａｘまたはａｒｇｍａｘと略す）を使用して、最大値を決定する。ａｒｇｍａｘ値によって分類された在庫商品は、棚に置かれたまたは棚から取られた在庫商品を識別するために使用される。開示された技術は、在庫に関連する在庫イベントを被写体のログ・データ構造（または、ショッピングカート・データ構造）に割り当てることで、在庫イベントを被写体に帰属させる。在庫商品は、それぞれの被写体のＳＫＵ（ショッピングカートまたはバスケットとも呼ばれる）のログに追加される。在庫イベントの検出につながった画像フレームの画像フレーム識別子「フレームＩＤ」も、識別されたＳＫＵとともに格納される。在庫イベントを被写体に帰属させるロジックは、在庫イベントの位置と複数の顧客中の顧客の一人の位置とをマッチングさせる。例えば、画像フレームは、被写体データ構造５００を用いて在庫イベントとして分類されるシーケンス中の少なくとも１時点における被写体の手の位置によって表される在庫イベントの３Ｄ位置を識別するのに使用することができ、そして、商品が取り出されたか、または置かれた場所からの在庫位置を決定するのに使用することができる。開示された技術は、複数のカメラの内の少なくとも２つのカメラによって生成された画像シーケンスを用いて、在庫イベントの位置を見つけ、在庫イベント・データ構造を作成する。一実施形態では、在庫イベント・データ構造は、商品識別子、置くまたは取るインジケータ、実空間のエリアの３次元の座標、及びタイムスタンプを格納する。一実施形態では、在庫イベントは、在庫イベント・データベース１５０に格納される。 The true events of taking and placing for each subject are further processed by calculating the average SKU logit for the 30 image frames before the image frame having the true event. Finally, the maximum value argument (abbreviated as arg max or arg max) is used to determine the maximum value. Inventory items classified by argmax value are used to identify inventory items placed on or taken from shelves. The disclosed technology assigns inventory events to a subject by assigning inventory-related inventory events to the subject's log data structure (or shopping cart data structure). Inventory items are added to the SKU (also known as shopping cart or basket) log for each subject. The image frame identifier "frame ID" of the image frame that led to the detection of the inventory event is also stored with the identified SKU. The logic for assigning an inventory event to a subject matches the position of the inventory event with the position of one of the customers among multiple customers. For example, the image frame can be used to identify the 3D position of an inventory event represented by the position of the subject's hand at at least one time point in the sequence classified as an inventory event using the subject data structure 500. , And can be used to determine the inventory position from where the goods were taken out or placed. The disclosed technique uses image sequences generated by at least two of the cameras to locate inventory events and create inventory event data structures. In one embodiment, the inventory event data structure stores a product identifier, an indicator to place or take, three-dimensional coordinates of an area in real space, and a time stamp. In one embodiment, the inventory event is stored in the inventory event database 150.

在庫イベント（空間のエリア内の被写体による在庫商品の置くこと及び取ること）の位置は、被写体が商品を取り出した、または商品を置いた棚等の在庫位置を識別するために、店舗のプラノグラムまたは他のマップと比較することができる。例示６６０は、在庫イベントに関連付けられた手の位置６４０からの最短距離を計算することによる、棚ユニット内の棚の判定を示す。次に、この棚の判定は、棚の在庫データ構造を更新するために使用される。図７に、例示的な在庫データ構造７００（ログ・データ構造とも呼ばれる）が示されている。この在庫データ構造は、被写体、棚または店舗の在庫をキー値辞書として記憶する。キーは、被写体、棚または店舗の固有識別子であり、値は、別のキー値辞書であり、この場合、キーが在庫管理単位（ＳＫＵ）のような商品識別子であり、値が在庫イベント予測をもたらした画像フレームの「フレームＩＤ」と共に商品の数量を識別する番号である。フレーム識別子（「フレームＩＤ」）は、在庫商品と被写体、棚、または店舗との関連をもたらす在庫イベントの識別をもたらした画像フレームを識別するために使用することができる。他の実施形態では、ソースカメラを識別する「カメラＩＤ」をフレームＩＤと組み合わせて、在庫データ構造７００内に格納することもできる。一実施形態では、フレームが被写体の手を有界ボックス内に有するので、「フレームＩＤ」は被写体識別子である。他の実施形態では、実空間のエリア内の被写体を明示的に識別する「被写体ＩＤ」のような他のタイプの識別子を、被写体を識別するために使用することができる。 The location of an inventory event (placement and picking of inventory items by a subject within an area of space) is a store planogram to identify the inventory position, such as a shelf on which the subject has taken out or placed the item. Or you can compare it with other maps. Example 660 shows the determination of a shelf in a shelf unit by calculating the shortest distance from the hand position 640 associated with the inventory event. This shelf determination is then used to update the shelf inventory data structure. FIG. 7 shows an exemplary inventory data structure 700 (also referred to as a log data structure). This inventory data structure stores the inventory of a subject, a shelf, or a store as a key value dictionary. The key is the unique identifier of the subject, shelf or store, the value is another key-value dictionary, in this case the key is the product identifier, such as a stock keeping unit (SKU), and the value is the inventory event forecast. It is a number that identifies the quantity of the product together with the "frame ID" of the brought image frame. The frame identifier (“frame ID”) can be used to identify the image frame that resulted in the identification of the inventory event that resulted in the association between the inventory item and the subject, shelf, or store. In another embodiment, the "camera ID" that identifies the source camera can be combined with the frame ID and stored in the inventory data structure 700. In one embodiment, the "frame ID" is the subject identifier because the frame holds the subject's hand in the bounded box. In another embodiment, another type of identifier, such as a "subject ID" that explicitly identifies a subject in a real space area, can be used to identify the subject.

棚在庫データ構造が、被写体のログ・データ構造と統合されると、棚在庫は、顧客が棚から取り出した商品の数量を反映するように減少される。顧客が商品を棚に置くか、または、従業員が商品を棚にストックした場合、商品は、それぞれの在庫位置の在庫データ構造に追加される。ある期間にわたって、この処理は、ショッピングストア内の全ての在庫位置についての棚在庫データ構造の更新をもたらす。実空間のエリア内の在庫位置の在庫データ構造を統合して、その時点における店舗内の各ＳＫＵの商品の総数を示す実空間のエリアの在庫データ構造を更新する。一実施形態では、そのような更新が各在庫イベントの後に実行される。別の実施形態では、店舗在庫データ構造は定期的に更新される。 When the shelf inventory data structure is integrated with the subject's log data structure, the shelf inventory is reduced to reflect the quantity of goods that the customer has taken out of the shelf. If the customer puts the goods on the shelves or the employee stocks the goods on the shelves, the goods are added to the inventory data structure of each stock position. Over time, this process results in an update of the shelf inventory data structure for all inventory locations within the shopping store. The inventory data structure of the inventory position in the real space area is integrated and the inventory data structure of the real space area showing the total number of products of each SKU in the store at that time is updated. In one embodiment, such an update is performed after each inventory event. In another embodiment, the store inventory data structure is updated periodically.

在庫イベントを検出するＷｈａｔＣＮＮ及びＷｈｅｎＣＮＮの実施態様の詳細は、２０１８年２月２７日出願の米国特許出願第１５／９０７，１１２号、「画像認識を用いた商品を置くこと及び取ることの検出」に示されており、これは、本明細書に完全に記載されているかのように、参照により本明細書に組み込まれる。

［リアルタイム棚及び店舗在庫更新］ For more information on the WatCNN and ThenCNN embodiments of detecting inventory events, see US Patent Application No. 15 / 907,112, filed February 27, 2018, "Detection of Placing and Taking Goods Using Image Recognition." Shown in, which is incorporated herein by reference, as if fully described herein.

[Real-time shelf and store inventory update]

図８は、実空間のエリア内の棚在庫構造を更新する処理ステップを示すフローチャートである。処理はステップ８０２から始まる。ステップ８０４では、システムが、実空間のエリア内での取るまたは置くイベントを検出する。在庫イベントは、在庫イベント・データベース１５０に記憶される。在庫イベント・レコードは、ＳＫＵ等の商品識別子、タイムスタンプ、３次元ｘ、ｙ、及びｚに沿った位置を示す実空間の３次元エリア内のイベントの位置を含む。在庫イベントは、また、置くまたは取るインジケータを含み、被写体が商品を棚に置いたかどうか（プラス在庫イベントとも呼ばれる）、または商品を棚から取り出したかどうか（マイナス在庫イベントとも呼ばれる）を識別する。在庫イベント情報は、被写体追跡エンジン１１０からの出力と組み合わされて、この在庫イベントに関連する被写体を識別する。次に、この分析の結果を使用して、在庫データベース１６０内の被写体のログ・データ構造（ショッピングカート・データ構造とも呼ばれる）を更新する。一実施形態では、被写体識別子（例えば、「被写体ＩＤ」）が在庫イベント・データ構造に格納される。 FIG. 8 is a flowchart showing a processing step for updating the shelf inventory structure in the real space area. The process begins at step 802. In step 804, the system detects an event to be taken or placed within an area of real space. Inventory events are stored in the inventory event database 150. The inventory event record includes a product identifier such as a SKU, a time stamp, and the location of the event in a real space 3D area indicating a position along the 3D x, y, and z. Inventory events also include a place or take indicator to identify whether the subject has placed the item on the shelf (also known as a plus inventory event) or removed the item from the shelf (also known as a minus inventory event). The inventory event information is combined with the output from the subject tracking engine 110 to identify the subject associated with this inventory event. The results of this analysis are then used to update the subject's log data structure (also known as the shopping cart data structure) in the inventory database 160. In one embodiment, the subject identifier (eg, "subject ID") is stored in the inventory event data structure.

システムは、在庫イベントに関連する被写体の手の位置（ステップ８０６）を使用して、ステップ８０８において、在庫陳列構造（上記の棚ユニットとも呼ばれる）内の最も近い棚の位置を見つけることができる。店舗在庫エンジン１８０は、ショッピングストア内の在庫位置のｘｚ平面（フロア２２０に垂直）上の２次元（２Ｄ）領域またはエリアまでの手の距離を計算する。在庫位置の２Ｄ領域は、ショッピングストアのマップ・データベース１４０内に格納される。手が実空間内の点Ｅ(ｘ_event，ｙ_event，ｚ_event）で表されるとする。実空間内の点Ｅから該平面上の任意の点Ｐまでの最短距離Ｄは、該平面に対する法線ベクトルｎ上にベクトルＰＥを射影することによって決定することができる。既存の数学的技法を使用して、在庫位置の２Ｄ領域を表す全ての平面に対する手の距離を計算することができる。 The system can use the subject's hand position (step 806) associated with the inventory event to find the position of the closest shelf in the inventory display structure (also referred to as the shelf unit above) in step 808. The store inventory engine 180 calculates the distance of the hand to a two-dimensional (2D) area or area on the xz plane (perpendicular to the floor 220) of the inventory position in the shopping store. The 2D area of the inventory position is stored in the map database 140 of the shopping store. Suppose the hand is represented by a _{point E (x event} , y _event , z _{event) in real space.} The shortest distance D from the point E in the real space to any point P on the plane can be determined by projecting the vector PE onto the normal vector n with respect to the plane. Existing mathematical techniques can be used to calculate the distance of the hand to all planes representing the 2D region of the stock position.

一実施形態では、開示される技術は、在庫イベントの位置から在庫陳列構造上の在庫位置までの距離を計算すること、及び計算された距離に基づいて在庫イベントを在庫位置とマッチングさせることを含む手順を実行することによって、在庫イベントの位置を在庫位置とマッチングさせる。例えば、在庫イベントの位置から最短距離にある在庫位置（棚等）が選択され、この棚の在庫データ構造がステップ８１０で更新される。一実施形態では、在庫イベントの位置が、実空間の３つの座標に沿った被写体の手の位置によって決定される。在庫イベントが、ケチャップのボトルが被写体によって取られたことを示す取るイベント（またはマイナスイベント）である場合、棚の在庫は、ケチャップボトルの数を１つ減らすことによって更新される。同様に、在庫イベントが、被写体が棚にケチャップのボトルを置くことを示す置くイベントである場合は、ケチャップボトル数を１つ増やして、棚の在庫を更新する。同様に、店舗の在庫データ構造もそれに応じて更新される。在庫位置に置かれた商品の数量は、店舗在庫データ構造において同じ数だけ増分される。同様に、在庫位置から取られた商品の数量は、在庫データベース１６０内の店舗の在庫データ構造から差し引かれる。 In one embodiment, the disclosed technique comprises calculating the distance from the position of the inventory event to the inventory position on the inventory display structure, and matching the inventory event with the inventory position based on the calculated distance. Match the position of the inventory event with the inventory position by performing the procedure. For example, the inventory position (shelf, etc.) that is the shortest distance from the position of the inventory event is selected, and the inventory data structure of this shelf is updated in step 810. In one embodiment, the position of the inventory event is determined by the position of the subject's hand along three coordinates in real space. If the inventory event is a take event (or negative event) indicating that the ketchup bottle was taken by the subject, the inventory on the shelf is updated by reducing the number of ketchup bottles by one. Similarly, if the inventory event is a placing event indicating that the subject places a bottle of ketchup on the shelf, the number of ketchup bottles is increased by one to update the inventory on the shelf. Similarly, the store inventory data structure is updated accordingly. The quantity of goods placed in the inventory position is incremented by the same number in the store inventory data structure. Similarly, the quantity of goods taken from the inventory position is deducted from the inventory data structure of the store in the inventory database 160.

ステップ８１２において、プラノグラムがショッピングストアに対し利用可能であるか、或いは、プラノグラムが利用可能であること知り得るかがチェックされる。プラノグラムは、ショッピングストア内の在庫位置に在庫商品をマッピングするデータ構造であり、これは、店舗内の在庫商品の配置のための計画に基づくことができる。ショッピングストアに対しプラノグラムが利用可能である場合、ステップ８１４で、被写体によって棚に置かれた商品が、プラノグラム内の棚上の商品と比較される。一実施形態では、開示される技術は、在庫イベントがプラノグラムとマッチングしない在庫位置とマッチングする場合に、誤配置された商品を決定するロジックを含む。例えば、在庫イベントに関連付けられた商品のＳＫＵが、在庫位置における在庫商品の配置とマッチングする場合、商品の位置は正しく（ステップ８１６）、マッチングしない場合、商品は誤配置されている。一実施形態では、ステップ８１８において、誤配置された商品を現在庫位置（棚等）から取り出し、プラノグラムに従ってその正しい在庫位置に移動させるための通知が、従業員に送信される。システムは、ステップ８２０において、被写体がショッピングストアから出て行くかどうかを、店舗の出口までのスピード、向き、及び近さを用いてチェックする。被写体が店舗から出て行こうとしていない場合（ステップ８２０）、処理はステップ８０４において再開する。そうでない場合、被写体が店舗を出て行くと判定された場合、ステップ８２２で、被写体のログ・データ構造（またはショッピングカート・データ構造）及び店舗の在庫データ構造が統合される。 At step 812, it is checked whether the planogram is available to the shopping store or whether it can be known that the planogram is available. A planogram is a data structure that maps inventory items to inventory locations within a shopping store, which can be based on a plan for the placement of inventory items within a store. If a planogram is available for the shopping store, in step 814 the goods placed on the shelves by the subject are compared to the goods on the shelves in the planogram. In one embodiment, the disclosed technique includes logic for determining misplaced goods when an inventory event matches an inventory position that does not match the planogram. For example, if the SKU of the goods associated with the inventory event matches the placement of the goods in stock at the stock position, the position of the goods is correct (step 816), otherwise the goods are misplaced. In one embodiment, in step 818, a notification is sent to the employee to remove the misplaced merchandise from the current storage position (such as a shelf) and move it to its correct inventory position according to the planogram. In step 820, the system checks whether the subject leaves the shopping store using the speed, orientation, and proximity to the exit of the store. If the subject is not about to leave the store (step 820), processing resumes at step 804. Otherwise, if it is determined that the subject is leaving the store, step 822 integrates the subject's log data structure (or shopping cart data structure) and the store's inventory data structure.

一実施形態では、ステップ８１０において、被写体のショッピングカート・データ構造内の商品が店舗在庫から差し引かれない場合、統合はこれらの商品を店舗在庫データ構造から差し引くことを含む。このステップにおいて、システムは、低識別信頼度スコアを有する被写体のショッピングカート・データ構造内の商品を識別し、店舗出口付近に位置する店舗従業員に通知を送信することもできる。次いで、従業員は、顧客のショッピングカート内の低識別信頼度スコアの商品を確認することができる。この処理は、顧客のショッピングカート内の全ての商品を顧客のショッピングカート・データ構造と比較することを店舗従業員に要求せず、信頼度スコアが低い商品のみがシステムによって店舗従業員に対して識別され、店舗従業員によって確認される。処理は、ステップ８２４で終了する。

［リアルタイム棚及び店舗在庫更新のためのアーキテクチャ］ In one embodiment, if in step 810 the goods in the subject's shopping cart data structure are not deducted from the store inventory, the integration comprises deducting these goods from the store inventory data structure. In this step, the system can also identify items in the shopping cart data structure of the subject with a low identification confidence score and send a notification to a store employee located near the store exit. The employee can then see the item with the low identification confidence score in the customer's shopping cart. This process does not require the store employee to compare all the products in the customer's shopping cart with the customer's shopping cart data structure, and only the products with a low confidence score are sent to the store employee by the system. Identified and confirmed by store employees. The process ends in step 824.

[Architecture for real-time shelf and store inventory updates]

顧客在庫、在庫位置（例えば棚）在庫、及び、店舗在庫（例えば店舗全体）データ構造が、ショッピングストア内の顧客による商品を置くこと及び取ることを使用して更新されるシステムのアーキテクチャ例を、図９Ａに示す。図９Ａはアーキテクチャ図であるため、説明の明確性を向上させるために、特定の詳細は省略されている。図９Ａに示すシステムは、複数のカメラ１１４から画像フレームを受信する。上述のように、一実施形態では、カメラ１１４が、画像が同時に、または時間的に近く、かつ同じ画像キャプチャレートで取得されるように、互いに時間的に同期させることができる。同時にまたは時間的に近い実空間のエリアをカバーする全てのカメラにおいて取得された画像は、同期された画像が実空間において固定された位置を有する被写体のある時点での様々な光景を表すものとして処理エンジンにおいて識別されることができるという意味で同期される。画像は、カメラ毎に画像フレームの循環バッファ９０２内に格納される。 An example architecture of a system in which customer inventory, inventory location (eg, shelves) inventory, and store inventory (eg, entire store) data structures are updated using placing and taking goods by customers in a shopping store. It is shown in FIG. 9A. Since FIG. 9A is an architectural diagram, certain details are omitted to improve clarity of description. The system shown in FIG. 9A receives image frames from a plurality of cameras 114. As mentioned above, in one embodiment, the cameras 114 can be temporally synchronized with each other so that the images are acquired simultaneously or close in time and at the same image capture rate. Images acquired by all cameras covering areas of real space that are simultaneous or close in time represent different scenes of the synchronized image at a given point in time with a subject having a fixed position in real space. Synchronized in the sense that it can be identified in the processing engine. The image is stored in the circular buffer 902 of the image frame for each camera.

「被写体識別」サブシステム９０４（第１の画像プロセッサとも呼ばれる）は、カメラ１１４から受け取った画像フレームを処理して、実空間内の被写体を識別し追跡する。第１の画像プロセッサは、実空間内の被写体の関節を検出する被写体画像認識エンジンを含む。関節は接続され被写体を形成し、そして、実空間内での移動として追跡される。被写体は匿名であり、内部の識別子「被写体ＩＤ」を使用して追跡される。 A "subject identification" subsystem 904 (also referred to as a first image processor) processes an image frame received from the camera 114 to identify and track a subject in real space. The first image processor includes a subject image recognition engine that detects joints of a subject in real space. The joints are connected to form a subject and are tracked as movement in real space. The subject is anonymous and is tracked using the internal identifier "subject ID".

「領域提案」サブシステム９０８（第３の画像プロセッサとも呼ばれる）は、前景画像認識エンジンを含み、複数のカメラ１１４から対応する画像シーケンスを受信し、前景内の意味的に重要な物体（すなわち、顧客、顧客の手、及び在庫商品）が、各カメラからの画像において経時的に、在庫商品を置くこと及び取ることに関連するときに、当該物体を認識する。領域提案サブシステム９０８は、また、被写体識別サブシステム９０４の出力を受信する。第３の画像プロセッサは、カメラ１１４からの画像シーケンスを処理して、対応する画像シーケンス内の画像に表される前景変化を識別し且つ分類する。第３の画像プロセッサは、識別された前景変化を処理して、識別された被写体による在庫商品を取ることの検出、及び、識別された被写体による在庫陳列構造上の在庫商品を置くことの第１の検出セットを作成する。一実施形態では、第３の画像プロセッサは、上述のＷｈａｔＣＮＮ等の畳み込みニューラル・ネットワーク（ＣＮＮ）モデルを備える。第１の検出セットは、在庫商品を置くこと及び取ることの前景検出とも呼ばれる。この実施形態では、ＷｈａｔＣＮＮの出力が第２の畳み込みニューラル・ネットワーク（ＷｈｅｎＣＮＮ）で処理されて、在庫位置に在庫商品の置くイベント、及び、顧客及び店舗の従業員による在庫陳列構造内の在庫位置上の在庫商品の取るイベントを識別する第１の検出セットを作成する。領域提案サブシステムの詳細は、２０１８年２月２７日出願の米国特許出願第１５／９０７，１１２号、「画像認識を用いた商品を置くこと及び取ることの検出」に示されており、これは、本明細書に完全に記載されているかのように、参照により本明細書に組み込まれる。 The "Region Proposal" subsystem 908 (also referred to as a third image processor) includes a foreground image recognition engine that receives corresponding image sequences from multiple cameras 114 and is a semantically important object in the foreground (ie, ie. The object is recognized when the customer, the customer's hand, and the inventory item) are associated with placing and taking the inventory item over time in the image from each camera. The region proposal subsystem 908 also receives the output of the subject identification subsystem 904. The third image processor processes the image sequence from the camera 114 to identify and classify the foreground changes represented by the images in the corresponding image sequence. The third image processor processes the identified foreground change to detect that the identified subject takes inventory and places the inventory on the inventory display structure by the identified subject. Create a detection set for. In one embodiment, the third image processor comprises a convolutional neural network (CNN) model such as WhatCNN described above. The first detection set is also referred to as foreground detection of placing and taking inventories. In this embodiment, the output of WhatCNN is processed by a second convolutional neural network (WhenCNN) to place an inventory item in the inventory position, and on the inventory position in the inventory display structure by customers and store employees. Create a first set of detections that identifies the events taken by the in-stock items. Details of the domain proposal subsystem are set forth in US Patent Application No. 15 / 907,112, filed February 27, 2018, "Detection of Placing and Taking Goods Using Image Recognition". Is incorporated herein by reference as if fully described herein.

別の実施形態では、アーキテクチャが、在庫商品を置くこと及び取ることを検出し、これらの置くこと及び取ることをショッピングストア内の被写体に関連付けるために、第３の画像プロセッサと並列に使用することができる「意味的差分抽出」サブシステム（第２の画像プロセッサとも呼ばれる）を含む。この意味的差分抽出サブシステムは、背景画像認識エンジンを含み、背景画像認識エンジンは、複数のカメラから対応する画像シーケンスを受信し、例えば、背景（すなわち、棚のような在庫陳列構造）内の意味的に重要な差異が、各カメラからの画像において経時的に、在庫商品を置くこと及び取ることに関連するときに、当該差異を認識する。第２の画像プロセッサは、被写体識別サブシステム９０４の出力と、カメラ１１４からの画像フレームとを入力として受け取る。「意味的差分抽出」サブシステムの詳細は、２０１８年４月４日出願の米国特許第１０，１２７，４３８号、「意味的差分抽出を使用した在庫イベントの予測」、及び、２０１８年４月４日出願の米国特許出願第１５／９４５，４７３号、「前景／背景処理を使用した在庫イベントの予測」に示されており、これらの両方は本明細書に完全に記載されているかのように、参照により本明細書に組み込まれる。第２の画像プロセッサは、識別された背景変化を処理して、識別された被写体による在庫商品を取ることの検出、及び、識別された被写体による在庫陳列構造上の在庫商品を置くことの第２の検出セットを作成する。第２の検出セットは、在庫商品を置くこと及び取ることの背景検出とも呼ばれる。ショッピングストアの例では、第２の検出が、在庫位置から取られた在庫商品、または、顧客または店舗従業員によって在庫位置上に置かれた在庫商品を識別する。意味的差分抽出サブシステムは、識別された背景変化を識別された被写体に関連付けるロジックを含む。 In another embodiment, the architecture detects placing and taking inventory items and uses them in parallel with a third image processor to associate these placing and taking with the subject in the shopping store. Includes a "semantic difference extraction" subsystem (also known as a second image processor) that can. This semantic difference extraction subsystem includes a background image recognition engine, which receives corresponding image sequences from multiple cameras, eg, in the background (ie, a shelf-like inventory display structure). Recognize such differences as the semantically significant differences relate to placing and taking stock items over time in the images from each camera. The second image processor receives the output of the subject identification subsystem 904 and the image frame from the camera 114 as inputs. For more information on the "Semantic Difference Extraction" subsystem, see US Pat. No. 10,127,438, filed April 4, 2018, "Prediction of Inventory Events Using Semantic Difference Extraction," and April 2018. It is set forth in U.S. Patent Application No. 15 / 945,473, "Prediction of Inventory Events Using Foreground / Background Processing," filed on the 4th, both of which appear to be fully described herein. Incorporated herein by reference. The second image processor processes the identified background change to detect that the identified subject takes inventory and places the inventory on the inventory display structure by the identified subject. Create a detection set for. The second detection set is also referred to as background detection for placing and taking inventories. In the example of a shopping store, the second detection identifies an inventory item taken from an inventory position or an inventory item placed on the inventory position by a customer or store employee. The semantic difference extraction subsystem contains logic that associates the identified background changes with the identified subject.

図９Ａに記載されるシステムは、第１及び第２の検出セットを処理して、識別された被写体についての在庫商品のリストを含むログ・データ構造を生成するための選択ロジックを含む。実空間内の置くこと及び取ることのために、選択ロジックは、意味的差分抽出サブシステムまたは領域提案サブシステム９０８の何れかからの出力を選択する。一実施形態では、選択ロジックが、第１の検出セットについて意味的差分抽出サブシステムによって生成された信頼度スコアと、第２の検出セットについて領域提案サブシステムによって生成された信頼度スコアとを使用して、選択を行う。特定の検出に対するより高い信頼度スコアを有するサブシステムの出力が選択され、識別された被写体に関連付けられた在庫商品及びその数量のリストを含むログ・データ構造７００（ショッピングカート・データ構造とも呼ばれる）を生成するために使用される。棚及び店舗在庫データ構造は、上述したように被写体のログ・データ構造を用いて更新される。 The system described in FIG. 9A includes selection logic for processing the first and second detection sets to generate a log data structure containing a list of in-stock items for the identified subject. For placing and taking in real space, the selection logic selects the output from either the semantic difference extraction subsystem or the region suggestion subsystem 908. In one embodiment, the selection logic uses a confidence score generated by the semantic difference extraction subsystem for the first detection set and a confidence score generated by the region proposal subsystem for the second detection set. And make a selection. Log data structure 700 (also known as a shopping cart data structure) that contains a list of in-stock items and their quantities associated with the identified subject, with the output of the subsystem selected to have a higher confidence score for a particular detection. Used to generate. The shelf and store inventory data structures are updated using the subject log data structure as described above.

被写体出口検出エンジン９１０は、顧客が出口ドアに向かって移動しているかどうかを判定し、信号を店舗在庫エンジン１９０に送信する。店舗在庫エンジンは、顧客のログ・データ構造７００内の１以上の商品が第２または第３のイメージ・プロセッサによって判定された低識別信頼度スコアを有するかどうかを判定する。もし有する場合、在庫統合エンジンは、顧客によって購入された商品を確認するために、出口近くに位置する店舗従業員に通知を送る。被写体、在庫位置、及びショッピングストアの在庫データ構造は、在庫データベース１６０に記憶される。 The subject exit detection engine 910 determines whether the customer is moving toward the exit door and transmits a signal to the store inventory engine 190. The store inventory engine determines if one or more goods in the customer's log data structure 700 have the low discriminant confidence score determined by the second or third image processor. If so, the inventory integration engine sends a notification to a store employee located near the exit to confirm the goods purchased by the customer. The subject, the inventory position, and the inventory data structure of the shopping store are stored in the inventory database 160.

図９Ｂは、顧客在庫、在庫位置（例えば棚）在庫、及び、店舗在庫（例えば店舗全体）データ構造が、ショッピングストア内の顧客による商品を置くこと及び取ることを使用して更新される、システムの別のアーキテクチャを示す。図９Ａは、アーキテクチャ図であるため、説明の明確性を向上させるために、特定の詳細は省略されている。上述したように、システムは、複数の同期されたカメラ１１４から画像フレームを受信する。ＷｈａｔＣＮＮ９１４は、画像認識エンジンを使用して、実空間（ショッピングストア等）のエリア内の顧客の手の中にある商品を判定する。一実施形態では、カメラ１１４毎に１つのＷｈａｔＣＮＮがあり、それぞれのカメラによって生成された画像フレームのシーケンスの画像処理を実行する。ＷｈｅｎＣＮＮ９１２は、ＷｈａｔＣＮＮの出力の時系列分析を実行して、置くまたは取るイベントを識別する。商品及び手の情報と共に在庫イベントがデータベース９１８に記憶される。次いで、この情報は、個人・商品属性コンポーネント９２０によって、顧客追跡エンジン１１０（上記では被写体追跡エンジン１１０とも呼ばれる）によって生成された顧客情報と組み合わされる。ショッピングストア内の顧客のログ・データ構造７００は、データベース９１８に記憶された顧客情報をリンクすることによって生成される。 FIG. 9B is a system in which customer inventory, inventory location (eg, shelves) inventory, and store inventory (eg, entire store) data structures are updated using placing and taking goods by customers in a shopping store. Shows another architecture of. Since FIG. 9A is an architectural diagram, certain details have been omitted to improve clarity of description. As mentioned above, the system receives image frames from a plurality of synchronized cameras 114. WhatCNN 914 uses an image recognition engine to determine the goods in the hands of a customer in an area of real space (shopping store, etc.). In one embodiment, there is one WhatCNN for each camera 114, which performs image processing on a sequence of image frames generated by each camera. WhenCNN912 performs a time series analysis of the output of WhatCNN to identify the event to put or take. Inventory events are stored in database 918 along with product and hand information. This information is then combined with the customer information generated by the customer tracking engine 110 (also referred to herein as the subject tracking engine 110) by the personal and product attribute component 920. The customer log data structure 700 in the shopping store is generated by linking the customer information stored in the database 918.

開示される技術は、複数のカメラによって生成される画像のシーケンスを使用して、実空間のエリアからの顧客の離脱を検出する。顧客の離脱の検出に応答して、開示された技術は、顧客に起因する在庫イベントに関連する商品について、メモリ内の店舗在庫を更新する。出口検出エンジン９１０が、ショッピングストアからの顧客「Ｃ」の離脱を検出すると、ログ・データ構造９２２に示されるように顧客「Ｃ」によって購入された商品は店舗の在庫データ構造９２４と統合されて、更新された店舗在庫データ構造９２６を生成する。例えば、図９Ｂに示されるように、顧客は２個の商品１と４個の商品３と１個の商品４を購入した。顧客のログ・データ構造９２２に示されるように顧客「Ｃ」によって購入されたそれぞれの商品の数量は店舗在庫９２４から差し引かれて、商品１の数量が４８から４６に減少し、同様に、商品３及び４の数量が顧客「Ｃ」によって購入された商品３及び商品４のそれぞれの数量のだけ減少したことを示す更新された店舗在庫データ構造９２６を生成する。顧客「Ｃ」は商品２を購入しなかったため、更新された店舗在庫データ構造９２６において、商品２の数量は、現在の店舗在庫データ構造９２４の以前のものと同じままである。 The disclosed technology uses a sequence of images generated by multiple cameras to detect a customer leaving an area of real space. In response to the detection of customer withdrawal, the disclosed technology updates store inventory in memory for goods related to inventory events caused by the customer. When the exit detection engine 910 detects the departure of customer "C" from the shopping store, the goods purchased by customer "C" are integrated with the store inventory data structure 924 as shown in log data structure 922. , Generate an updated store inventory data structure 926. For example, as shown in FIG. 9B, the customer has purchased two merchandise 1, four merchandise 3 and one merchandise 4. The quantity of each merchandise purchased by customer "C" as shown in the customer's log data structure 922 is deducted from the store inventory 924, the quantity of merchandise 1 is reduced from 48 to 46, and so on. Generates an updated store inventory data structure 926 indicating that the quantities of 3 and 4 have been reduced by the respective quantities of goods 3 and 4 purchased by customer "C". Since the customer "C" did not purchase the product 2, in the updated store inventory data structure 926, the quantity of the product 2 remains the same as the previous one in the current store inventory data structure 924.

一実施形態では、顧客の離脱検出は、また、顧客が商品を取り出した（ショッピングストアの棚等の）在庫位置の在庫データ構造の更新を始動させる。斯かる実施形態では、在庫位置の在庫データ構造は、上述したように、取るまたは置く在庫イベントの直後には更新されない。この実施形態では、システムが顧客の離脱を検出すると、顧客に関連付けられた在庫イベントがトラバースされ、在庫イベントをショッピングストア内のそれぞれの在庫位置にリンクする。この処理によって決定された在庫位置の在庫データ構造が更新される。例えば、顧客が在庫位置２７から２個の商品１を取った場合、在庫位置２７の在庫データ構造は、商品１の数量を２つ減らすことによって更新される。在庫商品は、ショッピングストア内の複数の在庫位置にストックされ得ることに留意されたい。システムは在庫イベントに対応する在庫位置を識別し、従って、商品が取り出される在庫位置が更新される。

［店舗リアログラム］ In one embodiment, the customer withdrawal detection also initiates an update of the inventory data structure of the inventory position (such as a shopping store shelf) from which the customer has taken out the goods. In such an embodiment, the inventory data structure of the inventory position is not updated immediately after the inventory event to be taken or placed, as described above. In this embodiment, when the system detects a customer leaving, the inventory event associated with the customer is traversed and the inventory event is linked to each inventory location in the shopping store. The inventory data structure of the inventory position determined by this process is updated. For example, if the customer takes two goods 1 from the stock position 27, the stock data structure of the stock position 27 is updated by reducing the quantity of the goods 1 by two. Note that in-stock items can be stocked in multiple inventory locations within a shopping store. The system identifies the inventory position corresponding to the inventory event and therefore updates the inventory position from which the goods are retrieved.

[Store realogram]

ショッピングストアの在庫位置を含む、店舗内の実空間全体にわたる在庫商品の位置は、顧客が在庫位置から商品を取り出し、購入したくない商品を、商品が取り出された同じ棚上の同じ位置に戻すか、商品が取り出された同じ棚上の異なる場所に戻すか、または異なる棚上に置くときに、ある期間にわたって変化する。開示された技術は、複数のカメラ内の少なくとも２つのカメラによって生成された画像シーケンスを使用して、在庫イベントを識別し、在庫イベントに応答して、実空間のエリア内の在庫商品の位置を追跡する。幾つかの実施形態では、ショッピングストア内の商品は、特定の商品が配置されることが計画されている（棚等の）在庫位置を識別するプラノグラムに従って配置される。例えば、図１０の例示９１０に示すように、棚３及び棚４の左半分は、商品（缶の形状でストックされている）に指定されている。一日の始めまたは他の在庫追跡間隔（時間ｔ＝０によって識別される）で、プラノグラムに従って在庫位置がストックされると考える。 The location of in-stock items across the real space in the store, including the inventory position in the shopping store, allows the customer to remove the item from the inventory position and return the item that the customer does not want to purchase to the same position on the same shelf from which the item was taken out. It changes over a period of time when the goods are taken out, returned to different locations on the same shelf, or placed on different shelves. The disclosed technology uses image sequences generated by at least two cameras in multiple cameras to identify inventory events and respond to inventory events to position inventory items within a real-space area. Chase. In some embodiments, the goods in the shopping store are placed according to a planogram that identifies the inventory location (such as a shelf) where the particular goods are planned to be placed. For example, as shown in Example 910 of FIG. 10, the left half of the shelves 3 and 4 is designated as a commodity (stocked in the form of a can). At the beginning of the day or at another inventory tracking interval (identified by time t = 0), inventory positions are considered to be stocked according to the planogram.

開示される技術は、実空間のエリア内の在庫商品の位置のリアルタイム・マップである任意の時間「ｔ」におけるショッピングストアの「リアログラム」を計算することができ、これは、幾つかの実施形態では更に、店舗内の在庫位置と相関させることができる。リアログラムは、在庫商品及び店舗内の位置を識別し、それらを在庫位置にマッピングすることによって、プラノグラムを作成するために使用することができる。一実施形態では、システムまたは方法が実空間のエリア内に座標を有する複数のセルを規定するデータセットを作成することができる。システムまたは方法は、実空間の座標に沿ったセルの長さを入力パラメータとして使用して、実空間を複数のセルを規定するデータセットに分割することができる。一実施形態では、セルは、実空間のエリア内に座標を有する２次元グリッドとして表される。例えば、セルは、図１０の例示９６０に示されるように、棚ユニット（在庫陳列構造とも呼ばれる）における在庫位置の前面図の２Ｄグリッド（例えば、１フィート間隔で）と相関することができる。各グリッドは、図１０に示すように、ｘ座標やｚ座標のような２次元平面の座標上で、その開始位置と終了位置によって規定される。この情報は、マップ・データベース１４０に記憶される。 The disclosed technology can calculate a "realogram" of a shopping store at any time "t" which is a real-time map of the location of inventories within an area of real space, which is several implementations. In the form, it can be further correlated with the inventory position in the store. Realograms can be used to create planograms by identifying inventory items and locations within the store and mapping them to inventory locations. In one embodiment, the system or method can create a dataset that defines multiple cells having coordinates within an area of real space. The system or method can use the cell length along the coordinates of the real space as an input parameter to divide the real space into datasets that define multiple cells. In one embodiment, the cells are represented as a two-dimensional grid with coordinates within an area of real space. For example, the cell can correlate with a 2D grid (eg, at 1 foot intervals) of the front view of the stock position in the shelf unit (also called the stock display structure), as shown in Illustrative 960 of FIG. As shown in FIG. 10, each grid is defined by its start position and end position on the coordinates of a two-dimensional plane such as x-coordinates and z-coordinates. This information is stored in the map database 140.

別の実施形態では、セルが実空間のエリア内に座標を有する３次元（３Ｄ）グリッドとして表される。一例では、セルは、図１１Ａに示すように、ショッピングストア内の棚ユニットの在庫位置（または在庫位置の一部）上の容積と相関することができる。この実施形態では、実空間のマップが、実空間のエリア内の在庫陳列構造上の在庫位置の部分と相関することができる容量の単位の構成を識別する。この情報は、マップ・データベース１４０に記憶される。店舗リアログラム・エンジン１９０は、在庫イベント・データベース１５０を使用して、時間「ｔ」におけるショッピングストアのリアログラムを計算し、それをリアログラム・データベース１７０に格納する。ショッピングストアのリアログラムは、在庫イベント・データベース１５０に記憶された在庫イベントのタイムスタンプを使用することによって、任意の時間ｔにおいて、それらの位置によってセルにマッチングされた在庫イベントに関連付けられた在庫商品を示す。在庫イベントは、商品識別子、置くまたは取るインジケータ、実空間のエリアの３つの軸に沿った位置によって表される在庫イベントの位置、及び、タイムスタンプを含む。 In another embodiment, the cells are represented as a three-dimensional (3D) grid with coordinates within an area of real space. In one example, the cell can correlate with the volume on the stock position (or part of the stock position) of the shelf unit in the shopping store, as shown in FIG. 11A. In this embodiment, a map of real space identifies a configuration of units of capacity that can correlate with a portion of inventory position on the inventory display structure within an area of real space. This information is stored in the map database 140. The store realogram engine 190 uses the inventory event database 150 to calculate the realogram of the shopping store at time "t" and stores it in the realogram database 170. The shopping store realogram uses inventory event timestamps stored in the inventory event database 150 to associate inventory items with inventory events matched to cells by their position at any time t. Is shown. The inventory event includes a product identifier, an indicator to place or take, the position of the inventory event represented by the position along the three axes of the real space area, and the time stamp.

図１１Ａ中の例示は、ｔ＝０における１日目の始めに、第１の棚ユニット（列方向配列を形成する）における在庫位置の左側部分が「ケチャップ」ボトルを収納することを示す。セル（またはグリッド）の列は、図式的な視覚化において黒色で示され、セルは暗緑色等の他の色でレンダリングすることができる。他の全てのセルは空白のままであり、これらが商品を収納していないことを示す色で塗りつぶされていない。一実施形態では、リアログラム内のセル内の商品の視覚化は、店舗内の（セル内の）その位置を示す一度に１つの商品について生成される。別の実施形態では、リアログラムは、区別するために異なる色を使用して、在庫位置上の商品のセットの位置を表示する。斯かる実施形態では、セルがそのセルにマッチングする在庫イベントに関連付けられた幾つかの商品に対応する複数の色を有することができる。別の実施形態では、他の図式的な視覚化またはテキストベースの視覚化を使用して、セル内のＳＫＵまたは名前を列挙すること等によって、セル内の在庫商品を示す。 The illustration in FIG. 11A shows that at the beginning of the first day at t = 0, the left portion of the stock position in the first shelf unit (forming the columnar array) houses the "ketchup" bottle. The columns of cells (or grids) are shown in black in the schematic visualization, and the cells can be rendered in other colors such as dark green. All other cells remain blank and are not filled with a color that indicates that they do not contain the item. In one embodiment, the visualization of a product in a cell within a realogram is generated for one product at a time indicating its location in the store (in the cell). In another embodiment, the rearogram uses different colors to distinguish and display the position of the set of goods on the stock position. In such an embodiment, a cell can have a plurality of colors corresponding to some goods associated with an inventory event that matches the cell. In another embodiment, other schematic or text-based visualizations are used to indicate inventory items in a cell, such as by enumerating SKUs or names in the cell.

システムは、在庫イベントのそれぞれのカウントを使用して、特定のセルにマッチングする位置を有する在庫商品について、スコアリング時にＳＫＵスコア（スコアとも呼ばれる）を計算する。セルに対するスコアの計算は、取ること及び置くことのタイムスタンプとスコアリング時との間の分離によって重み付けされた在庫商品を取ること及び置くことの合計を使用する。一実施形態では、スコアはＳＫＵ当たりの在庫イベントの加重平均である。他の実施形態では、ＳＫＵ当たりの在庫イベントのカウント等、様々なスコアリング計算を使用することができる。一実施形態では、システムは、リアログラムを、複数のセル内のセル、及び、該セルのスコアを表す画像として表示する。例えば、図１１Ａ内の例示として、スコアリング時ｔ＝１（例えば、１日後）を考える。時間ｔ＝１のリアログラムは、「ケチャップ」商品のスコアを黒色の異なる濃淡で表している。時間ｔ＝１の店舗リアログラムは、１番目の棚ユニットと２番目の棚ユニット（１番目の棚ユニットの後ろ）の４列全てが「ケチャップ」商品を収納している。「ケチャップ」ボトルのＳＫＵスコアが高いセルは、「ケチャップ」ボトルのスコアが低いセルに比べて、濃い灰色でレンダリングされている。ケチャップのスコア値がゼロのセルは、空白のままではなく、色で塗りつぶされていない。従って、リアログラムは、時間ｔ＝１の後（例えば、１日後）に、ショッピングストア内の在庫位置上のケチャップボトルの位置に関するリアルタイム情報を提示する。リアログラムの生成頻度は、ショッピングストアの管理によって、その要求に応じて設定することができる。また、リアログラムは、店舗管理によってオンデマンドで生成することもできる。一実施形態では、リアログラムによって生成された商品位置情報が、誤配置された商品を識別するために、店舗プラノグラムと比較される。通知は、誤配置された在庫商品を、店舗のプラノグラムに示されるように、指定された在庫位置に戻すことができる店舗従業員に送信することができる。 The system uses each count of inventory events to calculate a SKU score (also called a score) at the time of scoring for an inventory item that has a position that matches a particular cell. The calculation of the score for a cell uses the sum of taking and placing inventories weighted by the separation between the time stamp of taking and placing and the time of scoring. In one embodiment, the score is a weighted average of inventory events per SKU. In other embodiments, various scoring calculations can be used, such as counting inventory events per SKU. In one embodiment, the system displays the realogram as a cell within a plurality of cells and an image representing the score of the cell. For example, consider scoring t = 1 (for example, one day later) as an example in FIG. 11A. The time t = 1 realogram represents the score of the "ketchup" product in different shades of black. In the store realogram at time t = 1, all four rows of the first shelf unit and the second shelf unit (behind the first shelf unit) contain "ketchup" products. Cells with high SKU scores in "ketchup" bottles are rendered in darker gray than cells with low scores in "ketchup" bottles. Cells with a ketchup score of zero are not left blank and are not filled with color. Therefore, after time t = 1 (eg, one day later), the realogram presents real-time information about the location of the ketchup bottle on the inventory location in the shopping store. The frequency of generating the realogram can be set according to the request by the management of the shopping store. Realograms can also be generated on demand by store management. In one embodiment, the merchandise location information generated by the realogram is compared to the store planogram to identify misplaced merchandise. Notifications can be sent to store employees who can return misplaced inventory items to a designated inventory position, as shown in the store's planogram.

一実施形態では、システムが、複数のセル内のセル及び該セルのスコアを表す表示画像をレンダリングする。図１１Ｂは、ユーザ・インタフェース・ディスプレイ１１０２上に図１１Ａのリアログラムがレンダリングされたコンピューティング・デバイスを示す。リアログラムは、タブレット、モバイル・コンピューティング・デバイス等の他のタイプのコンピューティング・デバイス上に表示することができる。システムは、セルを表す表示画像内において色の変化を使用して、セルのスコアを示すことができる。例えば、図１１Ａでは、ｔ＝０で「ケチャップ」を収納するセルの列をその列の暗緑色のセルによって表すことができる。ｔ＝１では、「ケチャップ」ボトルがセルの１番目の列を越えて複数のセルに分散されている。システムは、セルのスコアを示すために、緑色の異なる濃淡を使用することによって、これらのセルを表すことができる。緑色の濃い色調はより高いスコアを示し、明るい緑色のセルはより低いスコアを示す。ユーザ・インタフェースには、生成されたその他の情報が表示され、機能を呼び出したり表示したりするためのツールが用意されている。

［店舗リアログラムの計算］ In one embodiment, the system renders a cell within a plurality of cells and a display image representing the score of the cell. FIG. 11B shows a computing device in which the realogram of FIG. 11A is rendered on the user interface display 1102. Realograms can be displayed on other types of computing devices such as tablets and mobile computing devices. The system can use the color change in the display image representing the cell to indicate the score of the cell. For example, in FIG. 11A, the row of cells accommodating "ketchup" at t = 0 can be represented by the dark green cells in that row. At t = 1, "ketchup" bottles are distributed across the first column of cells into a plurality of cells. The system can represent these cells by using different shades of green to indicate the scores of the cells. Dark green tones indicate higher scores and light green cells indicate lower scores. The user interface displays other generated information and provides tools for invoking and displaying features.

[Calculation of store realogram]

図１２は、他のタイプの在庫陳列構造に適合させ得る、時間ｔにおける実空間のエリア内における棚のリアログラムを計算するための処理ステップを提示するフローチャートである。処理はステップ１２０２で開始する。ステップ１２０４では、システムは、実空間のエリア内の在庫イベントを在庫イベント・データベース１５０から検索する。在庫イベント・レコードは、商品識別子、置くまたは取るインジケータ、実空間のエリアの３次元（ｘ、ｙ、及びｚ等）内の位置によって表される在庫イベントの位置、及びタイムスタンプを含む。置くまたは取るインジケータは、顧客（被写体とも呼ばれる）が商品を棚に置いたか、棚から商品を取り出したかを識別する。置くイベントはプラス在庫イベントとも呼ばれ、取るイベントはマイナス在庫イベントとも呼ばれる。ステップ１２０６において、在庫イベントは、被写体追跡エンジン１１０からの出力と組み合わされて、この在庫イベントに関連する被写体の手を識別する。 FIG. 12 is a flow chart presenting a processing step for calculating a shelf realogram within a real space area at time t, which may be adapted to other types of inventory display structures. The process starts at step 1202. At step 1204, the system retrieves inventory events in the real space area from the inventory event database 150. The inventory event record includes the goods identifier, the indicator to put or take, the location of the inventory event represented by the position in the three dimensions (x, y, and z, etc.) of the real space area, and the time stamp. The put or take indicator identifies whether the customer (also known as the subject) puts the item on the shelf or removes the item from the shelf. The event to put is also called a plus inventory event, and the event to take is also called a minus inventory event. In step 1206, the inventory event is combined with the output from the subject tracking engine 110 to identify the subject's hand associated with this inventory event.

このシステムは、在庫イベントに関連する被写体の手の位置（ステップ１２０６）を使用して、位置を決定する。幾つかの実施形態では、在庫イベントが、ステップ１２０８において、棚ユニットまたは在庫陳列構造において、最も近い棚、またはその他の可能性のある在庫位置とマッチングさせることができる。図８のフローチャートにおける処理ステップ８０８は、手の位置に最も近い棚上の位置を決定するために使用され得る手法の詳細を示す。ステップ８０８中の手法で説明されているように、実空間内の点Ｅから平面上の任意の点Ｐ（ｘｚ平面上の棚の前面領域を表す）までの最短距離Ｄは、ベクトルＰＥを該平面に対する法線ベクトルｎ上に射影することによって決定され得る。平面に対するベクトルＰＥの交点は、手に対して棚上の最も近い点を与える。この点の位置は「ポイントクラウド」データ構造（ステップ１２１０）に、実空間のエリア内の点の３Ｄ位置、商品のＳＫＵ、及びタイムスタンプを含むタプルとして格納され、後の２つは、在庫イベント・レコードから取得される。在庫イベント・データベース１５０に在庫イベント・レコード（ステップ１２１１）が更に存在する場合、処理ステップ１２０４〜１２１０が繰り返される。更に存在しない場合は、処理はステップ１２１４に進む。 The system uses the position of the subject's hand (step 1206) associated with the inventory event to determine the position. In some embodiments, the inventory event can be matched in step 1208 with the nearest shelf or other possible inventory position in the shelf unit or inventory display structure. Processing step 808 in the flowchart of FIG. 8 shows details of a technique that can be used to determine the position on the shelf closest to the position of the hand. As described in the procedure in step 808, the shortest distance D from point E in real space to any point P on the plane (representing the front region of the shelf on the xz plane) is the vector PE. It can be determined by projecting onto the normal vector n with respect to the plane. The intersection of the vector PEs with respect to the plane gives the hand the closest point on the shelf. The location of this point is stored in the "point cloud" data structure (step 1210) as a tuple containing the 3D location of the point in the real space area, the SKU of the product, and the time stamp, the latter two being inventory events. -Obtained from the record. If there are more inventory event records (step 1211) in the inventory event database 150, processing steps 1204-1210 are repeated. If it does not exist further, the process proceeds to step 1214.

開示された技術は、実空間のエリア内に座標を有する複数のセルを規定するメモリに記憶されたデータセットを含む。セルは、座標軸に沿った開始位置及び終了位置によって境界付けられた実空間のエリアを規定する。実空間のエリアは、複数の在庫位置を含み、複数のセル内のセルの座標は、複数の在庫位置内の在庫位置と相関させることができる。開示された技術は、在庫イベントに関連する在庫商品の位置をセルの座標とマッチングさせ、複数のセル内のセルとマッチングされた在庫商品を表すデータを維持する。一実施形態では、システムは、在庫イベントの位置からデータセット内のセルまでの距離を計算し、計算された距離に基づいて在庫イベントをセルとマッチングさせるための手順（図８のフローチャートのステップ８０８に記載されているように）を実行することによって、在庫イベントの位置に基づいてデータセット内の最も近いセルを決定する。イベント位置と最も近いセルとのこのマッチングは、ポイントクラウド・データが存在するセルを識別するポイントクラウド・データの位置を与える（ステップ１２１２）。一実施形態では、セルが、在庫陳列構造内の在庫位置（棚等）の部分にマッピングすることができる。従って、このマッピングを使用することによって、棚の部分も識別される。上述のように、セルは、実空間のエリアの２Ｄグリッドまたは３Ｄグリッドとして表すことができる。システムは、特定のセルにマッチングする位置を有する在庫商品についてスコアリング時にスコアを計算するロジックを含む。一実施形態では、スコアは在庫イベントのカウントに基づいている。この実施形態では、セルのスコアは、取ること及び置くことのタイムスタンプとスコアリング時との間の分離によって重み付けされた在庫商品を取ること及び置くことの合計を使用する。例えば、スコアは、ＳＫＵ（ＳＫＵスコアとも呼ばれる）毎の加重移動平均とすることができ、セルにマッピングされた「ポイントクラウド」のデータ・ポイントを使用してセル毎に計算される：

The disclosed technique includes a data set stored in memory that defines a plurality of cells having coordinates within an area of real space. A cell defines an area of real space bounded by a start position and an end position along an axis. The real space area contains a plurality of inventory positions, and the coordinates of the cells in the plurality of cells can be correlated with the inventory positions in the plurality of inventory positions. The disclosed technique matches the position of the inventory item associated with the inventory event with the cell coordinates and maintains data representing the inventory item matched with the cells in the cells. In one embodiment, the system calculates the distance from the location of the inventory event to the cell in the dataset and a procedure for matching the inventory event with the cell based on the calculated distance (step 808 in the flowchart of FIG. 8). By executing (as described in), the closest cell in the dataset is determined based on the location of the inventory event. This matching of the event location to the nearest cell gives the location of the point cloud data that identifies the cell in which the point cloud data resides (step 1212). In one embodiment, the cell can be mapped to a portion of the inventory position (shelf, etc.) in the inventory display structure. Therefore, by using this mapping, the parts of the shelves are also identified. As mentioned above, cells can be represented as a 2D or 3D grid of real-space areas. The system includes logic to calculate the score at the time of scoring for inventory items that have positions that match a particular cell. In one embodiment, the score is based on a count of inventory events. In this embodiment, the cell scoring uses the sum of taking and placing inventories weighted by the separation between the time stamp of taking and placing and the time of scoring. For example, the score can be a weighted moving average per SKU (also known as a SKU score) and is calculated cell by cell using the "point cloud" data points mapped to the cell:

式（１）によって計算されるＳＫＵスコアは、セル内のＳＫＵの全てのポイントクラウドのデータ・ポイントのスコアの合計であり、各データ・ポイントは、置く及び取るイベントのタイムスタンプからの日数での時間のポイント＿ｔによって重み付けされる。グリッド内の「ケチャップ」商品に２つのポイントクラウドのデータ・ポイントがあるとする。第１のデータ・ポイントにはリアログラムが計算される時間「ｔ」の２日前にこの在庫イベントが発生したことを示すタイムスタンプがある。従って、ポイント＿ｔの値は「２」になる。第２のデータ・ポイントは時間「ｔ」の１日前に発生した在庫イベントに対応するため、ポイント＿ｔは「１」になる。（棚ＩＤによって識別される棚にマッピングするセルＩＤによって識別される）セルのケチャップのスコアは、以下のように計算される：

The SKU score calculated by equation (1) is the sum of the scores of the data points of all the point clouds of the SKU in the cell, where each data point is the number of days from the time stamp of the event to put and take. Weighted by time point_t. Suppose there are two point cloud data points for a "ketchup" product in the grid. The first data point has a time stamp indicating that this inventory event occurred two days before the time "t" at which the realogram was calculated. Therefore, the value of point_t becomes "2". The second data point corresponds to an inventory event that occurred one day before time "t", so point_t is "1". The cell ketchup score (identified by the cell ID that maps to the shelf identified by the shelf ID) is calculated as follows:

在庫イベントに対応するポイントクラウドのデータ・ポイントが古くなる（すなわち、イベントからより多くの日が経過する）につれて、ＳＫＵスコアに対するそれらの寄与は減少する。ステップ１２１６では、上位「Ｎ」個のＳＫＵが、ＳＫＵスコアの最も高いセルに対して選択される。一実施形態では、システムは、スコアに基づいてセル毎の在庫商品のセットを選択するロジックを含んでいる。例えば、「Ｎ」の値は、ＳＫＵスコアに基づいてセル毎の上位１０商品を選択するために１０として選択することができる。本実施形態では、リアログラムがセル当たり上位１０商品を記憶する。時間ｔにおける更新されたリアログラムが、ステップ１２１８において、時間ｔにおける棚内のセル当たりの上位「Ｎ」個のＳＫＵを示すリアログラム・データベース１７０に記憶される。処理はステップ１２２０で終了する。 As the data points in the point cloud corresponding to the inventory event become older (ie, more days have passed since the event), their contribution to the SKU score diminishes. In step 1216, the top "N" SKUs are selected for the cell with the highest SKU score. In one embodiment, the system includes logic to select a set of inventories per cell based on a score. For example, the value of "N" can be selected as 10 to select the top 10 products per cell based on the SKU score. In this embodiment, the realogram stores the top 10 products per cell. The updated realogram at time t is stored in the realogram database 170 showing the top "N" SKUs per cell in the shelf at time t in step 1218. The process ends in step 1220.

別の実施形態では、開示された技術は、在庫イベントに対応する棚の部分におけるポイントクラウド・データを計算するために、マップ・データベース１４０に格納された棚の部分の２Ｄまたは３Ｄマップを使用しない。この実施形態では、ショッピングストアを表す３Ｄ実空間が３Ｄ立方体（例えば、１フィート立方体）として表されるセルに区分される。３Ｄの手の位置は、（３つの軸に沿ったそれぞれの位置を使用して）セルにマッピングされる。全ての商品に対するＳＫＵスコアは、上述の式（１）を用いてセル毎に計算される。結果として得られるリアログラムは、店舗内の棚の位置を必要とせずに、店舗を表す実空間内のセル内の商品を示す。この実施形態では、ポイントクラウドのデータ・ポイントが、実空間内の座標上で在庫イベントに対応する手の位置と同じ位置にあってもよいし、或いは、手の位置に近いか、または手の位置を包含するエリア内のセルの位置にあってもよい。これは、棚のマップがない可能性があり、従って、手の位置が最も近い棚にマッピングされないためである。このため、この実施形態におけるポイントクラウドのデータ・ポイントは、必ずしも同一平面上にある必要はない。実空間における容積の単位（例えば、１立方フィート）内の全てのポイントクラウドのデータ・ポイントは、ＳＫＵスコアの計算に含まれる。 In another embodiment, the disclosed technique does not use a 2D or 3D map of the shelf portion stored in the map database 140 to calculate the point cloud data in the shelf portion corresponding to the inventory event. .. In this embodiment, the 3D real space representing the shopping store is divided into cells represented as 3D cubes (eg, 1 foot cubes). The 3D hand position is mapped to the cell (using each position along the three axes). The SKU score for all products is calculated cell by cell using the above formula (1). The resulting realogram shows the goods in a cell in real space that represents the store, without requiring the location of the shelves in the store. In this embodiment, the data point in the point cloud may be at the same position of the hand corresponding to the inventory event on coordinates in real space, or is close to or close to the position of the hand. It may be at the position of the cell in the area that includes the position. This is because there may not be a map of the shelves, so the hand position is not mapped to the nearest shelf. Therefore, the data points of the point cloud in this embodiment do not necessarily have to be on the same plane. All point cloud data points within a unit of volume in real space (eg, 1 cubic foot) are included in the SKU score calculation.

幾つかの実施形態では、リアログラムが反復的に計算され、店舗内の活動の時刻分析のために使用されるか、または、店舗内の在庫商品の移動を経時的に表示するためのアニメーション（ストップモーション・アニメーションのような）を生成するために使用され得る。

［店舗リアログラムの応用例］ In some embodiments, the realogram is iteratively calculated and used for time analysis of in-store activity, or an animation to show the movement of inventories in the store over time ( Can be used to generate (like stop motion animation).

[Application example of store realogram]

店舗リアログラムは、ショッピングストアの多くの業務で使用することができる。リアログラムの幾つかの応用例を以下のパラグラフに示す。

［在庫商品の再ストック］ Store realograms can be used in many operations of shopping stores. Some application examples of realograms are shown in the following paragraphs.

[Restock of in-stock items]

図１３Ａは、在庫商品が在庫位置（棚等）に再ストックされる必要があるかどうかを決定するための店舗リアログラムの１つのそのような応用例を提示する。処理は、ステップ１３０２で開始する。ステップ１３０４で、システムは、リアログラム・データベース１７０からスコアリング時「ｔ」でリアログラムを検索する。一例では、これは、つい最近に生成されたリアログラムである。リアログラム内の全てのセルのＳＫＵスコアが、ステップ１３０６で閾値スコアと比較される。ＳＫＵスコアが閾値より上の場合（ステップ１３０８）、処理は、次の在庫商品「ｉ」に対してステップ１３０４及び１３０６を繰り返す。プラノグラムを含む実施形態、または、プラノグラムが利用可能な場合では、商品「ｉ」のＳＫＵスコアが、プラノグラム内の在庫商品「ｉ」の配置にマッチングするセルに対する閾値と比較される。別の実施形態では、在庫商品のＳＫＵスコアが、「置く」在庫イベントをフィルタリングすることによって計算される。この実施形態では、ＳＫＵスコアは、閾値と比較し得るリアログラム内のセル当たりの在庫商品「ｉ」の「取る」イベントを反映する。別の実施形態では、セル当たりの「取る」在庫イベントのカウントは、在庫商品「ｉ」の再ストックを決定するための閾値と比較するためのスコアとして、使用することができる。この実施形態では、閾値は、在庫位置にストックされる必要がある在庫商品の最小カウントである。 FIG. 13A presents one such application of a store realogram for determining whether an inventory item needs to be restocked in an inventory location (such as a shelf). The process starts at step 1302. At step 1304, the system searches the realogram database 170 for the realogram at scoring "t". In one example, this is a recently generated realogram. The SKU scores of all cells in the rearogram are compared to the threshold scores in step 1306. If the SKU score is above the threshold (step 1308), the process repeats steps 1304 and 1306 for the next inventory item "i". In embodiments that include a planogram, or where a planogram is available, the SKU score of the product "i" is compared to a threshold for cells that match the placement of the stock product "i" in the planogram. In another embodiment, the SKU score of an in-stock item is calculated by filtering the "put" inventory event. In this embodiment, the SKU score reflects the "take" event of the stock item "i" per cell in the realogram that can be compared to the threshold. In another embodiment, the count of "take" inventory events per cell can be used as a score to compare with a threshold for determining restocking of inventory item "i". In this embodiment, the threshold is the minimum count of inventories that need to be stocked at the inventory position.

在庫商品「ｉ」のＳＫＵスコアが閾値未満である場合、在庫商品「ｉ」を再ストックする必要があることを示す警告通知が、店長または他の指定された従業員に送信される（ステップ１３１０）。システムは、また、ＳＫＵスコアが閾値未満のセルを在庫位置とマッチングさせることによって、在庫商品を再ストックする必要のある在庫位置を識別することができる。他の実施形態では、システムは、ショッピングストアのストック・ルーム内の在庫商品「ｉ」の在庫レベルをチェックして、在庫商品「ｉ」を卸業者から注文する必要があるかどうかを判定することができる。処理は、ステップ１３１２で終了する。図１３Ｂは、在庫商品についての再ストック警告通知を表示する、例示的なユーザ・インタフェースを提示する。警告通知は、タブレット及びモバイル・コンピューティング・デバイス等の他のタイプのデバイスのユーザ・インタフェース上に表示することができる。警告は、電子メール、携帯電話上のＳＭＳ（ショート・メッセージ・サービス）、または、モバイル・コンピューティング・デバイスにインストールされたアプリケーションを保存するための通知を介して、指定された受信者に送信することもできる。

［誤配置された在庫商品］ If the SKU score for inventory item "i" is below the threshold, a warning notice is sent to the store manager or other designated employee indicating that inventory item "i" needs to be restocked (step 1310). ). The system can also identify inventory positions where inventory items need to be restocked by matching cells with SKU scores below the threshold to inventory positions. In another embodiment, the system checks the inventory level of inventory item "i" in the stock room of a shopping store to determine if inventory item "i" needs to be ordered from a wholesaler. Can be done. The process ends in step 1312. FIG. 13B presents an exemplary user interface that displays restock warning notices for in-stock items. Warning notifications can be displayed on the user interface of other types of devices such as tablets and mobile computing devices. Alerts are sent to designated recipients via email, SMS (Short Message Service) on mobile phones, or notifications to store applications installed on mobile computing devices. You can also do it.

[Misplaced inventory items]

プラノグラムを含む実施形態では、または、店舗のプラノグラムが他の方法で利用可能である場合、プラノグラムのコンプライアンスのために、リアログラムが、誤配置された商品を識別することによって、プラノグラムと比較される。斯かる実施形態では、システムが、実空間のエリア内の在庫位置における在庫商品の配置を指定するプラノグラムを含む。システムは、複数のセル内のセルとマッチングする在庫商品を表すデータを維持するロジックを含む。システムは、セルとマッチングした在庫を表すデータを、プラノグラムで指定された在庫位置における在庫商品の配置と比較することによって、誤配置された商品を決定する。図１４は、リアログラムを使用してプラノグラムのコンプライアンスを判定するためのフローチャートを示す。処理はステップ１４０２で開始する。ステップ１４０４で、システムは、スコアリング時「ｔ」で在庫商品「ｉ」についてのリアログラムを検索する。リアログラム内の全てのセル内の在庫商品「ｉ」のスコアが、プラノグラム内の在庫商品「ｉ」の配置と比較される（ステップ１４０６）。リアログラムが、プラノグラム内の在庫商品「ｉ」の配置とマッチングしないセルにおける閾値を超える在庫商品「ｉ」に対するＳＫＵスコアを示す場合（ステップ１４０８）、システムは、これらの商品を誤配置として識別する。プラノグラム内の在庫商品の配置とマッチングしない商品についての警告または通知が、誤配置された商品を現在の位置から取って、指定された在庫位置に置き戻すことができる店舗従業員に送信される（ステップ１４１０）。ステップ１４０８で、誤配置された商品が識別されない場合、処理ステップ１４０４及び１４０６が、次の在庫商品「ｉ」について繰り返される。 In embodiments that include a planogram, or if the store's planogram is available in other ways, for the compliance of the planogram, the rearogram will identify the misplaced merchandise. Is compared with. In such an embodiment, the system includes a planogram that specifies the placement of inventory goods at inventory locations within an area of real space. The system includes logic that maintains data representing in-stock items that match cells within multiple cells. The system determines the misplaced goods by comparing the data representing the inventory matched with the cell with the placement of the goods in stock at the stock position specified in the planogram. FIG. 14 shows a flow chart for determining the compliance of a planogram using a realogram. The process starts in step 1402. At step 1404, the system searches for a realogram for inventory item "i" at "t" when scoring. The score of the inventory item "i" in all cells in the realogram is compared to the placement of the inventory item "i" in the planogram (step 1406). If the realogram shows a SKU score for inventory item "i" that exceeds the threshold in cells that do not match the placement of inventory item "i" in the planogram (step 1408), the system identifies these items as misplacement. do. A warning or notification about an item that does not match the placement of the item in stock in the planogram is sent to the store employee who can take the misplaced item from its current position and put it back in the specified inventory position. (Step 1410). If the misplaced merchandise is not identified in step 1408, processing steps 1404 and 1406 are repeated for the next inventories "i".

一実施形態では、店舗アプリが店舗マップ上に商品の位置を表示し、店舗従業員を誤配置された商品に導く。これに続いて、店舗アプリは店舗マップ上の商品の正しい位置を表示し、従業員を正しい棚の部分に導いて、商品を指定された位置に置くことができる。別の実施形態では、店舗アプリが、店舗アプリに入力されたショッピング・リストに基づいて、顧客を在庫商品に案内することもできる。店舗アプリは、リアログラムを使用して在庫商品のリアルタイム位置を使用し、顧客をマップ上の在庫商品に最も近い在庫位置に案内することができる。この例では、在庫商品の最も近い位置が、店舗のプラノグラムに従って在庫位置に配置されていない誤配置された商品の位置であり得る。図１４Ｂは、誤配置された在庫商品「ｉ」の警告通知をユーザ・インタフェース・ディスプレイ１１０２上に表示する例示的なユーザ・インタフェースを示す。図１３Ｂで上述したように、この情報を店舗従業員に送信するために、様々なタイプのコンピューティング・デバイス及び警告通知メカニズムが使用できる。

［在庫商品予測精度の向上］ In one embodiment, the store app displays the location of the product on the store map and guides the store employee to the misplaced product. Following this, the store app can display the correct position of the product on the store map, guide the employee to the correct part of the shelf, and place the product in the specified position. In another embodiment, the store app can also guide customers to inventory items based on the shopping list entered in the store app. The store app can use the real-time position of the inventory to guide the customer to the inventory location closest to the inventory on the map. In this example, the closest position of the in-stock item may be the position of the misplaced item that is not placed in the inventory position according to the store's planogram. FIG. 14B shows an exemplary user interface that displays a warning notification for misplaced inventory product “i” on the user interface display 1102. As mentioned above in FIG. 13B, various types of computing devices and alert notification mechanisms can be used to transmit this information to store employees.

[Improved inventory product forecast accuracy]

リアログラムの別の応用例は、画像認識エンジンによる在庫商品の予測を改善することである。図１５のフローチャートは、リアログラムを使用して在庫商品予想を調整するための例示的な処理ステップを示す。処理は、ステップ１５０２で開始する。ステップ１５０４で、システムは、画像認識エンジンから商品「ｉ」の予測信頼度スコア確率を受け取る。上述のように、ＷｈａｔＣＮＮは、被写体（または顧客）の手にある在庫商品を識別する例示的な画像認識エンジンである。ＷｈａｔＣＮＮが予測された在庫商品の信頼度スコア（または信頼値）確率を出力する。ステップ１５０６で、信頼度スコア確率が閾値と比較される。確率値が閾値を超え、予測のより高い信頼度を示す場合（ステップ１５０８）、処理は次の在庫商品「ｉ」について繰り返される。そうでなく、信頼度スコア確率が閾値未満である場合は、処理はステップ１５１０に続く。 Another application of realograms is to improve the prediction of in-stock items by image recognition engines. The flowchart of FIG. 15 shows exemplary processing steps for adjusting inventory forecasts using realograms. The process starts at step 1502. At step 1504, the system receives the predicted confidence score probability of the product "i" from the image recognition engine. As mentioned above, WhatCNN is an exemplary image recognition engine that identifies in-stock items in the hands of a subject (or customer). WhatCNN outputs the predicted confidence score (or confidence value) probability of the inventories. At step 1506, the confidence score probability is compared to the threshold. If the probability value exceeds the threshold and indicates a higher confidence in the prediction (step 1508), the process is repeated for the next inventory item "i". Otherwise, if the confidence score probability is less than the threshold, processing continues with step 1510.

スコアリング時「ｔ」での在庫商品「ｉ」のリアログラムは、ステップ１５１０において検索される。一例において、これは最新のリアログラムであってもよく、別の例においては、在庫イベント時と時間的にマッチングまたはより近いスコアリング時「ｔ」におけるリアログラムが、リアログラム・データベース１７０から検索され得る。ステップ１５１２で、在庫イベントの位置における在庫商品「ｉ」のＳＫＵスコアが閾値と比較される。ＳＫＵスコアが閾値を上回る場合（ステップ１５１４）、画像認識による在庫商品「ｉ」の予測を受け付ける（ステップ１５１６）。在庫イベントに関連付けられた顧客のログ・データ構造をそれに応じて更新する。在庫イベントが「取る」イベントである場合、在庫商品「ｉ」を顧客のログ・データ構造に追加する。在庫イベントが「置く」イベントである場合、在庫商品「ｉ」を顧客のログ・データ構造から除去する。ＳＫＵスコアが閾値を下回る場合（ステップ１５１４）、画像認識エンジンの予測は拒絶される（ステップ１５１８）。在庫イベントが「取る」イベントである場合、結果として、在庫商品「ｉ」は顧客のログ・データ構造に追加されない。同様に、在庫イベントが「置く」である場合、結果として、在庫商品「ｉ」は顧客のログ・データ構造から除去されない。処理は、ステップ１５２０で終了する。別の実施形態では、在庫商品「ｉ」のＳＫＵスコアを使用して、商品予測信頼度スコアを決定するための画像認識エンジンに対する入力パラメータを調整することができる。畳み込みニューラル・ネットワーク（ＣＮＮ）であるＷｈａｔＣＮＮは、在庫商品を予測する画像認識エンジンの一例である。

［ネットワーク構成］ The realogram of the inventory item "i" at scoring "t" is searched for in step 1510. In one example, this may be the latest realogram, and in another example, the realogram at "t" that is temporally matched or closer to the time of the inventory event is retrieved from the realogram database 170. Can be done. At step 1512, the SKU score of the stock item "i" at the position of the stock event is compared to the threshold. When the SKU score exceeds the threshold value (step 1514), the prediction of the inventory product "i" by image recognition is accepted (step 1516). Update the customer log data structure associated with the inventory event accordingly. If the inventory event is a "take" event, the inventory item "i" is added to the customer's log data structure. If the inventory event is a "put" event, the inventory item "i" is removed from the customer's log data structure. If the SKU score is below the threshold (step 1514), the image recognition engine's prediction is rejected (step 1518). If the inventory event is a "take" event, as a result, the inventory item "i" is not added to the customer's log data structure. Similarly, if the inventory event is "put", as a result, the inventory item "i" is not removed from the customer's log data structure. The process ends in step 1520. In another embodiment, the SKU score of the inventory product "i" can be used to adjust the input parameters to the image recognition engine for determining the product prediction confidence score. WhatCNN, a convolutional neural network (CNN), is an example of an image recognition engine that predicts inventory products.

[Network Configuration]

図１６は、ネットワーク・ノード１０６上でホストされる店舗リアログラム・エンジン１９０をホストするネットワークのアーキテクチャを示す。システムは、図示された実施形態では複数のネットワーク・ノード１０１ａ、１０１ｂ、１０１ｎ、及び１０２を含む。斯かる実施形態では、ネットワーク・ノードは処理プラットフォームとも呼ばれる。処理プラットフォーム（ネットワーク・ノード）１０３，１０１ａ〜１０１ｎ，及び１０２、並びに、カメラ１６１２，１６１４，１６１６，・・・１６１８は、ネットワーク１６８１に接続される。同様のネットワークは、ネットワーク・ノード１０４上でホストされる店舗在庫エンジン１８０をホストする。 FIG. 16 shows the architecture of a network hosting a store realogram engine 190 hosted on network node 106. The system includes a plurality of network nodes 101a, 101b, 101n, and 102 in the illustrated embodiment. In such an embodiment, the network node is also referred to as a processing platform. Processing platforms (network nodes) 103, 101a-101n, and 102, as well as cameras 1612, 1614, 1616, ... 1618 are connected to network 1681. A similar network hosts the store inventory engine 180 hosted on network node 104.

図１３は、ネットワークに接続された複数のカメラ１６１２，１６１４，１６１６，・・・１６１８を示す。多数のカメラを特定のシステムに配備することができる。一実施形態では、カメラ１６１２〜１６１８が、イーサネット（登録商標）ベースのコネクタ１６２２，１６２４，１６２６，及び１６２８をそれぞれ使用して、ネットワーク１６８１に接続される。斯かる実施形態では、イーサネット・ベースのコネクタがギガビット・イーサネットとも呼ばれる１ギガビット／秒のデータ転送速度を有する。他の実施形態では、カメラ１１４がギガビット・イーサネット（登録商標）よりも高速または低速のデータ転送速度を有することができる他のタイプのネットワーク接続を使用してネットワークに接続されることを理解されたい。また、代替の実施形態では、１組のカメラを各処理プラットフォームに直接接続することができ、処理プラットフォームをネットワークに結合することができる。 FIG. 13 shows a plurality of cameras 1612, 1614, 1616, ... 1618 connected to the network. Many cameras can be deployed in a particular system. In one embodiment, cameras 1612-1618 are connected to network 1681 using Ethernet®-based connectors 1622, 1624, 1626, and 1628, respectively. In such an embodiment, the Ethernet-based connector has a data transfer rate of 1 gigabit / sec, also referred to as Gigabit Ethernet. It should be appreciated that in other embodiments, the camera 114 is connected to the network using another type of network connection that can have faster or slower data transfer rates than Gigabit Ethernet®. .. Also, in an alternative embodiment, a set of cameras can be directly connected to each processing platform and the processing platform can be coupled to the network.

記憶サブシステム１６３０は、本発明の特定の実施形態の機能を提供する基本的なプログラミング及びデータ構成を記憶する。例えば、店舗リアログラム・エンジン１９０の機能を実施する様々なモジュールを、記憶サブシステム１６３０に格納することができる。記憶サブシステム１６３０は、非一時的データ記憶媒体を備えるコンピュータ可読メモリの一例であり、本明細書で説明される処理によって実空間のエリアのリアログラムを計算するロジックを含む、本明細書で説明されるデータ処理機能及び画像処理機能の全てまたは任意の組み合わせを実行するための、コンピュータによって実行可能なメモリに記憶されるコンピュータ命令を備える。他の例では、コンピュータ命令は、コンピュータ可読の非一時的データ記憶媒体または媒体を含む、携帯用メモリを含む他のタイプのメモリに記憶することができる。 The storage subsystem 1630 stores basic programming and data structures that provide the functionality of a particular embodiment of the invention. For example, various modules that perform the functions of the store realogram engine 190 can be stored in the storage subsystem 1630. The storage subsystem 1630 is an example of a computer-readable memory comprising a non-temporary data storage medium, which is described herein including logic for calculating a real-space area realogram by the processing described herein. It comprises computer instructions stored in a computer-executable memory for performing all or any combination of data processing and image processing functions to be performed. In another example, computer instructions can be stored in other types of memory, including portable memory, including computer-readable non-temporary data storage media or media.

これらのソフトウェア・モジュールは一般に、プロセッサ・サブシステム１６５０によって実行される。ホスト・メモリ・サブシステム１６３２は、通常、プログラム実行中の命令及びデータの記憶のためのメイン・ランダム・アクセス・メモリ（ＲＡＭ）１６３４と、固定命令が記憶される読み出し専用メモリ（ＲＯＭ）１６３６とを含む幾つかのメモリを含む。一実施形態では、ＲＡＭ１６３４が店舗リアログラム・エンジン１９０によって生成されたポイントクラウド・データ構造タプルを格納するためのバッファとして使用される。 These software modules are typically run by the processor subsystem 1650. The host memory subsystem 1632 typically includes a main random access memory (RAM) 1634 for storing instructions and data during program execution, and a read-only memory (ROM) 1636 for storing fixed instructions. Includes some memory including. In one embodiment, RAM 1634 is used as a buffer for storing point cloud data structure tuples generated by the store realogram engine 190.

ファイル記憶サブシステム１６４０は、プログラム及びデータ・ファイルのための永続的記憶を提供する。例示的な一実施形態では、記憶サブシステム１６４０が、番号１６４２によって識別されるＲＡＩＤ０（独立ディスクの冗長配列）構成内に４つの１２０ギガバイト（ＧＢ）ソリッド・ステート・ディスク（ＳＳＤ）を含む。該例示的な実施形態では、マップ・データベース１４０内のデータ、在庫イベント・データベース１５０内の在庫イベント・データ、在庫データベース１６０内の在庫データ、及びＲＡＭにないリアログラム・データベース１７０内のリアログラム・データが、ＲＡＩＤ０に記憶される。該例示的な実施形態では、ハードディスク・ドライブ１６４６はＲＡＩＤ０１６４２ストレージよりもアクセス速度が遅い。ソリッド・ステート・ディスク（ＳＳＤ）１６４４は、店舗リアログラム・エンジン１９０のためのオペレーティング・システム及び関連ファイルを含む。 The file storage subsystem 1640 provides permanent storage for programs and data files. In one exemplary embodiment, the storage subsystem 1640 comprises four 120 gigabytes (GB) solid state disks (SSDs) in a RAID 0 (redundant array of independent disks) configuration identified by number 1642. In the exemplary embodiment, the data in the map database 140, the inventory event data in the inventory event database 150, the inventory data in the inventory database 160, and the realogram in the realogram database 170 not in RAM. The data is stored in RAID0. In the exemplary embodiment, the hard disk drive 1646 has a slower access speed than the RAID0 1642 storage. Solid State Disk (SSD) 1644 contains an operating system and related files for the store realogram engine 190.

例示的な構成では、４つのカメラ１６１２，１６１４，１６１６，１６１８が、処理プラットフォーム（ネットワーク・ノード）１０３に接続される。各カメラは、カメラによって送られた画像を処理するために、専用グラフィックス処理ユニットＧＰＵ１１６６２，ＧＰＵ２１６６４，ＧＰＵ３１６６６，及びＧＰＵ４１６６８を有する。１つの処理プラットフォームにつき、３つより少ないまたは多いカメラを接続することができると理解される。従って、各カメラが、カメラから受信した画像フレームを処理するための専用ＧＰＵを有するように、より少ないまたはより多いＧＰＵがネットワーク・ノード内に構成される。プロセッサ・サブシステム１６５０、記憶サブシステム１６３０、及びＧＰＵ１６６２，１６６４、及び１６６６は、バス・サブシステム１６５４を使用して通信する。 In an exemplary configuration, four cameras 1612, 1614, 1616, 1618 are connected to a processing platform (network node) 103. Each camera has dedicated graphics processing units GPU1 1662, GPU2 1664, GPU3 1666, and GPU4 1668 to process the images sent by the cameras. It is understood that less than three or more cameras can be connected per processing platform. Thus, fewer or more GPUs are configured within the network node so that each camera has a dedicated GPU for processing image frames received from the camera. The processor subsystem 1650, the storage subsystem 1630, and the GPUs 1662, 1664, and 1666 communicate using the bus subsystem 1654.

ネットワーク・インタフェース・サブシステム１６７０は、処理プラットフォーム（ネットワーク・ノード）１０４の一部を形成するバス・サブシステム１６５４に接続される。ネットワーク・インタフェース・サブシステム１６７０は、他のコンピュータ・システムにおける対応するインタフェース・デバイスへのインタフェースを含む、外部ネットワークへのインタフェースを提供する。ネットワーク・インタフェース・サブシステム１６７０は、ケーブル（または配線）を使用して、またはワイヤレスで、処理プラットフォームがネットワークを介して通信することを可能にする。ユーザ・インタフェース出力デバイス及びユーザ・インタフェース入力デバイスのような幾つかの周辺デバイスも、処理プラットフォーム１０４の一部を形成するバス・サブシステム１６５４に接続されている。これらのサブシステム及びデバイスは、説明の明確性を改善するために、図１３には意図的に示されていない。バス・サブシステム１６５４は、単一のバスとして概略的に示されているが、バス・サブシステムの代わりの実施形態は複数のバスを使用することができる。 The network interface subsystem 1670 is connected to the bus subsystem 1654 which forms part of the processing platform (network node) 104. The network interface subsystem 1670 provides an interface to an external network, including an interface to the corresponding interface device in other computer systems. The network interface subsystem 1670 allows the processing platform to communicate over the network, either by cable (or wiring) or wirelessly. Several peripheral devices, such as user interface output devices and user interface input devices, are also connected to the bus subsystem 1654, which forms part of the processing platform 104. These subsystems and devices are not intentionally shown in FIG. 13 to improve the clarity of the description. Although the bus subsystem 1654 is schematically shown as a single bus, alternative embodiments of the bus subsystem can use multiple buses.

一実施形態では、カメラ１１４が、１２８８×９６４の解像度、３０ＦＰＳのフレームレート、及び１．３メガピクセル／イメージで、３００ｍｍ〜無限大の作動距離を有する可変焦点レンズ、９８．２°〜２３．８°の１／３インチセンサによる視野を有するＣｈａｍｅｌｅｏｎ３１．３ＭＰＣｏｌｏｒＵＳＢ３Ｖｉｓｉｏｎ（ＳｏｎｙＩＣＸ４４５）を使用して実装することができる。 In one embodiment, the camera 114 is a varifocal lens with a resolution of 1288 x 964, a frame rate of 30 FPS, and a working distance of 300 mm to infinity at 1.3 megapixels / image, 98.2 ° to 23. It can be mounted using a Cameraleon3 1.3 MP Color USB3 Vision (Sony ICX445) with a field of view with an 8 ° 1/3 inch sensor.

本明細書に記載された技術は、また、在庫陳列構造を含む実空間のエリア内の在庫商品を追跡するためのシステムを含み、対応する方法及びコンピュータ・プログラム製品と共に、在庫陳列構造の上方に配置された複数のカメラまたはその他のセンサと、実空間内の対応する視野内の在庫陳列構造のそれぞれの画像シーケンスを生成する複数のセンサ内のセンサと、但し、各センサの視野は、複数のセンサ内の少なくとも１つの他のセンサの視野と重なり合っており、データセットを記憶するメモリと、但し、該データセットは実空間のエリア内の座標を有する複数のセルを規定し、複数のセンサに結合され処理システムとを備え、該処理システムは、複数のセンサによって生成された画像シーケンスを処理して、実空間のエリア内の３次元における在庫イベントの位置を見つけ、在庫イベントに応答して、在庫イベントの位置に基づいてデータセット内の最も近いセルを決定するロジックを含み、該処理システムは、在庫イベントのそれぞれのカウントを使用して特定のセルにマッチングする位置を有する在庫イベントと関連する在庫商品について、スコアリング時にスコアを計算するロジックを含む。システムは、スコアに基づいてセル毎の在庫商品のセットを選択するロジックを含むことができる。在庫イベントは、商品識別子、置くまたは取るインジケータ、実空間のエリアの３つの軸に沿った位置によって表される位置、及びタイムスタンプを含むことができる。システムは、実空間のエリア内に座標を有する２次元グリッドとして表される複数のセルを規定するデータセットを含むことができ、セルは在庫位置の前面図の部分と相関し、処理システムは、在庫イベントの位置に基づいて最も近いセルを決定するロジックを含む。システムは、実空間のエリア内に座標を有する３次元グリッドとして表される複数のセルを規定するデータセットを含むことができ、セルは在庫位置上の容積の部分と相関し、処理システムは、在庫イベントの位置に基づいて最も近いセルを決定するロジックを含む。置くインジケータは商品が在庫位置に置かれたことを識別することができ、置くインジケータは、商品が在庫位置から取り出されたことを識別する。複数のセンサによって生成された画像シーケンスを処理するロジックは、手に対応する画像内の要素を表すデータセットを生成する画像認識エンジンを備え、少なくとも２つのセンサからの画像シーケンスからのデータセットの分析を実行して、３次元における在庫イベントの位置を判断することができる。画像認識エンジンは、畳み込みニューラル・ネットワークを含むことができる。セルのスコアを計算するロジックは、置くこと及び取ることのタイムスタンプとスコアリング時との間の分離によって重み付けされた在庫商品の置くこと及び取ることの合計を使用し、スコアをメモリに記憶することができる。在庫イベントの位置に基づいてデータセット内の最も近いセルを決定するロジックは、在庫イベントの位置からデータセット内のセルまでの距離を計算することと、計算された距離に基づいて在庫イベントをセルとマッチングさせることとを含む手順を実行することができる The techniques described herein also include a system for tracking inventory items in a real space area that includes an inventory display structure, along with corresponding methods and computer program products, above the inventory display structure. Multiple cameras or other sensors placed and sensors in multiple sensors that generate each image sequence of inventory display structures in the corresponding field of view in real space, provided that the field of view of each sensor is multiple. A memory that overlaps the field of view of at least one other sensor in the sensor and stores the dataset, where the dataset defines multiple cells with coordinates within an area of real space, to multiple sensors. Combined with a processing system, the processing system processes an image sequence generated by multiple sensors to locate an inventory event in three dimensions within an area of real space and responds to the inventory event. It contains logic to determine the closest cell in the dataset based on the location of the inventory event, and the processing system is associated with an inventory event that has a position to match a particular cell using each count of inventory events. Includes logic to calculate the score at the time of scoring for in-stock items. The system can include logic to select a set of inventories per cell based on the score. An inventory event can include a product identifier, an indicator to place or take, a position represented by a position along three axes of an area in real space, and a time stamp. The system can contain a dataset that defines multiple cells represented as a two-dimensional grid with coordinates within an area of real space, where the cells correlate with the front view portion of the stock location and the processing system Includes logic to determine the closest cell based on the location of the inventory event. The system can contain a dataset that defines multiple cells represented as a three-dimensional grid with coordinates within an area of real space, where the cells correlate with a portion of the volume on the stock location and the processing system Includes logic to determine the closest cell based on the position of the inventory event. The place indicator can identify that the item has been placed in the inventory position, and the place indicator identifies that the item has been removed from the inventory position. The logic that processes the image sequences generated by multiple sensors includes an image recognition engine that produces a dataset that represents the elements in the image that correspond to the hand, and analyzes the dataset from the image sequences from at least two sensors. Can be executed to determine the position of the inventory event in three dimensions. The image recognition engine can include a convolutional neural network. The logic for calculating the score of a cell uses the total of placing and taking of inventory items weighted by the separation between the putting and taking timestamps and the time of scoring, and stores the score in memory. be able to. The logic that determines the closest cell in the dataset based on the location of the inventory event is to calculate the distance from the location of the inventory event to the cell in the dataset, and to cell the inventory event based on the calculated distance. You can perform steps including matching with

上述または上記で参照された任意のデータ構造及びコードは、多くの実施態様に従って、コンピュータ・システムによって使用されるコード及び／またはデータを記憶することができる任意のデバイスまたは媒体であり得る、非一時的なコンピュータ可読記憶媒体を含むコンピュータ可読メモリに記憶される。これには、揮発性メモリ、不揮発性メモリ、特定用途向け集積回路（ＡＳＩＣ）、フィールド・プログラマブル・ゲートアレイ（ＦＰＧＡ）、ディスク・ドライブ、磁気テープ、ＣＤ（コンパクトディスク）、ＤＶＤ（デジタル・バーサタイル・ディスクまたはデジタル・ビデオ・ディスク）等の磁気及び光記憶デバイス、または、現在知られているまたは今後開発されるコンピュータ可読媒体を記憶することができる他の媒体が含まれるが、これらに限定されない。 Any data structure and code described above or referenced above can be any device or medium capable of storing the code and / or data used by the computer system according to many embodiments, non-temporarily. Stored in computer-readable memory, including a typical computer-readable storage medium. These include volatile memory, non-volatile memory, application-specific integrated circuits (ASIC), field programmable gate arrays (FPGA), disk drives, magnetic tapes, CDs (compact discs), DVDs (digital versatile). It includes, but is not limited to, magnetic and optical storage devices such as discs or digital video discs), or other media capable of storing computer-readable media currently known or developed in the future.

先行する説明は、開示された技術の使用及び実施を可能にするために提示されている。開示された実施態様に対する種々の変形は明らかであり、本明細書で規定された原理は、開示された技術の趣旨及び範囲から逸脱することなく、他の実施態様及び応用例に適用され得る。従って、開示された技術は、示された実施態様に限定されることを意図するものではなく、本明細書で開示された原理及び特徴と一致する最も広い範囲が与えられるべきである。開示される技術の範囲は、添付の特許請求の範囲によって規定される。 The preceding description is presented to enable the use and implementation of the disclosed technology. Various variations to the disclosed embodiments are obvious, and the principles defined herein can be applied to other embodiments and applications without departing from the spirit and scope of the disclosed art. Accordingly, the disclosed techniques are not intended to be limited to the embodiments shown, and should be given the broadest scope consistent with the principles and features disclosed herein. The scope of the disclosed technology is defined by the appended claims.

Claims

A system for tracking inventory items in a real space area.
A plurality of sensors and a processing system coupled to the plurality of sensors are provided.
Sensors within the plurality of sensors generate an image sequence for each of the corresponding fields of view in the real space, where in the plurality of sensors the field of view of each sensor is with the field of view of at least one other sensor. Overlapping,
The processing system uses the image sequence generated by at least two of the plurality of sensors to identify an inventory event and, in response to the inventory event, inventory within the real space area. A system that includes logic to track the location of merchandise and count the inventories within said location.

The system of claim 1, wherein the inventory event comprises a product identifier, an indicator to place or take, a position represented by a position in three dimensions of the real space area, and a time stamp.

A data set that defines a plurality of cells having coordinates in the real space area is included in the memory.
The system according to claim 1, further comprising a logic that matches the position of an in-stock item with the coordinates of a cell and maintains data representing an in-stock item that matches the cells in the plurality of cells.

The real space area contains multiple inventory locations and
The system according to claim 3, wherein the system includes a logic for matching an inventory position with the coordinates of cells in a plurality of cells that correlate with the inventory positions in the plurality of inventory positions.

The system according to claim 3, wherein the processing system includes a logic for calculating a score based on each count of inventory events at the time of scoring for an inventory product having a position matching a specific cell.

The logic for calculating the score of a cell uses the sum of placing and taking the inventory, weighted by the separation between the time stamp of placing and taking the inventory and the time of scoring. Item 5. The system according to item 5.

The system according to claim 5, wherein the processing system includes a logic for rendering a cell in the plurality of cells and a display image representing the score of the cell.

The system according to claim 5, wherein the processing system includes a logic for selecting a set of inventories for each cell based on the score.

The real space area contains a plurality of inventory positions, and the coordinates of the cells in the plurality of cells correlate with the inventory positions in the plurality of inventory positions.
A data set that defines a plurality of cells having coordinates in the real space area is contained in the memory, and a planogram that specifies the arrangement of inventory products at the stock position in the real space area is contained in the memory.
The logic that maintains the data representing the inventory items that match the cells in the plurality of cells compares the data representing the inventory that matches the cells with the placement of the inventory items within the inventory position specified in the planogram. The system according to claim 3, comprising logic for determining misplaced goods by doing so.

The real space area contains a plurality of inventory positions, and the coordinates of the cells in the plurality of cells correlate with the inventory positions in the plurality of inventory positions.
For a specific inventory product matched with a specific cell, the logic for maintaining the data representing the inventory product matching the cells in the plurality of cells counts the specific inventory product in the cell. The system of claim 3, comprising logic for determining whether the inventory item on an inventory position that correlates with a particular cell is below the threshold for restocking.

A method for tracking inventory items in a real space area.
Using a plurality of sensors to generate an image sequence for each of the corresponding fields of view in the real space, where the field of view of each sensor overlaps the field of view of at least one other sensor.
The image sequence generated by at least two of the plurality of sensors is used to identify inventory events and.
A method of tracking the location of inventory goods within the real space area and counting the inventory goods within the location in response to the identification of the inventory event.

11. The method of claim 11, wherein the inventory event comprises a product identifier, an indicator to place or take, a position represented by a position along three axes of the real space area, and a time stamp.

Specifying multiple cells with coordinates within the real space area,
To store the plurality of cells in a data set in memory,
Matching the position of in-stock items with the cell coordinates, and
11. The method of claim 11, further comprising maintaining data representing inventories that match cells in the plurality of cells.

The real space area contains multiple inventory locations and
13. The method of claim 13, wherein the coordinates of the cells in the plurality of cells correlate with the inventory positions in the plurality of inventory positions.

13. The method of claim 13, further comprising calculating a score using each count of inventory events at the time of scoring for an inventory item having a position that matches a particular cell.

Using the sum of placing and taking the inventory, weighted by the separation between the time stamp of placing and taking the inventory and the time of scoring, to calculate the score of the cell, and The method of claim 15, further comprising storing the score in memory.

15. The method of claim 15, further comprising rendering a cell within the plurality of cells and a display image representing the score of the cell.

15. The method of claim 15, further comprising selecting a set of inventories for each cell based on the score.

The real space area contains a plurality of inventory positions, and the coordinates of the cells in the plurality of cells correlate with the inventory positions in the plurality of inventory positions.
Specifying a plurality of cells having coordinates in the area of the real space, and storing the specified cells as a data set in the memory.
Specifying the arrangement of inventory products at the inventory position in the real space area as a planogram and storing the planogram in the memory.
Maintaining data representing in-stock items that match cells in the plurality of cells, and
13. The method of claim 13, further comprising determining misplaced merchandise by comparing the data representing the inventory matching the cell with the placement of the inventories in the inventory position.

The real space area contains a plurality of inventory positions, and the coordinates of the cells in the plurality of cells correlate with the inventory positions in the plurality of inventory positions.
Maintaining the data representing inventories that match cells in the plurality of cells, and
For an inventory item that matches a particular cell, whether the count of the particular inventory item in that cell is lower than the threshold for restocking the inventory item on the inventory position that correlates with the particular cell. 13. The method of claim 13, further comprising determining.

A non-temporary computer-readable storage medium that stores computer program instructions for tracking inventory items in a real-space area.
The method implemented when the instruction is executed on the processor is
Using a plurality of sensors to generate an image sequence for each of the corresponding fields of view in the real space, where the field of view of each sensor overlaps the field of view of at least one other sensor.
The image sequence generated by at least two of the plurality of sensors is used to identify inventory events and.
A non-temporary computer-readable storage medium comprising tracking the location of inventory goods within the real space area and counting the inventory goods within said location in response to said identification of the inventory event.

21. The non-temporary computer-readable storage medium of claim 21, wherein the inventory event comprises a product identifier, an indicator to place or take, a position represented by a position along three axes of the real space area, and a time stamp. ..

The implementation of the above method
Specifying multiple cells with coordinates within the real space area,
To store the plurality of cells in a data set in memory,
Matching the position of in-stock items with the cell coordinates, and
21. The non-temporary computer-readable storage medium of claim 21, further comprising maintaining data representing in-stock items that match the cells in the plurality of cells.

The real space area contains multiple inventory locations and
23. The non-temporary computer-readable storage medium according to claim 23, wherein the coordinates of the cells in the plurality of cells correlate with the inventory positions in the plurality of inventory positions.

The implementation of the above method
24. The non-temporary computer-readable storage medium of claim 24, further comprising calculating a score using each count of inventory events at the time of scoring for an inventory item having a position matching a particular cell.

The implementation of the above method
Using the sum of placing and taking the inventory, weighted by the separation between the time stamp of placing and taking the inventory and the time of scoring, to calculate the score of the cell, and 25. The non-temporary computer-readable storage medium according to claim 25, further comprising storing the score in memory.

The implementation of the above method
25. The non-temporary computer-readable storage medium of claim 25, further comprising rendering a cell within the plurality of cells and a display image representing the score of the cell.

The implementation of the above method
25. The non-temporary computer-readable storage medium of claim 25, further comprising selecting a set of in-stock items for each cell based on the score.

The real space area contains a plurality of inventory positions, and the coordinates of the cells in the plurality of cells correlate with the inventory positions in the plurality of inventory positions.
The implementation of the above method
Specifying a plurality of cells having coordinates in the area of the real space, and storing the specified cells as a data set in the memory.
Specifying the arrangement of inventory products at the inventory position in the real space area as a planogram and storing the planogram in the memory.
Maintaining data representing in-stock items that match cells in the plurality of cells, and
23. The non-temporary computer-readable storage according to claim 23, further comprising determining a misplaced product by comparing the data representing the inventory matching the cell with the placement of the stock product in the stock position. Medium.

The real space area contains a plurality of inventory positions, and the coordinates of the cells in the plurality of cells correlate with the inventory positions in the plurality of inventory positions.
The implementation of the above method
Maintaining the data representing inventories that match cells in the plurality of cells, and
For an inventory item that matches a particular cell, whether the count of the particular inventory item in that cell is lower than the threshold for restocking the inventory item on the inventory position that correlates with the particular cell. 23. The non-temporary computer-readable storage medium according to claim 23.