JP2023170655A

JP2023170655A - information processing system

Info

Publication number: JP2023170655A
Application number: JP2022082553A
Authority: JP
Inventors: 成吉谷井; Seikichi Tanii
Original assignee: Marketvision Co Ltd
Current assignee: Marketvision Co Ltd
Priority date: 2022-05-19
Filing date: 2022-05-19
Publication date: 2023-12-01
Anticipated expiration: 2042-05-19
Also published as: JP7339630B1

Abstract

To provide an information processing system for improving recognition accuracy in the case of identifying a product from image information.SOLUTION: An information processing system for identifying a product shown in image information includes a first learning processing part for performing machine learning by using first annotation data obtained by associating data from masking the inner side of the closed area delineated by the product's outer shape with attributes to create a first learning model, and a recognition processing part for identifying product identification information of the product shown in the image information obtained by photographing a display shelf. In the information processing system, the recognition processing part identifies the outer shape of the product shown in the image information as a face area by using the image information and the first learning model, and identifies the product identification information of the product shown in the identified face area.SELECTED DRAWING: Figure 1

Description

本発明は、画像情報から商品を同定する場合の認識精度を低下させない情報処理システムに関する。 The present invention relates to an information processing system that does not reduce recognition accuracy when identifying products from image information.

コンビニエンスストア、スーパーなどの各種の店舗では、販売している商品を陳列棚に置いて販売をしていることが一般的である。この陳列方法としては、購買者に対して目につきやすくするために商品を横方向に複数陳列しておく、あるいは、商品の一つが購入されても、同一の商品をほかの人が購入できるように、商品を縦方向に陳列しておく場合がある。そして、商品が陳列棚のどこにいくつ陳列されているかを管理することは、商品の販売戦略上、重要である。 BACKGROUND ART In various stores such as convenience stores and supermarkets, it is common for products to be sold on display shelves. This display method involves displaying multiple products horizontally to make them more visible to buyers, or allowing other people to purchase the same product even if one product is purchased. In some cases, products are displayed vertically. Managing where and how many products are displayed on a display shelf is important in terms of product sales strategy.

そのため、店舗における商品の実際の陳列状況を把握するため、陳列棚を撮影装置で撮影し、その撮影した画像情報から陳列されている商品を自動的に特定する方法がある。たとえば商品ごとの標本画像をもとに、店舗の陳列棚を撮影した画像に対して画像認識技術を用いる方法がある。これらの従来技術として、たとえば、下記特許文献１、特許文献２がある。 Therefore, in order to understand the actual display status of products in a store, there is a method of photographing the display shelves with a photographing device and automatically identifying the displayed products from the photographed image information. For example, there is a method of using image recognition technology on images of store display shelves based on sample images of each product. These conventional techniques include, for example, Patent Document 1 and Patent Document 2 below.

特開平５－３４２２３０号公報Japanese Patent Application Publication No. 5-342230 特開平５－３３４４０９号公報Japanese Patent Application Publication No. 5-334409

特許文献１の発明は、商品をどの陳列棚に陳列すべきかが知識のない者にもできるように支援するシステムである。そのため、商品をどこに陳列するかを把握することはできるが、陳列されている商品を特定するものではない。また特許文献２は、商品の陳列を支援する棚割支援システムにおいて、商品画像の入力を支援するシステムである。しかし特許文献２のシステムでは、棚割支援システムを利用する際の商品画像の入力を支援するのみであって、このシステムを用いたとしても、具体的な商品の陳列状況を把握することはできない。 The invention of Patent Document 1 is a system that helps even those without knowledge to decide on which display shelves products should be displayed. Therefore, although it is possible to grasp where products are displayed, it does not specify the products that are displayed. Further, Patent Document 2 is a system that supports input of product images in a planogram support system that supports display of products. However, the system of Patent Document 2 only supports the input of product images when using the planogram support system, and even if this system is used, it is not possible to grasp the specific display status of products. .

さらに、特許文献１、特許文献２以外にも、陳列棚を撮影した画像情報から陳列されている商品を画像認識処理技術を用いて同定する技術もある。これによって、店舗における実際の陳列状況を把握することはできる点で有益である。 Furthermore, in addition to Patent Document 1 and Patent Document 2, there is also a technology that uses image recognition processing technology to identify displayed products from image information obtained by photographing display shelves. This is useful in that it is possible to grasp the actual display status at the store.

従来技術において画像認識処理技術を用いて商品を同定する場合、陳列棚を撮影した画像情報から、商品があると思われる矩形領域を検出し、その矩形領域について商品の標本画像とマッチング処理を実行する、あるいは矩形領域を入力値として深層学習（ディープラーニング）の処理を実行することで、商品を同定する。 In conventional technology, when identifying a product using image recognition processing technology, a rectangular area where the product is thought to be is detected from image information taken of a display shelf, and matching processing is performed for that rectangular area with a sample image of the product. Alternatively, the product can be identified by performing deep learning processing using the rectangular area as an input value.

しかし、商品の形状（輪郭）は矩形とは限らない。商品があると思われる領域を矩形で検出すると、当該矩形領域にほかの商品の一部が写り込む場合がある。また、当該矩形領域に背景が写り込む場合もある。そのため、このような矩形領域を、深層学習（ディープラーニング）の教師データや処理対象の画像、あるいは画像マッチング処理とすると、商品を同定する精度を低下させる原因となる課題がある。 However, the shape (outline) of the product is not necessarily rectangular. When a rectangular area is detected that is thought to contain a product, part of another product may be reflected in the rectangular area. Further, the background may be reflected in the rectangular area. Therefore, if such a rectangular area is used as training data for deep learning, an image to be processed, or image matching processing, there is a problem that it causes a decrease in the accuracy of product identification.

本発明者は上記課題に鑑み、画像情報から陳列している商品を同定する際に、商品を同定する精度を向上させる情報処理システムを発明した。 In view of the above problems, the present inventors have invented an information processing system that improves the accuracy of product identification when identifying products on display from image information.

第１の発明は、画像情報に写っている商品を同定する情報処理システムであって、商品の輪郭を外形としてその閉領域の内側をマスク処理したデータと属性とを対応づけた第１のアノテーションデータを用いて機械学習をして第１の学習モデルを作成する第１の学習処理部と、陳列棚を撮影した画像情報に写っている商品の商品識別情報を同定する認識処理部と、を有しており、前記認識処理部は、前記画像情報と前記第１の学習モデルとを用いて、前記画像情報に写っている商品の外形をフェイス領域として特定し、前記特定したフェイス領域に写っている商品の商品識別情報を同定する、情報処理システムである。 A first invention is an information processing system for identifying a product shown in image information, and a first annotation that associates data obtained by masking the inside of a closed area with the outline of the product as an external shape and an attribute. a first learning processing unit that performs machine learning using data to create a first learning model; and a recognition processing unit that identifies product identification information of products shown in image information taken of display shelves. The recognition processing unit uses the image information and the first learning model to identify the external shape of the product shown in the image information as a face area, and identifies the external shape of the product shown in the identified face area. This is an information processing system that identifies product identification information for products that are currently being sold.

本発明のように、商品の輪郭を外形としたマスク処理したデータによる第１のアノテーションデータを用いて機械学習した学習モデルを利用して商品識別情報を同定することで、陳列棚を撮影した画像情報から切り出すフェイス領域を従来のような矩形領域から、商品の外形に沿った領域にできる。これによって、ほかの商品や背景の写り込みを減らすことができ、商品を同定する際の精度を向上させることができる。 As in the present invention, by identifying product identification information using a learning model that is machine learned using first annotation data based on mask-processed data with the outline of the product as the external shape, an image of a display shelf is created. The face area extracted from the information can be changed from a conventional rectangular area to an area that follows the outer shape of the product. This makes it possible to reduce the reflection of other products and the background, and improve the accuracy when identifying products.

上述の発明において、前記第１の学習処理部は、前記第１のアノテーションデータを用いて画像セグメンテーションによる機械学習をして前記第１の学習モデルを作成する、情報処理システムのように構成することができる。 In the above invention, the first learning processing unit may be configured as an information processing system that creates the first learning model by performing machine learning by image segmentation using the first annotation data. Can be done.

機械学習をする際には、画像セグメンテーションの方法による機械学習が好ましい。 When performing machine learning, machine learning using an image segmentation method is preferable.

上述の発明において、前記情報処理システムは、商品の画像データと商品識別情報とを対応づけた第２のアノテーションデータを用いて機械学習をして第２の学習モデルを作成する第２の学習処理部、を有しており、前記認識処理部は、前記特定したフェイス領域の画像情報と前記第２の学習モデルとを用いて、前記フェイス領域に写っている商品の商品識別情報を同定する、情報処理システムのように構成することができる。 In the above invention, the information processing system performs a second learning process of creating a second learning model by performing machine learning using second annotation data that associates product image data with product identification information. The recognition processing unit uses the image information of the identified face area and the second learning model to identify product identification information of the product shown in the face area. It can be configured like an information processing system.

上述の発明において、前記認識処理部は、前記特定したフェイス領域の画像情報と、標本情報記憶部に記憶する商品の標本情報とを画像マッチング処理することで、前記フェイス領域に写っている商品の商品識別情報を同定する、情報処理システムのように構成することができる。 In the above invention, the recognition processing unit performs image matching processing on the image information of the identified face area and the sample information of the product stored in the sample information storage unit, thereby identifying the product appearing in the face area. It can be configured like an information processing system that identifies product identification information.

フェイス領域が商品の輪郭を外形とした領域で構成されているため、これらの発明のような処理を行うことで商品識別情報を同定することが好ましい。 Since the face area is composed of an area whose outer shape is the outline of the product, it is preferable to identify the product identification information by performing processing such as those disclosed in these inventions.

第５の発明は、画像情報に写っている商品を同定する情報処理システムであって、商品の輪郭を外形としてその閉領域の内側をマスク処理したデータと属性とを対応づけた第１のアノテーションデータを用いて機械学習をして第１の学習モデルを作成する第１の学習処理部、を有しており、前記第１の学習モデルと陳列棚を撮影した画像情報とを用いて、前記画像情報に写っている商品の外形をフェイス領域として特定させ、前記特定したフェイス領域に写っている商品の商品識別情報を同定させる、情報処理システムである。 A fifth invention is an information processing system for identifying a product shown in image information, and a first annotation that associates data obtained by masking the inside of a closed area with the outline of the product as an external shape and an attribute. It has a first learning processing unit that performs machine learning using data to create a first learning model, and uses the first learning model and image information obtained by photographing the display shelf to This information processing system specifies the external shape of a product shown in image information as a face area, and identifies product identification information of the product shown in the specified face area.

第６の発明は、画像情報に写っている商品を同定する情報処理システムであって、陳列棚を撮影した画像情報の入力を受け付ける画像情報入力受付処理部と、前記入力を受け付けた画像情報若しくは前記画像情報を正置化した画像情報から、写っている商品の商品識別情報を同定する陳列商品認識処理部と、を有しており、前記陳列商品認識処理部は、商品の輪郭を外形としてその閉領域の内側をマスク処理したデータと属性とを対応づけた第１のアノテーションデータを用いて機械学習をすることによって作成された第１の学習モデルと、前記入力を受け付けた画像情報若しくは前記画像情報を正置化した画像情報とを用いて、前記写っている商品の外形をフェイス領域として特定し、前記特定したフェイス領域に写っている商品の商品識別情報を同定する、情報処理システムである。 A sixth invention is an information processing system for identifying products shown in image information, which includes an image information input reception processing unit that receives input of image information obtained by photographing a display shelf; It has a displayed product recognition processing unit that identifies product identification information of the product in the image from the image information that has been normalized, and the displayed product recognition processing unit identifies the outline of the product as an external shape. A first learning model created by machine learning using first annotation data in which data obtained by masking the inside of the closed region and attributes are associated with each other; An information processing system that uses image information in which image information is normalized to identify the external shape of the product shown as a face area, and identifies product identification information of the product shown in the specified face area. be.

これらの発明のように構成しても、第１の発明と同様の技術的効果を得ることができる。 Even with the configurations of these inventions, the same technical effects as the first invention can be obtained.

第１の発明は、本発明のプログラムをコンピュータに読み込ませて実行することで、実現することができる。すなわち、コンピュータを、商品の輪郭を外形としてその閉領域の内側をマスク処理したデータと属性とを対応づけた第１のアノテーションデータを用いて機械学習をして第１の学習モデルを作成する第１の学習処理部、陳列棚を撮影した画像情報に写っている商品の商品識別情報を同定する認識処理部、として機能させる情報処理プログラムであって、前記認識処理部は、前記画像情報と前記第１の学習モデルとを用いて、前記画像情報に写っている商品の外形をフェイス領域として特定し、前記特定したフェイス領域に写っている商品の商品識別情報を同定する、情報処理プログラムである。 The first invention can be realized by loading and executing the program of the invention into a computer. That is, a computer performs machine learning to create a first learning model using first annotation data in which attributes are associated with data obtained by masking the inside of a closed area using the outline of the product as an external shape. an information processing program that functions as a learning processing section of No. 1 and a recognition processing section that identifies product identification information of a product shown in image information obtained by photographing a display shelf; The information processing program is an information processing program that uses a first learning model to identify an external shape of a product shown in the image information as a face area, and identifies product identification information of a product shown in the identified face area. .

第５の発明は、本発明のプログラムをコンピュータに読み込ませて実行することで、実現することができる。すなわち、コンピュータを、商品の輪郭を外形としてその閉領域の内側をマスク処理したデータと属性とを対応づけた第１のアノテーションデータを用いて機械学習をして第１の学習モデルを作成する第１の学習処理部、として機能させる情報処理プログラムであって、前記第１の学習モデルと陳列棚を撮影した画像情報とを用いて、前記画像情報に写っている商品の外形をフェイス領域として特定させ、前記特定したフェイス領域に写っている商品の商品識別情報を同定させる、情報処理プログラムである。 The fifth invention can be realized by loading and executing the program of the invention into a computer. That is, a computer performs machine learning to create a first learning model using first annotation data in which attributes are associated with data obtained by masking the inside of a closed area using the outline of the product as an external shape. 1, the information processing program is configured to function as a learning processing unit according to the first aspect of the present invention, which uses the first learning model and image information obtained by photographing a display shelf to identify an external shape of a product shown in the image information as a face area. This information processing program causes the user to identify the product identification information of the product shown in the specified face area.

第６の発明は、本発明のプログラムをコンピュータに読み込ませて実行することで、実現することができる。すなわち、コンピュータを、陳列棚を撮影した画像情報の入力を受け付ける画像情報入力受付処理部、前記入力を受け付けた画像情報若しくは前記画像情報を正置化した画像情報から、写っている商品の商品識別情報を同定する陳列商品認識処理部、として機能させる情報処理プログラムであって、前記陳列商品認識処理部は、商品の輪郭を外形としてその閉領域の内側をマスク処理したデータと属性とを対応づけた第１のアノテーションデータを用いて機械学習をすることによって作成された第１の学習モデルと、前記入力を受け付けた画像情報若しくは前記画像情報を正置化した画像情報とを用いて、前記写っている商品の外形をフェイス領域として特定し、前記特定したフェイス領域に写っている商品の商品識別情報を同定する、情報処理プログラムである。 The sixth invention can be realized by loading and executing the program of the invention into a computer. That is, the computer is operated by an image information input reception processing unit that receives input of image information of a photograph of a display shelf, and identifies the product in the photograph based on the input image information or the image information obtained by normalizing the image information. An information processing program that functions as a displayed product recognition processing unit that identifies information, wherein the displayed product recognition processing unit associates data obtained by masking the inside of a closed area with the outline of the product as an external shape, and attributes. The first learning model created by machine learning using the first annotation data and the image information that received the input or the image information that normalized the image information, This information processing program specifies the external shape of a product shown in the image as a face area, and identifies product identification information of the product shown in the specified face area.

本発明の情報処理システムを用いることで、画像情報から陳列している商品を同定する際の同定の精度を向上させることが可能となる。 By using the information processing system of the present invention, it is possible to improve the accuracy of identification when identifying products on display from image information.

本発明の情報処理システムの構成の一例を模式的に示すブロック図である。1 is a block diagram schematically showing an example of the configuration of an information processing system of the present invention. 本発明の情報処理システムにおける陳列商品認識処理部の構成の一例を模式的に示すブロック図である。FIG. 2 is a block diagram schematically showing an example of the configuration of a displayed product recognition processing section in the information processing system of the present invention. 本発明の情報処理システムで用いるコンピュータのハードウェア構成の一例を模式的に示すブロック図である。FIG. 1 is a block diagram schematically showing an example of the hardware configuration of a computer used in the information processing system of the present invention. 本発明の情報処理システムにおける学習処理の処理プロセスの一例を示すフローチャートである。It is a flowchart which shows an example of the processing process of the learning process in the information processing system of this invention. 本発明の情報処理システムにおける認識処理の処理プロセスの一例を示すフローチャートである。3 is a flowchart illustrating an example of a recognition processing process in the information processing system of the present invention. 第１のアノテーションデータの一例を模式的に示す図である。FIG. 3 is a diagram schematically showing an example of first annotation data. 第１のアノテーションデータの他の一例を模式的に示す図である。It is a figure which shows typically another example of 1st annotation data. 第２のアノテーションデータの一例を模式的に示す図である。It is a figure which shows typically an example of 2nd annotation data. 第２のアノテーションデータの他の一例を模式的に示す図である。It is a figure which shows typically another example of 2nd annotation data. 撮影画像情報の一例を示す図である。FIG. 3 is a diagram showing an example of photographed image information. 撮影画像情報の他の一例を示す図である。FIG. 7 is a diagram showing another example of photographed image information. 図１０の撮影画像情報を正置化した画像情報の一例を示す図である。11 is a diagram showing an example of image information obtained by normalizing the captured image information of FIG. 10. FIG. 図１１の撮影画像情報を正置化した画像情報の一例を示す図である。12 is a diagram illustrating an example of image information obtained by normalizing the photographed image information of FIG. 11. FIG. 商品が陳列されている陳列棚を撮影した画像情報を正置化した正置画像情報に対して，棚段領域の指定の入力を受け付けた状態を模式的に示す図である。FIG. 6 is a diagram schematically showing a state in which an input for specifying a shelf area has been received for normalized image information obtained by normalizing image information obtained by photographing display shelves on which products are displayed. 商品が吊り下げられて陳列されている陳列棚を撮影した画像情報を正置化した正置画像情報に対して，棚段領域の指定の入力を受け付けた状態を模式的に示す図である。FIG. 6 is a diagram schematically showing a state in which an input for specifying a shelf area has been received with respect to normal image information obtained by normalizing image information obtained by photographing a display shelf on which products are hung and displayed. 棚段領域の画像情報からフェイス領域を特定した場合の一例を示す図である。FIG. 7 is a diagram showing an example of a case where a face area is specified from image information of a shelf area. 撮影画像情報の一例を示す図である。FIG. 3 is a diagram showing an example of photographed image information. 図１７の撮影画像情報に対して正置化処理を実行した正置画像情報の一例を示す図である。18 is a diagram showing an example of normalized image information obtained by performing normalization processing on the captured image information of FIG. 17. FIG. 実施例２における情報処理システムの構成の一例を模式的に示すブロック図である。2 is a block diagram schematically showing an example of the configuration of an information processing system in Example 2. FIG. 標本情報記憶部に記憶される標本情報の一例を示す図である。It is a figure showing an example of specimen information stored in a specimen information storage part. 実施例３における陳列商品認識処理部の一例を模式的に示すブロック図である。FIG. 7 is a block diagram schematically showing an example of a displayed product recognition processing section in Example 3. FIG.

本発明の情報処理システム１の処理機能の一例をブロック図で図１および図２に示す。情報処理システム１は、管理端末２と画像情報入力端末３とを用いる。図１は情報処理システム１の全体の機能を示すブロック図であり、図２は後述する陳列商品認識処理部２１４の機能を示すブロック図である。 An example of the processing functions of the information processing system 1 of the present invention is shown in block diagrams in FIGS. 1 and 2. The information processing system 1 uses a management terminal 2 and an image information input terminal 3. FIG. 1 is a block diagram showing the overall functions of the information processing system 1, and FIG. 2 is a block diagram showing the functions of a displayed product recognition processing section 214, which will be described later.

管理端末２は、情報処理システム１を運営する企業等の組織が利用するコンピュータである。また、画像情報入力端末３は、店舗の陳列棚を撮影した画像情報の入力を行う端末である。 The management terminal 2 is a computer used by an organization such as a company that operates the information processing system 1. Further, the image information input terminal 3 is a terminal for inputting image information obtained by photographing display shelves in a store.

情報処理システム１における管理端末２、画像情報入力端末３は、コンピュータを用いて実現される。図３にコンピュータのハードウェア構成の一例を模式的に示す。コンピュータは、プログラムの演算処理を実行するＣＰＵなどの演算装置７０と、情報を記憶するＲＡＭやハードディスクなどの記憶装置７１と、情報を表示するディスプレイなどの表示装置７２と、情報の入力が可能なキーボードやマウスなどの入力装置７３と、演算装置７０の処理結果や記憶装置７１に記憶する情報をインターネットやＬＡＮなどのネットワークを介して送受信する通信装置７４とを有している。 The management terminal 2 and image information input terminal 3 in the information processing system 1 are realized using a computer. FIG. 3 schematically shows an example of the hardware configuration of a computer. A computer has an arithmetic unit 70 such as a CPU that executes arithmetic processing of a program, a storage device 71 such as a RAM or hard disk that stores information, and a display device 72 such as a display that displays information, and is capable of inputting information. It has an input device 73 such as a keyboard and a mouse, and a communication device 74 that transmits and receives processing results of the arithmetic device 70 and information stored in the storage device 71 via a network such as the Internet or a LAN.

コンピュータがタッチパネルディスプレイを備えている場合には、表示装置７２と入力装置７３とが一体的に構成されていてもよい。タッチパネルディスプレイは、たとえばタブレット型コンピュータやスマートフォンなどの可搬型通信端末などで利用されることが多いが、それに限定するものではない。 When the computer is equipped with a touch panel display, the display device 72 and the input device 73 may be integrally configured. Touch panel displays are often used, for example, in portable communication terminals such as tablet computers and smartphones, but are not limited thereto.

タッチパネルディスプレイは、そのディスプレイ上で、直接、所定の入力デバイス（タッチパネル用のペンなど）や指などによって入力を行える点で、表示装置７２と入力装置７３の機能が一体化した装置である。 The touch panel display is a device in which the functions of the display device 72 and the input device 73 are integrated in that input can be performed directly on the display using a predetermined input device (such as a touch panel pen) or a finger.

画像情報入力端末３は、上記の各装置のほか、カメラなどの撮影装置を備えていてもよい。画像情報入力端末３として、携帯電話、スマートフォン、タブレット型コンピュータなどの可搬型通信端末を用いることもできる。 The image information input terminal 3 may include a photographing device such as a camera in addition to the above devices. As the image information input terminal 3, a portable communication terminal such as a mobile phone, a smartphone, or a tablet computer can also be used.

本発明における各手段は、その機能が論理的に区別されているのみであって、物理上あるいは事実上は同一の領域を為していても良い。本発明の各手段における処理は、その処理順序を適宜変更することもできる。また、処理の一部を省略してもよい。たとえば後述する正置化処理を省略することもできる。その場合、正置化処理をしていない画像情報に対する処理を実行することができる。 The respective means in the present invention are only logically distinguished in their functions, and may physically or practically form the same area. The order of the processing in each means of the present invention can be changed as appropriate. Also, part of the processing may be omitted. For example, it is also possible to omit the perpendicularization process, which will be described later. In that case, it is possible to perform processing on image information that has not been subjected to normalization processing.

情報処理システム１は、学習処理部２０と認識処理部２１とを有する。学習処理部２０は、第１の学習処理部２０１と第２の学習処理部２０２とを有する。 The information processing system 1 includes a learning processing section 20 and a recognition processing section 21. The learning processing section 20 includes a first learning processing section 201 and a second learning processing section 202.

第１の学習処理部２０１は、第１のアノテーションデータを用いて、陳列棚を撮影した画像情報に対する機械学習による学習処理、好ましくは、画像セグメンテーションの方法による学習処理を行う。この学習処理とは、機械学習における学習処理であって、たとえば深層学習（ディープラーニング）を用いた学習モデルを作成するため、画像セグメンテーションによる学習処理を実行する。 The first learning processing unit 201 uses the first annotation data to perform a learning process using machine learning on image information obtained by photographing a display shelf, preferably a learning process using an image segmentation method. This learning process is a learning process in machine learning, and for example, in order to create a learning model using deep learning, a learning process using image segmentation is executed.

第１のアノテーションデータとは、陳列棚に陳列される可能性のある商品の輪郭を外形とし、その輪郭による閉領域の内側をマスク処理したデータと、その商品の輪郭の属性を分類したタグ（ラベル）とを対応づけたデータである。第１のアノテーションデータは、一つの商品に一つでなくてもよく、一つの商品に複数あってもよい。すなわち、複数の方向から商品の輪郭を外形としてその内側をマスク処理したデータと属性とを対応づけて、それぞれを当該商品の第１のアノテーションデータとしてもよい。属性とは、たとえば、商品の容器の分類や商品の商品識別情報（ＪＡＮコードなど）などである。容器の分類としては、缶、ビン、箱、パウチ容器など容器の種類であってもよいし、洗剤容器のように用途に応じたさらに細分化されたものであってもよい。すなわち、属性とは、当該商品の輪郭による閉領域がどのように分類されるかを示すものであればよい。第１のアノテーションデータの一例を図６および図７に示す。商品識別情報としては、ＪＡＮコードに限られるものではなく、商品を一意に識別できる情報であれば如何なる情報であってもよい。 The first annotation data consists of data in which the outline of a product that is likely to be displayed on a display shelf is used as an external shape, and the inside of a closed area formed by the outline is masked, and tags that classify the attributes of the product's outline ( This is data that corresponds to the label. The first annotation data does not need to be one for one product, and may be plural for one product. That is, data obtained by masking the inside of the outline of a product from a plurality of directions as an external shape may be associated with an attribute, and each may be used as the first annotation data of the product. The attributes include, for example, the classification of the product's container and the product identification information (JAN code, etc.) of the product. Container classification may be by type of container, such as cans, bottles, boxes, and pouch containers, or may be further subdivided according to usage, such as detergent containers. That is, the attribute may be anything that indicates how the closed area based on the outline of the product is classified. An example of the first annotation data is shown in FIGS. 6 and 7. The product identification information is not limited to the JAN code, but may be any information that can uniquely identify the product.

図６は、缶の輪郭とその輪郭による閉領域の内側をマスク処理したデータと、属性として「缶」を対応づけて第１のアノテーションデータとした場合を示しており、図７は、詰め替え用シャンプーの輪郭とその輪郭による閉領域の内側をマスク処理したデータと、属性として「パウチ容器」を対応づけて第１のアノテーションデータとした場合を示している。第１のアノテーションデータにおける属性としては、商品の輪郭自体から商品を同定できるような場合には、容器の分類ではなく、ＪＡＮコードなどの商品の識別情報を用いてもよい。 Figure 6 shows the case where the outline of a can and the data obtained by masking the inside of the closed area formed by the outline are associated with "can" as an attribute to form the first annotation data. A case is shown in which first annotation data is created by associating data obtained by masking the outline of shampoo and the inside of a closed region formed by the outline with "pouch container" as an attribute. As the attribute in the first annotation data, if the product can be identified from the outline of the product itself, product identification information such as a JAN code may be used instead of the container classification.

第１の学習処理部２０１での第１のアノテーションデータを用いて機械学習用の学習処理を実行することで、陳列棚を撮影した画像情報から、商品の輪郭の領域を特定するための学習モデル（第１の学習モデル）を作成する。 By executing a learning process for machine learning using the first annotation data in the first learning processing unit 201, a learning model for identifying the outline area of a product from image information obtained by photographing a display shelf is created. (first learning model).

第２の学習処理部２０２は、第２のアノテーションデータを用いて、機械学習の学習処理を実行することで、所定の画像情報、好ましくは後述するフェイス領域の画像情報から、その領域にある商品の商品識別情報を同定するための学習モデル（第２の学習モデル）を作成する。この際の学習処理としては、好ましくは画像分類（Image Classification）の方法による学習処理を実行するとよいが、物体検出（Object Detection）、画像分類・物体位置特定（Image Classification・Localization）などの方法であってもよい。 The second learning processing unit 202 executes a machine learning learning process using the second annotation data to determine the products in the area based on predetermined image information, preferably image information of a face area (described later). A learning model (second learning model) for identifying product identification information is created. In this case, it is preferable to perform a learning process using an image classification method, but it is also recommended to perform a learning process using methods such as object detection, image classification/localization, etc. There may be.

第２のアノテーションデータとは、陳列棚に陳列される可能性のある商品の画像情報と、その商品の商品識別情報をタグ（ラベル）として対応づけたデータである。第２のアノテーションデータの一例を図８および図９に示す。第２のアノテーションデータも第１のアノテーションデータと同様に、一商品に一つでなくてもよく、複数あってもよい。すなわち、複数の方向から商品を撮影し、各方向からの商品の画像情報と商品識別情報を対応づけて第２のアノテーションデータとしてもよい。 The second annotation data is data in which image information of a product that may be displayed on a display shelf is associated with product identification information of the product as a tag (label). Examples of the second annotation data are shown in FIGS. 8 and 9. Similarly to the first annotation data, the second annotation data does not need to be one per product, and may be plural. That is, the product may be photographed from a plurality of directions, and image information of the product taken from each direction may be associated with product identification information to form the second annotation data.

図８は、缶の画像情報と、商品識別情報とを対応づけて第２のアノテーションデータとした場合を示しており、図９は、詰め替え用シャンプーの画像情報と、商品識別情報とを対応づけて第２のアノテーションデータとした場合を示している。図８の第２のアノテーションデータは、図６の第１のアノテーションデータに対応し、図８の第２のアノテーションデータは、図７の第２のアノテーションデータに対応する。 Figure 8 shows a case where image information of a can is associated with product identification information to form second annotation data, and Figure 9 shows a case where image information of a refillable shampoo is associated with product identification information. This shows the case where the annotation data is used as the second annotation data. The second annotation data in FIG. 8 corresponds to the first annotation data in FIG. 6, and the second annotation data in FIG. 8 corresponds to the second annotation data in FIG.

認識処理部２１は、画像情報入力受付処理部２１０と画像情報記憶部２１１と画像情報正置化処理部２１２と棚段特定処理部２１３と陳列商品認識処理部２１４とを有する。 The recognition processing section 21 includes an image information input reception processing section 210, an image information storage section 211, an image information uprighting processing section 212, a shelf identification processing section 213, and a displayed product recognition processing section 214.

画像情報入力受付処理部２１０は、画像情報入力端末３で撮影した店舗の陳列棚の画像情報（撮影画像情報）の入力を受け付け、後述する画像情報記憶部２１１に記憶させる。画像情報入力端末３からは、撮影画像情報のほか、撮影日時、店舗名などの店舗識別情報、画像情報を識別する画像情報識別情報などをあわせて入力を受け付けるとよい。図１０、図１１に撮影画像情報の一例を示す。図１０、図１１では、陳列棚に３段の棚段があり、そこに商品が陳列されている撮影画像情報である。なお、本発明においては特にその処理を明記はしないが、陳列棚や棚段は横方向に長いことが多い。そのため、その処理においては、一定の幅で区切り、各処理の処理対象としてもよい。 The image information input reception processing section 210 receives input of image information (photographed image information) of a store display shelf photographed by the image information input terminal 3, and stores it in the image information storage section 211, which will be described later. It is preferable to receive input from the image information input terminal 3, in addition to photographed image information, store identification information such as photographing date and time, store name, and image information identification information for identifying image information. Examples of captured image information are shown in FIGS. 10 and 11. In FIGS. 10 and 11, the photographed image information shows that there are three shelves on the display shelf, and products are displayed on the shelves. Although the present invention does not specifically specify the processing, display shelves and shelves are often long in the horizontal direction. Therefore, in the process, the data may be divided into sections of a certain width and used as processing targets for each process.

画像情報記憶部２１１は、画像情報入力端末３から受け付けた撮影画像情報、撮影日時、店舗識別情報、画像情報識別情報などを対応づけて記憶する。撮影画像情報とは、本発明の処理対象となる画像情報であればよい。一般的には、単に撮影した場合、撮影対象物を正対した状態で撮影することが困難であることから、それを正対した状態に補正する補正処理、たとえば台形補正処理などを実行することがよい。一つの陳列棚を複数枚で撮影した場合に、それが一つの画像情報として合成された画像情報も含まれる。また、歪み補正処理が実行された後の画像情報も撮影画像情報に含まれる。 The image information storage unit 211 stores photographed image information, photographed date and time, store identification information, image information identification information, etc. received from the image information input terminal 3 in association with each other. The photographed image information may be any image information to be processed by the present invention. Generally, when simply taking a picture, it is difficult to take a picture with the subject facing directly, so correction processing such as keystone correction processing is performed to correct the subject so that it is facing directly. Good. It also includes image information obtained by combining multiple images of one display shelf into one image. Further, image information after the distortion correction process has been executed is also included in the photographed image information.

画像情報正置化処理部２１２は、画像情報記憶部２１１に記憶した撮影画像情報に対して、撮影対象物が正対した状態になるように補正する処理（正置化処理）、たとえば台形補正処理を実行した正置画像情報を生成する。台形補正処理は、撮影画像情報に写っている陳列棚の棚段が水平になるように行う補正処理である。正置化とは、撮影装置のレンズの光軸を撮影対象である平面の垂線方向に沿って、十分に遠方から撮影した場合と同じになるように画像情報を変形させることであり、たとえば台形補正処理があるが、それに限定するものではない。 The image information normalization processing section 212 performs a process (orientation processing) of correcting the photographed image information stored in the image information storage section 211 so that the object to be photographed faces directly, for example, trapezoidal correction. Generate orthogonal image information that has been processed. The trapezoidal correction process is a correction process performed so that the shelves of the display shelf shown in the photographed image information are horizontal. Orthogonalization means transforming the image information so that the optical axis of the lens of the imaging device is perpendicular to the plane to be photographed, and the image information is the same as when photographed from a sufficiently far distance. Although there is a correction process, it is not limited thereto.

画像情報正置化処理部２１２が実行する台形補正処理は、撮影画像情報において４頂点の指定の入力を受け付け、その各頂点を用いて台形補正処理を実行する。指定を受け付ける４頂点としては、陳列棚の棚段の４頂点であってもよいし、陳列棚の棚位置の４頂点であってもよい。また、２段、３段の棚段のまとまりの４頂点であってもよい。４頂点としては任意の４点を指定できる。図１２に図１０の撮影画像情報を、図１３に図１１の撮影画像情報をそれぞれ正置化した撮影画像情報（正置画像情報）の一例を示す。 The trapezoidal correction process executed by the image information uprighting processing unit 212 receives input specifying four vertices in the captured image information, and executes the trapezoidal correction process using each of the vertices. The four vertices that accept the designation may be the four vertices of the shelves of the display shelf, or the four vertices of the shelf positions of the display shelf. Alternatively, it may be four vertices of a group of two or three shelves. Any four points can be specified as the four vertices. FIG. 12 shows an example of the photographed image information of FIG. 10, and FIG. 13 shows an example of the photographed image information (original image information) obtained by normalizing the photographed image information of FIG. 11.

棚段特定処理部２１３は、画像情報正置化処理部２１２において撮影画像情報に対して台形補正処理を実行した正置画像情報のうち、商品が配置される可能性のある棚段の領域（棚段領域）を特定する。撮影画像情報および正置画像情報には陳列棚が写っているが、陳列棚には、商品が陳列される棚段領域がある。そのため、正置画像情報から棚段領域を特定する。棚段領域の特定としては、管理端末２の操作者が手動で棚段領域を指定し、それを棚段特定処理部２１３が受け付けてもよいし、初回に手動で入力を受け付けた棚段領域の情報に基づいて、二回目以降は自動で棚段領域を特定してもよい。 The shelf identification processing unit 213 selects a shelf area where a product may be placed ( (shelf area). Although a display shelf is shown in the photographed image information and the normal position image information, the display shelf includes a shelf area where products are displayed. Therefore, the shelf area is specified from the normal position image information. To specify the shelf area, the operator of the management terminal 2 may manually specify the shelf area, and the shelf specification processing unit 213 may accept it, or the shelf area for which the input was manually received for the first time may be specified. Based on the information, the shelf area may be automatically specified from the second time onward.

図１４に、飲料缶などの商品が陳列されている陳列棚を撮影した画像情報を正置化した正置画像情報に対して、棚段領域の指定の入力を受け付けた状態を模式的に示す。また、図１５に、歯ブラシなどの商品が吊り下げられて陳列されている陳列棚を撮影した画像情報を正置化した正置画像情報に対して、棚段領域の指定の入力を受け付けた状態を模式的に示す。 FIG. 14 schematically shows a state in which an input for specifying a shelf area has been received for normalized image information that is a normalized image of a display shelf on which products such as beverage cans are displayed. . In addition, FIG. 15 shows a state in which an input for specifying a shelf area is accepted for the upright image information, which is an upright position of image information taken of a display shelf on which products such as toothbrushes are hung and displayed. is schematically shown.

なお、棚段特定処理部２１３は、棚段領域を特定する際に、深層学習（ディープラーニング）を用いて棚段領域を特定してもよい。この場合、中間層が多数の層からなるニューラルネットワークの各層のニューロン間の重み付け係数が最適化された学習モデルに対して、上記正置画像情報を入力し、その出力値に基づいて、棚段領域を特定してもよい。また学習モデルとしては、さまざまな正置画像情報に棚段領域を正解データとして与えたものを用いることができる。 Note that when identifying the shelf area, the shelf identification processing unit 213 may use deep learning to specify the shelf area. In this case, the above-mentioned orthogonal image information is input to a learning model in which the weighting coefficients between neurons in each layer of a neural network consisting of many intermediate layers are optimized, and based on the output value, the shelf A region may also be specified. Further, as a learning model, it is possible to use various types of orthographic image information in which shelf areas are given as correct data.

棚段特定処理部２１３で特定した棚段領域は、その画像情報を棚段領域画像情報として特定する。棚段特定処理部２１３は、実際に、画像情報として切り出してもよいし、実際には画像情報としては切り出さずに、領域の画像情報を座標などで特定するなどによって、仮想的に切り出すのでもよい。なお、陳列棚に棚段が複数ある場合には、それぞれが棚段領域画像情報として切り出される。また棚段の領域を示す座標としては、その領域を特定するための頂点の座標であり、正置画像情報におけるたとえば４点、右上と左下、左上と右下の２点の座標などでよい。また、正置画像情報における陳列棚など、画像情報における所定箇所（たとえば陳列棚の左上の頂点）を基準とした相対座標である。なお、本明細書において画像情報を切り出すとは、棚段特定処理部２１３における切り出しと同様に、実際に、画像情報として切り出してもよいし、実際には画像情報としては切り出さずに、領域の画像情報を座標などで特定するなどによって、仮想的に切り出すのでもよい。 The shelf area identified by the shelf identification processing unit 213 specifies its image information as shelf area image information. The shelf identification processing unit 213 may actually cut out the image information, or may virtually cut it out by specifying the image information of the area by coordinates, etc., without actually cutting out the image information. good. Note that if there are multiple shelves on the display shelf, each shelf is cut out as shelf area image information. Further, the coordinates indicating the area of the shelf are the coordinates of the vertices for specifying the area, and may be the coordinates of, for example, four points, the upper right and lower left, and the upper left and lower right, in the orthogonal image information. Further, it is a relative coordinate based on a predetermined location in the image information (for example, the top left vertex of the display shelf) such as a display shelf in the normal image information. Note that in this specification, cutting out image information may mean actually cutting out image information, similar to cutting out in the shelf identification processing unit 213, or actually cutting out image information without cutting out image information. It is also possible to virtually cut out the image information by specifying the image information using coordinates or the like.

陳列商品認識処理部２１４は、画像情報、好ましくは撮影画像情報若しくは正置画像情報に写っている陳列棚から、陳列されている商品を認識する処理を実行する。 The displayed product recognition processing unit 214 executes a process of recognizing displayed products from the display shelf shown in the image information, preferably photographed image information or normal orientation image information.

陳列商品認識処理部２１４は、フェイス特定処理部２１４１と商品同定処理部２１４２とを有する。 The displayed product recognition processing section 214 includes a face identification processing section 2141 and a product identification processing section 2142.

フェイス特定処理部２１４１は、正置画像情報における棚段領域における棚段ごとに、フェイスの領域（フェイス領域）を特定する。フェイスとは商品が置かれる領域であって、その商品が置かれているか否かは問わない。フェイス領域の大きさは、そこに置かれるべき商品と同一または略同一の大きさである。 The face identification processing unit 2141 identifies a face area (face area) for each shelf in the shelf area in the normal image information. A face is an area where a product is placed, and it does not matter whether the product is placed there or not. The size of the face area is the same or approximately the same size as the product to be placed there.

フェイス特定処理部２１４１は、第１の学習処理部２０１において学習させた学習モデル（第１の学習モデル）に、撮影画像情報、正置画像情報若しくは棚段領域の画像情報を入力値として入力し、入力した画像情報からフェイス領域を特定する。すなわち、フェイス特定処理部２１４１は、第１の学習処理部２０１において学習させた学習モデル（中間層が多数の層からなるニューラルネットワークの各層のニューロン間の重み付け係数が最適化された学習モデル）に対して、処理対象とする領域、たとえば棚段領域の画像情報を入力し、その出力値に基づいて、フェイスの領域を特定する。特定したフェイスの領域については、フェイス領域を識別するフェイス識別情報を割り当てて、撮影画像情報、正置画像情報若しくは棚段領域の画像情報における位置情報（たとえば画像情報における座標）とともに商品識別情報記憶部２２に記憶させる。 The face identification processing unit 2141 inputs photographed image information, normal orientation image information, or image information of the shelf region as an input value to the learning model (first learning model) learned by the first learning processing unit 201. , Identify the face area from the input image information. That is, the face identification processing unit 2141 uses the learning model trained in the first learning processing unit 201 (a learning model in which the weighting coefficients between neurons in each layer of a neural network consisting of many intermediate layers are optimized). On the other hand, image information of a region to be processed, for example, a shelf region, is input, and the face region is specified based on the output value. For the identified face area, face identification information that identifies the face area is assigned, and product identification information is stored together with photographed image information, normal position image information, or position information in the image information of the shelf area (for example, coordinates in the image information). 22.

図１６に、棚段領域の画像情報からフェイス領域を特定した場合の一例を示す。図１６（ａ）は第１の学習処理部２０１により学習させた学習モデルに対して入力する棚段領域の画像情報の一例であり、図１６（ｂ）は図１６（ａ）で入力値とした棚段領域の画像情報において、上記学習モデルを用いてフェイス領域を特定した状態の一例を示す図である。図１６（ｂ）では棚段領域においてフェイス領域を特定した状態を重畳して示しているが、特定したフェイス領域の画像情報をそのまま切り出して出力をしてもよい。 FIG. 16 shows an example of a case where the face area is identified from the image information of the shelf area. FIG. 16(a) is an example of image information of the shelf region input to the learning model trained by the first learning processing unit 201, and FIG. 16(b) is an example of the input value and the input value in FIG. 16(a). FIG. 7 is a diagram showing an example of a state in which a face region is identified using the learning model in the image information of the shelf region. Although FIG. 16(b) shows a superimposed state in which the face area is identified in the shelf area, the image information of the identified face area may be cut out and output as is.

商品同定処理部２１４２は、フェイス領域に表示されている商品の商品識別情報を、第２の学習処理部２０２において学習させた学習モデル（第２の学習モデル）に、フェイス領域の画像情報を入力値として入力し、入力された画像情報からその領域にある商品の識別情報を同定する。すなわち、商品同定処理部２１４２は、第２の学習処理部２０２において学習させた学習モデル（中間層が多数の層からなるニューラルネットワークの各層のニューロン間の重み付け係数が最適化された学習モデル）に対して、処理対象とするフェイス領域の画像情報を入力し、その出力値に基づいて、フェイス領域にある商品の商品識別情報を同定する。 The product identification processing unit 2142 inputs the image information of the face area into a learning model (second learning model) that has been trained by the second learning processing unit 202 to learn the product identification information of the product displayed in the face area. It is input as a value, and the identification information of the product in that area is identified from the input image information. That is, the product identification processing unit 2142 uses the learning model trained in the second learning processing unit 202 (a learning model in which the weighting coefficients between neurons in each layer of a neural network whose intermediate layer is composed of many layers are optimized). On the other hand, image information of the face area to be processed is input, and product identification information of the product in the face area is identified based on the output value.

陳列商品認識処理部２１４は、フェイス特定処理部２１４１、商品同定処理部２１４２の処理をまとめて深層学習などによって実行してもよい。 The displayed product recognition processing section 214 may perform the processing of the face identification processing section 2141 and the product identification processing section 2142 together using deep learning or the like.

商品識別情報記憶部２２は、陳列棚の棚段の各フェイスに表示されている商品の商品識別情報を示す情報を記憶する。たとえば、商品識別情報に対応付けて、撮影日時情報、店舗情報、撮影画像情報の画像情報識別情報、正置画像情報の画像識別情報、フェイスを識別するためのフェイス識別情報に対応づけて商品識別情報記憶部２２に記憶する。 The product identification information storage unit 22 stores information indicating product identification information of products displayed on each face of the shelves of the display shelf. For example, product identification can be performed in association with product identification information, photographing date and time information, store information, image information identification information of photographed image information, image identification information of normal position image information, and face identification information for identifying faces. The information is stored in the information storage section 22.

つぎに本発明の情報処理システム１の処理プロセスの一例を図４および図５のフローチャートを用いて説明する。なお、以下の説明では、撮影画像情報から陳列している商品の商品識別情報を同定する場合を説明する。 Next, an example of the processing process of the information processing system 1 of the present invention will be explained using the flowcharts of FIGS. 4 and 5. In the following description, a case will be described in which product identification information of a displayed product is identified from photographed image information.

まず、本発明の情報処理システム１の認識処理部２１で用いる学習モデルを学習するための、学習処理を、図４のフローチャートを用いて説明する。 First, a learning process for learning a learning model used in the recognition processing unit 21 of the information processing system 1 of the present invention will be described using the flowchart of FIG. 4.

第１の学習処理部２０１における学習モデルの教師データとして、第１のアノテーションデータを作成する（Ｓ１００）。第１のアノテーションデータは、陳列棚に陳列される可能性のある商品の輪郭により外形を形成し、その外形の内側の閉領域をマスク処理した画像データとする。この画像データに、属性をタグとして対応づけて作成する（図６、図７参照）。 First annotation data is created as training data for the learning model in the first learning processing unit 201 (S100). The first annotation data is image data in which an outer shape is formed by the outline of a product that may be displayed on a display shelf, and a closed area inside the outer shape is masked. This image data is created by associating attributes with tags (see FIGS. 6 and 7).

同様に、第２の学習処理部２０２における学習モデルの教師データとして、第２のアノテーションデータを作成する（Ｓ１１０）。第２のアノテーションデータは、陳列棚に陳列される可能性のある商品の画像に、その商品の商品識別情報をタグとして対応づけて作成する。この際の商品の画像情報は、第１のアノテーションデータに対応しているとよく、商品の輪郭を外形とした画像情報であるとよい。 Similarly, second annotation data is created as training data for the learning model in the second learning processing unit 202 (S110). The second annotation data is created by associating product identification information of the product as a tag with an image of the product that may be displayed on the display shelf. The image information of the product at this time preferably corresponds to the first annotation data, and is preferably image information whose outer shape is the outline of the product.

そして、作成した第１のアノテーションデータを教師データとして入力し、第１の学習処理部２０１において機械学習用の学習処理を実行し、陳列棚を撮影した画像情報から、商品の輪郭の領域を特定するための学習モデル（第１の学習モデル）を作成する（Ｓ１２０）。 Then, the created first annotation data is input as training data, the first learning processing unit 201 executes learning processing for machine learning, and identifies the contour area of the product from the image information obtained by photographing the display shelf. A learning model (first learning model) is created (S120).

また、作成した第２のアノテーションデータを教師データとして入力し、第２の学習処理部２０２において機械学習用の学習処理を実行し、画像情報、好ましくはフェイス領域からその領域にある商品の商品識別情報を同定するための学習モデル（第２の学習モデル）を作成する（Ｓ１３０）。 In addition, the created second annotation data is input as training data, and the second learning processing unit 202 executes learning processing for machine learning to identify the product in the area from the image information, preferably the face area. A learning model (second learning model) for identifying information is created (S130).

以上のような処理を実行することで、各学習モデルを作成することができる。 By performing the above processing, each learning model can be created.

つぎに、陳列棚を撮影した画像情報から、陳列棚に陳列されている商品の商品識別情報を同定するための認識処理を、図５のフローチャートを用いて説明する。 Next, a recognition process for identifying product identification information of products displayed on a display shelf from image information obtained by photographing the display shelf will be explained using the flowchart of FIG.

店舗の陳列棚が撮影された撮影画像情報は、画像情報入力端末３から入力され、管理端末２の画像情報入力受付処理部２１０でその入力を受け付ける（Ｓ２００）。図１７に、撮影画像情報の一例を示す。また、撮影日時、店舗識別情報、撮影画像情報の画像情報識別情報の入力を受け付ける。そして、画像情報入力受付処理部２１０は、入力を受け付けた撮影画像情報、撮影日時、店舗識別情報、撮影画像情報の画像情報識別情報を対応づけて画像情報記憶部２１１に記憶させる。 Photographed image information of a store display shelf is input from the image information input terminal 3, and the input is accepted by the image information input reception processing unit 210 of the management terminal 2 (S200). FIG. 17 shows an example of captured image information. It also accepts input of photographing date and time, store identification information, and image information identification information of photographed image information. Then, the image information input reception processing section 210 associates and stores the input photographed image information, photographing date and time, store identification information, and image information identification information of the photographed image information in the image information storage section 211.

管理端末２において所定の操作入力を受け付けると、正置画像情報正置化処理部２１２は、画像情報記憶部２１１に記憶する撮影画像情報を抽出し、台形補正処理などの正置化処理を行うための頂点である棚位置（陳列棚の位置）の４点の入力を受け付け、正置化処理を実行する（Ｓ２１０）。このようにして正置化処理が実行された撮影画像情報（正置画像情報）の一例が、図１８である。 When a predetermined operation input is received on the management terminal 2, the orthogonal image information orthogonalization processing unit 212 extracts the photographed image information stored in the image information storage unit 211, and performs orthogonalization processing such as trapezoidal correction processing. The four points of shelf positions (positions of display shelves) that are the vertices of the display are received, and a normalization process is executed (S210). FIG. 18 shows an example of photographed image information (normalized image information) on which the normalization process has been performed in this manner.

そして、正置画像情報に対して、管理端末２において所定の操作入力を受け付けることで、棚段特定処理部２１３は、棚段位置領域を特定する（Ｓ２２０）。すなわち、正置画像情報における棚段領域の入力を受け付ける。図１４、図１５が、正置画像情報から棚段領域が特定された状態を示す図である。 Then, by receiving a predetermined operation input on the management terminal 2 with respect to the normal position image information, the shelf identification processing unit 213 identifies the shelf position area (S220). That is, input of the shelf area in the normal position image information is accepted. FIGS. 14 and 15 are diagrams showing a state in which the shelf area is specified from the normal position image information.

以上のようにして、棚段領域を特定すると、正置画像情報から棚段領域の画像情報を切り出す。そして、棚段領域画像情報における棚段ごとに、フェイスを特定する処理を実行する（Ｓ２３０）。すなわち、フェイス特定処理部２１４１は、第１の学習処理部２０１において学習させた学習モデル（第１の学習モデル）に、棚段領域の画像情報を入力値として入力し、入力した画像情報からフェイス領域を特定する。 Once the shelf area is identified as described above, image information of the shelf area is extracted from the normal position image information. Then, a process of identifying a face is executed for each shelf in the shelf area image information (S230). That is, the face identification processing unit 2141 inputs the image information of the shelf area as an input value to the learning model trained in the first learning processing unit 201 (first learning model), and determines the face from the input image information. Identify the area.

以上のように正置画像情報の棚段位置領域画像情報における各棚段の各フェイス領域を特定すると、商品同定処理部２１４２は、第２の学習処理部２０２において学習させた学習モデル（第２の学習モデル）に、フェイス領域の画像情報を入力値として入力し、フェイス領域に写っている商品の商品識別情報を同定する（Ｓ２４０）。同定した商品識別情報は、撮影日時、店舗識別情報、撮影画像情報の画像情報識別情報、正置画像情報の画像情報識別情報、フェイス識別情報に対応づけて商品識別情報記憶部２２に記憶させる。 After specifying each face area of each shelf in the shelf position area image information of the normal position image information as described above, the product identification processing unit 2142 uses the learning model learned by the second learning processing unit 202 (second The image information of the face area is input as an input value to the learning model (learning model), and the product identification information of the product shown in the face area is identified (S240). The identified product identification information is stored in the product identification information storage unit 22 in association with the photographing date and time, store identification information, image information identification information of the photographed image information, image information identification information of the normal position image information, and face identification information.

なお、すべてのフェイス領域の商品識別情報を同定できるとは限らない。そこで、同定できないフェイス領域については、商品識別情報の入力を受け付け、入力を受け付けた商品識別情報を、撮影日時、店舗識別情報、撮影画像情報の画像情報識別情報、正置画像情報の画像情報識別情報、フェイス識別情報に対応づけて商品識別情報記憶部２２に記憶する。また、同定した商品識別情報の修正処理についても同様に、入力を受け付けてもよい。 Note that it is not always possible to identify product identification information in all face areas. Therefore, for face areas that cannot be identified, input of product identification information is accepted, and the input product identification information is converted into the photographing date and time, store identification information, image information identification information of the photographed image information, and image information identification information of the upright image information. The product identification information is stored in the product identification information storage unit 22 in association with the face identification information. Further, input may be similarly accepted for the correction process of the identified product identification information.

以上のような処理を行うことで、撮影画像情報に写っている陳列棚の棚段に陳列されている商品の商品識別情報を同定することができる。また従来のシステムのように、フェイス領域を矩形領域とせず、商品の輪郭の外形に沿って商品の同定を行うので、フェイス領域に含まれる不要な情報、たとえば他の商品などのノイズが除外されるので、認識精度が向上することとなる。 By performing the above processing, it is possible to identify the product identification information of the product displayed on the shelf of the display shelf shown in the photographed image information. In addition, unlike conventional systems, the face area is not a rectangular area, but the product is identified along the outline of the product, so unnecessary information contained in the face area, such as noise from other products, is excluded. Therefore, recognition accuracy improves.

なお、第１のアノテーションデータにおける属性として商品識別情報を用いている場合（商品の外形から商品が同定できる場合）には、画像情報を第１の学習モデルに入力してフェイス領域を特定することで、当該商品の商品識別情報を同定できる。その場合には、第２の学習処理部２０２、商品同定処理部２１４２による処理を実行せずともよく、フェイス特定処理部２１４１でフェイス領域を特定すると、そのフェイス領域に写っている商品の商品識別情報を、第１の学習モデルによる出力結果としての属性の商品識別情報で同定してもよい。 Note that if product identification information is used as an attribute in the first annotation data (if the product can be identified from the product's external shape), the image information may be input to the first learning model to identify the face area. You can identify the product identification information of the product. In that case, it is not necessary to execute the processing by the second learning processing unit 202 and the product identification processing unit 2142, and once the face area is identified by the face identification processing unit 2141, the product identified in the face area can be identified. The information may be identified using attribute product identification information as an output result of the first learning model.

実施例１では、フェイス領域の特定と、フェイス領域から商品識別情報の同定の２つの処理で機械学習を用いる構成を説明したが、フェイス領域から商品識別情報を同定する処理については、画像マッチング処理を用いてもよい。この場合の情報処理システム１の構成の一例を図１９に示す。 In Example 1, we explained a configuration that uses machine learning in the two processes of identifying the face area and identifying product identification information from the face area.However, regarding the process of identifying product identification information from the face area, image matching processing may also be used. FIG. 19 shows an example of the configuration of the information processing system 1 in this case.

本実施例における情報処理システム１では、学習処理部２０では第２の学習処理部２０２は設ける必要はない。また、認識処理部２１では、画像マッチング処理に用いる標本情報を記憶する標本情報記憶部を備える。 In the information processing system 1 in this embodiment, the second learning processing section 202 does not need to be provided in the learning processing section 20. The recognition processing unit 21 also includes a specimen information storage unit that stores specimen information used in image matching processing.

標本情報記憶部は、画像情報に写っている陳列棚の棚段に陳列されている商品がどの商品であるかを識別するための標本情報を記憶する。標本情報は，陳列棚に陳列される可能性のある商品を，上下，左右，斜めなど複数の角度から撮影をした画像情報である。図２０に標本情報記憶部に記憶される標本情報の一例を示す。図２０では，標本情報として，缶ビールをさまざまな角度から撮影をした場合を示しているが，缶ビールに限られない。標本情報記憶部は，標本情報と，商品識別情報とを対応付けて記憶する。 The specimen information storage unit stores specimen information for identifying which product is displayed on the shelf of the display shelf shown in the image information. Specimen information is image information of products that may be displayed on display shelves, taken from multiple angles such as top and bottom, left and right, and diagonally. FIG. 20 shows an example of specimen information stored in the specimen information storage section. Although FIG. 20 shows the case where canned beer is photographed from various angles as sample information, it is not limited to canned beer. The specimen information storage unit stores specimen information and product identification information in association with each other.

なお，標本情報記憶部には，標本情報とともに，または標本情報に代えて，標本情報から抽出された，類似性の算出に必要となる情報，たとえば画像特徴量とその位置のペアの情報を記憶していてもよい。標本情報には，類似性の算出に必要となる情報も含むとする。この場合，陳列商品認識処理部２１４は，後述するフェイス領域の画像情報と，標本情報とのマッチング処理を行う際に，標本情報について毎回，画像特徴量を算出せずともよくなり，計算時間を短縮することができる。 The sample information storage unit stores information extracted from the sample information and necessary for calculating similarity, such as information on pairs of image feature amounts and their positions, along with or in place of the sample information. You may do so. The sample information also includes information necessary for calculating similarity. In this case, when the displayed product recognition processing unit 214 performs matching processing between the image information of the face area and the specimen information, which will be described later, it is not necessary to calculate the image feature amount for the specimen information every time, and the calculation time is reduced. Can be shortened.

また標本情報記憶部に記憶する標本情報は，第１の学習処理部２０１の学習処理の際に用いた第１のアノテーションデータにおける商品の輪郭の外形をマスク処理した商品の画像情報を用いてもよい。すなわち、第１のアノテーションデータを作成する際に、商品を一または複数の方向から撮影した商品の画像情報若しくはその画像特徴量を標本情報とする。そして、当該撮影した商品の画像情報のうち、輪郭を外形として、その閉領域の内側をマスク処理するとともに、属性をタグ付けして第１のアノテーションデータを作成する。このような処理によって、標本情報と第１のアノテーションデータをまとめて作成することができる。 Furthermore, the sample information stored in the sample information storage unit may be image information of the product obtained by masking the outline of the product in the first annotation data used during the learning process of the first learning processing unit 201. good. That is, when creating the first annotation data, image information of the product taken from one or more directions or its image feature amount is used as sample information. Then, among the image information of the photographed product, the inside of the closed area is masked using the outline as the external shape, and the attributes are tagged to create first annotation data. Through such processing, specimen information and first annotation data can be created together.

本実施例における商品同定処理部２１４２は、フェイス特定処理部２１４１で特定したフェイス領域の画像情報と、標本情報記憶部に記憶する標本情報とのマッチング処理を実行し、そのフェイス領域に表示されている商品の商品識別情報を同定する。すなわち、ある棚段のフェイス領域（このフェイスの領域のフェイス識別情報をＸとする）における画像情報と、標本情報記憶部に記憶する各標本情報とから、それぞれの画像特徴量を算出し、特徴点のペアを求めることで、類似性を判定する。そして、もっとも類似性の高い標本情報を特定し、そのときの類似性があらかじめ定められた閾値以上であれば、その標本情報に対応する商品識別情報を標本情報記憶部に基づいて同定する。そして、同定した商品識別情報を、そのフェイス識別情報Ｘのフェイスに表示されている商品の商品識別情報とする。なお、いずれの標本情報とも類似ではないと判定したフェイスについては、そのフェイス識別情報について「空」であることを示す情報（商品がないことを示す情報）を付する。商品同定処理部２１４２は、同定した商品識別情報または「空」であることを示す情報を、撮影日時、店舗識別情報、撮影画像情報の画像情報識別情報、正置画像情報の画像情報識別情報、フェイス識別情報に対応づけて商品識別情報記憶部２２に記憶する。 The product identification processing unit 2142 in this embodiment executes matching processing between the image information of the face area identified by the face identification processing unit 2141 and the specimen information stored in the specimen information storage unit, and performs matching processing between the image information of the face area identified by the face identification processing unit 2141 and the specimen information stored in the specimen information storage unit, Identify the product identification information of the product. That is, each image feature amount is calculated from the image information in the face area of a certain shelf (the face identification information of this face area is X) and each specimen information stored in the specimen information storage unit, and the characteristics are Similarity is determined by finding pairs of points. Then, the specimen information with the highest similarity is identified, and if the similarity at that time is equal to or greater than a predetermined threshold, the product identification information corresponding to the specimen information is identified based on the specimen information storage section. The identified product identification information is then used as the product identification information of the product displayed on the face of the face identification information X. Note that for faces that are determined to be not similar to any sample information, information indicating that the face identification information is "empty" (information indicating that there is no product) is attached to the face identification information. The product identification processing unit 2142 converts the identified product identification information or information indicating that it is "empty" into the photographing date and time, store identification information, image information identification information of the photographed image information, image information identification information of the normal position image information, It is stored in the product identification information storage unit 22 in association with the face identification information.

商品同定処理部２１４２は、一例として、具体的には以下のような処理を実行する。まず、処理対象となるフェイス領域の座標で構成される画像情報と、標本情報記憶部に記憶する標本情報との類似性を判定し、その類似性がもっとも高い標本情報に対応する商品識別情報を特定し、特定した類似性があらかじめ定めた閾値以上であれば、上記座標で構成されるフェイス領域に表示されている商品の商品識別情報として同定をする。 As an example, the product identification processing unit 2142 specifically executes the following processing. First, the similarity between the image information consisting of the coordinates of the face area to be processed and the specimen information stored in the specimen information storage unit is determined, and the product identification information corresponding to the specimen information with the highest similarity is determined. If the identified similarity is greater than or equal to a predetermined threshold, the product is identified as product identification information of the product displayed in the face area formed by the coordinates.

ここでフェイスの画像情報と標本情報との類似性を判定するには、以下のような処理を行う。まず、商品同定処理部２１４２における商品識別情報の同定処理の前までの処理において、正置画像情報の棚段におけるフェイスの領域の画像情報と、標本情報との方向が同じ（横転や倒立していない）となっており、また、それぞれの画像情報の大きさが概略同じとなっている（所定範囲以上で画像情報の大きさが異なる場合には、類似性の判定の前にそれぞれの画像情報の大きさが所定範囲内となるようにサイズ合わせをしておく）。 Here, in order to determine the similarity between face image information and sample information, the following processing is performed. First, in the processing before the product identification information identification processing in the product identification processing unit 2142, the image information of the face area on the shelf in the normal orientation image information and the specimen information are in the same direction (overturned or inverted). In addition, the size of each image information is approximately the same (if the size of image information differs over a predetermined range, each image information is Adjust the size so that the size is within the specified range).

商品同定処理部２１４２は、フェイス領域の画像情報と、標本情報との類似性を判定するため、フェイスの画像情報の画像特徴量（たとえば局所特徴量）に基づく特徴点と、標本情報との画像特徴量（たとえば局所特徴量）に基づく特徴点を、それぞれ抽出する。そして、フェイスの画像情報の特徴点と、標本情報の特徴点とでもっとも類似性が高いペアを検出し、それぞれで対応する点の座標の差を求める。そして、差の平均値を求める。差の平均値は、フェイス領域の画像情報と、標本情報との全体の平均移動量を示している。そして、すべての特徴点のペアの座標差を平均の座標差と比較し、外れ度合いの大きなペアを除外する。そして、残った対応点の数で類似性を順位付ける。 In order to determine the similarity between the image information of the face region and the specimen information, the product identification processing unit 2142 uses an image of the specimen information and feature points based on the image feature amount (for example, local feature amount) of the image information of the face. Feature points based on feature amounts (for example, local feature amounts) are each extracted. Then, a pair of the most similar feature points of the image information of the face and feature points of the sample information is detected, and the difference in the coordinates of the corresponding points is determined. Then, calculate the average value of the differences. The average value of the differences indicates the overall average movement amount between the image information of the face area and the sample information. Then, the coordinate differences between all pairs of feature points are compared with the average coordinate difference, and pairs with a large degree of deviation are excluded. Then, the similarity is ranked based on the number of remaining corresponding points.

以上のような方法でフェイス領域の画像情報と、標本情報との類似性を算出できる。また、その精度を向上させるため、さらに、色ヒストグラム同士のＥＭＤ（ＥａｒｔｈＭｏｖｅｒｓＤｉｓｔａｎｃｅ）を求め、類似性の尺度としてもよい。これによって、撮影された画像情報の明度情報等の環境変化に比較的強い類似性の比較を行うことができ、高精度で特定をすることができる。 The similarity between the image information of the face area and the sample information can be calculated using the method described above. Furthermore, in order to improve the accuracy, the EMD (Earth Movers Distance) between color histograms may be determined and used as a measure of similarity. As a result, it is possible to compare similarities that are relatively strong against environmental changes such as brightness information of photographed image information, and it is possible to perform identification with high accuracy.

類似性の判定としては、ほかにも、各フェイス領域の画像情報のシグネチャ（画像特徴量と重みの集合）同士のＥＭＤを求め、類似性の尺度としてもよい。シグネチャの画像特徴量としては、たとえばフェイス領域の画像情報のＨＳＶ色空間内の頻度分布を求め、色相と彩度に関してグルーピングを行って、特徴の個数とＨＳＶ色空間内の領域による画像特徴量とすることができる。色相と彩度についてグルーピングを行うのは、撮影条件に大きく左右されないように、明度への依存度を下げるためである。 Alternatively, the similarity may be determined by determining the EMD between signatures (sets of image features and weights) of image information of each face region, and using this as a measure of similarity. For example, the image feature amount of the signature is determined by calculating the frequency distribution of the image information of the face area in the HSV color space, grouping it with respect to hue and saturation, and calculating the image feature amount based on the number of features and the region in the HSV color space. can do. The reason for grouping based on hue and saturation is to reduce dependence on brightness so that it is not greatly influenced by shooting conditions.

また、処理の高速化のため、シグネチャとＥＭＤの代わりに、適宜の色空間内での画像情報の色コリログラムや色ヒストグラムなどの画像特徴量間のＬ２距離等の類似性を用いることもできる。 Further, in order to speed up processing, similarity such as L2 distance between image feature quantities such as a color correlogram or a color histogram of image information in an appropriate color space can be used instead of the signature and EMD.

類似性の判定は、上述に限定をするものではない。同定した商品識別情報は、撮影日時情報、店舗情報、撮影画像情報の画像情報識別情報、正置画像情報の画像識別情報、フェイス識別情報に対応づけて商品識別情報記憶部２２に記憶する。 The determination of similarity is not limited to the above. The identified product identification information is stored in the product identification information storage unit 22 in association with photographing date and time information, store information, image information identification information of photographed image information, image identification information of normal position image information, and face identification information.

なお、商品識別情報が同定できなかったフェイスは、商品識別情報記憶部２２においてそのフェイス領域が「空」であることを示す情報（商品が欠品などないことを示す情報）が記憶される。 For faces whose product identification information could not be identified, information indicating that the face area is "empty" (information indicating that the product is not out of stock) is stored in the product identification information storage unit 22.

以上のように、フェイス領域の画像情報から商品を同定する場合において画像マッチング処理を用いた場合であっても、フェイス領域が矩形領域ではないので、精度よく画像マッチング処理を実行することができる。 As described above, even when image matching processing is used to identify a product from the image information of the face area, since the face area is not a rectangular area, the image matching process can be executed with high accuracy.

上述の実施例１，実施例２の変形例として、棚段単位での変化を検出する棚段比較処理部２１４３を設け、棚段単位で変化がない場合には、前回の認識結果をそのまま用いることもできる。この場合の陳列商品認識処理部２１４の一例を図２１に示す。 As a modification of the above-mentioned embodiments 1 and 2, a shelf comparison processing unit 2143 is provided to detect changes in each shelf, and if there is no change in each shelf, the previous recognition result is used as is. You can also do that. An example of the displayed product recognition processing section 214 in this case is shown in FIG.

棚段比較処理部２１４３は、前回（Ｎ－１回目）の正置画像情報における棚段の領域の画像情報と、今回（Ｎ回目）の正置画像情報における棚段の領域の画像情報とに基づいて、その類似性が高ければその棚段における各フェイスの商品識別情報または「空」は同一と判定する。この類似性の判定処理は、上述のように、前回（Ｎ－１回目）の正置画像情報における棚段の領域の画像情報の画像特徴量と、今回（Ｎ回目）の正置画像情報における棚段の領域の画像情報とに基づく類似性の判定でもよいし、色ヒストグラム同士のＥＭＤを用いたものであってもよい。また、それらに限定するものではない。そして、商品同定処理部２１４２におけるフェイス単位ごとの特定処理ではなく、商品同定処理部２１４２に、Ｎ回目の正置画像情報におけるその棚段における各フェイスの商品識別情報を、Ｎ－１回目の同一の棚段における各フェイスの商品識別情報と同一として、商品識別情報記憶部２２に記憶させる。これによって、あまり商品の動きがない棚段や逆にきわめて短いサイクルで管理される棚段など、変化がほとんど生じない棚段についての処理を省略することができる。 The shelf comparison processing unit 2143 compares the image information of the shelf area in the previous (N-1st) normal position image information with the image information of the shelf area in the current (Nth) normal position image information. Based on this, if the similarity is high, it is determined that the product identification information or "empty" of each face on that shelf is the same. As mentioned above, this similarity determination process uses the image feature amount of the image information of the shelf area in the previous (N-1st) normal orientation image information and the current (Nth) orientation image information. The similarity may be determined based on image information of the region of the shelf, or may be determined using EMD between color histograms. Moreover, it is not limited to these. Then, instead of performing identification processing for each face in the product identification processing section 2142, the product identification processing section 2142 uses the product identification information of each face on the shelf in the N-th normal orientation image information for the N-1st identical This information is stored in the product identification information storage unit 22 as the same product identification information for each face on the shelf. As a result, it is possible to omit processing for shelves that rarely change, such as shelves where products do not move much or shelves that are managed in extremely short cycles.

上述の実施例１乃至実施例３の処理を、適宜、組み合わせることもできる。またその各処理については、本発明の明細書に記載した順序に限定するものではなく、その目的を達成する限度において適宜、変更することが可能である。また、陳列商品認識処理部２１４における処理は、撮影画像情報に対して正置化処理を実行した正置画像情報に対して実行したが、撮影画像情報に対して実行をしてもよい。その場合、正置画像情報を、撮影画像情報と読み替えればよい。 The processes of the first to third embodiments described above can also be combined as appropriate. Further, each process is not limited to the order described in the specification of the present invention, and can be changed as appropriate as long as the purpose is achieved. Further, although the processing in the displayed product recognition processing unit 214 is performed on the normalized image information obtained by performing the normalization process on the photographed image information, the processing may be performed on the photographed image information. In that case, normal image information may be read as photographed image information.

また、認識処理部２１において棚段領域を特定してそこから後述のフェイス領域を特定する処理とせずに、棚段領域を特定せずに撮影画像情報、正置画像情報若しくは棚段領域の画像情報の全体から後述のフェイス領域を特定するように構成することもできる。その場合には、棚段特定処理部２１３は設けずともよく、その処理を実行しないように構成してもよい。 Alternatively, instead of specifying the shelf area in the recognition processing unit 21 and specifying a face area, which will be described later, from there, the photographed image information, the normal position image information, or the image of the shelf area can be obtained without specifying the shelf area. It can also be configured to specify a face area, which will be described later, from the entire information. In that case, the shelf identification processing section 213 may not be provided, and the configuration may be such that the processing is not executed.

上述の実施例１乃至実施例４では、コンビニエンスストアやスーパーなどの陳列棚について例示して説明をしたが、それに限定するものではなく、陳列棚に何らかの商品が陳列されている場合であれば、如何なるジャンルであっても適用することができる。たとえば調剤薬局の医薬品を陳列する陳列棚（医薬品棚）に陳列される医薬品（商品）に適用することもできる。同様に、倉庫の陳列棚に陳列される商品に適用することもできる。 In the above-mentioned Examples 1 to 4, explanations have been made by exemplifying display shelves in convenience stores, supermarkets, etc., but the present invention is not limited thereto, and as long as some kind of product is displayed on the display shelves, It can be applied to any genre. For example, it can be applied to medicines (commodities) displayed on display shelves (medicine shelves) for displaying medicines in dispensing pharmacies. Similarly, it can also be applied to products displayed on display shelves in a warehouse.

本発明の情報処理システム１を用いることで、画像情報から陳列している商品を同定する際に、商品を同定する精度を向上させることが可能となる。 By using the information processing system 1 of the present invention, it is possible to improve the accuracy of product identification when identifying products on display from image information.

１：情報処理システム
２：管理端末
３：画像情報入力端末
２０：学習処理部
２１：認識処理部
２２：商品識別情報記憶部
７０：演算装置
７１：記憶装置
７２：表示装置
７３：入力装置
７４：通信装置
２０１：第１の学習処理部
２０２：第２の学習処理部
２１０：画像情報入力受付処理部
２１１：画像情報記憶部
２１２：画像情報正置化処理部
２１３：棚段特定処理部
２１４：陳列商品認識処理部
２１４１：フェイス特定処理部
２１４２：商品同定処理部
２１４３：棚段比較処理部 1: Information processing system 2: Management terminal 3: Image information input terminal 20: Learning processing section 21: Recognition processing section 22: Product identification information storage section 70: Arithmetic device 71: Storage device 72: Display device 73: Input device 74: Communication device 201: First learning processing section 202: Second learning processing section 210: Image information input reception processing section 211: Image information storage section 212: Image information uprighting processing section 213: Shelf specification processing section 214: Display product recognition processing unit 2141: Face identification processing unit 2142: Product identification processing unit 2143: Shelf comparison processing unit

Claims

An information processing system that identifies products shown in image information,
a first learning processing unit that performs machine learning to create a first learning model using first annotation data in which attributes are associated with data obtained by masking the inside of the closed region using the outline of the product as the external shape; and,
a recognition processing unit that identifies product identification information of a product shown in image information taken of a display shelf;
It has
The recognition processing unit is
Using the image information and the first learning model, specifying the external shape of the product shown in the image information as a face area, and identifying product identification information of the product shown in the specified face area.
An information processing system characterized by:

The first learning processing unit includes:
creating the first learning model by performing machine learning by image segmentation using the first annotation data;
The information processing system according to claim 1, characterized in that:

The information processing system includes:
a second learning processing unit that performs machine learning using second annotation data that associates product image data and product identification information to create a second learning model;
The recognition processing unit is
identifying product identification information of a product appearing in the face area using the image information of the identified face area and the second learning model;
The information processing system according to claim 1 or claim 2, characterized in that:

The recognition processing unit is
identifying product identification information of the product appearing in the face area by performing image matching processing on the image information of the identified face area and sample information of the product stored in the sample information storage unit;
The information processing system according to claim 1 or claim 2, characterized in that:

An information processing system that identifies products shown in image information,
a first learning processing unit that performs machine learning to create a first learning model using first annotation data in which attributes are associated with data obtained by masking the inside of the closed region using the outline of the product as the external shape; ,
It has
Using the first learning model and image information taken of a display shelf, the external shape of the product shown in the image information is identified as a face area, and product identification information of the product shown in the identified face area is obtained. identify,
An information processing system characterized by:

An information processing system that identifies products shown in image information,
an image information input reception processing unit that accepts input of image information obtained by photographing a display shelf;
a displayed product recognition processing unit that identifies product identification information of the product in the photograph from the image information that has received the input or the image information that has been normalized;
It has
The displayed product recognition processing unit is
A first learning model created by machine learning using first annotation data in which attributes are associated with data obtained by masking the inside of the closed region using the outline of the product as the outer shape, and the input. Using the received image information or image information obtained by normalizing the image information, the external shape of the product shown is specified as a face area, and product identification information of the product shown in the specified face area is identified. do,
An information processing system characterized by:

computer,
a first learning processing unit that performs machine learning to create a first learning model using first annotation data in which attributes are associated with data obtained by masking the inside of the closed region using the outline of the product as the external shape; ,
a recognition processing unit that identifies the product identification information of the product shown in the image information of the display shelf;
An information processing program that functions as a
The recognition processing unit is
Using the image information and the first learning model, specifying the external shape of the product shown in the image information as a face area, and identifying product identification information of the product shown in the specified face area.
An information processing program characterized by:

computer,
a first learning processing unit that performs machine learning to create a first learning model using first annotation data in which attributes are associated with data obtained by masking the inside of the closed region using the outline of the product as the external shape; ,
An information processing program that functions as a
Using the first learning model and image information taken of a display shelf, the external shape of the product shown in the image information is identified as a face area, and product identification information of the product shown in the identified face area is obtained. identify,
An information processing program characterized by:

computer,
an image information input reception processing unit that accepts input of image information obtained by photographing a display shelf;
a displayed product recognition processing unit that identifies product identification information of the product in the image from the image information that has received the input or the image information that has been normalized;
An information processing program that functions as a
The displayed product recognition processing unit is
A first learning model created by machine learning using first annotation data in which attributes are associated with data obtained by masking the inside of the closed region using the outline of the product as the outer shape, and the input. Using the received image information or image information obtained by normalizing the image information, the external shape of the product shown is specified as a face area, and product identification information of the product shown in the specified face area is identified. do,
An information processing program characterized by: