JP6966749B2

JP6966749B2 - Image processing system

Info

Publication number: JP6966749B2
Application number: JP2019100629A
Authority: JP
Inventors: 成吉谷井
Original assignee: Marketvision Co Ltd
Current assignee: Marketvision Co Ltd
Priority date: 2019-05-29
Filing date: 2019-05-29
Publication date: 2021-11-17
Anticipated expiration: 2039-05-29
Also published as: JP2020194443A

Description

本発明は，画像処理システムに関する。 The present invention relates to an image processing system.

ある画像情報に，判定対象とする対象物が写っているかを判定することが求められる場合がある。たとえば商品の陳列棚を撮影した画像情報に，判定対象とする商品が写っているかを判定することで，その商品が陳列されていたかを判定する場合などがその一例としてあげられる。 It may be required to determine whether or not an object to be determined is reflected in a certain image information. For example, there is a case where it is determined whether or not the product is displayed by determining whether or not the product to be determined is reflected in the image information obtained by photographing the display shelf of the product.

このように，画像情報に対象物が写っているかを判定する場合には，通常，処理対象となる画像情報と，対象物の画像情報との画像マッチング処理を実行することが一般的である。たとえば，下記特許文献１には，商品ごとの登録画像をもとに，自動販売機を撮影した画像情報に対して画像認識技術を用いることで，自動販売機が取り扱う商品を把握するシステムが開示されている。 In this way, when determining whether or not an object is reflected in the image information, it is common to execute an image matching process between the image information to be processed and the image information of the object. For example, Patent Document 1 below discloses a system for grasping products handled by vending machines by using image recognition technology for image information taken by vending machines based on registered images for each product. Has been done.

特開２０１４−１９１４２３号公報Japanese Unexamined Patent Publication No. 2014-191423

特許文献１の具体的な処理は，複数の方向から自動販売機を撮影し，撮影した各画像の位置関係を合わせた後，撮影した各画像を重畳することで合成画像を生成する。そして，生成した合成画像に，自動販売機に陳列される可能性のある商品を表す登録画像を照合することで，自動販売機が取り扱う商品を特定している。 The specific processing of Patent Document 1 is to shoot a vending machine from a plurality of directions, align the positional relationships of the shot images, and then superimpose the shot images to generate a composite image. Then, the product handled by the vending machine is specified by collating the generated composite image with the registered image representing the product that may be displayed on the vending machine.

合成画像と，商品を表す登録画像との照合処理（マッチング処理）の際には，それぞれの特徴量を用いることで処理を実行することが一般的である。しかし，比較対象とする商品が多くなる場合，登録画像の数が多くなり，照合処理（マッチング処理）の精度を高めることはできない。また計算時間が膨大にかかる課題がある。 In the collation process (matching process) between the composite image and the registered image representing the product, it is common to execute the process by using each feature amount. However, when the number of products to be compared increases, the number of registered images increases, and the accuracy of the collation process (matching process) cannot be improved. In addition, there is a problem that the calculation time is enormous.

そこで本発明者は上記課題に鑑み，陳列棚などを撮影した画像情報に写っている商品などの対象物を特定する際に，計算時間を従来よりも要することなく，マッチング処理の精度を向上させる画像処理システムを発明した。 Therefore, in view of the above problems, the present inventor improves the accuracy of the matching process without requiring more calculation time than before when identifying an object such as a product shown in image information obtained by photographing a display shelf or the like. Invented an image processing system.

第１の発明は，陳列棚を撮影した画像情報に写っている商品を特定する画像処理システムであって，前記画像処理システムは，前記画像情報におけるフェイス領域において所定条件を充足する領域であるブロックのレイアウト情報と，標本とする商品におけるブロックのレイアウト情報とを用いて，前記ブロック同士のレイアウトの類似性の判定を行い，比較処理の候補とする商品を特定する候補処理部と，前記画像情報のフェイス領域と，前記候補処理部で候補として特定した商品の標本情報とを用いて比較処理をすることで，前記フェイス領域に写っている商品を特定する比較処理部と，を有しており，前記候補処理部は，前記画像情報のフェイス領域におけるブロックのレイアウト情報と，標本とする商品のブロックのレイアウト情報との共通部分および／または差分の面積を用いて所定の評価値を算出し，前記算出した評価値が閾値以上の場合，前記標本とした商品を候補として特定する，画像処理システムである。
The first invention is an image processing system for identifying the product that is reflected in the photographed image information display shelf, the image processing system is an area satisfying a predetermined condition in the face region in the image information block by using the layout information, and layout information of the block in the item to the sampling of the a judgment of similarity of layout between blocks, and the candidate processor for identifying a product to be a candidate of the comparison process, the image information a face region of the, by the comparison processing using the sample information items identified as candidates by the candidate processor has a comparison unit for identifying the product that is reflected in the face region , The candidate processing unit calculates a predetermined evaluation value using the area of the common part and / or the difference between the block layout information in the face area of the image information and the layout information of the block of the sampled product. This is an image processing system that specifies the sampled product as a candidate when the calculated evaluation value is equal to or higher than the threshold value.

本発明のように構成することで，比較処理（マッチング処理）において比較対象とする標本の対象物の数を減らすことができるので，計算時間を従来よりも減らすことができる。 By configuring as in the present invention, the number of objects to be compared in the comparison process (matching process) can be reduced, so that the calculation time can be reduced as compared with the conventional case.

また，比較処理の候補とする対象物を絞り込むためには，本発明のような処理を実行することが好ましい。またレイアウト情報は画像情報ではなく座標情報であるため，これらの処理を実行しても，高速化を実現することができる。
Further, in order to narrow down the objects to be candidates for the comparison processing, it is preferable to execute the processing as in the present invention. Moreover, since the layout information is not image information but coordinate information, high speed can be realized even if these processes are executed.

上述の発明において，前記候補処理部は，前記画像情報におけるフェイス領域において所定条件を充足するブロックを特定し，前記特定したフェイス領域の所定範囲におけるブロックのレイアウト情報と，前記標本とする商品の画像情報において前記フェイス領域の所定範囲の大きさに対応する領域のブロックのレイアウト情報と用いて，レイアウトの類似性の判定を行う，画像処理システムのように構成することができる。
In the invention described above, the candidate processing unit, the identifies blocks that satisfy a predetermined condition in the face region in the image information, and the block layout information in a predetermined range of the specified face area, the image of the product to the specimen The information can be configured like an image processing system that determines the similarity of layouts by using the layout information of the blocks of the area corresponding to the size of the predetermined range of the face area.

対象物の表面は平面とは限らず，たとえば円柱のように曲面となっている可能性がある。対象物の表面が曲面の場合，その対象物を撮影した画像情報においては，中心から左右方向に離れるにつれ，歪みが大きくなる。そこで，第１の領域の中央部付近を切り出し，標本とする対象物の画像情報においても，切り出した中央部付近に対応する大きさでレイアウト情報のマッチング処理を行うことで，対象物の表面が曲面であることによる歪みの影響を減らすことができる。 The surface of the object is not limited to a flat surface, but may be a curved surface such as a cylinder. When the surface of the object is a curved surface, the image information obtained by photographing the object becomes more distorted as it moves away from the center in the left-right direction. Therefore, by cutting out the vicinity of the central part of the first region and performing the matching processing of the layout information with the size corresponding to the vicinity of the cut out central part in the image information of the object to be sampled, the surface of the object can be obtained. The effect of distortion due to the curved surface can be reduced.

上述の発明において，前記候補処理部は，前記レイアウトの類似性の判定を行う処理を，前記標本とする商品の画像情報において，前記フェイス領域の所定範囲の大きさに対応する領域をずらしながら行い，前記比較処理部は，前記商品の画像情報において，前記候補処理部でもっとも類似性が高いと判定した領域の標本情報と，前記画像情報のフェイス領域とを用いて比較処理をすることで，前記フェイス領域に写っている商品を特定する，画像処理システムのように構成することができる。
In the above-mentioned invention, the candidate processing unit performs the process of determining the similarity of the layout while shifting the area corresponding to the size of the predetermined range of the face area in the image information of the product as the sample. The comparison processing unit performs comparison processing using the sample information of the region determined to have the highest similarity in the candidate processing unit and the face region of the image information in the image information of the product. It can be configured like an image processing system that identifies the product in the face area.

中央部付近を切り出す場合，それをずらしながら行った上で，もっとも類似性が高いと判定した領域の標本情報等を用いて比較処理を行うことで，比較処理の精度を向上させることができる。 When cutting out the vicinity of the central part, the accuracy of the comparison processing can be improved by performing the comparison processing while shifting it and then using the sample information of the region judged to have the highest similarity.

上述の発明において，前記比較処理部は，前記画像情報のブロックと，前記候補処理部で候補として特定した商品のブロックとを対応づけ，前記対応づけたブロックを用いて，画像情報および／または文字情報の類似性を判定することで，前記比較処理を実行する，画像処理システムのように構成することができる。
In the above invention, the comparison processing unit, said the block of image information, correspondence and a block of the specified item as a candidate in the candidate processing unit, by using the correspondence block, image information and / or character By determining the similarity of information, it can be configured like an image processing system that executes the comparison processing.

比較処理では，本発明のようにブロック同士の比較を行ってもよい。 In the comparison process, blocks may be compared with each other as in the present invention.

第５の発明は，陳列棚を撮影した画像情報に写っている商品を特定する画像処理システムであって，前記画像処理システムは，前記陳列棚を撮影した画像情報とそれに対応する深さ情報を用いて生成した３次元モデルの一部または全部について，平面に展開する平面展開画像情報を生成し，前記生成した平面展開画像情報の一部または全部を，処理対象の画像情報とする画像情報処理部と，前記処理対象の画像情報におけるフェイス領域において所定条件を充足する領域であるブロックのレイアウト情報と，標本とする対象物におけるブロックのレイアウト情報とを用いて，前記ブロック同士のレイアウトの類似性の判定を行い，比較処理の候補とする商品を特定する候補処理部と，前記処理対象の画像情報のフェイス領域と，前記候補処理部で候補として特定した商品の標本情報とを用いて比較処理をすることで，前記フェイス領域に写っている商品を特定する比較処理部と，を有する画像処理システムである。
A fifth invention is an image processing system that identifies a product shown in image information obtained by photographing a display shelf, and the image processing system obtains image information obtained by photographing the display shelf and corresponding depth information. Image information processing that generates plane-expanded image information to be expanded on a plane for a part or all of the three-dimensional model generated in use, and uses part or all of the generated plane-expanded image information as image information to be processed. parts and the layout information of the block is an area satisfying a predetermined condition in the face region in the image information of the processing target, by using the block layout information in the object to be sampled, the layout of the similarity of the blocks to each other Comparison processing is performed using the candidate processing unit that identifies the product as a candidate for comparison processing, the face area of the image information to be processed, and the sample information of the product specified as a candidate by the candidate processing unit. This is an image processing system having a comparison processing unit for identifying a product appearing in the face area.

処理対象とする画像情報は，陳列棚などを撮影した画像情報を２次元のまま取り扱う画像情報のほか，一度，３次元モデルを生成し，それを平面展開した画像情報としてもよい。それによって，より精度よく処理を実行することができる。 The image information to be processed may be image information in which the image information obtained by photographing a display shelf or the like is handled as it is in two dimensions, or image information in which a three-dimensional model is once generated and expanded in a plane. As a result, the processing can be executed more accurately.

第１の発明は，本発明のプログラムをコンピュータに読み込ませて実行することで実現できる。すなわち，コンピュータを，陳列棚を撮影した画像情報におけるフェイス領域において所定条件を充足する領域であるブロックのレイアウト情報と，標本とする商品におけるブロックのレイアウト情報とを用いて，前記ブロック同士のレイアウトの類似性の判定を行い，比較処理の候補とする商品を特定する候補処理部，前記画像情報のフェイス領域と，前記候補処理部で候補として特定した商品の標本情報とを用いて比較処理をすることで，前記フェイス領域に写っている商品を特定する比較処理部，として機能させる画像処理プログラムであって，前記候補処理部は，前記処理対象の画像情報のフェイス領域におけるブロックのレイアウト情報と，標本とする商品のブロックのレイアウト情報との共通部分および／または差分の面積を用いて所定の評価値を算出し，前記算出した評価値が閾値以上の場合，前記標本とした商品を候補として特定する，画像処理プログラムである。
The first invention can be realized by loading the program of the present invention into a computer and executing it. That is, the computer uses the layout information of the block, which is an area satisfying a predetermined condition in the face area of the image information obtained by photographing the display shelf, and the layout information of the block in the sample product , to form the layout of the blocks. a determination is similarity, candidate processing unit for identifying the product to be a candidate of the comparison process, the comparison process using the face region of the image information, and sample information items identified as candidates by the candidate processing unit This is an image processing program that functions as a comparison processing unit that identifies the products in the face area , and the candidate processing unit uses the layout information of the blocks in the face area of the image information to be processed and the layout information of the blocks. A predetermined evaluation value is calculated using the area common to the layout information of the block of the sampled product and / or the difference area, and if the calculated evaluation value is equal to or greater than the threshold value, the sampled product is specified as a candidate. This is an image processing program.

第５の発明は，本発明のプログラムをコンピュータに読み込ませて実行することで実現できる。すなわち，コンピュータを，陳列棚を撮影した画像情報とそれに対応する深さ情報を用いて生成した３次元モデルの一部または全部について，平面に展開する平面展開画像情報を生成し，前記生成した平面展開画像情報の一部または全部を，処理対象の画像情報とする画像情報処理部，前記処理対象の画像情報におけるフェイス領域において所定条件を充足する領域であるブロックのレイアウト情報と，標本とする対象物におけるブロックのレイアウト情報とを用いて，前記ブロック同士のレイアウトの類似性の判定を行い，比較処理の候補とする商品を特定する候補処理部，前記処理対象の画像情報のフェイス領域と，前記候補処理部で候補として特定した商品の標本情報とを用いて比較処理をすることで，前記フェイス領域に写っている商品を特定する比較処理部，として機能させる画像処理プログラムである。 The fifth invention can be realized by loading the program of the present invention into a computer and executing it. That is, the computer generates plane-expanded image information that expands to a plane for a part or all of the three-dimensional model generated by using the image information obtained by photographing the display shelf and the corresponding depth information, and the generated plane. The image information processing unit that uses part or all of the developed image information as the image information to be processed, the layout information of the block that is the area that satisfies the predetermined conditions in the face area of the image information to be processed, and the target to be sampled. The candidate processing unit that determines the layout similarity between the blocks using the layout information of the blocks in the object and identifies the products that are candidates for the comparison processing, the face area of the image information to be processed, and the above. This is an image processing program that functions as a comparison processing unit that identifies products in the face area by performing comparison processing using sample information of products specified as candidates in the candidate processing unit.

本発明の画像処理システムを用いることによって，陳列棚などを撮影した画像情報に写っている商品などの対象物を特定する際に，計算時間を従来よりも要することなく，マッチング処理の精度を向上させることが可能となる。 By using the image processing system of the present invention, the accuracy of matching processing is improved without requiring more calculation time than before when identifying an object such as a product shown in image information taken of a display shelf or the like. It is possible to make it.

本発明の画像処理システムの全体の処理機能の一例を模式的に示すブロック図である。It is a block diagram schematically showing an example of the whole processing function of the image processing system of this invention. 本発明の画像処理システムで用いるコンピュータのハードウェア構成の一例を模式的に示すブロック図である。It is a block diagram schematically showing an example of the hardware composition of the computer used in the image processing system of this invention. 本発明の画像処理システムにおける全体処理の処理プロセスの一例を示すフローチャートである。It is a flowchart which shows an example of the processing process of the whole processing in the image processing system of this invention. 本発明の画像処理システムにおけるブロックを特定する処理の処理プロセスの一例を示すフローチャートである。It is a flowchart which shows an example of the processing process of the process of specifying a block in the image processing system of this invention. 本発明の画像処理システムにおける候補特定処理の処理プロセスの一例を示すフローチャートである。It is a flowchart which shows an example of the processing process of the candidate identification processing in the image processing system of this invention. 撮影画像情報の一例を示す図である。It is a figure which shows an example of photographed image information. 図６の撮影画像情報に対して正置化処理を実行した画像情報の一例を示す図である。It is a figure which shows an example of the image information which performed the orthotopic process with respect to the photographed image information of FIG. フェイス領域の画像情報の一例を示す図である。It is a figure which shows an example of the image information of a face area. フェイス領域の画像情報においてブロックを特定した状態の一例を示す図である。It is a figure which shows an example of the state which specified the block in the image information of a face area. フェイス領域の画像情報において特定したブロックのレイアウト情報と，それを画像情報上で解釈する場合を模式的に示した図である。It is the figure which showed typically the layout information of the block specified in the image information of a face area, and the case of interpreting it on the image information. 標本情報に対応する画像情報と，その画像情報におけるブロックのレイアウト情報と，それを画像情報上で解釈する場合を模式的に示した図である。It is a figure which shows typically the image information corresponding to the sample information, the layout information of a block in the image information, and the case of interpreting it on the image information. レイアウトマッチング処理の一例を模式的に示す図である。It is a figure which shows an example of the layout matching process schematically. 対応するブロックの領域がある箇所，対応するブロックの領域がない箇所の組み合わせを模式的に示す図である。It is a figure which shows typically the combination of the part which has the area of a corresponding block, and the part which has the area of a corresponding block. 図１３についての一例を示す図である。It is a figure which shows an example about FIG. レイアウトマッチング処理のほかの一例を模式的に示す図である。It is a figure which shows the other example of the layout matching process schematically. ブロックの種別を示すタグ情報一例を示す図である。It is a figure which shows an example of the tag information which shows the type of a block. ブロック同士の対応関係の一例を示す図である。It is a figure which shows an example of the correspondence relation between blocks. 実施例３における処理を模式的に示す図である。It is a figure which shows typically the process in Example 3. FIG. 総合的な指標値として出現頻度を用いた場合の，ブロックとタグ情報の種別，絞込表の一例を模式的に示す図である。It is a figure which shows typically an example of the type of a block and a tag information, and a narrowing down table when the appearance frequency is used as a comprehensive index value. 撮影した画像情報と深さ情報による深さマップとの対応関係を模式的に示す図である。It is a figure which shows typically the correspondence relationship between the photographed image information and the depth map by depth information. 撮影した画像情報と深さマップに基づく３次元モデル化の処理の一例を示す図である。It is a figure which shows an example of the process of 3D modeling based on the photographed image information and the depth map. 平面展開の処理の一例を模式的に示す図である。It is a figure which shows an example of the process of plane expansion schematically. ３次元モデル化の後に実行するスティッチング処理を模式的に示す図である。It is a figure which shows typically the stitching process which is executed after 3D modeling. ３次元モデル化の後に実行するスティッチング処理を模式的に示す図である。It is a figure which shows typically the stitching process which is executed after 3D modeling. 視点方向決定処理において，基準を特定する処理を模式的に示す図である。It is a figure which shows typically the process of specifying a reference in a process of determining a viewpoint direction. 各面をラベリングする処理の一例を模式的に示す図である。It is a figure which shows an example of the process of labeling each surface schematically. パッケージタイプが缶の場合のフェイスを特定するイメージ図である。It is an image diagram which specifies the face when the package type is a can. パッケージタイプが瓶の場合のフェイスを特定するイメージ図である。It is an image diagram that identifies the face when the package type is a bottle. パッケージタイプが箱物の場合のフェイスを特定するイメージ図である。It is an image diagram that identifies the face when the package type is a box. パッケージタイプが吊るし商品の場合のフェイスを特定するイメージ図である。It is an image diagram that identifies the face when the package type is a hanging product.

本発明の画像処理システム１の全体の処理機能の一例のブロック図を図１に示す。画像処理システム１は，管理端末２と入力端末３とを用いる。 FIG. 1 shows a block diagram of an example of the overall processing function of the image processing system 1 of the present invention. The image processing system 1 uses a management terminal 2 and an input terminal 3.

管理端末２は，画像処理システム１の中心的な処理機能を実現するコンピュータである。また，入力端末３は，店舗の陳列棚などを撮影した画像情報を取得する端末である。また，後述する比較処理（マッチング処理）で用いる標本とする商品などの対象物を撮影し，取得してもよい。 The management terminal 2 is a computer that realizes the core processing function of the image processing system 1. Further, the input terminal 3 is a terminal for acquiring image information obtained by photographing a display shelf of a store or the like. In addition, an object such as a product used as a sample used in the comparison process (matching process) described later may be photographed and acquired.

画像処理システム１における管理端末２，入力端末３は，コンピュータを用いて実現される。図２にコンピュータのハードウェア構成の一例を模式的に示す。コンピュータは，プログラムの演算処理を実行するＣＰＵなどの演算装置７０と，情報を記憶するＲＡＭやハードディスクなどの記憶装置７１と，情報を表示するディスプレイなどの表示装置７２と，情報の入力が可能なキーボードやマウスなどの入力装置７３と，演算装置７０の処理結果や記憶装置７１に記憶する情報をインターネットやＬＡＮなどのネットワークを介して送受信する通信装置７４とを有している。 The management terminal 2 and the input terminal 3 in the image processing system 1 are realized by using a computer. FIG. 2 schematically shows an example of a computer hardware configuration. The computer can input information into an arithmetic unit 70 such as a CPU that executes arithmetic processing of a program, a storage device 71 such as a RAM or a hard disk for storing information, a display device 72 such as a display for displaying information, and information. It has an input device 73 such as a keyboard and a mouse, and a communication device 74 that transmits and receives processing results of the arithmetic unit 70 and information stored in the storage device 71 via a network such as the Internet or LAN.

コンピュータがタッチパネルディスプレイを備えている場合には，表示装置７２と入力装置７３とが一体的に構成されていてもよい。タッチパネルディスプレイは，たとえばタブレット型コンピュータやスマートフォンなどの可搬型通信端末などで利用されることが多いが，それに限定するものではない。 When the computer is provided with a touch panel display, the display device 72 and the input device 73 may be integrally configured. Touch panel displays are often used in portable communication terminals such as tablet computers and smartphones, but are not limited thereto.

タッチパネルディスプレイは，そのディスプレイ上で，直接，所定の入力デバイス（タッチパネル用のペンなど）や指などによって入力を行える点で，表示装置７２と入力装置７３の機能が一体化した装置である。 The touch panel display is a device in which the functions of the display device 72 and the input device 73 are integrated in that input can be performed directly on the display with a predetermined input device (such as a pen for a touch panel) or a finger.

入力端末３は，上記の各装置のほか，カメラなどの撮影装置を備えていてもよい。入力端末３として，携帯電話，スマートフォン，タブレット型コンピュータなどの可搬型通信端末を用いることもできる。入力端末３は，撮影装置で可視光などによる画像情報（後述する撮影画像情報または対象物画像情報）を撮影する。 The input terminal 3 may include a photographing device such as a camera in addition to the above-mentioned devices. As the input terminal 3, a portable communication terminal such as a mobile phone, a smartphone, or a tablet computer can also be used. The input terminal 3 captures image information (photographed image information or object image information described later) by visible light or the like with a photographing device.

本発明における各手段は，その機能が論理的に区別されているのみであって，物理上あるいは事実上は同一の領域を為していてもよい。本発明の各手段における処理は，その処理順序を適宜変更することもできる。また，処理の一部を省略してもよい。たとえば後述する視点方向を決定する処理を省略することもできる。その場合，視点方向を決定する処理をしていない画像情報に対する処理を実行することができる。また，管理端末２における機能の一部または全部を入力端末３で実行してもよい。 Each means in the present invention has only a logical distinction in its function, and may form the same area physically or substantially. The processing in each means of the present invention may be appropriately changed in the processing order. In addition, a part of the processing may be omitted. For example, it is possible to omit the process of determining the viewpoint direction, which will be described later. In that case, it is possible to execute processing on the image information that has not been processed to determine the viewpoint direction. Further, a part or all of the functions in the management terminal 2 may be executed in the input terminal 3.

画像処理システム１は，画像情報入力受付処理部２０と画像情報記憶部２１と画像情報処理部２２と候補処理部２３と比較処理部２４と標本情報処理部２５と標本情報記憶部２６とを有する。 The image processing system 1 includes an image information input reception processing unit 20, an image information storage unit 21, an image information processing unit 22, a candidate processing unit 23, a comparison processing unit 24, a sample information processing unit 25, and a sample information storage unit 26. ..

画像情報入力受付処理部２０は，入力端末３で撮影した画像情報（撮影画像情報）の入力を受け付け，後述する画像情報記憶部２１に記憶させる。たとえば店舗の陳列棚の撮影画像情報の入力を受け付け，画像情報記憶部２１に記憶させる。入力端末３からは，撮影画像情報のほか，撮影日時，撮影対象を示す情報，たとえば店舗名などの店舗識別情報，画像情報を識別する画像情報識別情報などをあわせて入力を受け付けるとよい。図６に，撮影画像情報の一例を示す。図６では，陳列棚に３段の棚段があり，そこに商品が陳列されている状態を撮影した撮影画像情報を模式的に示すものである。 The image information input reception processing unit 20 receives input of image information (photographed image information) taken by the input terminal 3 and stores it in the image information storage unit 21 described later. For example, the input of the photographed image information of the display shelf of the store is accepted and stored in the image information storage unit 21. In addition to the captured image information, the input terminal 3 may accept input together with information indicating the shooting date and time and the shooting target, for example, store identification information such as a store name, image information identification information for identifying image information, and the like. FIG. 6 shows an example of captured image information. FIG. 6 schematically shows photographed image information obtained by photographing a state in which products are displayed on three shelves on the display shelf.

画像情報記憶部２１は，入力端末３から受け付けた撮影画像情報，撮影日時，画像情報識別情報などを対応づけて記憶する。 The image information storage unit 21 stores the captured image information received from the input terminal 3, the shooting date and time, the image information identification information, and the like in association with each other.

画像情報処理部２２は，画像情報入力受付処理部２０で受け付けた撮影画像情報について，撮影画像情報を正対した状態とする正置化処理，撮影画像情報から標本情報と比較処理（マッチング処理）を実行する領域（フェイス領域）を特定するフェイス処理，フェイス領域において所定の条件を充足する特徴領域（ブロック）を特定するブロック特定処理を実行する。 The image information processing unit 22 makes the captured image information received by the image information input reception processing unit 20 upright so that the captured image information faces each other, and compares the captured image information with the sample information (matching processing). Performs face processing to specify the area (face area) to execute, and block specifying processing to specify the feature area (block) that satisfies a predetermined condition in the face area.

正置化処理とは，一般的に，単に撮影対象物を撮影した場合には，撮影対象物を正対した状態で撮影することは困難であることから，それを正対した状態に補正する処理であり，撮影装置のレンズの光軸を撮影対象である平面の垂線方向に沿って，十分に遠方から撮影した場合と同じになるように画像情報を変形させる処理である。このような補正処理の一例として台形補正処理がある。なお，画像情報に歪みがある場合，歪み補正処理を付加してもよい。図７に，図６の撮影画像情報に対して正置化処理を実行した状態の画像情報を示す。 In general, when the object to be photographed is simply photographed, it is difficult to shoot the object in the orthotopic position in the face-to-face state. This is a process of transforming the image information so that the optical axis of the lens of the photographing device is the same as when the image is taken from a sufficient distance along the perpendicular direction of the plane to be photographed. An example of such correction processing is trapezoidal correction processing. If the image information is distorted, distortion correction processing may be added. FIG. 7 shows the image information in a state in which the orthostatic processing is executed on the captured image information of FIG.

なお，撮影画像情報が正対した状態で撮影された画像情報である場合，あるいは歪みがない場合には，正置化処理，歪み補正処理を実行しなくてもよい。 If the captured image information is the image information captured in a facing state, or if there is no distortion, it is not necessary to execute the orthostatic processing and the distortion correction processing.

撮影画像情報とは，本発明の処理対象となる画像情報であればよい。正置化処理などの撮影画像情報に対する補正処理が実行された後の画像情報も撮影画像情報に含まれる。 The captured image information may be any image information to be processed by the present invention. The captured image information also includes the image information after the correction processing for the captured image information such as the orthostatic processing is executed.

また，画像情報処理部２２は，撮影対象を撮影する際に，複数枚で撮影した場合，それを一つの画像情報に合成する処理を実行し，合成処理を実行した画像情報に対して，正置化処理，フェイス処理，ブロック特定処理を実行してもよい。複数枚の画像情報を一つの画像情報に合成する処理としては，公知の手法を用いることもできる。 Further, when the image information processing unit 22 shoots a plurality of images, the image information processing unit 22 executes a process of synthesizing the images into one image information, and the image information obtained by the synthesizing process is positive. Placement processing, face processing, and block identification processing may be executed. A known method can also be used as a process of synthesizing a plurality of image information into one image information.

フェイス処理とは、撮影画像情報において，後述する標本情報と比較処理を実行するための領域（フェイス領域）を特定する。たとえば商品の陳列棚を撮影した撮影画像情報であって，標本情報が商品である場合，フェイス領域として，陳列棚に陳列されている商品の領域や商品のラベルの領域を特定する。商品がペットボトル飲料の場合には，商品のラベルの領域をフェイス領域とし，商品が箱に入った商品（たとえばお菓子）の場合には，商品のパッケージ全体をフェイス領域とするなど，商品に応じて，適宜，フェイス領域を設定できる。なお，商品がペットボトル飲料の場合にも，商品の領域をフェイス領域としてもよい。 The face processing specifies an area (face area) for executing comparison processing with sample information described later in the captured image information. For example, in the case of photographed image information obtained by photographing a product display shelf and the sample information is a product, the area of the product displayed on the display shelf or the area of the product label is specified as the face area. If the product is a PET bottle beverage, the area of the product label is the face area, and if the product is a boxed product (for example, sweets), the entire product package is the face area. Depending on the situation, the face area can be set as appropriate. Even when the product is a PET bottle beverage, the product area may be the face area.

フェイス領域の特定方法はさまざまな方法があり，標本情報の特性に合わせて任意に設定することができる。標本情報が商品（たとえばペットボトル飲料）であって，陳列棚を撮影した撮影画像情報から商品のラベルの領域をフェイス領域として特定する場合には，たとえば，陳列棚の棚段と棚段の間の領域（棚段領域）における商品と商品との間に生じる縦の細く狭い陰影を特定する，画像の繰り返しパターンを特定する，パッケージの上辺の段差を特定する，商品幅が同一であるなどの制約に基づいて区切り位置を特定する，などによって，商品の領域を特定する。そして，その商品の領域の中から，所定の矩形領域をラベルの領域として特定し，その領域をフェイス領域として特定する。 There are various methods for specifying the face area, and it can be set arbitrarily according to the characteristics of the sample information. When the sample information is a product (for example, a PET bottle beverage) and the area of the product label is specified as the face area from the photographed image information of the display shelf, for example, between the shelves of the display shelf. Identify the vertical thin and narrow shadows that occur between products in the area (shelf area), identify the repeating pattern of the image, identify the step on the upper side of the package, the product width is the same, etc. Specify the product area by specifying the delimiter position based on the constraint. Then, from the area of the product, a predetermined rectangular area is specified as a label area, and that area is specified as a face area.

フェイス領域の特定方法は，商品のカテゴリや商品の形態によって任意の方法を採用可能であり，上記に限定するものではない。また，自動的に特定したフェイス領域に対して，オペレータによる修正入力を受け付けてもよい。さらに，オペレータからフェイス領域の位置の入力を受け付けるのでもよい。 As the method for specifying the face area, any method can be adopted depending on the product category and the product form, and the method is not limited to the above. Further, the correction input by the operator may be accepted for the automatically specified face area. Further, the input of the position of the face area may be accepted from the operator.

さらに画像情報処理部２２は，深層学習（ディープラーニング）を用いてフェイス領域を特定してもよい。この場合，中間層が多数の層からなるニューラルネットワークの各層のニューロン間の重み付け係数が最適化された学習モデルに対して，上記撮影画像情報を入力し，その出力値に基づいて，フェイス領域を特定してもよい。また学習モデルとしては，さまざまな撮影画像情報にフェイス領域を正解データとして与えたものを用いることができる。 Further, the image information processing unit 22 may specify the face region by using deep learning. In this case, the above captured image information is input to the learning model in which the weighting coefficient between the neurons of each layer of the neural network consisting of many layers is optimized, and the face area is set based on the output value. It may be specified. Further, as the learning model, it is possible to use various captured image information in which the face region is given as correct answer data.

画像情報処理部２２は，以上のように特定したフェイス領域を切り出す。フェイス領域を切り出すとは，撮影画像情報から特定したフェイス領域を実際に切り出してもよいし，後述の処理の処理対象としてその領域を設定することも含まれる。 The image information processing unit 22 cuts out the face region specified as described above. Cutting out the face area may actually cut out the face area specified from the captured image information, and also includes setting the area as the processing target of the processing described later.

フェイス領域を切り出す場合には，上述の各処理で特定したフェイス領域をそのまま切り出してもよいし，複数の方法によりフェイス領域の特定を行い，各方法で特定したフェイス領域の結果を用いて切り出す対象とするフェイス領域を特定してもよい。たとえば，陳列棚の棚段と棚段の間の領域（棚段領域）における商品と商品との間に生じる縦の細く狭い陰影を特定することで商品の領域を特定し，その領域から所定の矩形領域を特定する方法と，深層学習によりフェイス領域を特定する方法とを行い，それらの各方法で特定したフェイス領域の結果同士について，あらかじめ定めた演算によって，最終的に切り出す対象とするフェイス領域を特定してもよい。切り出すフェイス領域の特定方法は，上記に限定するものではなく，任意に設定することができる。 When cutting out the face area, the face area specified in each of the above processes may be cut out as it is, or the face area may be specified by a plurality of methods and the target to be cut out using the result of the face area specified by each method. You may specify the face area to be. For example, the area of a product is specified by identifying the vertical thin and narrow shadows that occur between the products in the area between the shelves of the display shelf (shelf area), and a predetermined area is specified from that area. A method of specifying a rectangular area and a method of specifying a face area by deep learning are performed, and the results of the face areas specified by each of these methods are finally cut out by a predetermined calculation. May be specified. The method for specifying the face area to be cut out is not limited to the above, and can be set arbitrarily.

図８に，撮影画像情報からフェイス領域として，商品（ペットボトル飲料のラベル部分）を切り出した場合の一例を示す。 FIG. 8 shows an example of a case where a product (label portion of a PET bottle beverage) is cut out as a face area from the photographed image information.

画像情報処理部２２がフェイス領域を特定すると，画像情報処理部２２は，そのフェイス領域のうち，商品を特定し得る情報がある矩形領域をブロックとして特定する。ブロックとして特定する領域は，フェイス領域の画像情報のうち，その内部に集中して共通の特徴を有する領域であり，フェイス領域の画像情報を特徴付ける特徴的な領域である。ブロックとして特定する領域には，商品名や商品のロゴなどが表示された領域となる。 When the image information processing unit 22 specifies the face area, the image information processing unit 22 specifies a rectangular area having information that can identify the product as a block in the face area. The area specified as a block is an area of the image information of the face area that is concentrated inside and has common features, and is a characteristic area that characterizes the image information of the face area. The area specified as a block is the area where the product name, product logo, etc. are displayed.

画像情報処理部２２がフェイス領域からブロックを特定するための方法としては，たとえば３つの方法があるが，それに限定するものではないし，それらを複数組み合わせてもよい。 The image information processing unit 22 has, for example, three methods for specifying a block from the face area, but the method is not limited to the three methods, and a plurality of methods may be combined.

第１の方法としては，対象物のブロックに共通する画像的な特徴を抽出して判定する方法である。すなわち，フェイス領域の画像情報内で，互いに明確に弁別可能な限定された個数の色で構成され，かつ色と色の境界が明確である（エッジが明瞭）という特徴を有する矩形領域を，ブロックとして特定する方法である。 The first method is a method of extracting and determining image features common to blocks of an object. That is, in the image information of the face region, a rectangular region having a feature of being composed of a limited number of colors that can be clearly distinguished from each other and having a clear color-to-color boundary (clear edges) is blocked. It is a method to specify as.

まずフェイス領域の画像情報を構成する各画素の色について，所定の色数のパレット色に色数を減色して検出する。またフェイス領域の画像情報を，所定の大きさのメッシュに分割し，パレット色ごとの分布マップを生成する。そして各パレットで，一定の密度以上で分布しており，かつ縦横の大きさが所定の閾値内に収まる局所的なグループを検出することで，矩形領域を特定する。 First, the color of each pixel constituting the image information in the face region is detected by reducing the number of colors to a palette color having a predetermined number of colors. In addition, the image information in the face area is divided into meshes of a predetermined size, and a distribution map for each palette color is generated. Then, in each pallet, a rectangular area is specified by detecting a local group that is distributed at a certain density or higher and whose vertical and horizontal sizes fall within a predetermined threshold value.

そして各グループについて，全パレット色の分布マップを参照し，その矩形領域に含まれるパレットの数が所定数，たとえば３以内であるかを判定し，所定数以内である場合には，そのグループ領域内にエッジが所定の閾値以上あるかを判定する。エッジの判定については，フェイス領域の画像情報に対して，エッジ検出処理を行っておき，それに基づいて閾値以上あるかを判定できる。そして，これらの条件を充足するグループを囲む矩形領域をブロックとして特定をする。 Then, for each group, the distribution map of all palette colors is referred to, and it is determined whether the number of palettes contained in the rectangular area is within a predetermined number, for example, 3, and if it is within the predetermined number, the group area. It is determined whether or not there is an edge within a predetermined threshold value or more. For edge determination, edge detection processing is performed on the image information in the face area, and it is possible to determine whether or not the image information is equal to or greater than the threshold value. Then, the rectangular area surrounding the group that satisfies these conditions is specified as a block.

以上のような方法によって，フェイス領域からブロックを特定することができる。 By the above method, the block can be specified from the face area.

フェイス領域からブロックを特定するための第２の方法としては，フェイス領域の画像情報から局所特徴量等の特徴量を抽出する。そして，広い範囲（所定以上の広さの範囲）にわたって共通の特徴が分布している領域を検出し，その領域をフェイス領域の「地」の部分として消し込む。そして残った領域について，互いに共通の特徴量が分布している領域を特定し，それらを囲む矩形領域をブロックとして特定をする。 As a second method for specifying a block from the face area, a feature amount such as a local feature amount is extracted from the image information of the face area. Then, a region in which common features are distributed over a wide range (a range of a predetermined size or more) is detected, and that region is erased as a "ground" part of the face region. Then, for the remaining areas, the areas where the features common to each other are distributed are specified, and the rectangular area surrounding them is specified as a block.

フェイス領域からブロックを特定するための第３の方法としては，機械学習あるいは深層学習（ディープラーニング）などのＡＩ技術によってフェイス領域の画像情報からブロックを検出することもできる。この場合，中間層が多数の層からなるニューラルネットワークの各層のニューロン間の重み付け係数が最適化された学習モデルに対して，フェイス領域の画像情報を入力し，その出力値に基づいて，ブロックを特定してもよい。また学習モデルとしては，さまざまなフェイス領域の画像情報にブロックを正解データとして与えたものを用いることができる。なお，撮影画像情報からフェイス領域の特定とブロックの特定とを同時に行っても良い。 As a third method for identifying a block from the face region, the block can be detected from the image information of the face region by AI technology such as machine learning or deep learning. In this case, the image information of the face region is input to the learning model in which the weighting coefficient between the neurons of each layer of the neural network consisting of many layers is optimized, and the block is created based on the output value. It may be specified. In addition, as a learning model, it is possible to use image information of various face regions in which blocks are given as correct answer data. The face area and the block may be specified at the same time from the captured image information.

なお，第３の方法における機械学習あるいは深層学習などのＡＩ技術を用いるほかの例として，次のようなものがある。あらかじめ，上述の学習モデルとして与える正解データ（さまざまなフェイス領域の画像情報にブロックを与えたもの）におけるブロック領域内の特徴量（この特徴量は色，局所特徴量など複数でも良い）と，ブロック領域外の特徴量とを抽出しておき，それぞれを正例，負例として，ＳＶＭ（サポートベクターマシン support vector machine）などの判定モジュール（判定機）を構成する。そしてこのＳＶＭなどの判定モジュールに対して，フェイス領域の画像情報の各点，たとえばフェイス領域の画像情報を所定の大きさのメッシュに分割してそのメッシュ内の点を入力し，各点についてブロックの領域に属しているか否かを判定させる。この判定結果において，ブロックの領域に属していると判定したメッシュの領域を囲む領域をブロックとして特定をする。 As another example of using AI technology such as machine learning or deep learning in the third method, there are the following. In advance, the feature amount in the block area (the feature amount may be a plurality of colors, local feature amounts, etc.) in the correct answer data (a block is given to the image information of various face areas) given as the above-mentioned learning model, and the block. A judgment module (judgment machine) such as an SVM (support vector machine) is configured by extracting the features outside the region and using each as a positive example and a negative example. Then, for this determination module such as SVM, each point of the image information in the face area, for example, the image information in the face area is divided into a mesh of a predetermined size, points in the mesh are input, and each point is blocked. It is made to judge whether or not it belongs to the area of. In this determination result, the area surrounding the mesh area determined to belong to the block area is specified as a block.

フェイス領域からブロックを特定する処理としては上記の３つの方法に限定されず，ほかの方法を用いてもよい。また上記の３つあるいはほかの方法のうち複数を組み合わせて特定しても良い。 The process of specifying the block from the face area is not limited to the above three methods, and other methods may be used. Further, a plurality of the above three methods or a plurality of other methods may be combined and specified.

図９にフェイス領域からブロックを特定した場合の一例を模式的に示す。図９では，図８のフェイス領域からブロックを特定した場合を示している。図９における矩形領域が特定したブロックである。 FIG. 9 schematically shows an example when a block is specified from the face area. FIG. 9 shows a case where a block is specified from the face area of FIG. The rectangular area in FIG. 9 is the specified block.

以上のように画像情報処理部２２でフェイス領域の画像情報からブロックを特定すると，候補処理部２３は，当該特定したブロックと，後述する標本情報記憶部２６に記憶する標本情報に対応するブロックとの間のレイアウトの類似性の判定処理であるレイアウトマッチング処理を実行する。レイアウトマッチング処理としては，たとえば，フェイス領域の画像情報からブロックを抽出する。ここでブロックを抽出するとは，たとえば，フェイス領域の画像情報におけるブロックのレイアウト情報（位置情報）を抽出する。これを模式的に示すのが図１０である。図１０（ａ）がフェイス領域の画像情報におけるブロックを示すものであり，図１０（ｂ）がレイアウト情報を座標で示した場合であり，図１０（ｃ）がレイアウト情報を画像情報上で解釈する場合を模式的に示したものである。図１０（ａ）におけるブロックはそれぞれ「region」との名称を付して各ブロックを識別しており，図１０（ａ）では，「region1」〜「region5」まで５つのブロックがある場合を示している。また，それぞれのブロックのレイアウト情報は，矩形領域の２点，たとえば左上と右下の座標によって表されており，「ａ」がｘ座標，「ｂ」がｙ座標を示している。なお，レイアウト情報を２点ではなく４点で示してもよいし，１点からの距離で示すなど，各ブロックの位置情報が特定できるのであればさまざまな方法を用いることができる。図１０（ｃ）では図１０（ｂ）のレイアウト情報をフェイス領域の画像情報で示したものであり，フェイス領域の画像情報の左上を原点とした場合の相対座標で各ブロックの領域を示している。 When the block is specified from the image information in the face area by the image information processing unit 22 as described above, the candidate processing unit 23 includes the specified block and the block corresponding to the sample information stored in the sample information storage unit 26 described later. The layout matching process, which is the process of determining the similarity of the layouts between the two, is executed. As the layout matching process, for example, a block is extracted from the image information in the face area. Extracting the block here means, for example, extracting the layout information (position information) of the block in the image information of the face area. FIG. 10 schematically shows this. FIG. 10A shows a block in the image information of the face area, FIG. 10B shows the layout information in coordinates, and FIG. 10C interprets the layout information on the image information. This is a schematic diagram of the case of Each block in FIG. 10A is named "region" to identify each block, and FIG. 10A shows a case where there are five blocks from "region1" to "region5". ing. Further, the layout information of each block is represented by the coordinates of two points in the rectangular area, for example, the upper left and the lower right, where "a" indicates the x coordinate and "b" indicates the y coordinate. It should be noted that the layout information may be indicated by four points instead of two points, or various methods can be used as long as the position information of each block can be specified, such as the distance from one point. In FIG. 10 (c), the layout information of FIG. 10 (b) is shown by the image information of the face area, and the area of each block is shown by the relative coordinates when the upper left of the image information of the face area is the origin. There is.

候補処理部２３は，同様に，標本情報記憶部２６に記憶する対象物の標本情報に対応するブロックを抽出する。抽出した対象物に対応するブロックを模式的に示すのが図１１である。図１１（ａ）は抽出した対象物またはそれに対応する画像情報であり，図１１（ｂ）は抽出した対象物に対応する画像情報におけるブロックを示すものであり，図１１（ｃ）がレイアウト情報を座標で示した場合であり，図１１（ｄ）がレイアウト情報を画像情報上で解釈する場合を模式的に示したものである。図１１（ａ）は対象物の標本情報またはそれに対応する画像情報であるので，後述するように，撮影画像情報に写っている対象物を特定するための比較処理の際に用いるものである。そのため，撮影画像情報から切り出したフェイス領域よりもノイズなどが少なく，鮮明であることが多い。また，シールなど不要なものも存在していないことが一般的である。図１１（ｂ）の画像情報におけるブロックは，後述するように，オペレータが特定したものであってもよいし，上述の画像情報処理部２２と同様の処理で自動的に特定したものであってもよいし，自動的に特定したブロックに対してオペレータが適宜，取捨選択や付加等の修正したものであってもよい。図１１（ｂ）におけるブロックはそれぞれ「REGION」との名称を付して各ブロックを識別しており，図１１（ｂ）では，「REGION1」〜「REGION6」まで６つのブロックがある場合を示している。また，それぞれのブロックのレイアウト情報は，矩形領域の２点，たとえば左上と右下の座標によって表されており，「ｘ」がｘ座標，「ｙ」がｙ座標を示している。なお，レイアウト情報を２点ではなく４点で示してもよいし，１点からの距離で示すなど，各ブロックの位置情報が特定できるのであればさまざまな方法を用いることができる。図１１（ｄ）では図１１（ｃ）のレイアウト情報をフェイス領域の画像情報で示したものであり，フェイス領域の画像情報の左上を原点とした場合の相対座標で各ブロックの領域を示している。 Similarly, the candidate processing unit 23 extracts a block corresponding to the sample information of the object stored in the sample information storage unit 26. FIG. 11 schematically shows a block corresponding to the extracted object. FIG. 11A shows the extracted object or the image information corresponding thereto, FIG. 11B shows the block in the image information corresponding to the extracted object, and FIG. 11C shows the layout information. Is shown in coordinates, and FIG. 11D schematically shows a case where layout information is interpreted on image information. Since FIG. 11A shows the sample information of the object or the image information corresponding thereto, it is used in the comparison process for identifying the object reflected in the photographed image information, as will be described later. Therefore, there is less noise than the face area cut out from the captured image information, and it is often clear. In addition, it is common that there are no unnecessary items such as stickers. As will be described later, the block in the image information of FIG. 11B may be specified by the operator, or may be automatically specified by the same processing as the image information processing unit 22 described above. Alternatively, the block may be automatically specified and modified by the operator as appropriate, such as selection and addition. Each block in FIG. 11B is named "REGION" to identify each block, and FIG. 11B shows a case where there are six blocks from "REGION1" to "REGION6". ing. Further, the layout information of each block is represented by the coordinates of two points in the rectangular area, for example, the upper left and the lower right, where "x" indicates the x coordinate and "y" indicates the y coordinate. It should be noted that the layout information may be indicated by four points instead of two points, or various methods can be used as long as the position information of each block can be specified, such as the distance from one point. In FIG. 11D, the layout information of FIG. 11C is shown by the image information of the face area, and the area of each block is shown by the relative coordinates when the upper left of the image information of the face area is the origin. There is.

図１０（ａ）におけるブロックと図１１（ｂ）におけるブロックとは必ずしも一致しない。たとえば図１０（ａ）において図１１（ｂ）の「REGION3」と「REGION4」は一つのブロック「region2」として認識されており，また図１０（ａ）では図１１（ｂ）の「REGION2」は認識されていない。さらに図１０（ａ）ではラベル部分にシールが付されていることで，「region5」が過剰に認識されている。 The block in FIG. 10 (a) and the block in FIG. 11 (b) do not always match. For example, in FIG. 10 (a), “REGION 3” and “REGION 4” in FIG. 11 (b) are recognized as one block “region 2”, and in FIG. 10 (a), “REGION 2” in FIG. 11 (b) is recognized. Not recognized. Further, in FIG. 10A, “region 5” is excessively recognized because the label portion has a sticker.

候補処理部２３は，フェイス領域の画像情報から抽出したブロックのレイアウト情報と，標本情報記憶部２６に記憶する対象物に対応するブロックのレイアウト情報とを用いて，それらのレイアウト情報の類似性を示す評価値，たとえば類似度を算出する。レイアウト情報の類似性を示す評価値は，レイアウト情報の類似度に限られず，その類似性を示す情報であればいかなるものであってもよい。たとえば評価値として類似度を用いる場合，類似度を算出するためには，さまざまな方法を用いることができるが，たとえば以下のような処理を用いることができる。 The candidate processing unit 23 uses the layout information of the block extracted from the image information of the face area and the layout information of the block corresponding to the object stored in the sample information storage unit 26 to determine the similarity of the layout information. Calculate the indicated evaluation value, for example, the degree of similarity. The evaluation value indicating the similarity of the layout information is not limited to the similarity of the layout information, and may be any information indicating the similarity. For example, when similarity is used as an evaluation value, various methods can be used to calculate the similarity. For example, the following processing can be used.

まずフェイス領域の画像情報と標本情報記憶部２６に記憶する対象物に対応する標本情報またはそれに対応する画像情報のサイズを合わせる（この場合の面積はいずれも共通Ｓとなる）。そして，フェイス領域の画像情報におけるブロックのレイアウト情報と，標本情報記憶部２６に記憶する対象物の標本情報またはそれに対応する画像情報に対応するブロックのレイアウト情報とに基づき，それらのインターセクション（共通部分）や差分などを算出する。そしてブロック間のＡＮＤ，ＯＲ，ＮＯＴなどの論理演算を行い，その面積をそれぞれ求める。この処理の一例を模式的に図１２に示す。図１２では，対応するブロックの領域がある箇所を黒，対応するブロックの領域がない箇所を白で示している。この組み合わせを模式的に示すのが図１３である。図１３（ａ）は，各ブロック間の演算による面積の組み合わせを示すものであり，図１３（ｂ）は，図１３（ａ）の４つの組み合わせについて，ＴＰ（True Positive），ＦＮ（False Negative），ＦＰ（False Positive），ＴＮ（True Negative）と，予測結果と真の値の関係と同様の名称を付したものである。ここで，標本とする対象物に対応する画像情報のレイアウト情報を真（True）の値，フェイス領域の画像情報のレイアウト情報をそれに対する予測結果と見なしている。たとえば，ＴＰは，フェイス領域の画像情報のレイアウト情報があると見なした箇所が，標本とする対象物に対応する画像情報では実際にレイアウト情報が合った，という意味に解釈する。 First, the size of the image information in the face region and the sample information corresponding to the object stored in the sample information storage unit 26 or the image information corresponding to the sample information or the corresponding image information are matched (the area in this case is the common S). Then, based on the block layout information in the image information of the face region and the sample information of the object stored in the sample information storage unit 26 or the layout information of the block corresponding to the corresponding image information, those intersections (common). Part) and the difference are calculated. Then, logical operations such as AND, OR, and NOT between blocks are performed, and the areas thereof are obtained. An example of this process is schematically shown in FIG. In FIG. 12, a part having a corresponding block area is shown in black, and a part having no corresponding block area is shown in white. FIG. 13 schematically shows this combination. FIG. 13 (a) shows a combination of areas calculated between blocks, and FIG. 13 (b) shows TP (True Positive) and FN (False Negative) for the four combinations of FIG. 13 (a). ), FP (False Positive), TN (True Negative), and the same names as the relationship between the prediction result and the true value. Here, the layout information of the image information corresponding to the object to be sampled is regarded as a true value, and the layout information of the image information in the face area is regarded as the prediction result for it. For example, TP is interpreted to mean that the location where the layout information of the image information in the face area is considered to exist actually matches the layout information in the image information corresponding to the object to be sampled.

そして候補処理部２３は，図１３（ａ）のように算出した各面積を用いて，類似度を算出する。たとえば類似度としてＦ値を用いる場合，以下の数１のように算出する。
（数１）

ここで，Precisionは適合率（正と予測したデータのうち，実際に正であるものの割合），Recallは再現率（実際に正であるもののうち，正であると予測されたものの割合）である。 Then, the candidate processing unit 23 calculates the similarity using each area calculated as shown in FIG. 13 (a). For example, when the F value is used as the similarity, it is calculated as the following equation 1.
(Number 1)

Here, Precision is the precision rate (the ratio of data predicted to be positive that is actually positive), and Recall is the recall rate (the ratio of data that is actually positive and predicted to be positive). ..

たとえば，あるフェイス領域の画像情報から抽出したブロックのレイアウト情報と，標本情報記憶部２６に記憶するある対象物に対応するブロックのレイアウト情報とを用いて，図１２に示すように，それらのインターセクションや差分などを算出し，ブロック間の論理演算を行った結果の面積が図１４であったとする。このとき，適合率（Precision）は０．８６（＝０．３／（０．３＋０．０５）），再現率（Recall）は０．７５（＝０．３／（０．３＋０．１））であるので，Ｆ値は０．８０（＝２×０．７５×０．８６／（０．７５＋０．８６））として算出できる。 For example, using the layout information of the block extracted from the image information of a certain face area and the layout information of the block corresponding to a certain object stored in the sample information storage unit 26, as shown in FIG. It is assumed that the area of the result of calculating the section, the difference, etc. and performing the logical operation between the blocks is shown in FIG. At this time, the precision is 0.86 (= 0.3 / (0.3 + 0.05)) and the recall is 0.75 (= 0.3 / (0.3 + 0.1)). Therefore, the F value can be calculated as 0.80 (= 2 × 0.75 × 0.86 / (0.75 + 0.86)).

そしてここで算出したＦ値が所定の閾値ＴＨ以上であれば，候補処理部２３は，その対象物の識別情報を候補リストに追加する。 If the F value calculated here is equal to or higher than the predetermined threshold value TH, the candidate processing unit 23 adds the identification information of the object to the candidate list.

なお，候補処理部２３は類似度を算出するためには，Ｆ値以外の指標を用いてもよいし，また，正解率（Accuracy。正や負と予測したデータのうち，実際にそうであるものの割合）や，特異度（Specificity。実際に負であるもののうち，負であると予測されたものの割合）を用いてもよい。正解率，特異度は，数２を用いて算出できる。
（数２）

In addition, the candidate processing unit 23 may use an index other than the F value in order to calculate the specificity, and the accuracy rate (Accuracy) is actually the case among the data predicted to be positive or negative. You may use (ratio of things) or specificity (specificity, which is the ratio of things that are actually negative and that are predicted to be negative). The correct answer rate and specificity can be calculated using Equation 2.
(Number 2)

候補処理部２３は，この処理を，標本情報記憶部２６に記憶する比較対象とする対象物について実行する。比較対象とする対象物は，標本情報記憶部２６に記憶するすべての対象物であってもよいし，何らかの方法で絞り込みをした一部の対象物であってもよい。 The candidate processing unit 23 executes this processing for the object to be compared stored in the sample information storage unit 26. The objects to be compared may be all objects stored in the sample information storage unit 26, or may be some objects narrowed down by some method.

そして候補処理部２３は，候補リストに追加された類似度が高い順にソートし，そのうち上位所定数，たとえば上位１０件の対象物を候補として特定する。 Then, the candidate processing unit 23 sorts the objects added to the candidate list in descending order of similarity, and identifies the upper predetermined number, for example, the upper 10 objects as candidates.

比較処理部２４は，候補処理部２３で特定した候補とした対象物の標本情報と，フェイス領域の画像情報との比較処理（マッチング処理）を実行する。たとえばフェイス領域の画像情報の特徴量を抽出し，抽出した特徴量と，候補とした対象物の特徴量などの標本情報との比較処理を実行することで，フェイス領域の画像情報と標本情報との類似性を判定し，そのフェイス領域に含まれる対象物の識別情報，たとえば商品名などを判定する。 The comparison processing unit 24 executes a comparison process (matching process) between the sample information of the target object specified by the candidate processing unit 23 and the image information of the face region. For example, by extracting the feature amount of the image information in the face area and executing the comparison processing between the extracted feature amount and the sample information such as the feature amount of the candidate object, the image information and the sample information in the face area can be obtained. The similarity is determined, and the identification information of the object contained in the face area, for example, the product name, is determined.

なお比較処理としては，フェイス領域の画像情報の全体と標本情報とをマッチングするのみならず，フェイス領域の画像情報の一部，たとえば特徴領域（ブロック）の画像情報の特徴量を抽出し，標本情報の特徴量と比較処理を実行してもよい。 As the comparison process, not only the whole image information in the face area and the sample information are matched, but also a part of the image information in the face area, for example, the feature amount of the image information in the feature area (block) is extracted and sampled. Information feature quantities and comparison processing may be executed.

また，特徴量以外の方法により比較処理を行ってもよい。 Further, the comparison process may be performed by a method other than the feature amount.

さらに，比較処理部２４は，深層学習（ディープラーニング）を用いて標本情報との比較処理を実行してもよい。この場合，中間層が多数の層からなるニューラルネットワークの各層のニューロン間の重み付け係数が最適化された学習モデルに対して，上記フェイス領域の画像情報の全部または一部を入力し，その出力値に基づいて，類似する標本情報を特定してもよい。学習モデルとしては，さまざまなフェイス領域の画像情報に標本情報を正解データとして与えたものを用いることができる。 Further, the comparison processing unit 24 may execute the comparison processing with the sample information by using deep learning. In this case, all or part of the image information in the face region is input to the learning model in which the weighting coefficient between neurons in each layer of the neural network consisting of many layers is optimized, and the output value thereof. Similar sample information may be identified based on. As the learning model, it is possible to use image information of various face regions with sample information given as correct answer data.

標本情報処理部２５は，撮影画像情報に写っている対象物を特定するための比較処理の際に用いる対象物の標本情報を生成し，後述する標本情報記憶部２６に記憶させる。標本情報とは対象物を撮影した画像情報（対象物画像情報）および／またはその特徴量の情報である。標本情報には，対象物の識別情報，例えば商品名，型番，ＪＡＮコードなどの商品コード，対象物画像情報におけるブロックおよびそのレイアウト情報を対応づけて記憶している。対象物画像情報としては，対象物全体の画像情報であってもよいし，対象物の一部分，たとえば対象物がペットボトル飲料の場合にはラベル部分の画像情報であってもよい。なお，標本情報として，対象物画像情報の特徴量の情報を用いる場合には，処理の都度，特徴量を抽出する必要がなくなる。ブロックとしては，当該対象物画像情報を特徴付ける特徴的な領域であって，たとえば商品名，ロゴなどを囲む矩形領域が該当する。 The sample information processing unit 25 generates sample information of the object used in the comparison process for identifying the object reflected in the photographed image information, and stores it in the sample information storage unit 26 described later. The sample information is image information (object image information) obtained by photographing an object and / or information on a feature amount thereof. The sample information stores identification information of the object, for example, a product code such as a product name, a model number, and a JAN code, a block in the image information of the object, and its layout information. The image information of the object may be the image information of the entire object, or may be the image information of a part of the object, for example, the label portion when the object is a PET bottle beverage. When the feature amount information of the object image information is used as the sample information, it is not necessary to extract the feature amount each time the processing is performed. The block is a characteristic area that characterizes the image information of the object, and corresponds to, for example, a rectangular area surrounding a product name, a logo, or the like.

標本情報処理部２５は，一つの対象物について一つの標本情報を生成してもよいし，一つの対象物について複数の角度，たとえば商品を正対化して撮影する場合に，写らない位置にある表面を写すため，正面，背面，上面，下面，両側面程度の角度からの標本情報を生成してもよい。また，一つの対象物に複数の外装（パッケージなど）がある場合には，一つの対象物にそれぞれの外装の場合の標本情報やブロックを対応づけても良い。また，標本情報に対応づけるブロックとしては，一つであってもよいが，複数のブロックが対応づけられていると良い。 The sample information processing unit 25 may generate one sample information for one object, or is in a position where it cannot be photographed when the product is photographed at a plurality of angles for one object, for example, when the product is face-to-face. In order to capture the surface, sample information may be generated from angles such as front, back, top, bottom, and both sides. Further, when one object has a plurality of exteriors (packages, etc.), the sample information and blocks for each exterior may be associated with one object. Moreover, although one block may be associated with the sample information, it is preferable that a plurality of blocks are associated with each other.

標本情報は，標本とする対象物を撮影した対象物画像情報に対して，上述の画像情報処理部２２と同様の処理を実行することで，ブロックを抽出してもよいし，抽出したブロックからオペレータが商品の特定のために必要なブロックのみを取捨選択や付加などをしても良い。また画像情報処理部２２のように自動的に抽出処理を実行するのではなく，読み込んだ対象物の画像情報からオペレータがブロックを指定しても良い。 For the sample information, a block may be extracted by executing the same processing as the above-mentioned image information processing unit 22 on the object image information obtained by photographing the object to be a sample, or the sample information may be extracted from the extracted block. The operator may select or add only the blocks necessary for specifying the product. Further, instead of automatically executing the extraction process as in the image information processing unit 22, the operator may specify a block from the image information of the read object.

標本情報記憶部２６は，標本情報処理部２５において生成した標本情報を，対象物の識別情報，例えば商品名，型番，ＪＡＮコードなどの商品コード，対象物画像情報におけるブロックおよびそのレイアウト情報などに対応づけて記憶する。 The sample information storage unit 26 uses the sample information generated by the sample information processing unit 25 as object identification information, for example, a product code such as a product name, a model number, and a JAN code, a block in the object image information, and its layout information. Correspond and memorize.

つぎに本発明の画像処理システム１を用いた処理プロセスの一例を図３乃至図５のフローチャートを用いて説明する。本実施例では，店舗の陳列棚を撮影し，陳列棚にある商品を特定する場合を説明する。そのため，撮影画像情報としては商品が陳列されている店舗の陳列棚であり，標本情報における対象物としては陳列される可能性のある商品となる。 Next, an example of the processing process using the image processing system 1 of the present invention will be described with reference to the flowcharts of FIGS. 3 to 5. In this embodiment, a case where a display shelf of a store is photographed and a product on the display shelf is specified will be described. Therefore, the photographed image information is the display shelf of the store where the product is displayed, and the product may be displayed as the object in the sample information.

まずオペレータは，標本情報とする商品を正対位置から撮影し，撮影した対象物画像情報，識別情報，商品名，商品コードを対応づけて標本情報記憶部２６に記憶させておく。また，対象物画像情報に対して画像情報処理部２２における処理と同様の処理を標本情報処理部２５が実行し，ブロックを抽出する。そして抽出したブロックのうち，標本情報として用いるブロック，たとえば商品名，ロゴなどの含まれたブロックのみを標本情報に対応づけるブロックとして，ブロックとその位置情報（レイアウト情報）を標本情報記憶部２６に記憶させておく。また，対象物画像情報から所定の処理を実行することで，局所特徴量などの特徴量を抽出し，対象物画像情報に対応づけて記憶させておく。 First, the operator photographs the product as the sample information from the opposite position, and stores the photographed object image information, identification information, product name, and product code in the sample information storage unit 26 in association with each other. Further, the sample information processing unit 25 executes the same processing as the processing in the image information processing unit 22 for the object image information, and extracts the block. Then, among the extracted blocks, the block and its position information (layout information) are stored in the sample information storage unit 26 as a block used as sample information, for example, a block containing only a product name, a logo, etc. is associated with the sample information. Remember it. In addition, by executing a predetermined process from the object image information, a feature amount such as a local feature amount is extracted and stored in association with the object image information.

このような標本情報に対する処理を，複数の商品，好ましくは陳列棚に陳列される可能性のあるすべての商品についてあらかじめ実行しておく。 Processing for such sample information is performed in advance for a plurality of products, preferably all products that may be displayed on a display shelf.

そして店舗の陳列棚を撮影した撮影画像情報を入力端末３から管理端末２の画像情報入力受付処理部２０が受け付けると（Ｓ１００），画像情報記憶部２１に，撮影日時，撮影画像情報の識別情報などと対応づけて記憶させる。 Then, when the image information input reception processing unit 20 of the management terminal 2 receives the photographed image information obtained by photographing the display shelf of the store from the input terminal 3 (S100), the image information storage unit 21 receives the photographed date and time and the identification information of the photographed image information. And memorize it in association with.

受け付けた撮影画像情報に対して，画像情報処理部２２は，撮影画像情報が正対した状態となるように補正する正置化処理を実行する（Ｓ１１０）。そして，正置化した撮影画像情報からフェイス領域を特定するフェイス処理を実行し（Ｓ１２０），特定したフェイス領域からブロックを特定する（Ｓ１３０）。 The image information processing unit 22 executes an orthostatic processing for correcting the received captured image information so that the captured image information faces each other (S110). Then, a face process for specifying the face area from the embossed captured image information is executed (S120), and the block is specified from the specified face area (S130).

上述のように，フェイス領域からブロックを特定するためには，上述のようにたとえば第１の方法から第３の方法のような各種の方法があるが，たとえば第１の方法による場合には，以下のような処理を実行する。この場合の処理プロセスの一例を図４のフローチャートを用いて説明する。 As described above, in order to identify the block from the face area, there are various methods such as the first method to the third method as described above, but in the case of the first method, for example, Execute the following processing. An example of the processing process in this case will be described with reference to the flowchart of FIG.

まずフェイス領域の画像情報を構成する各画素の色について，たとえば全体を代表する１６色など，所定の色数のパレット色に色数を減色して，パレット色を検出する（Ｓ２００）。また，フェイス領域の短辺をたとえば１６分割する正方形のメッシュで分割し，各メッシュがどのパレット色となるかを定め，パレット色ごとのその色の有無の分布マップを生成する（Ｓ２１０）。 First, for the color of each pixel constituting the image information in the face region, the number of colors is reduced to a predetermined number of palette colors such as 16 colors representing the whole, and the palette color is detected (S200). Further, the short side of the face region is divided into, for example, 16 divided square meshes, which palette color is used for each mesh, and a distribution map of the presence or absence of that color is generated for each palette color (S210).

そして，フェイス領域の画像情報の各メッシュに対して，一定の距離にある飛び地を候補とするよう膨張処理を分布マップに対して行い，その分布マップが，所定の範囲に収まるグループを検出する（Ｓ２２０）。所定の範囲としては，たとえば，横幅が全体の１／１０〜２／３，縦の高さが全体の１／２０〜１／２，面積が全体の１／１００〜１／１０の閾値の条件に収まる分布マップ（膨張処理を施した分布マップ）のグループを検出する。 Then, for each mesh of the image information in the face area, expansion processing is performed on the distribution map so that the enlave at a certain distance is a candidate, and the group in which the distribution map falls within a predetermined range is detected (). S220). The predetermined range is, for example, a threshold condition in which the width is 1/10 to 2/3 of the whole, the height is 1/20 to 1/2 of the whole, and the area is 1/100 to 1/10 of the whole. Detects a group of distribution maps (distribution maps that have undergone expansion processing) that fit in.

そして各グループについて，全パレット色の分布マップを参照し，その領域に含まれるパレットの数が所定数，たとえば３以内であるかを判定する（Ｓ２３０）。すなわち，グループの領域における全パレット色の分布マップを参照し，グループの領域内に多くの色数を含む場合にはイメージイラストである可能性が高く，商品を特定するような特徴的な領域であるロゴなどではないと考えられるので，そのようなグループを除外する。 Then, for each group, the distribution map of all palette colors is referred to, and it is determined whether the number of palettes included in the area is a predetermined number, for example, 3 or less (S230). That is, referring to the distribution map of all palette colors in the area of the group, if the area of the group contains a large number of colors, it is highly likely that it is an image illustration, and it is a characteristic area that identifies the product. Exclude such groups as they are not considered to be certain logos.

Ｓ２３０において，グループの領域内に含まれるパレットの数が所定数以内である場合には，そのグループ領域内にエッジが所定の閾値以上あるかを判定する（Ｓ２４０）。エッジの判定については，フェイス領域の画像情報に対して，あらかじめエッジ検出処理を行っておき，それに基づいて閾値以上あるかを判定できる。グループ領域内にエッジが閾値以上に存在しない場合には，ロゴではない可能性が高いので，そのグループを除外する。 In S230, when the number of pallets contained in the group area is within a predetermined number, it is determined whether or not the edge is equal to or more than a predetermined threshold value in the group area (S240). Regarding the edge determination, it is possible to perform edge detection processing on the image information in the face area in advance and determine whether or not the image information is equal to or greater than the threshold value. If the edge does not exist above the threshold value in the group area, it is highly possible that it is not a logo, so that group is excluded.

そして，これらの条件を充足するグループを囲む矩形領域をブロックとして特定をする（Ｓ２５０）。 Then, the rectangular area surrounding the group satisfying these conditions is specified as a block (S250).

この処理を，Ｓ２２０でグルーピングしたグループに対して実行する（Ｓ２６０）。 This process is executed for the group grouped in S220 (S260).

画像情報処理部２２でフェイス領域の画像情報からブロックを特定すると，候補処理部２３は，当該特定したブロックと，後述する標本情報記憶部２６に記憶する対象物に対応するブロックとの間のレイアウトマッチング処理を実行することで，候補となる商品を特定する（Ｓ１４０）。候補となる商品の特定処理の処理プロセスの一例を図５のフローチャートを用いて説明する。 When the image information processing unit 22 specifies a block from the image information in the face area, the candidate processing unit 23 arranges between the specified block and the block corresponding to the object stored in the sample information storage unit 26 described later. By executing the matching process, a candidate product is specified (S140). An example of the processing process of the specific processing of the candidate product will be described with reference to the flowchart of FIG.

具体的には，候補処理部２３は，比較対象とするすべての商品について，フェイス領域の画像情報から抽出したブロックのレイアウト情報と，商品に対応するブロックのレイアウト情報とをに基づいて，それらのインターセクション，差分などを算出し，またブロック間のＡＮＤ，ＯＲ，ＮＯＴなどの論理演算を行い，その面積をそれぞれ求めてレイアウトマッチングを行う（Ｓ３１０）。そして，たとえばＦ値などの類似度を算出し，それが所定の閾値ＴＨ以上であれば，その商品を候補リストに追加する（Ｓ３２０）。そして，つぎの商品について，同様の処理を実行する（Ｓ３３０）。 Specifically, the candidate processing unit 23 uses the block layout information extracted from the image information in the face area and the block layout information corresponding to the products for all the products to be compared. Intersections, differences, etc. are calculated, logical operations such as AND, OR, and NOT between blocks are performed, and the areas thereof are obtained for layout matching (S310). Then, for example, the similarity such as the F value is calculated, and if it is equal to or higher than the predetermined threshold value TH, the product is added to the candidate list (S320). Then, the same processing is executed for the next product (S330).

そして比較対象とするすべての商品についてレイアウトマッチング処理を行うと，候補処理部２３は，候補リストに追加された類似度を高い順にソートし（Ｓ３４０），そのうち上位所定数，たとえば上位１０件の商品を候補として特定する。 Then, when layout matching processing is performed for all the products to be compared, the candidate processing unit 23 sorts the similarities added to the candidate list in descending order (S340), and among them, the top predetermined number, for example, the top 10 products. Is specified as a candidate.

Ｓ１４０において候補処理部２３が，比較処理の候補となる商品を特定すると，比較処理部２４は，フェイス領域の画像情報の特徴量を抽出し，抽出した特徴量と，候補として特定した商品の標本情報の特徴量とを比較することで比較処理を行う（Ｓ１５０）。この比較の際には，フェイス領域の画像情報の大きさと，標本情報における画像情報の大きさ，縮尺，回転なども極力近づけた上で比較処理を行う。この比較処理を候補として特定した各商品について行う。 When the candidate processing unit 23 specifies a product that is a candidate for comparison processing in S140, the comparison processing unit 24 extracts the feature amount of the image information in the face region, and the extracted feature amount and the sample of the product specified as the candidate. The comparison process is performed by comparing with the feature amount of the information (S150). At the time of this comparison, the comparison process is performed after the size of the image information in the face area and the size, scale, rotation, etc. of the image information in the sample information are as close as possible. This comparison process is performed for each product specified as a candidate.

そしてこの比較処理の結果，候補として特定した商品のうち，もっとも類似度が高いと判定した商品を，当該フェイス領域の商品として特定し，当該フェイス領域と，商品識別情報，商品名，商品コードなどと対応づける（Ｓ１６０）。 Then, as a result of this comparison processing, among the products identified as candidates, the product determined to have the highest degree of similarity is specified as a product in the face area, and the face area, product identification information, product name, product code, etc. are specified. (S160).

撮影画像情報に写っている陳列棚には，複数の商品が陳列されていることが一般的であるから，陳列されている商品に応じたフェイス領域が切り出されていることが一般的である。そのため，それぞれのフェイス領域について，上述の各処理を実行することで，陳列棚に陳列されている商品や陳列位置を特定することができる。商品の陳列位置は，フェイス領域の縦，横方向の順番などで特定することができるほか，撮影画像情報における座標として特定することもできる。 Since a plurality of products are generally displayed on the display shelf shown in the photographed image information, it is common that a face area corresponding to the displayed products is cut out. Therefore, by executing each of the above-mentioned processes for each face area, it is possible to specify the products displayed on the display shelf and the display position. The display position of the product can be specified by the order of the vertical and horizontal directions of the face area, and can also be specified as the coordinates in the photographed image information.

対象物の表面が曲面の場合，左右方向にずれるにつれて，画像の歪みが発生する。そこで，以下のような処理を実行することがよい。 When the surface of the object is a curved surface, image distortion occurs as it shifts in the left-right direction. Therefore, it is advisable to execute the following processing.

すなわち，候補処理部２３は，レイアウト情報の類似度を算出する際に，図１２に示すように，フェイス領域の画像情報の全体におけるブロックのレイアウト情報，対象物に対応する画像情報の全体におけるブロックのレイアウト情報を用いて類似度を算出する処理を実行するほか，フェイス領域の画像情報の一部におけるブロックのレイアウト情報と，対象物に対応する画像情報の一部におけるブロックのレイアウト情報とを用いて類似度を算出してもよい。この処理の一例を模式的に図１５に示す。 That is, when the candidate processing unit 23 calculates the similarity of the layout information, as shown in FIG. 12, the candidate processing unit 23 blocks the layout information of the block in the entire image information of the face region and the block in the entire image information corresponding to the object. In addition to executing the process of calculating the similarity using the layout information of, the layout information of the block in a part of the image information of the face area and the layout information of the block in a part of the image information corresponding to the object are used. The degree of similarity may be calculated. An example of this process is schematically shown in FIG.

図１５に示すように，本実施例の場合，フェイス領域の画像情報の左右の一定割合，たとえば１／４程度を除外して，フェイス領域の画像情報の中央線から１／４程度ずつ（中央部１／２程度）を切り出した画像情報におけるブロックのレイアウト情報と，対象物に対応する画像情報について，上述のフェイス領域の画像情報から切り出した幅と同一またはほぼ同じ幅を切り出し，切り出した後のレイアウト情報を用いて，インターセクションや差分などを算出する。そしてブロック間のＡＮＤ，ＯＲ，ＮＯＴなどの論理演算を行い，その面積をそれぞれ求める。そして，上述と同様に類似度を算出する。 As shown in FIG. 15, in the case of this embodiment, a certain ratio on the left and right of the image information in the face region, for example, about 1/4 is excluded, and about 1/4 each from the center line of the image information in the face region (center). About the block layout information in the image information cut out (about 1/2 part) and the image information corresponding to the object, after cutting out and cutting out the same or almost the same width as the width cut out from the image information of the above-mentioned face area. Use the layout information of to calculate intersections and differences. Then, logical operations such as AND, OR, and NOT between blocks are performed, and the areas thereof are obtained. Then, the similarity is calculated in the same manner as described above.

つぎに，対象物に対応する画像情報の切り出し位置を，所定量だけずらして同様の処理を実行する。なお，この切り出し位置をずらした状態のブロックのレイアウト情報をあらかじめ用意しておいてもよい。 Next, the same process is executed by shifting the cutout position of the image information corresponding to the object by a predetermined amount. The layout information of the block in which the cutout position is shifted may be prepared in advance.

そして，もっとも高い類似度と閾値ＴＨとを比較し，それが閾値ＴＨ以上であれば，候補リストに追加する。また，閾値ＴＨ以上となったもっとも高い類似度のときの，標本情報に対応する画像情報の切り出し位置を記憶しておいてもよい。この場合，比較処理部２４におけるフェイス領域の画像情報の特徴量と標本情報に対応する画像情報の特徴量との比較処理の際に，記憶した切り出し位置で標本情報に対応する画像情報を切り出して，その特徴量を比較することが好ましい。 Then, the highest similarity is compared with the threshold value TH, and if it is equal to or higher than the threshold value TH, it is added to the candidate list. Further, the cutout position of the image information corresponding to the sample information at the time of the highest similarity when the threshold value TH or more may be stored may be stored. In this case, when the comparison processing unit 24 compares the feature amount of the image information in the face region with the feature amount of the image information corresponding to the sample information, the image information corresponding to the sample information is cut out at the stored cutout position. ， It is preferable to compare the features.

候補処理部２３は，もっとも高い類似度と閾値ＴＨとを比較するのではなく，類似度を算出するたびに閾値ＴＨと比較し，それが閾値ＴＨ以上であれば，候補リストに追加して，切り出し位置をずらす処理を終了し，つぎの標本情報に対応する画像情報のブロックの処理を行うようにしてもよい。これによって，対象物に対応する画像情報を所定量ずつずらして切り出す処理を，類似度が閾値ＴＨ以上となった時点で終了できるので，処理速度を高速化することができる。 The candidate processing unit 23 does not compare the highest similarity with the threshold TH, but compares it with the threshold TH each time the similarity is calculated, and if it is equal to or higher than the threshold TH, adds it to the candidate list. The process of shifting the cutout position may be completed, and the process of blocking the image information corresponding to the next sample information may be performed. As a result, the process of cutting out the image information corresponding to the object by shifting it by a predetermined amount can be completed when the similarity becomes equal to or higher than the threshold value TH, so that the processing speed can be increased.

なお，上述では，フェイス領域の画像情報の中央部１／２程度を切り出した画像情報におけるブロックのレイアウト情報を，対象物の画像情報において同範囲で，所定量だけずらしながらレイアウトマッチング処理を行うことを示したが，その反対の処理を行ってもよい。すなわち，対象物の画像情報の中央部１／２程度を切り出した画像情報におけるブロックのレイアウト情報を，フェイス領域の画像情報において同一またはほぼ同じ幅を切り出してレイアウトマッチング処理を，フェイス領域の画像情報で所定量だけずらして繰り返すようにしてもよい。 In the above description, the layout matching process is performed while shifting the layout information of the block in the image information obtained by cutting out about 1/2 of the central part of the image information in the face area in the same range in the image information of the object by a predetermined amount. However, the opposite processing may be performed. That is, the layout matching process is performed by cutting out the layout information of the block in the image information obtained by cutting out about 1/2 of the central portion of the image information of the object and cutting out the same or almost the same width in the image information of the face area. It may be repeated by shifting by a predetermined amount.

実施例１および実施例２においては，比較処理部２４は，候補処理部２３で候補として特定した対象物の標本情報の対応する画像情報の一部または全部の画像情報の特徴量と，フェイス領域の一部または全部の画像情報の特徴量とを比較することで，標本情報とフェイス領域の画像情報との類似性を判定し，そのフェイス領域に含まれる対象物の識別情報，たとえば商品名などを判定していた。 In the first and second embodiments, the comparison processing unit 24 has a feature amount of a part or all of the corresponding image information of the sample information of the object specified as a candidate by the candidate processing unit 23, and a face area. By comparing the feature amount of a part or all of the image information of, the similarity between the sample information and the image information of the face area is determined, and the identification information of the object included in the face area, for example, the product name, etc. Was judged.

本実施例では，候補処理部２３で候補として特定した対象物のブロックと，フェイス領域の一部または全部のブロックとを比較することで，標本情報とフェイス領域の画像情報との類似性を判定し，そのフェイス領域に含まれる対象物の識別情報，たとえば商品名などを判定する場合を説明する。 In this embodiment, the similarity between the sample information and the image information in the face area is determined by comparing the block of the object specified as a candidate by the candidate processing unit 23 with a part or all of the blocks in the face area. Then, the case of determining the identification information of the object included in the face area, for example, the product name, etc. will be described.

この場合，標本情報記憶部２６には，当該ブロックに対応づけて，そのブロックの種別を示すタグ情報を記憶しておくことよい。タグ情報の一例を図１６に示す。そして各タグについて，それぞれあらかじめ定めた前処理をし，その処理結果を標本情報記憶部２６に対応づけて記憶させておく。たとえばタグ情報がイメージ画像であればその画像情報を切り出して登録，タグ情報が商品名ロゴやコピー，定格であればＯＣＲ処理をしておき，テキスト情報として記憶させておく。 In this case, the sample information storage unit 26 may store tag information indicating the type of the block in association with the block. An example of tag information is shown in FIG. Then, each tag is subjected to predetermined preprocessing, and the processing result is stored in association with the sample information storage unit 26. For example, if the tag information is an image image, the image information is cut out and registered, if the tag information is a product name logo or copy, or if it is rated, OCR processing is performed and stored as text information.

そして比較処理部２４は，候補処理部２３で候補として特定した対象物のブロックと，フェイス領域の一部または全部のブロックとの比較の際に，まずブロック同士の対応関係を特定する。この場合，フェイス領域の画像情報と標本情報の画像情報との位置と縮尺とを合わせた上で，標本情報に対応するブロックからみてフェイス領域の画像情報のインターセクションがもっとも大きいブロックを求める，あるいは標本情報に対応するブロックのレイアウト情報の中心を特定し，その中心がもっとも近い，フェイス領域の画像情報におけるブロックを求める，などの処理によって，ブロック同士の対応関係を特定できる。ブロック同士の対応関係の一例を図１７に示す。 Then, the comparison processing unit 24 first identifies the correspondence between the blocks when comparing the block of the object specified as a candidate by the candidate processing unit 23 with a part or all of the blocks in the face area. In this case, after matching the position and scale of the image information of the face area and the image information of the sample information, the block having the largest intersection of the image information of the face area when viewed from the block corresponding to the sample information is obtained, or The correspondence between blocks can be specified by specifying the center of the layout information of the block corresponding to the sample information, finding the block in the image information of the face area, which is the closest to the center, and so on. FIG. 17 shows an example of the correspondence between blocks.

そしてブロック同士の対応関係を特定すると，標本情報に対応するブロックに対する前処理を，対応するフェイス領域の画像情報のブロックに対して実行し，その結果を算出する。そして，前処理の結果同士の比較を行う。たとえば画像情報であれば，特徴量を用いた類似性の評価や，テキスト情報であればテキストマッチングなどの比較を行う。この処理の一例を図１８に示す。 Then, when the correspondence between the blocks is specified, the preprocessing for the block corresponding to the sample information is executed for the image information block in the corresponding face area, and the result is calculated. Then, the results of the preprocessing are compared with each other. For example, in the case of image information, similarity is evaluated using features, and in the case of text information, text matching is performed. An example of this process is shown in FIG.

比較処理部２４は，以上のようにブロック同士の比較を行った結果を用いて，その標本情報との一致度の総合的な指標値を算出する。たとえば，商品名の一致度は高評価，メーカー名の一致度は低評価となるような設定をあらかじめ行っておき，それに基づく総合的な指標値を算出する。この一致度の評価は，たとえばブロックに対応するタグ情報の種別に応じて，一致したときの重みを決定する数値が関係づけられる。重み付けの数値はブロックに対応するタグ情報の種別で定まっていてもよいし，たとえば対象物が商品の場合には商品カテゴリ内の出現頻度などでもよい。出現頻度を用いる場合には，何と何が一致した場合にはどこまで絞り込めるかを表にしておき，商品などの対象物を１つにまで絞り込めたら，対象物を特定できたとする。図１９（ａ）はブロックとタグ情報の種別の対応関係の一例を示しており，図１９（ｂ）は絞り込みの表の一例を示している。 The comparison processing unit 24 calculates a comprehensive index value of the degree of agreement with the sample information using the result of comparing the blocks as described above. For example, a setting is made in advance so that the degree of matching of product names is highly evaluated and the degree of matching of manufacturer names is low, and a comprehensive index value is calculated based on the setting. In the evaluation of the degree of matching, for example, a numerical value that determines the weight at the time of matching is related according to the type of tag information corresponding to the block. The weighting value may be determined by the type of tag information corresponding to the block, or may be, for example, the frequency of appearance in the product category when the object is a product. When using the frequency of appearance, it is assumed that the object can be specified by making a table of what and what matches and how far it can be narrowed down, and by narrowing down the object such as a product to one. FIG. 19A shows an example of the correspondence between the block and the type of tag information, and FIG. 19B shows an example of the narrowing table.

比較処理部２４は，以上のような比較処理を，候補として特定した商品などの対象物について行い，総合的な指標値がもっとも高い対象物を，当該フェイス領域の商品として特定する。 The comparison processing unit 24 performs the above comparison processing on an object such as a product specified as a candidate, and identifies the object having the highest overall index value as a product in the face region.

実施例１乃至実施例３では撮影画像情報として，２次元の画像情報のまま撮影をした場合を説明したが，撮影対象と撮影装置との距離情報を考慮して，撮影画像情報から３次元モデル化をし，さらに，そこから２次元の画像情報に投影した場合の画像情報を，本発明の画像処理システム１で処理対象とする撮影画像情報としてもよい。なお，この場合，標本情報における対象物の画像情報についても同様の処理が実行されている。この場合の処理を以下に説明する。 In the first to third embodiments, the case where the image is captured with the two-dimensional image information as the captured image information has been described, but the three-dimensional model is taken from the captured image information in consideration of the distance information between the imaging target and the imaging device. The image information when the image is converted and then projected onto the two-dimensional image information may be used as the captured image information to be processed by the image processing system 1 of the present invention. In this case, the same processing is performed for the image information of the object in the sample information. The processing in this case will be described below.

本実施例の入力端末３では，カメラなどの撮影装置で可視光などによる画像情報を撮影するほか，撮影対象と撮影装置との距離情報（深さ情報）を２次元情報として取得する。深さ情報がマッピングされた情報を，本明細書では深さマップとよぶ。深さマップは，撮影装置で撮影した画像情報に対応しており，撮影した画像情報に写っている物までの深さ情報が，撮影した画像情報と対応する位置にマッピングされている。深さマップは，少なくとも撮影した画像情報に対応する範囲をメッシュ状に区切り，そのメッシュごとに深さ情報が与えられている。メッシュの縦，横の大きさは，取得できる深さ情報の精度に依存するが，メッシュの縦，横の大きさが小さいほど精度が上げられる。通常は，１ｍｍから数ｍｍの範囲であるが，１ｍｍ未満，あるいは１ｃｍ以上であってもよい。 In the input terminal 3 of the present embodiment, in addition to photographing image information by visible light or the like with a photographing device such as a camera, distance information (depth information) between the imaged object and the photographing device is acquired as two-dimensional information. The information to which the depth information is mapped is referred to as a depth map in this specification. The depth map corresponds to the image information taken by the photographing device, and the depth information up to the object reflected in the photographed image information is mapped to the position corresponding to the photographed image information. In the depth map, at least the range corresponding to the captured image information is divided into meshes, and the depth information is given to each mesh. The vertical and horizontal sizes of the mesh depend on the accuracy of the depth information that can be acquired, but the smaller the vertical and horizontal sizes of the mesh, the higher the accuracy. Usually, it is in the range of 1 mm to several mm, but it may be less than 1 mm or 1 cm or more.

深さ情報を取得する場合，撮影対象の撮影の際に，特定の波長の光線，たとえば赤外線を照射してそれぞれの方向からの反射光の量や反射光が到達するまでの時間を計測することで撮影対象までの距離情報（深さ情報）を取得する，あるいは特定の波長の光線，たとえば赤外線のドットパターンを照射し，反射のパターンから撮影対象までの距離情報（深さ情報）を計測するもののほか，ステレオカメラの視差を利用する方法があるが，これらに限定されない。なお，深さ情報が撮影装置（三次元上の一点）からの距離で与えられる場合には，かかる深さ情報を，撮影装置の撮影面（平面）からの深さ情報に変換をしておく。このように深さ情報を取得するための装置（深さ検出装置）を入力端末３は備えていてもよい。 When acquiring depth information, the amount of reflected light from each direction and the time until the reflected light arrives are measured by irradiating light rays of a specific wavelength, such as infrared rays, when shooting the subject. Acquires distance information (depth information) to the shooting target, or irradiates a light beam of a specific wavelength, for example, an infrared dot pattern, and measures the distance information (depth information) from the reflection pattern to the shooting target. In addition to the above, there are methods that utilize the parallax of the stereo camera, but the method is not limited to these. If the depth information is given by the distance from the photographing device (one point on three dimensions), the depth information is converted into the depth information from the photographing surface (plane) of the photographing device. .. The input terminal 3 may be provided with a device (depth detection device) for acquiring depth information in this way.

画像情報と深さ情報とはその位置関係が対応をしている。図２０では，陳列棚を撮影対象として撮影した画像情報と深さ情報による深さマップとの対応関係を示している。図２０（ａ）が撮影した画像情報，図２０（ｂ）がそれに対応する深さマップを模式的に示した図である。 The positional relationship between the image information and the depth information corresponds to each other. FIG. 20 shows the correspondence between the image information taken with the display shelf as the shooting target and the depth map based on the depth information. FIG. 20 (a) is a diagram schematically showing image information taken, and FIG. 20 (b) is a diagram schematically showing a corresponding depth map.

画像情報入力受付処理部２０は，入力端末３で撮影した，陳列棚などの撮影対象の画像情報の入力を受け付け，画像情報記憶部２１に記憶させる。また，画像情報入力受付処理部２０は，入力端末３で撮影した画像情報に対する深さ情報や深さマップの入力を受け付け，撮影した画像情報に対応づけて，画像情報記憶部２１に記憶させる。なお，入力端末３から深さマップではなく，深さ情報を受け付けた場合には，それをマッピングして深さマップを生成する。 The image information input reception processing unit 20 receives input of image information to be photographed such as a display shelf taken by the input terminal 3 and stores it in the image information storage unit 21. Further, the image information input reception processing unit 20 receives the input of the depth information and the depth map for the image information captured by the input terminal 3, and stores the captured image information in the image information storage unit 21 in association with the captured image information. When the depth information is received from the input terminal 3 instead of the depth map, the depth map is generated by mapping it.

撮影対象を撮影した画像情報と深さ情報，深さマップは，画像情報を撮影する際に同時に撮影，情報が取得され，その情報の入力を受け付けるようにすることが好ましい。 It is preferable that the image information, the depth information, and the depth map of the photographed object are photographed and the information is acquired at the same time when the image information is photographed, and the input of the information is accepted.

画像情報処理部２２は，陳列棚などを撮影した画像情報と深さマップに基づいて３次元モデルに変換をする。すなわち，撮影した画像情報と深さマップとは対応しているので，まず深さマップにおける深さ情報に基づいて３次元モデルに変換をし，そこに撮影した画像情報を対応づけてテクスチャマッピングをする。図２１に撮影した画像情報と深さマップに基づく３次元モデル化の処理の一例を示す。なお，図２１では飲料のペットボトルを３次元モデル化する場合を示している。図２１（ａ）は撮影した画像情報である。画像情報処理部２２は，同時に深さマップを取得しているので，深さマップにおける各メッシュの深さ情報に基づいて，３次元モデルに変換をする。そして深さマップにおける各メッシュの位置に，撮影した画像情報の対応する画素の色情報などを貼り付けるテクスチャマッピングをすることで，図２１（ｂ）乃至（ｄ）のように３次元モデル化をすることができる。図２１（ｂ）は左方向からの視点，図２１（ｃ）は正面からの視点，図２１（ｄ）は右方向からの視点を示している。 The image information processing unit 22 converts the display shelf or the like into a three-dimensional model based on the photographed image information and the depth map. That is, since the captured image information and the depth map correspond to each other, first convert to a 3D model based on the depth information in the depth map, and then associate the captured image information with the texture mapping. do. FIG. 21 shows an example of a three-dimensional modeling process based on captured image information and a depth map. Note that FIG. 21 shows a case where a PET bottle of a beverage is modeled in three dimensions. FIG. 21A is image information taken. Since the image information processing unit 22 acquires the depth map at the same time, it converts it into a three-dimensional model based on the depth information of each mesh in the depth map. Then, by performing texture mapping in which the color information of the corresponding pixel of the captured image information is pasted at the position of each mesh in the depth map, three-dimensional modeling is performed as shown in FIGS. 21 (b) to 21 (d). can do. 21 (b) shows a viewpoint from the left, FIG. 21 (c) shows a viewpoint from the front, and FIG. 21 (d) shows a viewpoint from the right.

画像情報処理部２２は，陳列棚などを撮影した画像情報と深さマップに基づいて，３次元モデルを生成すると，それを平面展開する平面展開処理を実行する。平面展開とは，３次元モデルを平面（２次元）に展開することを意味する。３次元モデルを平面展開する方法はさまざまな方法があり，限定するものではない。 When the image information processing unit 22 generates a three-dimensional model based on the image information obtained by photographing the display shelf or the like and the depth map, the image information processing unit 22 executes a plane expansion process for expanding the three-dimensional model. Plane expansion means expanding a three-dimensional model into a plane (two-dimensional). There are various methods for expanding the three-dimensional model in a plane, and the method is not limited.

平面展開の方法の一例としては，３次元モデルの内側に投影の中心を定め，そこから投影をすることで，平面に展開する方法がある。画像情報処理部２２における平面展開の処理の一例を図２２を用いて説明する。図２２（ａ）はペットボトルを撮影した画像情報であり，その上面図は，略六角形状になっている。なお実際の形状はより細かく凹凸があるが，ここでは説明のため，六角形で示す。そして，図２２（ａ）の画像情報をその深さマップを用いて３次元モデル化した場合に，上方から見た場合の位置関係が図２２（ｂ）である。この場合，撮影装置（カメラ）の陰になっている箇所の深さ情報は存在しないので，図２２（ｂ）に示すように，上方から見た場合，空白がある。そして，平面展開する対象物の任意の箇所に投影の中心軸を特定する。たとえば陳列棚に陳列される商品であって，その形状が箱形，袋型ではなく，円柱型，ボトル型などの表面が平面ではなく規則的な曲面の場合には，その横幅とその表面の形状から，回転軸の中心点を求め，投影の中心軸として特定する。なお，投影の中心軸を特定するためには，上記のように限らず，任意の方法であってよい。投影の中心軸を特定した状態が図２２（ｃ）である。そして，特定した投影の中心軸から，対象物の表面を，任意に設定した投影面に投影する。これを模式的に示すのが図２２（ｄ）である。なお，図２２（ｄ）のように，３次元モデルの面と投影面との位置が一致しない場合，その離間する距離に応じた投影結果の画像情報の縦横の大きさを，適宜の縮尺率によって調整する。一方，図２（ｅ）のように，３次元モデルの面と投影面との位置を一致させる場合には，その縮尺率の調整処理は設けなくてもよい。なお，図２２（ｂ）乃至（ｅ）は，いずれも上方からの図である。 As an example of the method of plane expansion, there is a method of expanding to a plane by defining the center of projection inside the 3D model and projecting from there. An example of the plane expansion process in the image information processing unit 22 will be described with reference to FIG. FIG. 22A is image information obtained by photographing a PET bottle, and the top view thereof has a substantially hexagonal shape. The actual shape is finer and uneven, but for the sake of explanation, it is shown here as a hexagon. When the image information of FIG. 22 (a) is three-dimensionally modeled using the depth map, the positional relationship when viewed from above is shown in FIG. 22 (b). In this case, since there is no depth information of the part behind the photographing device (camera), there is a blank when viewed from above as shown in FIG. 22 (b). Then, the central axis of the projection is specified at an arbitrary point of the object to be developed in a plane. For example, if a product is displayed on a display shelf and its shape is not a box-shaped or bag-shaped product but a cylindrical or bottle-shaped surface, the surface is not a flat surface but a regular curved surface, the width and surface of the product. From the shape, find the center point of the rotation axis and specify it as the center axis of the projection. In addition, in order to specify the central axis of projection, any method may be used, not limited to the above. FIG. 22 (c) shows a state in which the central axis of projection is specified. Then, the surface of the object is projected onto an arbitrarily set projection surface from the central axis of the specified projection. This is schematically shown in FIG. 22 (d). As shown in FIG. 22D, when the positions of the surface of the 3D model and the projection surface do not match, the vertical and horizontal dimensions of the image information of the projection result according to the distance between them are determined by an appropriate scale ratio. Adjust by. On the other hand, when the positions of the plane of the three-dimensional model and the projection plane are matched as shown in FIG. 2 (e), it is not necessary to provide the adjustment process of the scale ratio. Note that FIGS. 22 (b) to 22 (e) are views from above.

画像情報処理部２２は，このような平面展開の処理を行うことで，３次元モデルを２次元（平面）に展開し，２次元の画像情報（平面展開画像情報）に変換することができる。なお，上述では，投影による処理で３次元モデルから平面展開画像情報に変換する処理を行ったが，平面展開処理において，３次元モデルから平面展開画像情報に変換する処理はほかの処理によってもよい。 By performing such a plane expansion process, the image information processing unit 22 can expand the three-dimensional model into two dimensions (plane) and convert it into two-dimensional image information (plane expansion image information). In the above, the process of converting the 3D model to the plane-expanded image information is performed by the process of projection, but in the plane-expanded process, the process of converting the 3D model to the plane-expanded image information may be performed by another process. ..

画像情報処理部２２は，このように生成した平面展開画像情報を，上述の実施例１および実施例２における撮影画像情報として取り扱い，フェイス処理以降の処理を実行すれば，実施例１および実施例２と同様の処理を実行できる。 If the image information processing unit 22 handles the plane-expanded image information generated in this way as the captured image information in the above-mentioned Examples 1 and 2, and executes the processing after the face processing, the first and second embodiments are performed. The same process as in 2 can be executed.

また標本情報処理部２５は，上述の画像情報処理部２２と同様に，対象物を撮影した画像情報と深さマップとから３次元モデルを生成し，その３次元モデルに基づいて平面展開画像情報を生成する。そして平面展開画像情報の全部または一部（たとえばラベル部分）を標本情報として標本情報記憶部２６に記憶させることで，実施例１および実施例２と同様の処理を実行できる。 Further, the sample information processing unit 25 generates a three-dimensional model from the image information obtained by photographing the object and the depth map, and the plane development image information is based on the three-dimensional model, as in the image information processing unit 22 described above. To generate. Then, by storing all or part of the plane-developed image information (for example, the label portion) in the sample information storage unit 26 as sample information, the same processing as in Example 1 and Example 2 can be executed.

実施例４の処理の変形例として，スティッチング処理を実行する場合を説明する。 As a modification of the process of the fourth embodiment, a case where the stitching process is executed will be described.

画像情報処理部２２は，撮影対象を撮影する際に，複数枚で撮影した場合，それを一つの画像情報に合成するスティッチング処理を実行してもよい。スティッチング処理は，３次元モデル化の前であってもよいし，３次元モデル化の後であってもよい。３次元モデル化の前のスティッチング処理，３次元モデル化の後のスティッチング処理のいずれも公知の手法を用いることもできる。 When a plurality of images are photographed, the image information processing unit 22 may execute a stitching process for synthesizing the images into one image information. The stitching process may be performed before the 3D modeling or after the 3D modeling. A known method can be used for both the stitching process before the 3D modeling and the stitching process after the 3D modeling.

３次元モデル化の前に行う場合には，各撮影画像情報間で対応点を検出することでスティッチング処理を実行する。この際に，撮影画像情報同士で対応点が特定できるので，特定した対応点の位置に対応する深さマップ同士も同様に対応づける。これによって，３次元モデル化の前にスティッチング処理を実行できる。 If it is performed before the three-dimensional modeling, the stitching process is executed by detecting the corresponding points between the captured image information. At this time, since the corresponding points can be specified between the captured image information, the depth maps corresponding to the positions of the specified corresponding points are also associated with each other in the same manner. This makes it possible to execute stitching processing before 3D modeling.

また，３次元モデル化の後に行う場合には，２つの３次元モデルにおける凹凸形状の類似性を深さ情報から特定し，形状が類似（深さ情報に基づく凹凸のパターンが所定範囲内に含まれている）している領域内において，その領域に対応する，当該３次元モデルの元となった撮影画像情報同士で対応点を検索することで，３次元モデル化の後にスティッチング処理を実行できる。これを模式的に示すのが図２３である。図２３（ａ）および（ｂ）は横に並ぶ棚段を撮影した撮影画像情報であり，図２３（ｃ）は図２３（ａ）に基づく３次元モデル，図２３（ｄ）は図２３（ｂ）に基づく３次元モデルである。この場合，図２４に示すように，たとえば撮影画像情報において，スティッチング処理のための基準１０１，たとえば棚段を写り込むように撮影をしておけば，その基準１０１が同一の視点方向になるように３次元モデルを回転する。そして深さ情報に基づく凹凸形状の類似を判定する。そして特定した類似する領域１００内における対応点を検索することで，スティッチング処理を実行できる。図２４（ａ）は図１９（ａ）の３次元モデルを，図２４（ｂ）と同一の視点方向となるように回転させた状態である。そして図２４（ａ）と図２４（ｂ）の破線領域１００が，深さ情報に基づいて判定した凹凸形状の類似の領域１００である。また，一点鎖線領域１０１が，同一の視点方向になるように３次元モデルを回転させるための基準とする棚段である。なお基準１０１とするものは撮影対象によって任意に特定することができるが、好ましくは２以上の直線で構成されるものであることがよい。 In addition, when performing after 3D modeling, the similarity of the uneven shape in the two 3D models is specified from the depth information, and the shapes are similar (the pattern of the unevenness based on the depth information is included in the predetermined range. The stitching process is executed after the 3D modeling by searching for the corresponding points between the captured image information that is the source of the 3D model that corresponds to that area. can. FIG. 23 schematically shows this. 23 (a) and 23 (b) are photographed image information obtained by photographing the shelves arranged side by side, FIG. 23 (c) is a three-dimensional model based on FIG. 23 (a), and FIG. 23 (d) is FIG. 23 (d). It is a three-dimensional model based on b). In this case, as shown in FIG. 24, for example, in the photographed image information, if the reference 101 for stitching processing, for example, the shelf step is photographed so as to be reflected, the reference 101 is in the same viewpoint direction. Rotate the 3D model as in. Then, the similarity of the uneven shape is determined based on the depth information. Then, the stitching process can be executed by searching for the corresponding points in the specified similar area 100. FIG. 24A shows a state in which the three-dimensional model of FIG. 19A is rotated so as to have the same viewpoint direction as that of FIG. 24B. The broken line region 100 in FIGS. 24 (a) and 24 (b) is a region 100 having a concave-convex shape determined based on the depth information. Further, the alternate long and short dash line region 101 is a shelf stage used as a reference for rotating the three-dimensional model so that the three-dimensional model is oriented in the same viewpoint direction. The reference 101 can be arbitrarily specified depending on the object to be photographed, but it is preferably composed of two or more straight lines.

以上のような処理を実行することで，３次元モデル化の前または後のいずれであってもスティッチング処理を実行することができる。また，複数枚の撮影画像情報を用いない場合にはスティッチング処理は実行しなくてもよい。 By executing the above processing, the stitching processing can be executed either before or after the 3D modeling. Further, when the information of a plurality of captured images is not used, the stitching process may not be executed.

実施例４の処理の変形例として，画像情報処理部２２は，３次元モデル化処理を実行した後，撮影対象とした３次元モデルが正対する位置となるように，その視点方向を決定する処理を実行し，その後，平面展開をする処理を実行してもよい。たとえば撮影対象が商品の陳列棚やそこにある商品である場合，陳列棚や商品が正対する位置（正面となる位置）になるように視点方向を決定する。 As a modification of the process of the fourth embodiment, the image information processing unit 22 determines the viewpoint direction so that the three-dimensional model to be photographed faces the position after the three-dimensional modeling process is executed. May be executed, and then the process of plane expansion may be executed. For example, when the object to be photographed is a product display shelf or a product located there, the viewpoint direction is determined so that the display shelf or the product faces the position (front position).

視点方向を決定する処理としては，３次元モデルにおける基準が，正対する位置となるためのあらかじめ定めた条件を充足するように視点方向を決定すればよい。基準および条件は撮影対象などに応じて任意に設定することができる。たとえば陳列棚を撮影する際には，陳列棚の棚段の前面１０２を基準とし，条件としては，陳列棚の複数の棚段の前面１０２がいずれも水平となり，かつ互いに垂直の位置となるように，視点を撮影範囲の上下左右で中央に移動させることで，実現できる。棚段の前面１０２は，通常，商品の陳列棚を撮影した場合，陳列棚の前面が一番手前に位置する（もっとも前に位置している）ので３次元モデルの深さ情報から棚段の前面１０２を特定することができる。この処理を模式的に示すのが図２５である。図２５（ａ）は陳列棚を撮影した撮影画像情報であり，図２５（ｂ）はその陳列棚の深さマップにおいて，深さ情報を白黒の濃度で表現した図であり（黒の場合が手前に位置する（深さ情報の値が小さい）），図２５（ｃ）は深さマップに基づく３次元モデルであり，深さ情報を色に変換したものである。そして画像情報処理部２２は，深さ情報に基づいて陳列棚の棚段の前面を特定し，その棚段が上述の条件を充足するように視点方向を決定する。また視点方向の決定処理で用いた基準と，上述の３次元モデル化後のスティッチング処理における基準とは同一の基準を用いてもよい。たとえばいずれの場合も陳列棚の棚段の前面を基準として用いることができる。 As a process for determining the viewpoint direction, the viewpoint direction may be determined so that the reference in the three-dimensional model satisfies a predetermined condition for the position to face each other. The standard and conditions can be arbitrarily set according to the shooting target and the like. For example, when taking a picture of a display shelf, the front 102 of the shelf of the display shelf is used as a reference, and the condition is that the front 102 of the plurality of shelves of the display shelf are all horizontal and perpendicular to each other. This can be achieved by moving the viewpoint to the center in the vertical and horizontal directions of the shooting range. Normally, when the front side of the shelf is photographed, the front of the display shelf is located in the foreground (the frontmost position), so that the depth information of the 3D model indicates that the front 102 of the shelf is on the shelf. The front surface 102 can be specified. FIG. 25 schematically shows this process. FIG. 25 (a) is photographed image information of the display shelf, and FIG. 25 (b) is a diagram expressing the depth information in black and white density in the depth map of the display shelf (black case is the case). Located in the foreground (the value of the depth information is small), FIG. 25 (c) is a three-dimensional model based on the depth map, in which the depth information is converted into colors. Then, the image information processing unit 22 identifies the front surface of the shelf of the display shelf based on the depth information, and determines the viewpoint direction so that the shelf satisfies the above-mentioned conditions. Further, the same standard as the standard used in the determination process of the viewpoint direction and the standard in the stitching process after the above-mentioned three-dimensional modeling may be used. For example, in either case, the front of the shelf of the display shelf can be used as a reference.

図２５（ｃ）ではわかりやすさのため，撮影画像情報をテクスチャマッピングする前の状態で処理をする場合を示しているが，テクスチャマッピングをした後の３次元モデルに対して視点方向の決定処理を実行してもよい。なお，視点方向の決定処理は，上述に限らず，任意の処理で実現することができる。 In FIG. 25 (c), for the sake of clarity, the case where the captured image information is processed in the state before the texture mapping is shown, but the process of determining the viewpoint direction is executed for the 3D model after the texture mapping. You may. The process of determining the viewpoint direction is not limited to the above, and can be realized by any process.

なお，もともと撮影画像情報が正対した位置から撮影されているなどの場合には，視点方向の決定処理は実行しなくてもよい。 If the captured image information is originally captured from a position facing it, the viewpoint direction determination process does not have to be executed.

実施例４乃至実施例６の処理では，平面展開画像情報に対してフェイス処理を実行することで，フェイス領域の画像情報を特定していたが，３次元モデルからフェイス領域を特定し，特定したフェイス領域について平面展開処理を実行して平面展開画像情報としても良い。 In the processing of Examples 4 to 6, the image information of the face region was specified by executing the face processing on the plane development image information, but the face region was specified and specified from the three-dimensional model. The plane expansion process may be executed for the face region to obtain the plane expansion image information.

３次元モデルからフェイス領域の特定方法はさまざまな方法があり，標本情報の特性に合わせて任意に設定することができる。たとえば，３次元モデルにおいて所定条件を充足する面の位置と範囲を特定し，その面と，あらかじめ定めた標本情報の表面モデルのタイプに応じた条件を充足するかをマッチングし，条件を充足する面をフェイス領域として特定する。たとえば上述と同様に，標本情報が商品であり，陳列棚を撮影した撮影画像情報から３次元モデルを生成し，商品のフェイス領域を特定する処理の場合には，以下のような処理が実行できる。 There are various methods for specifying the face area from the 3D model, and it can be set arbitrarily according to the characteristics of the sample information. For example, in a 3D model, the position and range of a surface that satisfies a predetermined condition are specified, and the surface is matched with whether the condition corresponding to the type of the surface model of the predetermined sample information is satisfied, and the condition is satisfied. Specify the face as a face area. For example, as described above, in the case where the sample information is a product, a three-dimensional model is generated from the photographed image information obtained by photographing the display shelf, and the face area of the product is specified, the following processing can be executed. ..

まず，微細な凹凸を除外するために高周波をフィルタアウトした上で，垂直に近い，正対した一定以上の面積を持つ面，直立した円筒の面，全体に凸のゆがんだ多面体の領域の面を検出し，３次元空間内の位置と範囲を特定する。たとえば，深さ情報が一定の範囲内であり，法線が安定して正面を向いていれば平面，深さ情報について垂直方向の相関性が強く，水平方向に凸であれば円筒，水平方向および垂直方向のいずれも凸，または法線が安定していなければ凸のゆがんだ多面体の領域の面のように特定をする。そして，検出した各面をラベリングをする。図２６では５つの面を検出し，それぞれ１から５のラベリングをした状態を示している。そして検出した面について，たとえば深さ情報の平均値などに基づいて，最前面から，順に最前面からの距離（これをレベルとする）を算出する。そして算出したレベルに基づいて，各面の前後関係を特定する。図２６では，１，２の面のレベルは０ｃｍであり，３，４の面のレベルは３ｃｍ，５の面のレベルは２０ｃｍのように算出をする。そして，最前面が１，２の面，その次に奥に位置するのが３，４の面，最背面が５の面として，５つの面の前後関係を特定する。 First, after filtering out the high frequency to exclude fine irregularities, a surface that is close to vertical and has a certain area facing directly, an upright cylindrical surface, and a surface of a polyhedral region that is convex and distorted as a whole. Is detected, and the position and range in the three-dimensional space are specified. For example, if the depth information is within a certain range and the normal is stable and facing the front, the vertical correlation is strong with respect to the plane and depth information, and if it is convex in the horizontal direction, the cylinder is in the horizontal direction. And both in the vertical direction are convex, or if the normal is not stable, it is specified as a plane in the region of a convex distorted polyhedron. Then, each detected surface is labeled. FIG. 26 shows a state in which five surfaces are detected and labeled from 1 to 5 respectively. Then, for the detected surface, the distance from the foreground (this is used as the level) is calculated in order from the foreground based on, for example, the average value of the depth information. Then, based on the calculated level, the context of each surface is specified. In FIG. 26, the levels of the surfaces 1 and 2 are calculated as 0 cm, the levels of the surfaces 3 and 4 are calculated as 3 cm, and the level of the surface 5 is calculated as 20 cm. Then, the front-back relationship of the five faces is specified, with the frontmost faces being the faces 1 and 2, the next being located in the back being the faces 3 and 4, and the backmost face being the face 5.

そして画像情報処理部２２は，このように特定した面の前後関係を用いて，面の種別を特定する。すなわち，最前面であって，深さマップにある狭い横長の平面を，棚板の前面および商品タグ面，最背面を棚段の後板面，また，棚段の前面と深さ情報が同一または近接（一定範囲，たとえば数ｃｍ以内）しており，中空にある垂直の長方形は商品タグの面として特定をする。そして，好ましくは，それ以外の面を商品面（商品が陳列される可能性のある面）として特定をする。したがって，図２６の例では，１，２の面を棚板の前面および商品タグ面，３，４の面を商品面，５の面を棚体の後板面として特定をする。 Then, the image information processing unit 22 specifies the type of the surface by using the context of the surface specified in this way. That is, the depth information is the same as the front surface of the shelf, the product tag surface, the rear surface of the shelf, and the front surface of the shelf. Or, a vertical rectangle that is in close proximity (within a certain range, for example, within a few centimeters) and is hollow is specified as the surface of the product tag. Then, preferably, the other side is specified as a product side (a side on which the product may be displayed). Therefore, in the example of FIG. 26, the surfaces 1 and 2 are specified as the front surface and the product tag surface of the shelf board, the surfaces 3 and 4 are specified as the product surface, and the surface 5 is specified as the rear plate surface of the shelf body.

このように特定した商品面において，いずれかのタイミングにおいて入力を受け付けた商品のパッケージタイプに基づいて，対応する処理を実行する。パッケージタイプとしては，缶，瓶（ペットボトルを含む。以下同様），缶と瓶の併存，箱物，吊るし商品などがあるが，それらに限定をするものではない。 On the product side specified in this way, the corresponding processing is executed based on the package type of the product for which the input is accepted at any timing. Package types include cans, bottles (including PET bottles; the same applies hereinafter), coexistence of cans and bottles, boxes, hanging products, etc., but are not limited to these.

パッケージタイプが缶または瓶であった場合，一定の大きさの範囲内にある円筒の領域を缶または瓶のフェイス領域として特定する。これを模式的に示すのが，図２７および図２８である。図２７はパッケージタイプが缶の場合を示すイメージ図であり，図２８はパッケージタイプが瓶の場合を示すイメージ図である。 If the package type is a can or bottle, identify the area of the cylinder within a certain size as the face area of the can or bottle. This is schematically shown in FIGS. 27 and 28. FIG. 27 is an image diagram showing the case where the package type is a can, and FIG. 28 is an image diagram showing the case where the package type is a bottle.

パッケージタイプが缶と瓶の併存であった場合，一定の大きさの範囲内に円筒があり，上方に小さな円筒がない領域を缶のフェイス領域，一定の大きさの範囲内に円筒があり，上方に小さな円筒がある領域を瓶のフェイス領域として特定をする。 When the package type is a coexistence of a can and a bottle, there is a cylinder within a certain size range, the area where there is no small cylinder above is the face area of the can, and there is a cylinder within a certain size range. The area with a small cylinder above is specified as the face area of the bottle.

パッケージタイプが箱物であった場合，一定の大きさの範囲内にある垂直の長方形の平面の領域を箱物のフェイス領域として特定をする。これを模式的に示すのが図２９である。 If the package type is a box, the area of the vertical rectangular plane within a certain size range is specified as the face area of the box. FIG. 29 schematically shows this.

また，パッケージタイプが吊るし商品であった場合，一定のサイズの範囲内にあり，中空に浮いた長方形の平面または凹凸面の領域を，吊るし商品のフェイス領域として特定をする。これを模式的に示すのが図３０である。 If the package type is a hanging product, the area of a rectangular flat surface or uneven surface that is within a certain size range and floats in the air is specified as the face area of the hanging product. FIG. 30 schematically shows this.

なお，画像情報処理部２２は，パッケージタイプの選択を受け付けずフェイス領域を特定してもよい。この場合，商品面の領域と，パッケージタイプごとの判定処理をそれぞれ実行することで，フェイス領域としての条件が成立するかを判定する。そして，フェイス領域を構成する面が存在する場合には，その面が構成するフェイス領域を特定し，あらかじめ定められたパッケージタイプの優先条件に基づいて，フェイス領域がどのようなパッケージタイプであるかを特定する。 The image information processing unit 22 may specify the face area without accepting the selection of the package type. In this case, it is determined whether the condition as the face area is satisfied by executing the determination process for each of the product surface area and the package type. If there is a face that constitutes the face area, specify the face area that the face constitutes, and what kind of package type the face area is based on the predetermined package type priority conditions. To identify.

３次元モデルからフェイス領域を特定する処理は，商品などのカテゴリや形状などによって任意の方法を採用可能であり，上記に限定するものではない。また，自動的に特定したフェイス領域に対して，オペレータによる修正入力を受け付けてもよい。さらに，オペレータからフェイス領域の位置の入力を受け付けるのでもよい。 The process of specifying the face area from the three-dimensional model can adopt any method depending on the category and shape of the product, and is not limited to the above. Further, the correction input by the operator may be accepted for the automatically specified face area. Further, the input of the position of the face area may be accepted from the operator.

さらに画像情報処理部２２は，深層学習（ディープラーニング）を用いてフェイス領域を特定してもよい。この場合，中間層が多数の層からなるニューラルネットワークの各層のニューロン間の重み付け係数が最適化された学習モデルに対して，上記３次元モデルを入力し，その出力値に基づいて，フェイス領域を特定してもよい。また学習モデルとしては，さまざまな３次元モデルにフェイス領域を与えたもの正解データとして用いることができる。 Further, the image information processing unit 22 may specify the face region by using deep learning. In this case, the above 3D model is input to the learning model in which the weighting coefficient between the neurons of each layer of the neural network consisting of many layers is optimized, and the face region is set based on the output value. It may be specified. Further, as a learning model, various three-dimensional models with face regions can be used as correct answer data.

画像情報処理部２２は，以上のように特定したフェイス領域を切り出す。フェイス領域を切り出すとは，３次元モデルから特定したフェイス領域を実際に切り出してもよいし，後述の処理の処理対象としてその領域を設定することも含まれる。 The image information processing unit 22 cuts out the face region specified as described above. Cutting out the face area may actually cut out the face area specified from the three-dimensional model, and also includes setting the area as the processing target of the processing described later.

フェイス領域を切り出す場合には，上述の各処理で特定したフェイス領域をそのまま切り出してもよいし，複数の方法によりフェイス領域の特定を行い，各方法で特定したフェイス領域の結果を用いて切り出す対象とするフェイス領域を特定してもよい。たとえば，３次元モデルから所定条件を充足する面の位置と範囲を特定し，表面モデルのタイプに応じた条件を充足するかをマッチングしてフェイス領域を特定する方法と，深層学習によりフェイス領域を特定する方法とを行い，あらかじめ定めた演算によって，最終的に切り出す対象とするフェイス領域を特定してもよい。切り出すフェイス領域の特定方法は，上記に限定するものではなく，任意に設定することができる。 When cutting out the face area, the face area specified in each of the above processes may be cut out as it is, or the face area may be specified by a plurality of methods and the target to be cut out using the result of the face area specified by each method. You may specify the face area to be. For example, a method of specifying the position and range of a surface that satisfies a predetermined condition from a 3D model and matching whether the condition is satisfied according to the type of the surface model to specify the face region, and a method of specifying the face region by deep learning. The face area to be finally cut out may be specified by a method of specifying and a predetermined operation. The method for specifying the face area to be cut out is not limited to the above, and can be set arbitrarily.

画像情報処理部２２は，画像情報処理部２２において切り出したフェイス領域について，実施例３と同様の平面展開処理を実行し，フェイス領域についての平面展開画像情報を生成する。そして，このように生成した平面展開画像情報を，上述の実施例１および実施例２における撮影画像情報として取り扱い，ブロック特定処理以降の処理を実行すれば，実施例１および実施例２と同様の処理を実行できる。 The image information processing unit 22 executes the same plane expansion processing as in the third embodiment for the face region cut out by the image information processing unit 22, and generates plane expansion image information for the face region. Then, if the plane-expanded image information generated in this way is treated as the captured image information in the above-mentioned Examples 1 and 2, and the processing after the block specifying process is executed, the same as in Example 1 and Example 2. Processing can be executed.

また標本情報処理部２５は，上述の画像情報処理部２２と同様に，対象物を撮影した画像情報と深さマップとから３次元モデルを生成し，その３次元モデルに基づいてフェイス領域を切り出す。そして切り出したフェイス領域から平面展開画像情報を生成し，平面展開画像情報を標本情報として標本情報記憶部２６に記憶させることで，実施例１および実施例２と同様の処理を実行できる。 Further, the sample information processing unit 25 generates a three-dimensional model from the image information obtained by photographing the object and the depth map, and cuts out the face region based on the three-dimensional model, similarly to the image information processing unit 22 described above. .. Then, by generating the plane-expanded image information from the cut-out face region and storing the plane-expanded image information as sample information in the sample information storage unit 26, the same processing as in the first and second embodiments can be executed.

上述の各実施例における各処理については，本発明の明細書に記載した順序に限定するものではなく，その目的を達成する限度において適宜，変更することが可能である。 The processing in each of the above-mentioned examples is not limited to the order described in the specification of the present invention, and can be appropriately changed as long as the object is achieved.

また，本発明の画像処理システム１は，店舗の陳列棚を撮影した撮影画像情報から，陳列棚に陳列した商品を対象物として，その商品の陳列状況を特定する場合に有効であるが，それに限定するものではない。すなわち，ある撮影対象物を撮影した場合に，その所望の対象物が写っている領域を撮影した画像情報から特定する際に，広く用いることができる。 Further, the image processing system 1 of the present invention is effective when specifying the display status of a product displayed on the display shelf from the photographed image information obtained by photographing the display shelf of the store. It is not limited. That is, when a certain object to be photographed is photographed, it can be widely used when specifying the area in which the desired object is captured from the photographed image information.

本発明の画像処理システム１を用いることによって，陳列棚などを撮影した画像情報に写っている商品などの対象物を特定する際に，計算時間を従来よりも要することなく，比較処理の精度を向上させることが可能となる。 By using the image processing system 1 of the present invention, the accuracy of comparison processing can be improved without requiring more calculation time than before when identifying an object such as a product shown in image information obtained by photographing a display shelf or the like. It will be possible to improve.

１：画像処理システム
２：管理端末
３：入力端末
２０：画像情報入力受付処理部
２１：画像情報記憶部
２２：画像情報処理部
２３：候補処理部
２４：比較処理部
２５：標本情報処理部
２６：標本情報記憶部
７０：演算装置
７１：記憶装置
７２：表示装置
７３：入力装置
７４：通信装置 1: Image processing system 2: Management terminal 3: Input terminal 20: Image information input reception processing unit 21: Image information storage unit 22: Image information processing unit 23: Candidate processing unit 24: Comparison processing unit 25: Sample information processing unit 26 : Sample information storage unit 70: Arithmetic device 71: Storage device 72: Display device 73: Input device 74: Communication device

Claims

It is an image processing system that identifies the products shown in the image information of the display shelves.
The image processing system is
Using the layout information of the block , which is an area satisfying a predetermined condition in the face area in the image information, and the layout information of the block in the sample product , the similarity of the layout between the blocks is determined, and the comparison process is performed. The candidate processing unit that identifies the products that are candidates for
Yes a face region of the image information, by the comparison processing using the sample information items identified as candidates by the candidate processing unit, and a comparison processing unit for identifying the product that is reflected in the face region And
The candidate processing unit
A predetermined evaluation value is calculated using the area of the intersection and / or the difference between the block layout information in the face area of the image information and the layout information of the block of the sampled product.
If the calculated evaluation value is equal to or greater than the threshold value, the sampled product is specified as a candidate.
An image processing system characterized by that.

The candidate processing unit
Identifies blocks that satisfy a predetermined condition in the face region in the image information,
The identified and layout information of the block in a predetermined range of the face area, by using the layout information of the block of the region corresponding to the size of the predetermined range of the face area in the image information of the commodity to the specimen, the layout similarities Judgment of
The image processing system according to claim 1.

The candidate processing unit
The process of determining the similarity of the layout is performed while shifting the area corresponding to the size of the predetermined range of the face area in the image information of the product to be the sample.
The comparison processing unit
In the image information of the product , the sample information of the region determined to have the highest similarity by the candidate processing unit and the face region of the image information are compared and reflected in the face region. Identify the product,
The image processing system according to claim 2, wherein the image processing system is characterized by the above.

The comparison processing unit
Correspondence with the block of the image information, and a block of the product was identified as a candidate in the candidate processing unit,
The comparison process is executed by determining the similarity of the image information and / or the character information using the associated block.
The image processing system according to any one of claims 1 to 3 , wherein the image processing system is characterized in that.

It is an image processing system that identifies the products shown in the image information of the display shelves.
The image processing system is
For a part or all of the 3D model generated by using the image information obtained by photographing the display shelf and the corresponding depth information, the plane development image information developed on a plane is generated, and the generated plane development image information is obtained. An image information processing unit that uses part or all of the image information to be processed,
Using the layout information of the block, which is an area satisfying a predetermined condition in the face area of the image information to be processed, and the layout information of the block in the sample object, the similarity of the layout between the blocks is determined. A candidate processing unit that identifies products that are candidates for comparison processing,
By performing comparison processing using the face area of the image information to be processed and the sample information of the product specified as a candidate in the candidate processing unit, the comparison processing unit that identifies the product reflected in the face area and the product. ，，
An image processing system characterized by having.

Computer,
Using the layout information of the block, which is an area satisfying a predetermined condition in the face area in the image information obtained by photographing the display shelf, and the layout information of the block in the sample product , the similarity of the layout between the blocks is determined. Candidate processing unit that performs and identifies products that are candidates for comparison processing,
By performing comparison processing using the face area of the image information and the sample information of the product specified as a candidate in the candidate processing unit, the product functions as a comparison processing unit for specifying the product reflected in the face area. It is an image processing program
The candidate processing unit
A predetermined evaluation value is calculated using the area of the intersection and / or the difference between the block layout information in the face area of the image information to be processed and the layout information of the block of the sampled product.
If the calculated evaluation value is equal to or greater than the threshold value, the sampled product is specified as a candidate.
An image processing program characterized by this.

Computer,
For a part or all of the 3D model generated by using the image information obtained by photographing the display shelf and the corresponding depth information, the plane development image information developed on a plane is generated, and one of the generated plane development image information. Image information processing unit, whose part or all is the image information to be processed,
Using the layout information of the block, which is an area satisfying a predetermined condition in the face area of the image information to be processed, and the layout information of the block in the sample object, the similarity of the layout between the blocks is determined. Candidate processing unit that performs and identifies products that are candidates for comparison processing,
A comparison processing unit that identifies products in the face area by performing comparison processing using the face area of the image information to be processed and the sample information of the product specified as a candidate in the candidate processing unit.
An image processing program characterized by functioning as.