JP2023161432A

JP2023161432A - Image processing device and image processing method

Info

Publication number: JP2023161432A
Application number: JP2022071824A
Authority: JP
Inventors: 和彦岩井; Kazuhiko Iwai
Original assignee: Panasonic Intellectual Property Management Co Ltd
Current assignee: Panasonic Intellectual Property Management Co Ltd
Priority date: 2022-04-25
Filing date: 2022-04-25
Publication date: 2023-11-07

Abstract

To create a learning model that enables a user to easily and visually confirm a collection state of teacher images prior to learning, and is efficient and high in accuracy.SOLUTION: The processor 13 is configured to: generate teacher images including a detection object and a background from an area image associated with a monitor area; set, for each of teacher images, attributes of the detection object included in the teacher image; generate visualized image including visualized gathering states of teacher images at respective positions in the area image targeting a teacher image having attributes that a user specifies; and output display information generated by superposing the visualized images on the area image. The processor further generates, as an area image, a teacher image from a reality area image photographed by a camera or a virtual area image generated through CG. The processor further generates a visualized image including gathering state of teacher images by color kinds associated with a person as an attribute.SELECTED DRAWING: Figure 2

Description

本発明は、監視エリアに対応した画像認識モデルを構築するための教師画像の収集状況を可視化する画像処理装置および画像処理方法に関するものである。 The present invention relates to an image processing device and an image processing method that visualize the collection status of teacher images for constructing an image recognition model corresponding to a monitoring area.

ディープラーニングなどの機械学習により構築された画像認識モデル（機械学習モデル）を用いて、カメラの撮影画像から人物の来店などの所定の事象を検知するシステムが利用されている。画像認識モデルは、収集された多数の教師画像（学習用画像）を用いた機械学習により構築されるが、教師画像に偏りがあると、安定した精度の画像認識モデルが構築できない。 Systems are in use that use image recognition models (machine learning models) built through machine learning such as deep learning to detect predetermined events, such as a person visiting a store, from images captured by a camera. An image recognition model is constructed by machine learning using a large number of collected teacher images (learning images), but if the teacher images are biased, an image recognition model with stable accuracy cannot be constructed.

このような教師画像の偏りに起因する画像認識モデルの精度低下を避けるため、従来、画像認識モデルの処理対象となる監視エリア（応用環境）に存在する人物や店舗（構成要素）の実体の確率分布を変更することで、学習用データ（教師画像）の偏りを低減する技術が知られている（特許文献１参照）。 In order to avoid a decrease in the accuracy of the image recognition model due to such bias in the teacher image, conventionally, the probability of the entity of a person or store (component) existing in the monitoring area (application environment) that is the processing target of the image recognition model has been calculated. A technique is known that reduces bias in learning data (teacher images) by changing the distribution (see Patent Document 1).

特開２０２１－１１１１０１号公報JP 2021-111101 Publication

従来の技術によれば、教師画像の偏りが低減するように教師画像の集合（学習用データセット）が更新されるため、精度が高い機械学習モデルが構築される可能性が高くなるが、機械学習モデルの精度が不十分な場合もある。このため、構築された機械学習モデルの評価において、機械学習モデルの精度が不十分と判定されると、不足する教師画像を追加するなどして、教師画像の集合を更新した上で、機械学習を再度行い、構築された機械学習モデルの評価を行う。このように、従来の技術では、教師画像の集合の更新と、機械学習と、機械学習モデルの評価とを繰り返す必要があり、十分な精度の機械学習モデルが完成するまでに非常に手間がかかる場合がある。 According to conventional technology, the set of teacher images (learning dataset) is updated to reduce bias in the teacher images, which increases the possibility of constructing a highly accurate machine learning model. In some cases, the accuracy of the learning model is insufficient. Therefore, when evaluating the constructed machine learning model, if it is determined that the accuracy of the machine learning model is insufficient, the set of teacher images is updated by adding missing teacher images, and then the machine learning The steps are repeated and the constructed machine learning model is evaluated. In this way, with conventional technology, it is necessary to repeatedly update the set of teacher images, perform machine learning, and evaluate the machine learning model, which takes a lot of effort to complete a machine learning model with sufficient accuracy. There are cases.

一方、教師画像の収集状況（アノテーション状況）、すなわち、必要な属性の教師画像が十分な数でかつ適切な配分で揃っているか否かが可視化されてユーザに提示されると、ユーザが、教師画像の収集状況を即座に把握して、不足する教師画像を追加するアノテーション作業を効率よく行うことができる。 On the other hand, when the teacher image collection status (annotation status), that is, whether or not there are a sufficient number of teacher images with necessary attributes and an appropriate distribution, is visualized and presented to the user, the user You can instantly grasp the image collection status and efficiently perform annotation work to add missing teacher images.

そこで、本発明は、学習に先だって、教師画像の収集状況をユーザが目視で容易に確認でき、効率よく高精度な学習モデルを作成することができる画像処理装置および画像処理方法を提供することを主な目的とする。 SUMMARY OF THE INVENTION Therefore, an object of the present invention is to provide an image processing device and an image processing method that allow a user to easily visually check the collection status of teacher images prior to learning, and that can efficiently create a highly accurate learning model. Main purpose.

本発明の画像処理装置は、監視エリアに対応した画像認識モデルを構築するための教師画像の収集状況を可視化する処理をプロセッサにより実行する画像処理装置であって、前記プロセッサは、前記監視エリアに関するエリア画像から、検知対象物と背景とを含む教師画像を生成し、前記教師画像に含まれる前記検知対象物の特徴に関する属性を前記教師画像ごとに設定し、ユーザが指定した前記属性を有する前記教師画像を対象にして、前記エリア画像の各位置における前記教師画像の収集状況を可視化した可視化画像を生成し、前記可視化画像を前記エリア画像に重畳した表示情報を出力する構成とする。 An image processing device of the present invention is an image processing device in which a processor executes a process of visualizing a collection status of teacher images for constructing an image recognition model corresponding to a monitoring area, the processor A teacher image including a detection target object and a background is generated from an area image, attributes related to the characteristics of the detection target included in the teacher image are set for each teacher image, and a teacher image having the attributes specified by the user is set. The present invention is configured to generate a visualized image that visualizes the collection status of the teacher image at each position of the area image with the teacher image as a target, and output display information in which the visualized image is superimposed on the area image.

また、本発明の画像処理方法は、監視エリアに対応した画像認識モデルを構築するための教師画像の収集状況を可視化する処理をプロセッサにより実行する画像処理方法であって、前記監視エリアに関するエリア画像から、検知対象物と背景とを含む教師画像を生成し、前記教師画像に含まれる前記検知対象物の特徴に関する属性を前記教師画像ごとに設定し、ユーザが指定した前記属性を有する前記教師画像を対象にして、前記エリア画像の各位置における前記教師画像の収集状況を可視化した可視化画像を生成し、前記可視化画像を前記エリア画像に重畳した表示情報を出力する構成とする。 Further, the image processing method of the present invention is an image processing method in which a processor executes a process of visualizing the collection status of teacher images for constructing an image recognition model corresponding to a monitoring area, the method comprising: , a teacher image including a detection target and a background is generated, attributes related to the characteristics of the detection target included in the teacher image are set for each teacher image, and the teacher image has the attributes specified by the user. The present invention is configured to generate a visualized image that visualizes the acquisition status of the teacher images at each position of the area image, and to output display information in which the visualized image is superimposed on the area image.

本発明によれば、ユーザが指定した属性を有する教師画像の収集状況、すなわち、必要な属性の教師画像が偏りなく揃っているか否かを、ユーザが容易に確認することができる。特に、教師画像の収集状況に問題のあるエリア画像上の位置を、ユーザが容易に把握することができる。これにより、学習に先だって、教師画像のアノテーション状況をユーザが目視で容易に確認でき、効率よく高精度な学習モデルを作成することができる。 According to the present invention, the user can easily check the collection status of teacher images having the attributes specified by the user, that is, whether the teacher images having the necessary attributes are evenly collected. In particular, the user can easily grasp the position on the area image where there is a problem in the collection status of teacher images. Thereby, the user can easily visually check the annotation status of the teacher image prior to learning, and can efficiently create a highly accurate learning model.

本実施形態に係る画像認識モデル構築システムの全体構成図Overall configuration diagram of the image recognition model construction system according to this embodiment 画像処理装置の概略構成を示すブロック図Block diagram showing a schematic configuration of an image processing device 人数計測システムの場合にエリア画像上に設定される検知領域および検知前領域を示す説明図Explanatory diagram showing the detection area and pre-detection area set on the area image in the case of a people counting system 教師画像生成処理の概要を示す説明図Explanatory diagram showing an overview of teacher image generation processing データベースに登録されるエリア画像に関する情報を示す説明図Explanatory diagram showing information about area images registered in the database 人物矩形を示す説明図Explanatory diagram showing a person rectangle 教師画像を示す説明図Explanatory diagram showing teacher images データベースに登録される教師画像に関する情報を示す説明図Explanatory diagram showing information about teacher images registered in the database エリア画像上における教師画像の抽出位置に関するアノテーション状況の不備を示す説明図An explanatory diagram showing deficiencies in the annotation status regarding the extraction position of the teacher image on the area image 抽出枚数計測処理を示す説明図Explanatory diagram showing the extraction number measurement process 人物の重なりが発生している場合の抽出枚数計測処理を示す説明図Explanatory diagram showing the extraction number measurement process when people overlap アノテーション作業モードの画面を示す説明図Explanatory diagram showing the screen of annotation work mode アノテーション状況確認モードにおけるリスト選択時の画面を示す説明図Explanatory diagram showing the screen when selecting a list in annotation status confirmation mode アノテーション状況確認モードにおけるグラフ選択時の画面を示す説明図Explanatory diagram showing the screen when selecting a graph in annotation status confirmation mode アノテーション状況確認モードにおけるグラフ選択時の画面を示す説明図Explanatory diagram showing the screen when selecting a graph in annotation status confirmation mode アノテーション状況詳細確認モードの画面を示す説明図Explanatory diagram showing the screen of annotation status detailed confirmation mode アノテーション状況詳細確認モードの画面を示す説明図Explanatory diagram showing the screen of annotation status detailed confirmation mode アノテーション状況詳細確認モードの画面を示す説明図Explanatory diagram showing the screen of annotation status detailed confirmation mode アノテーション状況詳細確認モードの画面を示す説明図Explanatory diagram showing the screen of annotation status detailed confirmation mode アノテーション状況詳細確認モードの画面の別例を示す説明図Explanatory diagram showing another example of the screen in the annotation status detailed confirmation mode

前記課題を解決するためになされた第１の発明は、監視エリアに対応した画像認識モデルを構築するための教師画像の収集状況を可視化する処理をプロセッサにより実行する画像処理装置であって、前記プロセッサは、前記監視エリアに関するエリア画像から、検知対象物と背景とを含む教師画像を生成し、前記教師画像に含まれる前記検知対象物の特徴に関する属性を前記教師画像ごとに設定し、ユーザが指定した前記属性を有する前記教師画像を対象にして、前記エリア画像の各位置における前記教師画像の収集状況を可視化した可視化画像を生成し、前記可視化画像を前記エリア画像に重畳した表示情報を出力する構成とする。 A first invention made to solve the above problem is an image processing device in which a processor executes a process of visualizing the collection status of teacher images for constructing an image recognition model corresponding to a monitoring area, The processor generates a teacher image including a detection target object and a background from the area image related to the monitoring area, sets attributes related to the characteristics of the detection target included in the teacher image for each teacher image, and Targeting the teacher image having the specified attribute, generate a visualized image that visualizes the collection status of the teacher image at each position of the area image, and output display information in which the visualized image is superimposed on the area image. The configuration is as follows.

これによると、ユーザが指定した属性を有する教師画像の収集状況、すなわち、必要な属性の教師画像が偏りなく揃っているか否かを、ユーザが容易に確認することができる。特に、教師画像の収集状況に問題のあるエリア画像上の位置を、ユーザが容易に把握することができる。これにより、学習に先だって、教師画像のアノテーション状況をユーザが目視で容易に確認でき、効率よく高精度な学習モデルを作成することができる。 According to this, the user can easily check the collection status of the teacher images having the attributes specified by the user, that is, whether the teacher images having the necessary attributes are evenly collected. In particular, the user can easily grasp the position on the area image where there is a problem in the collection status of teacher images. Thereby, the user can easily visually check the annotation status of the teacher image prior to learning, and can efficiently create a highly accurate learning model.

また、第２の発明は、前記プロセッサは、前記可視化画像として、前記エリア画像の各位置における前記教師画像の収集状況を表すヒートマップ画像を生成する構成とする。 In a second aspect of the invention, the processor generates, as the visualized image, a heat map image representing a collection status of the teacher images at each position of the area image.

これによると、エリア画像の各位置における教師画像の収集状況をユーザが容易に把握することができる。 According to this, the user can easily grasp the collection status of teacher images at each position of the area image.

また、第３の発明は、前記プロセッサは、前記可視化画像として、前記教師画像の収集状況に問題のある前記エリア画像上の範囲を表すマーク画像を生成する構成とする。 Further, in a third invention, the processor is configured to generate, as the visualized image, a mark image representing a range on the area image where there is a problem in the collection status of the teacher image.

これによると、教師画像の収集状況に問題のあるエリア画像上の領域をユーザが容易に把握することができる。 According to this, the user can easily grasp the area on the area image where there is a problem in the collection status of the teacher images.

また、第４の発明は、前記プロセッサは、前記エリア画像として、カメラで撮影された現実エリア画像、またはＣＧで作成された仮想エリア画像から前記教師画像を生成する構成とする。 Further, in a fourth invention, the processor generates the teacher image from a real area image photographed by a camera or a virtual area image created by CG as the area image.

これによると、教師画像を効率よく生成することができる。 According to this, a teacher image can be efficiently generated.

また、第５の発明は、前記プロセッサは、前記属性としての人物に関する色種別ごとの前記教師画像の収集状況を可視化した前記可視化画像を生成する構成とする。 Further, in a fifth invention, the processor is configured to generate the visualized image that visualizes the collection status of the teacher images for each color type regarding the person as the attribute.

これによると、人物に関する色種別に応じて画像認識モデルの精度が大きく異なる場合があるため、人物に関する色種別ごとの教師画像の収集状況をユーザに提示することで、容易に高精度な画像認識モデル（機械学習モデル）を作成することができる。 According to this, the accuracy of image recognition models may vary greatly depending on the color type of people, so by presenting the collection status of teacher images for each color type of people to the user, it is possible to easily achieve high-precision image recognition. Models (machine learning models) can be created.

また、第６の発明は、前記プロセッサは、前記属性ごとの前記教師画像の収集状況を可視化した統計グラフを生成し、この統計グラフを含む前記表示情報を出力する構成とする。 Further, in a sixth invention, the processor generates a statistical graph that visualizes the collection status of the teacher images for each attribute, and outputs the display information including this statistical graph.

これによると、属性ごとの教師画像の収集状況をユーザが容易に把握することができる。この場合、複数の属性の組み合わせごとの教師画像の収集状況を可視化した３次元統計グラフを生成してもよい。 According to this, the user can easily understand the collection status of teacher images for each attribute. In this case, a three-dimensional statistical graph may be generated that visualizes the collection status of teacher images for each combination of multiple attributes.

また、第７の発明は、前記プロセッサは、ユーザの操作に応じて、前記教師画像を生成すると共に前記教師画像に属性を設定する第１の画面を含む前記表示情報を出力し、前記エリア画像上に前記可視化画像を重畳して表示すると共に、前記第１の画面に戻るための操作部が設けられた第２の画面を含む前記表示情報を出力する構成とする。 Further, in a seventh aspect of the invention, the processor outputs the display information including a first screen for generating the teacher image and setting attributes for the teacher image in response to a user's operation, and The display information is configured to display the visualized image superimposed thereon and output the display information including a second screen provided with an operation section for returning to the first screen.

これによると、教師画像の収集状況に問題のある場合に、教師画像を生成すると共に教師画像に属性を設定する第１の画面において不足する教師画像を追加するための作業に、速やかに進むことができる。 According to this, if there is a problem with the collection status of teacher images, you can immediately proceed to the task of adding missing teacher images on the first screen that generates a teacher image and sets attributes for the teacher image. I can do it.

また、第８の発明は、監視エリアに対応した画像認識モデルを構築するための教師画像の収集状況を可視化する処理をプロセッサにより実行する画像処理方法であって、前記監視エリアに関するエリア画像から、検知対象物と背景とを含む教師画像を生成し、前記教師画像に含まれる前記検知対象物の特徴に関する属性を前記教師画像ごとに設定し、ユーザが指定した前記属性を有する前記教師画像を対象にして、前記エリア画像の各位置における前記教師画像の収集状況を可視化した可視化画像を生成し、前記可視化画像を前記エリア画像に重畳した表示情報を出力する構成とする。 Further, an eighth invention is an image processing method in which a processor executes a process of visualizing the collection status of teacher images for constructing an image recognition model corresponding to a monitoring area, the method comprising: A teacher image including a detection target object and a background is generated, attributes related to the characteristics of the detection target included in the teacher image are set for each teacher image, and the teacher image having the attributes specified by the user is targeted. Then, a visualized image that visualizes the acquisition status of the teacher images at each position of the area image is generated, and display information in which the visualized image is superimposed on the area image is output.

これによると、第１の発明と同様に、学習に先だって、教師画像のアノテーション状況をユーザが目視で容易に確認でき、効率よく高精度な学習モデルを作成することができる。 According to this, as in the first invention, the user can easily visually check the annotation status of the teacher image prior to learning, and can efficiently create a highly accurate learning model.

以下、本発明の実施の形態を、図面を参照しながら説明する。 Embodiments of the present invention will be described below with reference to the drawings.

図１は、本実施形態に係る画像認識モデル構築システムの全体構成図である。 FIG. 1 is an overall configuration diagram of an image recognition model construction system according to this embodiment.

本システムは、画像処理装置１（情報処理装置）と、カメラ２と、レコーダー３とを備えている。 This system includes an image processing device 1 (information processing device), a camera 2, and a recorder 3.

カメラ２は、監視エリアを撮影する。レコーダー３は、カメラ２による撮影画像を蓄積する。画像処理装置１には、レコーダー３に蓄積された撮影画像が入力される。 Camera 2 photographs the monitoring area. The recorder 3 stores images taken by the camera 2. Photographed images stored in the recorder 3 are input to the image processing device 1 .

画像処理装置１は、ＰＣなどで構成される。画像処理装置１には、ディスプレイ４と、キーボードやマウスなどの入力デバイス５とが接続されている。なお、ディスプレイ４と入力デバイス５とが一体化されたタッチパネルディスプレイでもよい。 The image processing device 1 is composed of a PC or the like. The image processing device 1 is connected to a display 4 and an input device 5 such as a keyboard and a mouse. Note that a touch panel display in which the display 4 and the input device 5 are integrated may be used.

画像処理装置１は、カメラ２の撮影画像から所定の事象を検知する画像認識モデル（機械学習モデル）を、ディープラーニングなどの機械学習により構築する。また、画像処理装置１は、画像認識モデルを構築するための学習に用いられる教師画像（学習用画像）を生成する。また、画像処理装置１は、教師画像と異なる評価用画像を用いて機械学習モデルの出来具合を評価する。 The image processing device 1 constructs an image recognition model (machine learning model) for detecting a predetermined event from images captured by the camera 2 using machine learning such as deep learning. The image processing device 1 also generates a teacher image (learning image) used for learning to construct an image recognition model. The image processing device 1 also evaluates the performance of the machine learning model using an evaluation image different from the teacher image.

さらに、画像処理装置１は、学習に先だって、教師画像の収集状況（アノテーション状況）、すなわち、必要な属性の教師画像が十分な数でかつ適切な配分で揃っているか否かを可視化してユーザに提示する。 Furthermore, prior to learning, the image processing device 1 visualizes the collection status (annotation status) of teacher images, that is, whether or not there are a sufficient number of teacher images with the necessary attributes and in an appropriate distribution, so that the user can to be presented.

なお、本実施形態では、画像処理装置１が、教師画像を生成して、その教師画像を用いて画像認識モデル（機械学習モデル）を構築する学習処理を行うが、画像処理装置１とは異なる装置で学習処理が行われてもよい。 Note that in this embodiment, the image processing device 1 performs a learning process of generating a teacher image and constructing an image recognition model (machine learning model) using the teacher image, but this is different from the image processing device 1. A learning process may be performed in the device.

次に、画像処理装置１の概略構成について説明する。図２は、画像処理装置１の概略構成を示すブロック図である。 Next, a schematic configuration of the image processing device 1 will be described. FIG. 2 is a block diagram showing a schematic configuration of the image processing device 1. As shown in FIG.

画像処理装置１は、通信部１１と、記憶部１２と、プロセッサ１３と、を備えている。 The image processing device 1 includes a communication section 11, a storage section 12, and a processor 13.

通信部１１は、レコーダー３との間で通信を行う。 The communication unit 11 communicates with the recorder 3.

記憶部１２は、プロセッサ１３で実行されるプログラムなどを記憶する。また、プロセッサ１３で生成した教師画像およびその属性を管理するデータベースの登録情報を記憶する。 The storage unit 12 stores programs executed by the processor 13 and the like. It also stores registration information of a database that manages the teacher images generated by the processor 13 and their attributes.

プロセッサ１３は、記憶部１２に記憶されたプログラムを実行することで各種の処理を行う。本実施形態では、プロセッサ１３が、教師画像生成処理、抽出枚数計測処理、可視化処理、出力処理、および学習処理などを行う。 The processor 13 performs various processes by executing programs stored in the storage unit 12. In this embodiment, the processor 13 performs teacher image generation processing, extraction number measurement processing, visualization processing, output processing, learning processing, and the like.

教師画像生成処理（アノテーション処理）では、プロセッサ１３が、画像認識モデルを構築するための学習に用いられる教師画像（学習用画像）を生成する。また、教師画像生成処理では、プロセッサ１３が、ユーザの入力操作に応じて、教師画像に含まれる検知対象物およびその背景の特徴に関する属性を教師画像ごとに設定する。 In the teacher image generation process (annotation process), the processor 13 generates a teacher image (learning image) used for learning to construct an image recognition model. Further, in the teacher image generation process, the processor 13 sets attributes related to the characteristics of the detection target and its background included in the teacher image for each teacher image in accordance with the user's input operation.

抽出枚数計測処理では、プロセッサ１３が、教師画像の属性、例えば教師画像に含まれる人物の属性（例えば人物の服装の色）ごとに、エリア画像上の各位置における教師画像の抽出枚数を計測する。 In the extraction number measurement process, the processor 13 measures the number of teacher images to be extracted at each position on the area image for each attribute of the teacher image, for example, the attribute of the person included in the teacher image (for example, the color of the person's clothing). .

可視化処理では、プロセッサ１３が、教師画像の収集状況（アノテーション状況）を可視化する。可視化処理では、ユーザが指定した属性を有する教師画像を対象にして、エリア画像の各位置における教師画像の収集状況を可視化した可視化画像を生成する。具体的には、エリア画像の各位置における教師画像の収集状況を表すヒートマップ画像や、教師画像の収集状況に問題のあるエリア画像上の範囲を表す枠画像（マーク画像）を生成する。 In the visualization process, the processor 13 visualizes the collection status (annotation status) of teacher images. In the visualization process, a visualized image is generated that visualizes the collection status of teacher images at each position of the area image, targeting teacher images having attributes specified by the user. Specifically, a heat map image representing the teacher image collection status at each position of the area image and a frame image (mark image) representing a range on the area image where there is a problem with the teacher image collection status are generated.

出力処理では、プロセッサ１３が、アノテーション作業モードの画面（図１２参照）、アノテーション状況確認モードの画面（図１３～図１５参照）、アノテーション状況確認モードの画面（図１６～図１９参照）などをディスプレイ４に出力する。 In the output process, the processor 13 displays an annotation work mode screen (see FIG. 12), an annotation status confirmation mode screen (see FIGS. 13 to 15), an annotation status confirmation mode screen (see FIGS. 16 to 19), etc. Output to display 4.

学習処理では、プロセッサ１３が、カメラ２の撮影画像から所定の事象を検知する画像認識モデル（機械学習モデル）を機械学習により構築する。学習処理では、教師画像生成処理で生成した教師画像が用いられる。 In the learning process, the processor 13 uses machine learning to construct an image recognition model (machine learning model) that detects a predetermined event from images taken by the camera 2. In the learning process, a teacher image generated in the teacher image generation process is used.

次に、画像認識モデルが利用される事象検知システムについて説明する。図３は、人数計測システムの場合にエリア画像上に設定される検知領域および検知前領域を示す説明図である。 Next, an event detection system using an image recognition model will be described. FIG. 3 is an explanatory diagram showing a detection area and a pre-detection area set on an area image in the case of the people counting system.

本実施形態では、監視エリアを通行する人物の数を計測する人数計測システムに用いられる画像認識モデル（機械学習モデル）を構築する。具体的には、店舗に来店する人物の数（来店客数）を計測するために、画像認識モデルを用いて、監視エリアとしての店舗の入口をカメラ２により撮影したエリア画像から対象事象として人物の来店を検知する。 In this embodiment, an image recognition model (machine learning model) used in a people counting system that counts the number of people passing through a monitoring area is constructed. Specifically, in order to measure the number of people visiting the store (number of customers visiting the store), an image recognition model is used to identify people as a target event from an area image taken by camera 2 of the entrance of the store as a monitoring area. Detect store visits.

この場合、検知対象物が、店舗に来店する人物（来店客）である。また、エリア画像に検知領域と検知前領域とが予め設定される。検知領域は、エリア画像における店舗の入口の位置に設定される。検知前領域は、人物が検知領域に進入する前に通過する領域であり、検知領域に隣接した通路の位置に設定される。 In this case, the object to be detected is a person (customer) visiting the store. Further, a detection area and a pre-detection area are set in advance in the area image. The detection area is set at the location of the store entrance in the area image. The pre-detection area is an area through which a person passes before entering the detection area, and is set at a position in a passage adjacent to the detection area.

画像認識モデルは、人物（検知対象物）が検知前領域から検知領域に移動したことで、人物が店舗に来店したものと判定して、計測結果（来店客数）が１人加算される。このとき、人物の代表点の位置に基づいて、人物が検知前領域を通過したこと、および人物が検知領域に進入したことが判定される。なお、代表点は、人物矩形の中心点または足元の中心点である。 When the person (detection target) moves from the pre-detection area to the detection area, the image recognition model determines that the person has visited the store, and adds one person to the measurement result (number of customers visiting the store). At this time, based on the position of the representative point of the person, it is determined that the person has passed through the pre-detection area and that the person has entered the detection area. Note that the representative point is the center point of the person's rectangle or the center point of the feet.

検知領域および検知前領域は、エリア画像上に多角形で設定される。エリア画像上に設定された検知領域および検知前領域の位置に関する情報として、多角形の頂点の座標が登録される。 The detection area and the pre-detection area are set as polygons on the area image. The coordinates of the vertices of the polygon are registered as information regarding the positions of the detection area and the pre-detection area set on the area image.

本実施形態では、監視エリアを通行する人物の数を計測する人数計測システムに用いられる画像認識モデルについて説明するが、種々の事象を検知する事象検知システムに用いられる画像認識モデルであってもよい。 In this embodiment, an image recognition model used in a people counting system that measures the number of people passing through a surveillance area will be described, but the image recognition model may also be used in an event detection system that detects various events. .

例えば、人物（検知対象物）が侵入禁止エリアに侵入したことを検知する侵入検知システムに用いられる画像認識モデルであってもよい。この場合、侵入禁止エリアを含む監視エリアをカメラ２により撮影したエリア画像において、侵入禁止エリアの位置に検知領域が設定され、人物が検知前領域から検知領域に移動したことで、人物が侵入禁止エリアに侵入したものと判定される。 For example, it may be an image recognition model used in an intrusion detection system that detects that a person (detection target) has entered a prohibited area. In this case, in the area image taken by camera 2 of the monitoring area including the no-entry area, a detection area is set at the position of the no-entry area, and the person is prohibited from entering as the person moves from the pre-detection area to the detection area. It is determined that the person has invaded the area.

また、人物が荷物を置き去りにしたことを検知する置き去り検知システムに用いられる画像認識モデルであってもよい。この場合、例えば非常用進入口（消防隊進入口）のような置き去り禁止エリアを含む監視エリアをカメラ２により撮影したエリア画像において、置き去り禁止エリアの位置に検知領域が設定され、荷物を所持した人物が検知前領域から検知領域に移動し、かつ、荷物を検知領域に放置したまま人物が検知領域から退出したことで、人物が荷物を置き去りにしたものと判定される。 Alternatively, it may be an image recognition model used in an abandonment detection system that detects whether a person has left luggage behind. In this case, for example, in an area image taken by camera 2 of a surveillance area that includes a prohibited area such as an emergency entrance (fire brigade entrance), a detection area is set at the position of the prohibited area, and a detection area is set at the position of the prohibited area, such as an emergency entrance (fire brigade entrance). When the person moves from the pre-detection area to the detection area and leaves the detection area while leaving the package in the detection area, it is determined that the person has left the package behind.

また、人物が特定の場所に長時間滞在したことを検知する滞在検知システムに用いられる画像認識モデルであってもよい。この場合、例えば、小売店における顧客のレジ待ちに関するものであれば、レジ待ちエリアを含む監視エリアをカメラ２により撮影したエリア画像において、レジ待ちエリアの位置に検知領域が設定され、人物が検知前領域から検知領域に移動した後に、人物が検知領域に所定時間以上滞留したことで、人物がレジ待ちエリアに長時間滞在したものと判定される。 Alternatively, the image recognition model may be used in a stay detection system that detects that a person has stayed in a specific place for a long time. In this case, for example, if the subject is related to a customer waiting at the checkout at a retail store, a detection area is set at the position of the checkout waiting area in the area image taken by camera 2 of the monitoring area including the checkout waiting area, and a person is detected. If the person stays in the detection area for a predetermined time or longer after moving from the front area to the detection area, it is determined that the person has stayed in the cashier waiting area for a long time.

次に、画像処理装置１で行われる教師画像生成処理の概要について説明する。図４は、教師画像生成処理の概要を示す説明図である。 Next, an overview of the teacher image generation process performed by the image processing device 1 will be explained. FIG. 4 is an explanatory diagram showing an overview of the teacher image generation process.

画像処理装置１では、画像認識モデル（機械学習モデル）を構築するための機械学習に用いられる教師画像（学習用画像）を生成する処理が行われる（教師画像生成処理）。 The image processing device 1 performs a process of generating a teacher image (learning image) used in machine learning for constructing an image recognition model (machine learning model) (teacher image generation process).

ここで、図４（Ａ）に示すように、対象となる監視エリアをカメラ２により撮影した実写エリア画像（現実エリア画像）に人物（検知対象物）が含まれる場合、実写エリア画像内の人物を含む領域を切り出すことで教師画像が生成される。 Here, as shown in FIG. 4(A), if a person (detection target object) is included in the live-action area image (real-area image) captured by the camera 2 of the target monitoring area, the person in the live-shot area image is A teacher image is generated by cutting out the area containing the .

また、図４（Ｂ）に示すように、エリア画像（実写エリア画像またはＣＧエリア画像）に人物画像（実写人物画像またはＣＧ人物画像）を重畳して人物を含む合成エリア画像が作成され、その合成エリア画像から人物を含む領域を切り出すことで教師画像が生成される。ここで、実写エリア画像（現実エリア画像）は、監視エリアをカメラ２により撮影した画像である。ＣＧエリア画像（仮想エリア画像）は、実写エリア画像をＣＧ（Computer Graphics）により模擬した画像である。実写人物画像（現実人物画像）は、カメラ２などで撮影された画像から人物の領域を切り出すことで生成された画像である。ＣＧ人物画像（仮想人物画像）は、ＣＧで作成された画像である。 Furthermore, as shown in FIG. 4(B), a composite area image including a person is created by superimposing a person image (a live-action person image or a CG person image) on an area image (a live-action area image or a CG area image). A teacher image is generated by cutting out a region including a person from the composite area image. Here, the real area image (actual area image) is an image captured by the camera 2 of the monitoring area. The CG area image (virtual area image) is an image that simulates a real area image using CG (Computer Graphics). A real person image (real person image) is an image generated by cutting out a person area from an image photographed by the camera 2 or the like. A CG person image (virtual person image) is an image created using CG.

また、教師画像は、エリア画像から１フレームごとに作成される。したがって、図４（Ａ）に示すように、人物を含む実写エリア画像から教師画像が生成される場合には、監視エリア内を人物が歩行するのに応じて、実写エリア画像上で１フレームごとに人物が移動することから、人物の移動に対応して教師画像の切り出し位置を徐々に変化させればよい。また、図４（Ｂ）に示すように、エリア画像に人物画像を重畳した合成エリア画像から教師画像が生成される場合には、監視エリア内を人物が歩行する状態を再現するように、エリア画像上で人物画像を移動させながら、エリア画像から１フレームごとに教師画像を切り出してもよい。 Further, the teacher image is created for each frame from the area image. Therefore, as shown in FIG. 4(A), when a teacher image is generated from a live-action area image including a person, each frame on the live-action area image is Since the person moves, the cutting position of the teacher image may be gradually changed in accordance with the movement of the person. Furthermore, as shown in FIG. 4(B), when a teacher image is generated from a composite area image in which a person image is superimposed on an area image, the area is The teacher image may be cut out frame by frame from the area image while moving the person image on the image.

また、エリア画像（実写エリア画像、ＣＧエリア画像）に人物画像（実写人物画像、ＣＧ人物画像）が重畳された合成エリア画像から教師画像が生成される場合には、エリア画像における人物が出現する可能性がある位置に人物画像が重畳される。 Furthermore, when a teacher image is generated from a composite area image in which a person image (a live-action person image, a CG person image) is superimposed on an area image (a live-action area image, a CG area image), the person in the area image appears. A person image is superimposed on a possible position.

なお、対象となる監視エリアが同じでも、別々のカメラ２で異なる方向から撮影された場合、人物（検知対象物）の向きが異なるため、教師画像が別に用意され、別の画像認識モデル（機械学習モデル）が構築される。 Note that even if the target monitoring area is the same, if images are taken from different directions with different cameras 2, the orientation of the person (detected object) will be different, so a separate teacher image is prepared and a different image recognition model (mechanical learning model) is constructed.

次に、エリア画像に関する情報について説明する。図５は、データベースに登録されるエリア画像に関する情報を示す説明図である。 Next, information regarding area images will be explained. FIG. 5 is an explanatory diagram showing information regarding area images registered in the database.

エリア画像（実写背景画像、ＣＧ背景画像）は、監視エリアを表す画像である。エリア画像には、監視エリアに存在する柱、壁、シャッターなどの構造物が含まれる。また、エリア画像には、監視エリアに滞在する人物が含まれる。 The area image (actual background image, CG background image) is an image representing a monitoring area. The area image includes structures such as pillars, walls, and shutters that exist in the monitoring area. Furthermore, the area image includes people staying in the monitoring area.

画像処理装置１は、エリア画像に関する情報をデータベースに登録して管理する。データベースに登録されるエリア画像に関する情報には、構造物情報と、人物情報とが含まれる。構造物情報は、エリア画像に含まれる構造物に関する情報である。人物情報は、エリア画像に背景として含まれる人物に関する情報である。 The image processing device 1 registers and manages information regarding area images in a database. Information related to area images registered in the database includes structure information and person information. The structure information is information regarding structures included in the area image. The person information is information regarding a person included in the area image as a background.

構造物情報には、構造物の種別に関する情報と、構造物の属性に関する情報とが含まれる。構造物の種別に関する情報は、固定構造物および移動構造物のいずれかであるかを示す情報である。例えば、柱や壁は固定構造物であり、店舗の入口などに設置されたシャッターは移動構造物である。構造物の属性に関する情報は、例えば、移動構造物としてのシャッターの開閉時刻などである。 The structure information includes information regarding the type of structure and information regarding the attributes of the structure. Information regarding the type of structure is information indicating whether the structure is a fixed structure or a mobile structure. For example, pillars and walls are fixed structures, and shutters installed at store entrances are movable structures. The information regarding the attributes of the structure is, for example, the opening/closing time of a shutter as a moving structure.

人物情報には、人物の服装の色種別に関する情報（上半身の服装の色、下半身の服装の色）と、人物の持ち物の有無に関する情報とが含まれる。なお、持ち物とは、荷物の他に、ベビーカーや台車などのように人物が動かす物体も含まれる。 The person information includes information regarding the color type of the person's clothing (the color of the upper body clothing, the color of the lower body clothing), and information regarding the presence or absence of the person's belongings. Note that belongings include not only luggage but also objects that the person moves, such as strollers and trolleys.

次に、画像処理装置１で設定される人物矩形について説明する。図６は、人物矩形を示す説明図である。なお、人物画像（実写人物画像、ＣＧ人物画像）が重畳された合成エリア画像から教師画像が切り出される場合には、人物領域は人物画像に相当する。 Next, a person rectangle set by the image processing device 1 will be explained. FIG. 6 is an explanatory diagram showing a person rectangle. Note that when a teacher image is cut out from a composite area image on which a person image (a real person image, a CG person image) is superimposed, the person area corresponds to the person image.

図６（Ａ）に示すように、本実施形態では、検知対象物としての人物を取り囲む人物矩形が設定され、人物矩形に関する情報として、人物矩形の高さＨおよび幅Ｗがデータベースに登録される。図６（Ｂ）に示す例は、検知対象物としての人物がベビーカーを動かしている場合である。この場合も、人物のみを取り囲むように人物矩形が設定される。 As shown in FIG. 6A, in this embodiment, a person rectangle surrounding a person as a detection target is set, and the height H and width W of the person rectangle are registered in the database as information regarding the person rectangle. . The example shown in FIG. 6(B) is a case where a person as a detection target is moving a stroller. In this case as well, the person rectangle is set so as to surround only the person.

また、図６（Ｃ）に示すように、ＣＧ人物画像の場合には、人物の輪郭に関する情報として、輪郭を構成する点（輪郭点）の座標がデータベースに登録される。 Further, as shown in FIG. 6C, in the case of a CG person image, the coordinates of points (contour points) forming the contour are registered in the database as information regarding the contour of the person.

また、図６（Ｄ），（Ｅ）に示すように、人物の位置に関する情報として、基準点（人物矩形の左上の点）の座標と、中心点（人物矩形の中心点）の座標と、足元の中心点（人物矩形の中心点を通る垂線と人物矩形の底辺との交点）の座標とがデータベースに登録される。 In addition, as shown in FIGS. 6(D) and (E), information regarding the position of the person includes the coordinates of the reference point (the upper left point of the person's rectangle), the coordinates of the center point (the center point of the person's rectangle), The coordinates of the center point of the feet (the intersection of the perpendicular line passing through the center point of the person's rectangle and the base of the person's rectangle) are registered in the database.

次に、画像処理装置１で生成される教師画像について説明する。図７は、教師画像を示す説明図である。 Next, the teacher image generated by the image processing device 1 will be explained. FIG. 7 is an explanatory diagram showing a teacher image.

教師画像は、人物を含むエリア画像から人物を含む領域を切り出すことで生成される。教師画像には、検知対象物としての人物を表す人物領域と、その人物の背景となる背景領域とが含まれる。背景領域には柱や床などの建築構造物が含まれる。 The teacher image is generated by cutting out a region including a person from an area image including the person. The teacher image includes a person area representing a person as a detection target and a background area serving as the background of the person. The background area includes architectural structures such as columns and floors.

図７（Ａ－１），（Ａ－２）に示すように、教師画像は、検知対象物としての人物を取り囲む人物矩形を基準にしてエリア画像から切り出される。すなわち、教師画像は、人物矩形の周囲に所定の幅で拡大された矩形の範囲をエリア画像から切り出すことで作成される。具体的には、教師画像は、所定の横方向の拡大幅αで人物矩形の領域が左右に拡大されると共に、所定の縦方向の拡大幅βで人物矩形の領域が上下に拡大された大きさを有する。教師画像には、人物矩形に含まれる人物領域と、人物矩形に含まれる背景領域と、人物矩形の周囲の背景領域とで構成される。なお、拡大幅α，βは、例えば０～１０ピクセルの範囲で適宜に設定されてもよい。 As shown in FIGS. 7(A-1) and (A-2), the teacher image is cut out from the area image based on the person rectangle surrounding the person as the detection target. That is, the teacher image is created by cutting out from the area image a rectangular range that is expanded by a predetermined width around the person rectangle. Specifically, in the teacher image, the area of the person rectangle is enlarged horizontally by a predetermined horizontal enlargement width α, and the area of the person rectangle is enlarged vertically by a predetermined vertical enlargement width β. It has a certain quality. The teacher image includes a person area included in the person rectangle, a background area included in the person rectangle, and a background area around the person rectangle. Note that the enlargement widths α and β may be appropriately set, for example, in the range of 0 to 10 pixels.

図７（Ｂ－１），（Ｂ－２）に示す例は、検知対象物としての人物がベビーカーを動かしている場合である。この場合も、人物のみを取り囲む人物矩形の周囲に所定の幅で拡大された矩形の範囲をエリア画像から切り出すことで教師画像が作成される。 The example shown in FIGS. 7(B-1) and (B-2) is a case where a person as a detection target is moving a stroller. In this case as well, a teacher image is created by cutting out from the area image a rectangular range that is expanded by a predetermined width around a person rectangle that surrounds only the person.

また、実運用時の監視エリアでは、カメラ２から見て複数の人物が前後に重なり合う状況（人物の重なり）が発生する場合がある。このような状況でも画像認識モデル（機械学習モデル）の性能を確保するため、人物の重なりが発生している状態の教師画像を用いて、画像認識モデル（機械学習モデル）を構築するための機械学習が行われる。 Furthermore, in the monitoring area during actual operation, a situation may occur in which a plurality of people overlap one another when viewed from the camera 2 (overlapping people). In order to ensure the performance of image recognition models (machine learning models) even in such situations, we have developed a machine that builds image recognition models (machine learning models) using teacher images with overlapping people. Learning takes place.

また、人物の重なりが発生する場合、図７（Ｃ－１），（Ｃ－２）に示すように、検知対象となる人物の後側に他の人物が現れる状態と、図７（Ｄ－１），（Ｄ－２）に示すように、検知対象となる人物の前側に他の人物が現れる状態とがある。この場合、検知対象となる人物の後側に他の人物が現れた状態で教師画像が抽出されると、教師画像には背景としての人物領域が含まれる。また、検知対象となる人物の前側に他の人物が現れた状態で教師画像が抽出されると、教師画像には前景としての人物領域が含まれる。 In addition, when overlapping people occur, as shown in Figures 7 (C-1) and (C-2), another person appears behind the person to be detected, and in Figure 7 (D- As shown in 1) and (D-2), there is a state in which another person appears in front of the person to be detected. In this case, if a teacher image is extracted with another person appearing behind the person to be detected, the teacher image includes a person area as a background. Further, when a teacher image is extracted with another person appearing in front of the person to be detected, the teacher image includes a person area as the foreground.

なお、人物の重なりが発生している状態の教師画像は、人物の重なりが発生している実写エリア画像から教師画像が抽出されることで生成される。また、人物を含むエリア画像に、そのエリア画像に含まれる人物に重なるように人物画像（実写人物画像、ＣＧ人物画像）が重畳された合成エリア画像を生成することでも、人物の重なりが発生している状態の教師画像が生成される。また、複数の人物画像（実写人物画像、ＣＧ人物画像）が重なるようにエリア画像に重畳された合成エリア画像を生成することでも、人物の重なりが発生している状態の教師画像が生成される。 Note that the teacher image in which persons overlap is generated by extracting the teacher image from a live-action area image in which persons overlap. Additionally, by generating a composite area image in which a person image (real-life person image, CG person image) is superimposed on an area image that includes a person so as to overlap the person included in the area image, overlapping of people can also be prevented. A teacher image is generated in a state where the Furthermore, by generating a composite area image in which multiple person images (real-life person images, CG person images) are superimposed on an area image, a teacher image with overlapping people can be generated. .

また、教師画像がエリア画像から生成されると、エリア画像上における教師画像の位置に関する情報（座標）と、教師画像のサイズに関する情報（高さ、幅）とがデータベースに登録される。 Further, when a teacher image is generated from an area image, information regarding the position of the teacher image on the area image (coordinates) and information regarding the size of the teacher image (height, width) are registered in the database.

次に、画像処理装置１で管理される教師画像に関する情報について説明する。図８は、データベースに登録される教師画像に関する情報を示す説明図である。 Next, information regarding teacher images managed by the image processing device 1 will be explained. FIG. 8 is an explanatory diagram showing information regarding teacher images registered in the database.

画像処理装置１は、教師画像に関する情報をデータベースに登録して管理する。データベースに登録される教師画像に関する情報には、画像番号（画像識別情報）と、属性情報と、画像情報とが含まれる。 The image processing device 1 registers and manages information regarding teacher images in a database. Information regarding teacher images registered in the database includes an image number (image identification information), attribute information, and image information.

属性情報には、教師画像に含まれる人物の服装に関する情報（上半身の服装の色、下半身の服装の色）と、人物の持ち物の有無に関する情報と、隠蔽に関する情報とが含まれる。この他に、人物の性別、身長、体形などが属性情報に含まれてもよい。なお、持ち物とは、荷物の他に、ベビーカーや台車などのように人物が動かす物体も含まれる。 The attribute information includes information regarding the clothing of the person included in the teacher image (color of upper body clothing, color of lower body clothing), information regarding presence/absence of belongings of the person, and information regarding concealment. In addition to this, the attribute information may include the person's gender, height, body shape, and the like. Note that belongings include not only luggage but also objects that the person moves, such as strollers and trolleys.

隠蔽に関する情報には、人物の重なりに関する情報と、人物以外の物体による隠蔽に関する情報とが含まれる。人物の重なりに関する情報は、検知対象物としての人物の背景に他の人物が存在する状態や、検知対象物としての人物の前景に他の人物が存在する状態が発生しているか否かを表す。人物以外の物体による隠蔽に関する情報は、検知対象物としての人物が、人物以外の物体により部分的に隠蔽された状態が発生しているか否かを表す。 Information regarding concealment includes information regarding overlapping of persons and information regarding concealment by objects other than persons. Information regarding the overlap of people indicates whether or not another person exists in the background of the person serving as the detection target, or another person exists in the foreground of the person serving as the detection target. . The information regarding concealment by an object other than a person indicates whether or not a state in which a person as a detection target is partially hidden by an object other than a person has occurred.

画像情報には、教師画像に含まれる人物を取り囲む人物矩形に関する情報（高さＨ、幅Ｗ）と、ＣＧ人物画像の場合における人物の輪郭に関する情報（輪郭を構成する点の座標）と、人物領域の位置に関する情報とが含まれる。 The image information includes information about the person rectangle surrounding the person included in the teacher image (height H, width W), information about the contour of the person in the case of a CG person image (coordinates of points making up the contour), and the person Information regarding the location of the area is included.

人物領域の位置に関する情報には、基準点（人物矩形の左上の点）の座標と、中心点（人物矩形の中心点）の座標と、足元の中心点（人物矩形の中心点を通る垂線と人物矩形の底辺との交点）の座標とが含まれる。なお、足元の中心点は、検知領域および検知前領域に人物が進入したか否かの判定に用いられる。また、各点の座標はエリア画像上での座標である。 Information regarding the position of the person area includes the coordinates of the reference point (the upper left point of the person rectangle), the coordinates of the center point (center point of the person rectangle), and the center point of the feet (perpendicular line passing through the center point of the person rectangle). The coordinates of the intersection point with the base of the person rectangle are included. Note that the center point of the feet is used to determine whether a person has entered the detection area and the pre-detection area. Further, the coordinates of each point are coordinates on the area image.

次に、アノテーション状況の不備を改善する手順について説明する。図９は、エリア画像上における教師画像の抽出位置に関するアノテーション状況の不備を示す説明図である。 Next, we will explain the procedure for improving deficiencies in the annotation status. FIG. 9 is an explanatory diagram showing deficiencies in the annotation status regarding the extraction position of the teacher image on the area image.

本実施形態では、人物を含むエリア画像から人物を含む領域を切り出すことで教師画像が生成される。一方、教師画像には人物領域と背景領域とが含まれ、教師画像内の人物領域と背景領域との特徴の差異に応じて、画像認識モデル（機械学習モデル）における人物の認識精度が変化する。すなわち、同様の特徴の人物でも、教師画像が抽出されたエリア画像上の位置が異なると、人物の認識精度が変化する。また、教師画像が抽出されたエリア画像上の位置が同じでも、人物の特徴、例えば人物の服装の色が異なると、人物の認識精度が変化する。このため、人物の特徴（例えば人物の服装の色）ごとに、エリア画像から教師画像が切り出された位置に偏りがあると、安定した精度の画像認識モデル（機械学習モデル）が構築できない。 In this embodiment, a teacher image is generated by cutting out a region including a person from an area image including the person. On the other hand, the teacher image includes a person area and a background area, and the person recognition accuracy in the image recognition model (machine learning model) changes depending on the difference in characteristics between the person area and the background area in the teacher image. . That is, even if people have similar characteristics, if the position on the area image from which the teacher image is extracted differs, the recognition accuracy of the person will change. Further, even if the position on the area image from which the teacher image is extracted is the same, if the characteristics of the person, for example, the color of the person's clothing, are different, the recognition accuracy of the person will change. Therefore, if there is a bias in the position where the teacher image is extracted from the area image for each person's characteristics (for example, the color of the person's clothing), it is not possible to construct an image recognition model (machine learning model) with stable accuracy.

そこで、本実施形態では、画像処理装置１において、教師画像に含まれる人物の属性ごとに、エリア画像上の各位置における教師画像の抽出枚数が計測される（抽出枚数計測処理）。次に、教師画像に含まれる人物の属性ごとに、エリア画像上の各位置における教師画像の抽出枚数が比較され、エリア画像内の人物が出現する可能性がある領域において、満遍なく教師画像が抽出されているかに関するアノテーション状況を表す情報がユーザに提示される。具体的には、エリア画像における他の位置に比較して教師画像の抽出枚数が顕著に少ない位置が検知されると、その教師画像の抽出枚数が顕著に少ない位置がユーザに提示される。 Therefore, in the present embodiment, the image processing device 1 measures the number of extracted teacher images at each position on the area image for each attribute of the person included in the teacher image (extracted number measurement process). Next, for each attribute of the person included in the teacher image, the number of teacher images extracted at each position on the area image is compared, and teacher images are evenly extracted from areas in the area image where a person may appear. The user is presented with information representing the annotation status regarding whether or not the annotation is being performed. Specifically, when a position where the number of extracted teacher images is significantly smaller than other positions in the area image is detected, the position where the number of extracted teacher images is significantly smaller is presented to the user.

これに応じて、ユーザは、教師画像の抽出枚数が顕著に少ない位置を対象にして、教師画像を追加するアノテーション作業を行う。これにより、不足する教師画像が補充されて、教師画像に位置的な偏りがあるアノテーション状況の不備が改善され、監視エリア内で検知対象となる人物が出現した位置に応じて、画像認識モデル（機械学習モデル）の認識精度が大きく変化する不具合を避けることができる。 In response to this, the user performs an annotation work to add teacher images to positions where the number of extracted teacher images is significantly small. As a result, missing teacher images are replenished, and deficiencies in the annotation situation where the teacher images are positionally biased are corrected, and the image recognition model ( It is possible to avoid problems in which the recognition accuracy of machine learning models (machine learning models) changes significantly.

図９に示す例では、教師画像に含まれる人物の属性として、人物の服装の色に注目して、エリア画像上の各位置における教師画像の抽出枚数が計測されている。本例では、一例として黄色系、水色系、緑色系、および赤色系の４系統の色に注目している。また、図９では、分析結果として、指定された属性（人物の服装の色が黄色）の教師画像が少ない領域、属性を限定せずに教師画像が少ない領域、全ての属性で教師画像が十分な領域がエリア画像上に示されている。 In the example shown in FIG. 9, the number of extracted teacher images at each position on the area image is measured, focusing on the color of the person's clothing as an attribute of the person included in the teacher image. In this example, attention is focused on four colors: yellow, light blue, green, and red. In addition, in Figure 9, the analysis results show areas where there are few teacher images for the specified attribute (the color of the person's clothing is yellow), areas where there are few teacher images without limiting the attribute, and areas where there are sufficient teacher images for all attributes. area is shown on the area image.

なお、黒色系、白色系などの他の系統の色に注目して教師画像の抽出枚数が計測されてもよく、また、必要に応じて色の系統が変更されてもよい。また、人物の服装の色を暖色系と寒色系と白黒系とに分けて教師画像の抽出枚数が計測されてもよい。 Note that the number of teacher images to be extracted may be measured by focusing on other colors such as black and white, or the color system may be changed as necessary. Furthermore, the number of extracted teacher images may be measured by dividing the color of a person's clothing into warm colors, cool colors, and black and white.

また、本例では、人物の全身の服装の色に注目しており、人物の服装の色が上半身と下半身とで同じ色となっているが、人物の服装の色が上半身と下半身とが異なる場合もあり、この場合、上半身の色と下半身の色との組み合わせに注目して教師画像の抽出枚数が計測されてもよい。 In addition, in this example, we are focusing on the color of the person's entire body, and the color of the person's clothes is the same color for the upper and lower body, but the color of the person's clothes is different for the upper and lower body. In this case, the number of extracted teacher images may be measured by paying attention to the combination of the color of the upper body and the color of the lower body.

次に、画像処理装置１で行われる抽出枚数計測処理について説明する。図１０は、抽出枚数計測処理を示す説明図である。 Next, a process for measuring the number of extracted images performed by the image processing device 1 will be explained. FIG. 10 is an explanatory diagram showing the extraction number measurement process.

抽出枚数計測処理では、まず、教師画像上の各位置（画素）に領域属性値を割り振り、さらに、教師画像上の各位置（画素）に割り振られた領域属性値を、エリア画像上の対応する位置（画素）に割り振る処理（マッピング処理）が行われる。具体的には、例えば、人物領域に「１」の領域属性値が割り振られ、背景領域に「０」の領域属性値が割り振られる。マッピング処理は、対象とする教師画像の全てに対して行われる。 In the extraction number measurement process, first, an area attribute value is assigned to each position (pixel) on the teacher image, and then the area attribute value allocated to each position (pixel) on the teacher image is applied to the corresponding area image. Processing (mapping processing) of allocating to positions (pixels) is performed. Specifically, for example, a region attribute value of "1" is assigned to the person region, and a region attribute value of "0" is assigned to the background region. The mapping process is performed on all target teacher images.

次に、エリア画像上の各位置（画素）において、領域属性値の各々（例えば１、０）が割り振られた回数をカウントする処理（カウント処理）が行われる。これにより得られたカウント値は、エリア画像上の各位置（画素）における属性ごと教師画像の抽出枚数を表す。 Next, at each position (pixel) on the area image, a process (counting process) is performed to count the number of times each area attribute value (for example, 1, 0) is assigned. The count value thus obtained represents the number of extracted teacher images for each attribute at each position (pixel) on the area image.

ここで、特定の属性の教師画像に注目して、エリア画像上の各位置（画素）における教師画像の抽出枚数を計測することができる。具体的には、注目する属性として人物の服装が注目色（例えば黄色）である教師画像を対象にして、エリア画像上の各位置（画素）において「１」の領域属性値が割り振られた回数をカウントする。このカウント値は、人物の服装の色が注目色（例えば黄色）である教師画像の抽出枚数を表す。また、この処理を人物の服装の各色で同様に行うことで、人物の服装の色ごとに教師画像の抽出枚数を取得することができる。これにより、人物の服装の色ごとに教師画像の抽出枚数が少ないエリア画像上の位置を可視化することができる。 Here, the number of extracted teacher images at each position (pixel) on the area image can be measured by focusing on teacher images with a specific attribute. Specifically, the number of times an area attribute value of "1" is assigned to each position (pixel) on the area image for a teacher image in which the person's clothes are in a color of interest (for example, yellow) as the attribute of interest. count. This count value represents the number of extracted teacher images in which the color of a person's clothing is the color of interest (for example, yellow). Further, by performing this process similarly for each color of the person's clothing, it is possible to obtain the number of extracted teacher images for each color of the person's clothing. This makes it possible to visualize the position on the area image for which the number of teacher images extracted is small for each color of a person's clothing.

また、エリア画像上の各位置（画素）において「０」の領域属性値が割り振られた回数をカウントする。このカウント値は、エリア画像上の各位置（画素）におれる教師画像の抽出枚数を表す。これにより、教師画像の抽出枚数が少ないエリア画像上の位置を可視化することができる。 Furthermore, the number of times an area attribute value of "0" is assigned to each position (pixel) on the area image is counted. This count value represents the number of teacher images extracted at each position (pixel) on the area image. Thereby, it is possible to visualize the position on the area image where the number of extracted teacher images is small.

次に、人物の重なりが発生している場合の抽出枚数計測処理について説明する。図１１は、人物の重なりが発生している場合の抽出枚数計測処理を示す説明図である。 Next, a process for measuring the number of extracted images when overlapping people occur will be described. FIG. 11 is an explanatory diagram showing a process for measuring the number of extracted images when overlapping people occur.

人物の重なりが発生している場合、教師画像上の各位置（画素）に領域属性値を割り振るマッピング処理において、背景となる人物領域と前景となる人物領域とに異なる領域属性値を割り振る。具体的には、例えば、図１１（Ａ）に示すように、背景となる人物が存在する場合には、背景となる人物領域に「２」の領域属性値を割り振る。図１１（Ｂ）に示すように、前景となる人物が存在する場合には、前景となる人物領域に「３」の領域属性値を割り振る。なお、人物領域に「１」の領域属性値を割り振り、人物を含まない背景領域に「０」の領域属性値を割り振る点は、図１０に示した例と同様である。 When overlapping people occur, different area attribute values are assigned to the background person area and the foreground person area in a mapping process that assigns area attribute values to each position (pixel) on the teacher image. Specifically, for example, as shown in FIG. 11A, if there is a person serving as the background, an area attribute value of "2" is assigned to the person area serving as the background. As shown in FIG. 11B, if there is a person in the foreground, an area attribute value of "3" is assigned to the foreground person area. Note that this is similar to the example shown in FIG. 10 in that a region attribute value of "1" is assigned to a person region, and a region attribute value of "0" is assigned to a background region that does not include a person.

また、領域属性値が割り振られた回数をカウントするカウント処理では、エリア画像上の各位置（画素）において、領域属性値の各々（例えば１、０、２、３）が割り振られた回数をカウントする。 In addition, in the counting process that counts the number of times an area attribute value is allocated, the number of times each area attribute value (for example, 1, 0, 2, 3) is allocated at each position (pixel) on the area image is counted. do.

ここで、注目する属性の教師画像として、人物の重なりを含む教師画像に関して、エリア画像の各位置における教師画像の抽出枚数を計測する。すなわち、エリア画像上の各位置（画素）において「２」または「３」の領域属性値が割り振られた回数をカウントする。このカウント値は、人物の重なりを含む教師画像の抽出枚数を表す。これにより、人物の重なりを含む教師画像の抽出枚数が少ないエリア画像上の位置を可視化することができる。 Here, as a teacher image of the attribute of interest, the number of extracted teacher images at each position of the area image is measured regarding a teacher image including overlapping people. That is, the number of times that an area attribute value of "2" or "3" is assigned to each position (pixel) on the area image is counted. This count value represents the number of extracted teacher images that include overlapping people. This makes it possible to visualize positions on area images where a small number of extracted teacher images including overlapping people are extracted.

このように人物の重なりが発生した場合、特に、検知対象となる人物の前側に他の人物が現れた状態では、検知対象物としての人物が他の人物で部分的に隠蔽された状態になる。一方、検知対象物としての人物の前側に、人物以外の物体が存在する場合がある。この場合、検知対象物としての人物が人物以外の物体で隠蔽された状態になる。このような状況でも画像認識モデル（機械学習モデル）の性能を確保するため、隠蔽が発生している状態の教師画像、すなわち、検知対象としての人物が前景の物体で隠蔽された状態の教師画像を用いて、画像認識モデル（機械学習モデル）を構築するための機械学習が行われるとよい。 When people overlap in this way, especially when another person appears in front of the person to be detected, the person to be detected will be partially hidden by the other person. . On the other hand, an object other than a person may exist in front of the person serving as the detection target. In this case, the person as the detection target is hidden by an object other than the person. In order to ensure the performance of the image recognition model (machine learning model) even in such situations, we use a teacher image in which occlusion occurs, that is, a teacher image in which the person to be detected is hidden by an object in the foreground. It is preferable to perform machine learning to construct an image recognition model (machine learning model) using .

この場合、隠蔽が発生している実写エリア画像や、人物を含むエリア画像に物体画像（実写物体画像、ＣＧ物体画像）が重畳された合成エリア画像や、人物画像（実写人物画像、ＣＧ人物画像）に物体画像が重畳された合成エリア画像を用いることで、隠蔽が発生している状態の教師画像を取得することができる。 In this case, a live-action area image in which concealment has occurred, a composite area image in which an object image (live-action object image, CG object image) is superimposed on an area image including a person, or a composite area image in which an object image (live-action object image, CG object image) is superimposed on an area image containing a person, or a composite area image in which an object image (live-action object image, CG object image) is superimposed on an area image containing a person, or a person image (live-action person image, CG person image) ), it is possible to obtain a teacher image in which concealment has occurred by using a composite area image in which an object image is superimposed on the object image.

次に、ディスプレイ４に表示される画面について説明する。図１２は、アノテーション作業モードの画面を示す説明図である。図１３は、アノテーション状況確認モードにおけるリスト選択時の画面を示す説明図である。図１４，図１５は、アノテーション状況確認モードにおけるグラフ選択時の画面を示す説明図である。図１６，図１７，図１８，図１９は、アノテーション状況詳細確認モードの画面を示す説明図である。 Next, the screen displayed on the display 4 will be explained. FIG. 12 is an explanatory diagram showing a screen in the annotation work mode. FIG. 13 is an explanatory diagram showing a screen when selecting a list in the annotation status confirmation mode. FIGS. 14 and 15 are explanatory diagrams showing screens when selecting a graph in the annotation status confirmation mode. 16, FIG. 17, FIG. 18, and FIG. 19 are explanatory diagrams showing screens in the annotation status detailed confirmation mode.

図１２～図１９に示すように、ディスプレイ４に表示される画面２１，６１，７１，８１には、アノテーション作業、アノテーション状況確認、およびアノテーション状況詳細確認の各タブ２２（操作部）が設けられている。ユーザが、アノテーション作業のタブ２２を操作すると、図１２に示すアノテーション作業モードの画面が表示される。ユーザが、アノテーション状況確認のタブ２２を操作すると、図１３～図１５に示すアノテーション状況確認モードの画面に遷移する。ユーザが、アノテーション状況詳細確認のタブ２２を操作すると、図１６～図１９に示すアノテーション状況詳細確認モードの画面に遷移する。 As shown in FIGS. 12 to 19, the screens 21, 61, 71, and 81 displayed on the display 4 are provided with tabs 22 (operation sections) for annotation work, annotation status confirmation, and annotation status detailed confirmation. ing. When the user operates the annotation work tab 22, an annotation work mode screen shown in FIG. 12 is displayed. When the user operates the annotation status confirmation tab 22, the screen transitions to an annotation status confirmation mode screen shown in FIGS. 13 to 15. When the user operates the annotation status detailed confirmation tab 22, the screen transitions to an annotation status detailed confirmation mode screen shown in FIGS. 16 to 19.

図１２に示すアノテーション作業モード（教師画像作成モード）の画面２１（第１の画面）には、画像入力部３１が設けられている。画像入力部３１には、ＣＧおよび実写の各タブ３２が設けられている。ユーザがＣＧのタブ３２を操作すると、ＣＧ画像入力モードになる。ユーザが実写のタブ３２を操作すると、実写画像入力モードになる。また、画像入力部３１には、人物画像入力部３３と、エリア画像入力部３４とが設けられている。 An image input section 31 is provided on the screen 21 (first screen) in the annotation work mode (teacher image creation mode) shown in FIG. The image input section 31 is provided with tabs 32 for CG and real photography. When the user operates the CG tab 32, a CG image input mode is entered. When the user operates the live-action image tab 32, a real-image input mode is entered. Further, the image input section 31 is provided with a person image input section 33 and an area image input section 34.

人物画像入力部３３では、ユーザが、入力ボタン３５を操作することで、人物画像のリストが表示され、ここで人物画像を選択することで、人物画像を入力することができる。このとき、ＣＧ画像入力モードではＣＧ人物画像が入力され、実写画像入力モードでは実写人物画像が入力される。 In the person image input section 33, when the user operates the input button 35, a list of person images is displayed, and by selecting a person image here, the user can input a person image. At this time, a CG person image is input in the CG image input mode, and a real person image is input in the real image input mode.

エリア画像入力部３４では、ユーザが、入力ボタン３６を操作することで、エリア画像のリストが表示され、ここでエリア画像を選択することで、エリア画像を入力することができる。このとき、ＣＧ画像入力モードではＣＧエリア画像が入力され、実写画像入力モードでは実写エリア画像が入力される。 In the area image input unit 34, when the user operates the input button 36, a list of area images is displayed, and by selecting an area image here, the user can input an area image. At this time, a CG area image is input in the CG image input mode, and a real area image is input in the real image input mode.

また、アノテーション作業モードの画面に２１は、エリア画像表示部４１が設けられている。エリア画像表示部４１では、入力されたエリア画像（ＣＧエリア画像、実写エリア画像）が表示される。また、エリア画像表示部４１では、入力された人物画像（ＣＧ人物画像、実写人物画像）がエリア画像上に重畳表示される。また、エリア画像表示部４１では、ユーザが、マウスのドラッグ操作などにより、エリア画像上に重畳表示された人物画像の位置および大きさを調整することができる。 Furthermore, an area image display section 41 is provided on the screen 21 in the annotation work mode. The area image display section 41 displays the input area image (CG area image, real area image). Further, in the area image display section 41, the input person image (CG person image, real photographed person image) is displayed in a superimposed manner on the area image. Furthermore, in the area image display section 41, the user can adjust the position and size of the person image superimposed on the area image by dragging the mouse or the like.

また、エリア画像表示部４１では、ユーザが、エリア画像上で対象とする人物を指定する、すなわち、エリア画像上で教師画像を切り出す位置を指定することができる。具体的には、ユーザが、マウスのドラッグ操作などにより、エリア画像（ＣＧエリア画像または実写エリア画像）上の人物画像（ＣＧ人物画像または実写人物画像）を取り囲む人物枠４２を入力することができる。この人物枠４２は、教師画像の範囲の候補となる。すなわち、人物枠４２の位置に基づいて人物矩形が設定され、その人物矩形に基づいて教師画像が生成される。このとき、エリア画像と人物画像とが合成された合成エリア画像から教師画像が切り出される。 Further, in the area image display section 41, the user can specify a target person on the area image, that is, specify a position on the area image from which the teacher image is to be cut out. Specifically, the user can input the person frame 42 that surrounds the person image (CG person image or live photo area image) on the area image (CG area image or live photo area image) by dragging the mouse or the like. . This person frame 42 becomes a candidate for the range of the teacher image. That is, a person rectangle is set based on the position of the person frame 42, and a teacher image is generated based on the person rectangle. At this time, the teacher image is cut out from the combined area image in which the area image and the person image are combined.

なお、エリア画像表示部４１に表示された実写エリア画像に含まれる人物を対象にして教師画像を作成する場合には、実写エリア画像に含まれる人物を取り囲む人物枠４２を入力すればよい。この場合、人物画像入力部３３でのユーザの入力操作は不要である。 Note that when creating a teacher image for a person included in the live-action area image displayed on the area-image display section 41, a person frame 42 surrounding the person included in the live-action area image may be input. In this case, the user's input operation on the person image input section 33 is not required.

また、人物検出処理が実施されることで、ユーザによる人物枠４２の入力操作が省略されてもよい。すなわち、人物を含む実写エリア画像や、エリア画像と人物画像とが合成された合成エリア画像に対して人物検出処理が行われ、その人物検出処理により取得した人物検出枠に基づいて教師画像が切り出されてもよい。 Further, by performing the person detection process, the input operation of the person frame 42 by the user may be omitted. That is, a person detection process is performed on a live-action area image containing a person or a composite area image in which an area image and a person image are combined, and a teacher image is cut out based on the person detection frame obtained by the person detection process. You may be

また、アノテーション作業モードの画面２１には、フレーム操作部４５が設けられている。フレーム操作部４５には、エリア画像を１フレームだけ前に戻すボタン４６と、エリア画像を１フレームだけ後に進めるボタン４７とが設けられている。このフレーム操作部４５の操作により、エリア画像の１フレームごとに教師画像を作成することができる。 Further, a frame operation section 45 is provided on the screen 21 in the annotation work mode. The frame operation unit 45 is provided with a button 46 that moves the area image back by one frame, and a button 47 that moves the area image back one frame. By operating the frame operation unit 45, a teacher image can be created for each frame of the area image.

また、アノテーション作業モードの画面２１には、検知領域入力部５１が設けられている。検知領域入力部５１では、ユーザが入力ボタン５２を操作すると、検知領域入力モードに遷移し、ユーザは、マウスのドラッグ操作などにより、エリア画像表示部４１に表示されたエリア画像上に検知領域および検知前領域の範囲を入力することができる。 Further, a detection area input section 51 is provided on the screen 21 in the annotation work mode. In the detection area input section 51, when the user operates the input button 52, the transition is made to the detection area input mode, and the user can input the detection area and the area image displayed on the area image display section 41 by dragging the mouse or the like. You can input the range of the pre-detection area.

また、アノテーション作業モードの画面２１には、属性入力部５３が設けられている。属性入力部５３では、ユーザが、教師画像に含まれる人物に関する属性として、上半身の服装の色と下半身の服装の色とを入力することができる。ＣＧ人物画像を選択した場合には、服装の色は既知であるため、ユーザの入力操作は不要である。 Further, an attribute input section 53 is provided on the screen 21 in the annotation work mode. In the attribute input section 53, the user can input the color of upper body clothing and the color of lower body clothing as attributes related to the person included in the teacher image. When a CG person image is selected, the color of the clothes is already known, so no input operation is required by the user.

また、アノテーション作業モードの画面２１には、タイトル入力部５５が設けられている。タイトル入力部５５では、ユーザが、教師画像のタイトル、具体的には、教師画像のグループに付与された名称などを入力することができる。教師画像のタイトル（グループの名称）は、教師画像のアノテーション状況を確認する際の教師画像の集合を識別するものである。 Further, a title input section 55 is provided on the screen 21 in the annotation work mode. In the title input section 55, the user can input the title of the teacher image, specifically, the name given to the group of teacher images. The teacher image title (group name) identifies a collection of teacher images when checking the annotation status of the teacher images.

また、アノテーション作業モードの画面２１には、保存のボタン５６が設けられている。ユーザが保存のボタン５６を操作すると、各部の入力内容に基づいて、教師画像が生成されて、その教師画像が記憶部１２に保存され、また、教師画像に関する属性情報などがデータベースに登録される。 Further, a save button 56 is provided on the screen 21 in the annotation work mode. When the user operates the save button 56, a teacher image is generated based on the input contents of each section, the teacher image is saved in the storage section 12, and attribute information etc. regarding the teacher image are registered in the database. .

図１３，図１４，図１５に示すアノテーション状況確認モードの画面６１には、教師画像選択部６２が設けられている。教師画像選択部６２では、ユーザが、教師画像のグループを分析対象に選択することができる。例えば、ユーザが教師画像選択部６２を操作すると、教師画像のグループのリストが表示され、ここで教師画像のグループを選択することができる。本例では、出入口Ａを監視エリアとした教師画像のグループが選択されている。 A teacher image selection section 62 is provided on the annotation status confirmation mode screen 61 shown in FIGS. 13, 14, and 15. The teacher image selection section 62 allows the user to select a group of teacher images to be analyzed. For example, when the user operates the teacher image selection section 62, a list of groups of teacher images is displayed, and the user can select a group of teacher images here. In this example, a group of teacher images in which entrance A is the monitoring area is selected.

また、アノテーション状況確認モードの画面６１には、リストのボタン６３と、グラフのボタン６４とが設けられている。ユーザが、リストのボタン６３を操作すると、教師画像選択部６２で選択された教師画像のグループを対象にして分析処理が実行され、分析結果として、図１３に示すリスト選択時の画面６１が表示される。また、ユーザが、グラフのボタン６４を操作すると、教師画像選択部６２で選択された教師画像のグループを対象にして分析処理が実行され、分析結果として、図１４に示すグラフ選択時の画面７１が表示される。 Further, the annotation status confirmation mode screen 61 is provided with a list button 63 and a graph button 64. When the user operates the list button 63, analysis processing is performed on the group of teacher images selected by the teacher image selection section 62, and a list selection screen 61 shown in FIG. 13 is displayed as the analysis result. be done. Further, when the user operates the graph button 64, analysis processing is performed on the group of teacher images selected by the teacher image selection section 62, and as an analysis result, a screen 71 at the time of graph selection shown in FIG. is displayed.

図１３に示すアノテーション状況確認モードにおけるリスト選択時の画面６１では、可視化結果表示部６５に一覧表６６が表示される。一覧表６６には、データベースに登録された各教師画像の属性が一覧表示されている。本例では、出入口Ａに関する教師画像の属性が一覧表示される。また、教師画像の属性として、教師画像に含まれる人物の特徴、特に人物の服装の色と、教師画像の背景となるエリア画像の時間帯とが表示されている。 On the screen 61 when selecting a list in the annotation status confirmation mode shown in FIG. 13, a list 66 is displayed on the visualization result display section 65. The list 66 displays a list of attributes of each teacher image registered in the database. In this example, attributes of teacher images related to entrance/exit A are displayed in a list. Further, as attributes of the teacher image, characteristics of the person included in the teacher image, particularly the color of the person's clothing, and the time period of the area image that is the background of the teacher image are displayed.

一覧表６６は、教師画像のアノテーション状況を文字情報により可視化するものであり、ユーザは、一覧表６６を目視することで、特定の属性の教師画像の不足などのアノテーション状況の不備を確認することができる。具体的には、ユーザは、特定の時間帯（例えば８時台）のエリア画像で、特定の色（例えば黄色）の服装をした人物を含む教師画像が、他の属性の教師画像より少ないことを確認することができる。 The list 66 visualizes the annotation status of the teacher images using text information, and by visually checking the list 66, the user can confirm deficiencies in the annotation status such as a lack of teacher images with a specific attribute. I can do it. Specifically, the user may notice that there are fewer teacher images containing people wearing clothes of a specific color (e.g. yellow) than teacher images of other attributes in area images taken during a specific time period (e.g. 8 o'clock). can be confirmed.

図１３に示す例では、一覧表６６に、教師画像に関する属性として、人物の服装の色と、エリア画像の時間帯とが表示されているが、この他の属性が表示されてもよい。例えば、人物の重なりの有無、すなわち、教師画像に背景となる人物領域や前景となる人物領域が含まれるか否かが表示されてもよい。また、教師画像の人物領域や背景領域の画質（解像度、ボケの有無など）が表示されてもよい。また、エリア画像の季節（春夏秋冬）が表示されてもよい。 In the example shown in FIG. 13, the list 66 displays the color of the person's clothing and the time period of the area image as attributes related to the teacher image, but other attributes may also be displayed. For example, the presence or absence of overlapping people, that is, whether the teacher image includes a background person area or a foreground person area may be displayed. Furthermore, the image quality (resolution, presence or absence of blur, etc.) of the person area and background area of the teacher image may be displayed. Furthermore, the season of the area image (spring, summer, fall, winter) may be displayed.

図１４，図１５に示すアノテーション状況確認モードにおけるグラフ選択時の画面７１では、可視化結果表示部６５に統計グラフ７２が表示される。統計グラフ７２は３次元棒グラフである。統計グラフ７２では、横方向の第１軸が、教師画像の第１の属性として、教師画像の背景となるエリア画像の時間帯を表し、奥行き方向の第２軸が、教師画像の第２の属性として、教師画像に含まれる人物の特徴、特に人物の服装の色を表し、縦方向の第３軸が、教師画像の枚数を表す。すなわち、教師画像の第１の属性（エリア画像の時間帯）と第２の属性（人物の服装の色）との組み合わせごとに棒グラフが描画され、棒グラフの高さが、第１の属性と第２の属性とを備えた教師画像の枚数を表す。 On the screen 71 when selecting a graph in the annotation status confirmation mode shown in FIGS. 14 and 15, a statistical graph 72 is displayed on the visualization result display section 65. The statistical graph 72 is a three-dimensional bar graph. In the statistical graph 72, the first axis in the horizontal direction represents the time period of the area image that is the background of the teacher image as the first attribute of the teacher image, and the second axis in the depth direction represents the time period of the area image that is the background of the teacher image. The attribute represents the characteristics of the person included in the teacher image, particularly the color of the person's clothing, and the third axis in the vertical direction represents the number of teacher images. In other words, a bar graph is drawn for each combination of the first attribute (time period of the area image) and second attribute (color of person's clothing) of the teacher image, and the height of the bar graph It represents the number of teacher images having the attribute of 2.

統計グラフ７２は、教師画像のアノテーション状況が棒グラフにより可視化するものであり、ユーザは、統計グラフ７２を目視することで、特定の属性の教師画像の不足などのアノテーション状況の不備を確認することができる。具体的には、ユーザは、特定の時間帯（例えば８時台）のエリア画像で、特定の色（例えば黄色）の服装をした人物を含む教師画像が、他の属性の教師画像より少ないことを確認することができる。 The statistical graph 72 visualizes the annotation status of the teacher image as a bar graph, and by visually observing the statistical graph 72, the user can confirm deficiencies in the annotation status such as a lack of teacher images with a specific attribute. can. Specifically, the user may notice that there are fewer teacher images containing people wearing clothes of a specific color (e.g. yellow) than teacher images of other attributes in area images taken during a specific time period (e.g. 8 o'clock). can be confirmed.

ここで、図１４に示す画面７１では、人物の服装が水色、緑色、および赤色である教師画像と比較して、人物の服装が黄色である教師画像が少なく、この属性の教師画像を追加する必要がある。一方、図１５に示す画面７１では、教師画像に含まれる人物の服装の色に関して黄色、水色、緑色、および赤色の全てで教師画像の枚数が均一になっており、教師画像のアノテーション状況が改善されている。 Here, in the screen 71 shown in FIG. 14, there are fewer teacher images in which the person is dressed in yellow than teacher images in which the person is in light blue, green, and red, and a teacher image with this attribute is added. There is a need. On the other hand, on the screen 71 shown in FIG. 15, the number of teacher images is uniform for yellow, light blue, green, and red regarding the clothing color of the person included in the teacher image, and the annotation status of the teacher image is improved. has been done.

なお、統計グラフ７２において横方向の第１軸により表される教師画像の第１の属性と、奥行き方向の第２軸により表される教師画像の第２の属性とを選択する操作部を画面に設けて、統計グラフ７２において各軸により表される教師画像の属性を、画面上でユーザが指定できるようにしてもよい。 Note that an operation section for selecting the first attribute of the teacher image represented by the first axis in the horizontal direction and the second attribute of the teacher image represented by the second axis in the depth direction in the statistical graph 72 is provided on the screen. The attributes of the teacher image represented by each axis in the statistical graph 72 may be specified by the user on the screen.

図１６～図１９に示すアノテーション状況詳細確認モードの画面８１（第２の画面）には、教師画像選択部８２が設けられている。教師画像選択部８２では、ユーザが、教師画像のグループを分析対象に選択することができる。例えば、ユーザが教師画像選択部８２を操作すると、教師画像のグループのリストが表示され、ここで教師画像のグループを選択することができる。本例では、監視エリアが出入口Ａで時間帯が８時となる教師画像のグループが選択されている。 A teacher image selection section 82 is provided in the annotation status detailed confirmation mode screen 81 (second screen) shown in FIGS. 16 to 19. The teacher image selection section 82 allows the user to select a group of teacher images to be analyzed. For example, when the user operates the teacher image selection section 82, a list of teacher image groups is displayed, from which a teacher image group can be selected. In this example, a group of teacher images in which the monitoring area is entrance A and the time zone is 8 o'clock is selected.

また、アノテーション状況詳細確認モードの画面８１には、分析のボタン８３と、可視化結果表示部８４とが設けられている。ユーザが、分析のボタン８３を操作すると、教師画像選択部８２で選択された教師画像のグループを対象にして分析処理が実行され、分析結果として、可視化結果表示部８４に、ヒートマップ画像８５が表示される。ヒートマップ画像８５は、透過状態でエリア画像上に重畳された状態で表示される。 Further, the screen 81 in the annotation status detailed confirmation mode is provided with an analysis button 83 and a visualization result display section 84. When the user operates the analysis button 83, analysis processing is performed on the group of teacher images selected in the teacher image selection section 82, and a heat map image 85 is displayed in the visualization result display section 84 as an analysis result. Is displayed. The heat map image 85 is displayed in a transparent state superimposed on the area image.

ヒートマップ画像８５では、エリア画像上の各位置における教師画像のアノテーション状況が可視化されている。具体的には、ヒートマップ画像８５は、メッシュ状に複数のセルに区切られており、各セルでは、セル内に代表点が位置する教師画像の枚数が階調の変化で表現されている。 In the heat map image 85, the annotation status of the teacher image at each position on the area image is visualized. Specifically, the heat map image 85 is divided into a plurality of cells like a mesh, and in each cell, the number of teacher images whose representative point is located within the cell is expressed by a change in gradation.

なお、セル内に代表点が位置する教師画像の枚数が色相の変化で表現されてもよい。この場合、教師画像の枚数が多くなるのに応じて、セルの色が、例えば青色、黄色、橙色、赤色の順で変化してもよい。また、セル内に代表点が位置する教師画像の枚数が模様（パターン画像）などの変化で表現されてもよい。 Note that the number of teacher images whose representative points are located within a cell may be expressed by a change in hue. In this case, as the number of teacher images increases, the color of the cell may change, for example, in the order of blue, yellow, orange, and red. Further, the number of teacher images whose representative points are located within a cell may be expressed by a change in a pattern (pattern image) or the like.

また、アノテーション状況詳細確認モードの画面８１には、属性選択部８７が設けられている。属性選択部８７では、ユーザが、教師画像の属性、特に教師画像に含まれる人物の属性として、人物の服装の色を選択することができる。具体的には、属性選択部８７には、「全て」、「黄色」、「水色」、「緑色」、「赤色」のボタン８８が設けられている。ユーザが、ボタン８８を操作することで、人物の服装の色を選択することができる。また、属性選択部８７では、ユーザが、人物の服装の色を複数選択することができる。 Further, the screen 81 in the annotation status detailed confirmation mode is provided with an attribute selection section 87. In the attribute selection section 87, the user can select the color of a person's clothing as an attribute of the teacher image, particularly as an attribute of the person included in the teacher image. Specifically, the attribute selection section 87 is provided with buttons 88 for "all", "yellow", "light blue", "green", and "red". By operating the button 88, the user can select the color of the person's clothing. Further, in the attribute selection section 87, the user can select a plurality of colors of clothing of the person.

属性選択部８７により、教師画像の属性、特に教師画像に含まれる人物の属性として、人物の服装の色が選択されると、ヒートマップ画像８５では、各セルにおいて、セル内に代表点が位置し、かつ、指定された属性に該当する教師画像の枚数が階調の変化で表現される。 When the attribute selection unit 87 selects the color of the person's clothing as an attribute of the teacher image, particularly an attribute of the person included in the teacher image, in the heat map image 85, in each cell, the representative point is located within the cell. In addition, the number of teacher images that correspond to the specified attribute is expressed as a change in gradation.

なお、本例では、教師画像の属性として、教師画像に含まれる人物の特徴に関する属性、特に服装の色に注目してアノテーション状況を可視化したヒートマップ画像８５が表示されるが、教師画像の他の属性に注目したヒートマップ画像８５であってもよい。例えば、人物の重なりの有無、すなわち、背景となる人物領域や前景となる人物領域が教師画像に含まれるか否かがヒートマップ画像８５で表現されてもよい。また、教師画像の人物領域や背景領域の画質（例えば、解像度、ボケの有無など）がヒートマップ画像８５で表現されてもよい。 In this example, as attributes of the teacher image, a heat map image 85 is displayed that visualizes the annotation status by focusing on the attributes related to the characteristics of the person included in the teacher image, especially the color of clothes, but other than the teacher image It may be a heat map image 85 that focuses on the attributes of . For example, the heat map image 85 may represent the presence or absence of overlapping people, that is, whether the teacher image includes a background person area or a foreground person area. Further, the image quality (for example, resolution, presence or absence of blur, etc.) of the person area and background area of the teacher image may be expressed in the heat map image 85.

ここで、まず、図１６に示すように、ユーザは、教師画像の属性を限定せずに教師画像のアノテーション状況を確認する。具体的には、ユーザは、属性選択部８７において、全ての色（黄色、水色、緑色、赤色）を選択するボタン８８を操作して、教師画像に含まれる人物の服装の色を限定せずに教師画像のアノテーション状況を確認する。 Here, first, as shown in FIG. 16, the user checks the annotation status of the teacher image without limiting the attributes of the teacher image. Specifically, the user operates a button 88 for selecting all colors (yellow, light blue, green, and red) in the attribute selection section 87, so that the color of the clothes of the person included in the teacher image is not limited. Check the annotation status of the teacher image.

本例では、ヒートマップ画像８５において、エリア画像内の左側のセルでは、指定された属性の教師画像の枚数が少なく、エリア画像内の中央のセルでは、指定された属性の教師画像の枚数が多く、エリア画像内の右側のセルでは、指定された属性の教師画像が全くない状態になっている。 In this example, in the heat map image 85, the left cell in the area image has a small number of teacher images with the specified attribute, and the center cell in the area image has a small number of teacher images with the specified attribute. In many cases, the cell on the right side of the area image has no teacher image with the specified attribute.

次に、図１７に示すように、ユーザは、特定の属性の教師画像に限定してアノテーション状況を確認する。本例では、ユーザは、属性選択部８７において黄色を選択するボタン８８を操作して、人物の服装が黄色となる属性の教師画像に関するアノテーション状況を確認する。 Next, as shown in FIG. 17, the user checks the annotation status only for teacher images with specific attributes. In this example, the user operates the yellow selection button 88 in the attribute selection section 87 to check the annotation status regarding the teacher image with the attribute of the person's clothing being yellow.

本例では、図１６に示した状態と同様に、ヒートマップ画像８５において、エリア画像内の左側のセルでは、指定された属性の教師画像の枚数が少なく、エリア画像内の中央のセルでは、指定された属性の教師画像の枚数が多く、エリア画像内の右側のセルでは、指定された属性の教師画像が全くない状態になっている。 In this example, similar to the state shown in FIG. 16, in the heat map image 85, the left cell in the area image has a small number of teacher images with the specified attribute, and the center cell in the area image has a small number of teacher images. There are a large number of teacher images with the specified attribute, and there are no teacher images with the specified attribute in the right cell in the area image.

そこで、ユーザは、アノテーション作業のタブ２２（操作部）を操作して、図１２に示すアノテーション作業モードの画面２１に戻り、不足する教師画像を追加するアノテーション作業を行う。本例では、エリア画像内の左側および右側の領域に代表点が位置し、かつ、人物の服装が黄色となる教師画像を追加する。すなわち、エリア画像内の左側および右側の領域を背景として、服装の色が黄色となる人物を含む教師画像を追加する。これにより、エリア画像内の左側および右側の領域に代表点が位置し、かつ、人物の服装の色が黄色となる属性の教師画像の不足が改善される。 Therefore, the user operates the annotation work tab 22 (operation unit) to return to the annotation work mode screen 21 shown in FIG. 12, and performs the annotation work to add the missing teacher image. In this example, a teacher image is added in which the representative points are located in the left and right regions of the area image and the person's clothing is yellow. That is, a teacher image including a person whose clothing color is yellow is added with the left and right regions in the area image as the background. As a result, the shortage of teacher images with attributes such that the representative points are located in the left and right regions of the area image and the color of the person's clothing is yellow can be alleviated.

次に、図１８に示すように、ユーザは、教師画像選択部８２において、教師画像の追加が行われた教師画像のグループ（出入口Ａ＿８：００＿教師画像＿追加１）を分析対象に選択して、アノテーション状況を確認する。また、ユーザは、属性選択部８７において、黄色を除く色（水色、緑色、赤色）を選択するボタン８８を操作して、黄色を除く色（水色、緑色、赤色）に該当する属性の教師画像に関するアノテーション状況を確認する。 Next, as shown in FIG. 18, the user selects the group of teacher images to which the teacher image has been added (entrance/exit A_8:00_teacher image_addition 1) in the teacher image selection unit 82 to be analyzed. , check the annotation status. In addition, the user operates a button 88 for selecting a color other than yellow (light blue, green, red) in the attribute selection section 87, and selects a teacher image with an attribute corresponding to a color other than yellow (light blue, green, red). Check the annotation status regarding.

本例では、ヒートマップ画像８５において、エリア画像内の左側および中央のセルでは、指定された属性の教師画像、すなわち、服装が水色、緑色、赤色のいずれかである人物を含む教師画像の枚数が多いものの、エリア画像内の右側のセルでは、指定された属性の教師画像が全くない状態になっている。 In this example, in the heat map image 85, in the left and center cells in the area image, the number of teacher images that include a teacher image with the specified attribute, that is, a person whose clothes are light blue, green, or red. However, in the cell on the right side of the area image, there is no teacher image with the specified attribute at all.

そこで、ユーザは、アノテーション作業のタブ２２を操作して、図１２に示すアノテーション作業モードの画面２１に戻り、不足する教師画像を追加するアノテーション作業を行う。本例では、エリア画像内の右側の領域に代表点が位置し、かつ、人物の服装の色が水色、緑色、赤色のいずれかとなる教師画像を追加する。すなわち、エリア画像内の右側の領域を背景として、服装の色が水色、緑色、赤色のいずれかとなる人物を含む教師画像を追加する。これにより、エリア画像内の右側の領域に代表点が位置し、かつ、人物の服装の色が水色、緑色、赤色のいずれかとなる教師画像の不足が改善される。 Therefore, the user operates the annotation work tab 22 to return to the annotation work mode screen 21 shown in FIG. 12, and performs an annotation work to add the missing teacher image. In this example, a teacher image is added in which the representative point is located in the right region of the area image and the color of the person's clothing is light blue, green, or red. That is, a teacher image is added that includes a person whose clothing color is light blue, green, or red, with the right region in the area image as the background. As a result, the lack of teacher images in which the representative point is located in the right region of the area image and in which the color of the person's clothing is light blue, green, or red can be alleviated.

次に、図１９に示すように、ユーザは、教師画像選択部８２において、教師画像の再度の追加が行われた教師画像のグループ（出入口Ａ＿８：００＿教師画像＿追加２）を分析対象に選択して、アノテーション状況を確認する。また、ユーザは、属性選択部８７において、全ての色（黄色、水色、緑色、赤色）を選択するボタン８８を操作して、全ての色（黄色、水色、緑色、赤色）に該当する属性の教師画像に関するアノテーション状況を確認する。 Next, as shown in FIG. 19, in the teacher image selection section 82, the user selects the teacher image group (entrance/exit A_8:00_teacher image_addition 2) to which the teacher image has been added again as an analysis target. to check the annotation status. In addition, the user operates the button 88 for selecting all colors (yellow, light blue, green, red) in the attribute selection section 87 to select the attributes corresponding to all the colors (yellow, light blue, green, red). Check the annotation status regarding teacher images.

本例では、ヒートマップ画像８５において、人物が存在することが可能な範囲で、全てのセルにおいて教師画像の枚数が均一な状態となっており、ユーザは、アノテーション状況が改善されたことを確認することができる。 In this example, in the heat map image 85, the number of teacher images is uniform in all cells within the range where a person can exist, and the user can confirm that the annotation situation has been improved. can do.

また、アノテーション状況詳細確認モードの画面８１では、ユーザが、マウスのドラッグ操作などにより、ヒートマップ画像８５上で計測対象エリア９１を指定すると、一覧表９２（度数分布表）が表示される。一覧表９２は、指定された計測対象エリア９１に代表点が位置する教師画像を対象にして、教師画像の属性としての人物の服装の色に関する分布状況を表す。具体的には、人物の服装の色（黄色、水色、緑色、赤色）ごとに教師画像の枚数が表示される。ユーザは、人物の服装の色（黄色、水色、緑色、赤色）ごとの教師画像の枚数により、アノテーション状況の詳細を確認することができる。本例では、全色の教師画像の枚数が均等であり、ユーザは、必要な属性の教師画像が偏りなく揃っていることを確認することができる。 Further, on the screen 81 in the annotation status detailed confirmation mode, when the user specifies a measurement target area 91 on the heat map image 85 by dragging the mouse or the like, a list 92 (frequency distribution table) is displayed. The list 92 represents the distribution of the colors of people's clothes as attributes of teacher images for teacher images whose representative points are located in the designated measurement target area 91. Specifically, the number of teacher images is displayed for each color of the person's clothing (yellow, light blue, green, red). The user can check the details of the annotation status based on the number of teacher images for each color of person's clothing (yellow, light blue, green, red). In this example, the number of teacher images of all colors is equal, and the user can confirm that the teacher images with the necessary attributes are evenly distributed.

また、アノテーション状況詳細確認モードの画面８１には、学習のボタン８８が設けられている。ユーザは、必要な属性の教師画像が偏りなく揃っていることを確認すると、学習のボタン８８を操作する。これにより、生成された教師画像に基づいて学習処理が実行され、画像認識モデル（機械学習モデル）が作成される。 Further, a learning button 88 is provided on the annotation status detailed confirmation mode screen 81. When the user confirms that the teacher images with the required attributes are evenly distributed, the user operates the learning button 88. Thereby, learning processing is executed based on the generated teacher image, and an image recognition model (machine learning model) is created.

次に、アノテーション状況詳細確認モードの画面の別例について説明する。図２０は、アノテーション状況詳細確認モードの画面の別例を示す説明図である。 Next, another example of the screen in the annotation status detailed confirmation mode will be described. FIG. 20 is an explanatory diagram showing another example of the screen in the annotation status detailed confirmation mode.

図１６～図１９に示した画面８１では、可視化結果表示部８４に、教師画像の抽出枚数が階調の変化で表現されたヒートマップ画像８５（可視化画像）がエリア画像上に重畳表示される。一方、図２０に示す画面１０１では、可視化結果表示部８４に、教師画像の収集状況に問題のあるエリア画像上の範囲を表す枠画像１０２（マーク画像）がエリア画像上に重畳表示される。 In the screen 81 shown in FIGS. 16 to 19, a heat map image 85 (visualized image) in which the number of extracted teacher images is expressed by a change in gradation is displayed in the visualization result display section 84, superimposed on the area image. . On the other hand, on the screen 101 shown in FIG. 20, a frame image 102 (mark image) representing a range on the area image where there is a problem in the collection status of teacher images is displayed in the visualization result display section 84 in a superimposed manner on the area image.

本例では、指定された属性（例えば、人物の服装が黄色）に該当する教師画像の抽出枚数が少ないエリア画像上の範囲を取り囲むように枠画像１０２がエリア画像上に重畳表示される。また、属性を限定せずに教師画像の抽出枚数が少ないエリア画像上の範囲を取り囲むように枠画像１０２がエリア画像上に重畳表示される。 In this example, a frame image 102 is superimposed and displayed on the area image so as to surround a range on the area image from which a small number of extracted teacher images corresponding to a specified attribute (for example, a person's clothes are yellow) are extracted. Furthermore, a frame image 102 is superimposed and displayed on the area image so as to surround a range on the area image from which a small number of teacher images have been extracted without limiting the attributes.

なお、教師画像の収集状況に問題のないエリア画像上の範囲が枠画像１０２で表示されてもよい。例えば、特定の属性を有する教師画像の抽出枚数が多いエリア画像上の範囲を取り囲むように枠画像１０２が描画されてもよい。また、教師画像の抽出枚数に応じて枠画像１０２が異なる色で描画されてもよい。 Note that a range on the area image where there is no problem in the collection status of teacher images may be displayed as the frame image 102. For example, the frame image 102 may be drawn so as to surround a range on the area image from which a large number of extracted teacher images having a specific attribute are extracted. Further, the frame image 102 may be drawn in a different color depending on the number of extracted teacher images.

また、本例では、教師画像の属性として人物の服装の色が指定されて、人物の服装が特定の色となる教師画像の抽出枚数が少ない領域が枠画像１０２で表示されたが、人物の服装の色以外の属性が指定されてもよい。例えば、教師画像の属性として人物の重なりが指定されて、人物の重なりが発生している教師画像の抽出枚数が少ない領域が枠画像１０２で表示されてもよい。 Further, in this example, the color of the person's clothing is specified as an attribute of the teacher image, and an area where the number of extracted teacher images in which the person's clothes are a specific color is displayed as the frame image 102. Attributes other than the color of clothing may be specified. For example, overlapping people may be specified as an attribute of the teacher image, and an area where a small number of extracted teacher images in which overlapping people occur may be displayed as the frame image 102.

また、本例では、教師画像の収集状況に問題のあるエリア画像上の範囲を表す枠画像１０２（マーク画像）がエリア画像上に重畳表示されるが、マーク画像は枠画像１０２に限定されない。例えばマーク画像として、模様が描画された半透過の画像が、教師画像の収集状況に問題のあるエリア画像上の範囲に重畳表示されてもよい。また、枠画像１０２（マーク画像）に加え、教師画像の追加に関するコメント（図示せず）を提示するようにしてもよい。 Further, in this example, a frame image 102 (mark image) representing a range on the area image where there is a problem in the collection status of teacher images is displayed superimposed on the area image, but the mark image is not limited to the frame image 102. For example, as a mark image, a semi-transparent image with a pattern drawn thereon may be superimposed and displayed in a range on an area image where there is a problem in the collection status of teacher images. Furthermore, in addition to the frame image 102 (mark image), a comment (not shown) regarding the addition of the teacher image may be presented.

以上のように、本出願において開示する技術の例示として、実施形態を説明した。しかしながら、本開示における技術は、これに限定されず、変更、置き換え、付加、省略などを行った実施形態にも適用できる。また、上記の実施形態で説明した各構成要素を組み合わせて、新たな実施形態とすることも可能である。 As described above, the embodiments have been described as examples of the technology disclosed in this application. However, the technology in the present disclosure is not limited to this, and can also be applied to embodiments in which changes, replacements, additions, omissions, etc. are made. Furthermore, it is also possible to create a new embodiment by combining the components described in the above embodiments.

本発明に係る画像処理装置および画像処理方法は、学習に先だって、教師画像の収集状況をユーザが目視で容易に確認でき、効率よく高精度な学習モデルを作成することができる効果を有し、監視エリアに対応した画像認識モデルを構築するための教師画像の収集状況を可視化する画像処理装置および画像処理方法などとして有用である。 The image processing device and image processing method according to the present invention have the effect that a user can easily visually check the collection status of teacher images prior to learning, and can efficiently create a highly accurate learning model. The present invention is useful as an image processing device and an image processing method for visualizing the collection status of teacher images for constructing an image recognition model corresponding to a monitoring area.

１画像処理装置
１３プロセッサ
２１アノテーション作業モードの画面
２２タブ
６１アノテーション状況確認モードの画面
６６一覧表
７１アノテーション状況確認モードの画面
７２統計グラフ
８１アノテーション状況詳細確認モードの画面
８５ヒートマップ画像
１０１アノテーション状況詳細確認モードの画面
１０２枠画像（マーク画像） 1 Image processing device 13 Processor 21 Annotation work mode screen 22 Tab 61 Annotation status confirmation mode screen 66 List 71 Annotation status confirmation mode screen 72 Statistical graph 81 Annotation status detailed confirmation mode screen 85 Heat map image 101 Annotation status details Confirmation mode screen 102 Frame image (mark image)

Claims

An image processing device that uses a processor to perform a process of visualizing the collection status of teacher images for constructing an image recognition model corresponding to a monitoring area, the image processing device comprising:
The processor includes:
Generating a teacher image including a detection target object and a background from an area image related to the monitoring area,
setting attributes related to the characteristics of the detection target included in the teacher image for each teacher image;
generating a visualized image that visualizes the collection status of the teacher image at each position of the area image, targeting the teacher image having the attribute specified by the user;
An image processing device that outputs display information in which the visualized image is superimposed on the area image.

The processor includes:
2. The image processing apparatus according to claim 1, wherein a heat map image representing a collection status of the teacher image at each position of the area image is generated as the visualized image.

The processor includes:
2. The image processing apparatus according to claim 1, wherein a mark image representing a range on the area image where there is a problem in the collection status of the teacher image is generated as the visualized image.

The processor includes:
The image processing apparatus according to claim 1, wherein the teacher image is generated from a real area image photographed by a camera or a virtual area image created by CG as the area image.

The processor includes:
The image processing apparatus according to claim 1, wherein the image processing apparatus generates the visualized image that visualizes the collection status of the teacher images for each color type related to the person as the attribute.

The processor includes:
The image processing apparatus according to claim 1, wherein the image processing apparatus generates a statistical graph that visualizes the collection status of the teacher images for each attribute, and outputs the display information including this statistical graph.

The processor includes:
outputting the display information including a first screen for generating the teacher image and setting attributes for the teacher image in response to a user's operation;
Claim characterized in that the visualized image is superimposed and displayed on the area image, and the display information including a second screen provided with an operation unit for returning to the first screen is output. 1. The image processing device according to 1.

An image processing method in which a processor executes processing for visualizing the collection status of teacher images for constructing an image recognition model corresponding to a monitoring area, the method comprising:
Generating a teacher image including a detection target object and a background from an area image related to the monitoring area,
setting attributes related to the characteristics of the detection target included in the teacher image for each teacher image;
generating a visualized image that visualizes the collection status of the teacher image at each position of the area image, targeting the teacher image having the attribute specified by the user;
An image processing method characterized by outputting display information in which the visualized image is superimposed on the area image.