JP2020149565A

JP2020149565A - Image processing system, image processing method and program

Info

Publication number: JP2020149565A
Application number: JP2019048472A
Authority: JP
Inventors: 哲広船城; Tetsuhiro Funashiro
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2019-03-15
Filing date: 2019-03-15
Publication date: 2020-09-17
Anticipated expiration: 2039-03-15
Also published as: JP7309392B2

Abstract

To enable a highly accurate assumption of the number of persons in an image.SOLUTION: An image processing system includes: division means (202) configured to determine a plurality of division areas by dividing an image; and assumption means (204) configured to assume the number of persons in each of the plurality of division images determined by the division means. The division means determines a position and a size of each of the plurality of division areas according to a size of a human body assumed by each location in the image. At least two division areas in the plurality of division areas overlap.SELECTED DRAWING: Figure 2

Description

本発明は、画像処理装置、画像処理方法およびプログラムに関する。 The present invention relates to an image processing apparatus, an image processing method and a program.

撮像装置により所定の領域を撮影し、撮影した画像を解析することにより、画像中の人物の数を計測するシステムが知られている。このシステムは、公共の空間での混雑の検知、混雑時の人の流れを把握することによりイベント時の混雑解消および災害時の避難誘導への活用が期待されている。 There is known a system that measures the number of people in an image by photographing a predetermined area with an imaging device and analyzing the captured image. This system is expected to be used for detecting congestion in public spaces, grasping the flow of people during congestion, eliminating congestion at events, and guiding evacuation during disasters.

画像中の人物の数を計測する方法として、特許文献１には、人体検出手段によって検出した人物の数を計数する方法が開示されている。また、特許文献２には、機械学習によって得た認識モデルを用いて、画像の所定の領域に存在する人数を直接推定する方法が開示されている。 As a method for measuring the number of people in an image, Patent Document 1 discloses a method for counting the number of people detected by a human body detecting means. Further, Patent Document 2 discloses a method of directly estimating the number of people existing in a predetermined region of an image by using a recognition model obtained by machine learning.

特開２０１５−７０３５９号公報Japanese Unexamined Patent Publication No. 2015-070359 特開２０１８−２２３４０号公報Japanese Unexamined Patent Publication No. 2018-22340

特許文献１の方法は、人がまばらに存在し、かつ、人が所定の大きさ以上である場合には、高精度で人数を推定できる。しかしながら、人が高密度で存在し、人体の大部分が隠れている場合、または、人体が所定の大きさより小さい場合には、人体検出手段の精度が低下する。 The method of Patent Document 1 can estimate the number of people with high accuracy when there are sparsely populated people and the number of people is a predetermined size or larger. However, if the human body is present at a high density and most of the human body is hidden, or if the human body is smaller than a predetermined size, the accuracy of the human body detecting means is reduced.

特許文献２の方法は、画像中に存在する人の大きさが学習で想定した大きさの場合においては、人が高密度で存在していても高精度で人数を推定できる。しなしながら、人の大きさが想定より大きく外れている場合においては、人数推定の精度が低下する。 In the method of Patent Document 2, when the size of the person existing in the image is the size assumed by learning, the number of people can be estimated with high accuracy even if the person exists at a high density. However, if the size of the person is larger than expected, the accuracy of the number estimation will decrease.

本発明の目的は、画像の中の人数を高精度で推定できるようにすることである。 An object of the present invention is to be able to estimate the number of people in an image with high accuracy.

本発明の一観点によれば、画像を分割することにより複数の分割領域を決定する分割手段と、前記分割手段により決定された複数の分割領域の各々の中の人数を推定する推定手段とを有し、前記分割手段は、前記画像の中の各位置で想定される人体のサイズを基に、前記複数の分割領域の各々の位置とサイズを決定し、前記複数の分割領域のうち、少なくとも２つの分割領域の一部が重複することを特徴とする画像処理装置が提供される。 According to one aspect of the present invention, there are a dividing means for determining a plurality of divided regions by dividing an image and an estimating means for estimating the number of people in each of the plurality of divided regions determined by the divided means. The dividing means determines the position and size of each of the plurality of divided regions based on the size of the human body assumed at each position in the image, and at least among the plurality of divided regions. An image processing apparatus is provided in which a part of two divided regions overlaps.

本発明によれば、画像の中の人数を高精度で推定できる。 According to the present invention, the number of people in an image can be estimated with high accuracy.

画像処理装置のハードウェア構成例を示すブロック図である。It is a block diagram which shows the hardware configuration example of an image processing apparatus. 画像処理装置の機能構成例を示すブロック図である。It is a block diagram which shows the functional structure example of an image processing apparatus. 画像の例を示す図である。It is a figure which shows the example of an image. 分割領域の例を示す図である。It is a figure which shows the example of the division area. 推定の正解率の分布を示す図である。It is a figure which shows the distribution of the estimated correct answer rate. 領域分割部の処理方法を示すフローチャートである。It is a flowchart which shows the processing method of the area division part. 領域分割部の処理方法を示すフローチャートである。It is a flowchart which shows the processing method of the area division part.

以下に、本発明の好ましい実施形態を、添付の図面に基づいて詳細に説明する。ただし、本発明の実施形態は以下の実施形態に限定されるものではない。各図面に示される同一または同等の構成要素、部材、処理には、同一の符号を付するものとし、適宜重複した説明は省略する。また、各図面において説明上重要ではない部材の一部は省略して表示する。以下、本発明の実施形態について図面に基づいて説明する。 Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. However, the embodiments of the present invention are not limited to the following embodiments. The same or equivalent components, members, and processes shown in the drawings shall be designated by the same reference numerals, and redundant description will be omitted as appropriate. In addition, some of the members that are not important for explanation are omitted in each drawing. Hereinafter, embodiments of the present invention will be described with reference to the drawings.

図１は、本発明の実施形態による画像処理装置１００のハードウェア構成の一例を示すブロック図である。画像処理装置１００は、ハードウェア構成として、ＣＰＵ１０１と、メモリ１０２と、ネットワークインタフェース１０３と、表示装置１０４と、入力装置１０５とを有する。 FIG. 1 is a block diagram showing an example of the hardware configuration of the image processing apparatus 100 according to the embodiment of the present invention. The image processing device 100 has a CPU 101, a memory 102, a network interface 103, a display device 104, and an input device 105 as hardware configurations.

ＣＰＵ１０１は、画像処理装置１００の全体を制御する。メモリ１０２は、ＣＰＵ１０１が処理するデータおよびプログラム等を記憶する。入力装置１０５は、マウスまたはボタン等であり、ユーザの操作を入力する。表示装置１０４は、液晶表示装置等であり、ＣＰＵ１０１による処理の結果等を表示する。ネットワークインタフェース１０３は、画像処理装置１００をネットワークに接続するためのインタフェースである。ＣＰＵ１０１がメモリ１０２に記憶されたプログラムを実行することにより、後述する図２から図５の処理が実現される。 The CPU 101 controls the entire image processing device 100. The memory 102 stores data, programs, and the like processed by the CPU 101. The input device 105 is a mouse, a button, or the like, and inputs a user's operation. The display device 104 is a liquid crystal display device or the like, and displays the result of processing by the CPU 101 or the like. The network interface 103 is an interface for connecting the image processing device 100 to the network. When the CPU 101 executes the program stored in the memory 102, the processes of FIGS. 2 to 5 described later are realized.

図２は、画像処理装置１００の機能構成の一例を示すブロック図である。画像処理装置１００は、機能構成として、画像取得部２０１と、分布取得部２０２と、領域分割部２０３と、人数推定部２０４と、人数統合部２０５と、表示部２０６とを有する。 FIG. 2 is a block diagram showing an example of the functional configuration of the image processing device 100. The image processing device 100 has an image acquisition unit 201, a distribution acquisition unit 202, an area division unit 203, a number estimation unit 204, a number integration unit 205, and a display unit 206 as functional configurations.

画像取得部２０１は、人数を推定する対象となる画像を取得する。 The image acquisition unit 201 acquires an image for which the number of people is estimated.

分布取得部２０２は、画像取得部２０１により取得された画像上の各画素の位置において想定される人体サイズの分布を取得する。分布取得部２０２は、ユーザが画像上のいくつかの位置における人体サイズを指定することで、画像上の任意の位置における人体の平均的な人体サイズを補間により推定し、人体サイズの分布を取得してもよい。また、分布取得部２０２は、人体を検出し、その検出結果から画像上の任意の位置における平均的な人体サイズを補間により推定し、人体サイズの分布を取得してもよい。 The distribution acquisition unit 202 acquires the distribution of the human body size assumed at the position of each pixel on the image acquired by the image acquisition unit 201. The distribution acquisition unit 202 acquires the distribution of the human body size by estimating the average human body size at an arbitrary position on the image by interpolation by the user specifying the human body size at some positions on the image. You may. Further, the distribution acquisition unit 202 may detect the human body, estimate the average human body size at an arbitrary position on the image from the detection result by interpolation, and acquire the distribution of the human body size.

補間による推定方法は、例えば、画像上の座標（ｘ，ｙ）における人体枠の大きさをｓとしたとき、ｓは、ｘ、ｙおよび未知の１個以上のパラメータによって表せると仮定する。例えば、ｓ＝ａｘ＋ｂｙ＋ｃと仮定する。この例では、未知のパラメータは、ａ、ｂおよびｃである。分布取得部２０２は、ユーザが指定した人体の位置およびサイズの集合、または、人体検出により検出された人体の位置およびサイズの集合を用いて、未知のパラメータを例えば最小二乗法等の統計処理により求めることができる。 The estimation method by interpolation assumes that, for example, when the size of the human body frame at the coordinates (x, y) on the image is s, s can be represented by x, y and one or more unknown parameters. For example, assume that s = ax + by + c. In this example, the unknown parameters are a, b and c. The distribution acquisition unit 202 uses a set of the positions and sizes of the human body specified by the user, or a set of the positions and sizes of the human body detected by the human body detection, and performs unknown parameters by statistical processing such as the least squares method. You can ask.

図３は、画像取得部２０１により取得された画像３００の一例を示す図である。画像３００は、高所に設置された撮像装置により撮像された画像である。画像３００は、人体３０１、３０２及び３０３を含む。撮像装置を斜めに設置した場合、図３のように、撮像装置に近い手前側の位置の人体３０３は大きく、撮像装置から離れた奥側の位置の人体３０１は小さい。分布取得部２０２は、画像取得部２０１により取得された画像６００上の各画素の位置において想定される人体サイズの分布を取得する。 FIG. 3 is a diagram showing an example of the image 300 acquired by the image acquisition unit 201. The image 300 is an image captured by an image pickup device installed at a high place. Image 300 includes human bodies 301, 302 and 303. When the image pickup device is installed at an angle, as shown in FIG. 3, the human body 303 at the front side position near the image pickup device is large, and the human body 301 at the back side position away from the image pickup device is small. The distribution acquisition unit 202 acquires the distribution of the human body size assumed at the position of each pixel on the image 600 acquired by the image acquisition unit 201.

図２の領域分割部２０３は、画像取得部２０１により取得された画像を分割することにより、複数の分割領域を決定（生成）する。 The area division unit 203 of FIG. 2 determines (generates) a plurality of division areas by dividing the image acquired by the image acquisition unit 201.

図４は、領域分割部２０３により決定される分割領域４０１〜４０３の一例を示す図である。図３と同様に、画像３００は、人体３０１〜３０３を含む。撮像装置から離れた奥側の位置の人体３０１は、画像３００上では小さくなっている。そのため、領域分割部２０３は、画像３００上の人体３０１のサイズを基に、人体３０１の位置に対応する分割領域４０１を決定する。同様に、領域分割部２０３は、画像３００上の人体３０２のサイズを基に、人体３０２の位置に対応する分割領域４０２を決定する。同様に、領域分割部２０３は、画像３００上の人体３０３のサイズを基に、人体３０３の位置に対応する分割領域４０３を決定する。分割領域４０１〜４０３は、矩形であり、例えば正方形である。 FIG. 4 is a diagram showing an example of division areas 401 to 403 determined by the area division unit 203. Similar to FIG. 3, image 300 includes the human body 301-303. The human body 301 located on the back side away from the image pickup apparatus is small on the image 300. Therefore, the region division unit 203 determines the division region 401 corresponding to the position of the human body 301 based on the size of the human body 301 on the image 300. Similarly, the region division unit 203 determines the division region 402 corresponding to the position of the human body 302 based on the size of the human body 302 on the image 300. Similarly, the region division unit 203 determines the division region 403 corresponding to the position of the human body 303 based on the size of the human body 303 on the image 300. The divided regions 401 to 403 are rectangular, for example, square.

図２の人数推定部２０４は、機械学習によって得られた回帰器を用いて、画像取得部２０１により取得された画像において、領域分割部２０３により決定された複数の分割領域の各々の中の人体の数（人数）を逐次的に推定する。画像におけるオブジェクトの数の推定方法の一例として、ある固定サイズＳの小画像を入力とし、その小画像に写っているオブジェクトの数を出力とする回帰器を用いる手法について説明する。オブジェクトが人体の場合、この手法では予め、頭部など人体の位置が既知である大量の小画像を学習データとして、サポートベクターマシンや深層学習等の既知の機械学習手法に基づいて回帰器を学習しておく。このとき、回帰器の精度向上を図るため、学習データは、小画像のサイズと映っている人のサイズとの比率がほぼ一定であることが望ましい。 The number estimation unit 204 of FIG. 2 is a human body in each of a plurality of division regions determined by the region division unit 203 in the image acquired by the image acquisition unit 201 using the regression device obtained by machine learning. Estimate the number (number of people) of. As an example of a method for estimating the number of objects in an image, a method using a regressionr that takes a small image of a certain fixed size S as an input and outputs the number of objects in the small image will be described. When the object is a human body, this method learns the regressionr based on a known machine learning method such as a support vector machine or deep learning, using a large number of small images whose positions of the human body such as the head are known as training data in advance. I will do it. At this time, in order to improve the accuracy of the regressor, it is desirable that the ratio of the size of the small image to the size of the person in the image is almost constant in the training data.

人数推定部２０４は、領域分割部２０３により決定された複数の分割領域それぞれについて、分割領域内の画像を固定サイズＳにリサイズしたものを小画像とし、該小画像を回帰器に入力することで「該分割領域内の人体の位置」を回帰器からの出力として求める。人数推定部２０４は、回帰器からの出力である「分割領域内の人体の位置」の個数（＝人体の数）を、該分割領域における人数として推定する。なお、人数推定部２０４が推定するオブジェクトの数は必ずしも整数とは限らず、実数を取ることもありえる。人数推定部２０４は、実数を四捨五入により整数に丸めて扱ってもよいし、実数のまま扱ってもよい。人数推定部２０４は、分割領域のサイズと推定対象のオブジェクトのサイズとの比率が、学習データのその比率とほぼ一定となるように制約を加えることで、推定の正解率の向上を図ることができる。 The number estimation unit 204 uses a small image obtained by resizing the image in the divided area to a fixed size S for each of the plurality of divided areas determined by the area division unit 203, and inputs the small image to the regression unit. The "position of the human body in the divided region" is obtained as the output from the regression device. The number estimation unit 204 estimates the number of "positions of the human body in the divided region" (= the number of human bodies), which is the output from the regression device, as the number of people in the divided region. The number of objects estimated by the number estimation unit 204 is not necessarily an integer, and may be a real number. The number estimation unit 204 may round the real number into an integer by rounding, or may handle the real number as it is. The number estimation unit 204 can improve the accuracy rate of estimation by restricting the ratio between the size of the divided area and the size of the object to be estimated to be substantially constant with the ratio of the training data. it can.

図５は、人数推定部２０４の推定の正解率の一例を示すグラフである。横軸は、分割領域４０１〜４０３の横幅の画素数に対する人体３０１〜３０３の横幅の画素数の割合を示す。縦軸は、人数推定部２０４の推定の正解率を示す。例えば、分割領域が１００画素×１００画素であり、人体の横幅が５０画素である場合、横軸の割合は、５０画素／１００画素＝５０％である。人数推定部２０４の推定の正解率は、回帰器の学習に用いた学習データである小画像のサイズと映っている人のサイズとの比率を中心とした正規分布となることが一般的である。割合範囲５０２は、分割領域４０１〜４０３の横幅の画素数に対する人体３０１〜３０３の横幅の画素数の割合の範囲であり、推定の正解率が目標値５０１以上である場合の割合の範囲（以下、理想割合範囲）である。人数推定部２０４は、割合範囲５０２の人体サイズの人数を目標値５０１以上の推定の正解率で推定可能である。理想割合範囲である割合範囲５０２の幅は、目標値５０１に応じて変化する。 FIG. 5 is a graph showing an example of the estimated correct answer rate of the number of people estimation unit 204. The horizontal axis indicates the ratio of the number of pixels in the width of the human body 301 to 303 to the number of pixels in the width of the divided regions 401 to 403. The vertical axis shows the correct answer rate of the estimation of the number of people estimation unit 204. For example, when the division area is 100 pixels × 100 pixels and the width of the human body is 50 pixels, the ratio of the horizontal axis is 50 pixels / 100 pixels = 50%. The correct answer rate of the estimation by the number estimation unit 204 is generally a normal distribution centered on the ratio between the size of the small image, which is the learning data used for learning the regression device, and the size of the person in the image. .. The ratio range 502 is a range of the ratio of the number of pixels of the width of the human body 301 to 303 to the number of pixels of the width of the divided regions 401 to 403, and is a range of the ratio when the estimated correct answer rate is the target value 501 or more (hereinafter). , Ideal ratio range). The number estimation unit 204 can estimate the number of people of the human body size in the ratio range 502 with an estimated correct answer rate of the target value 501 or more. The width of the ratio range 502, which is the ideal ratio range, changes according to the target value 501.

図２に示す領域分割部２０３は、理想割合範囲である割合範囲５０２に対応するサイズの分割領域を決定する必要がある。領域分割部２０３が画像の画素位置における人体サイズに応じた適切な分割領域を決定することにより、人数推定部２０４は、推定の正解率を高くすることができる。 The region division unit 203 shown in FIG. 2 needs to determine a division region having a size corresponding to the ratio range 502, which is the ideal ratio range. The area division unit 203 determines an appropriate division area according to the size of the human body at the pixel position of the image, so that the number estimation unit 204 can increase the accuracy rate of the estimation.

人数統合部２０５は、人数推定部２０４により推定された各分割領域の人数を統合する。例えば、人数統合部２０５は、人数推定部２０４により推定された各分割領域の人数を合算する。また、人数統合部２０５は、ユーザが設定した画像上の領域の内部の分割領域の人数のみを合算してもよい。また、人数統合部２０５は、画像取得部２０１により取得された画像に人数を重畳してもよい。 The number integration unit 205 integrates the number of people in each divided area estimated by the number estimation unit 204. For example, the number integration unit 205 adds up the number of people in each divided area estimated by the number estimation unit 204. Further, the number integration unit 205 may add up only the number of people in the divided area inside the area on the image set by the user. Further, the number-of-person integration unit 205 may superimpose the number of people on the image acquired by the image acquisition unit 201.

表示部２０６は、人数統合部２０５により統合された人数を、表示装置１０４に表示する。表示部２０６は、人数の数値を表示してもよいし、人数を重畳した画像を表示してもよい。また、表示部２０６は、ファイルに人数を出力してもよいし、ネットワークプロトコルを利用して人数を送信してもよい。 The display unit 206 displays the number of people integrated by the person integration unit 205 on the display device 104. The display unit 206 may display a numerical value of the number of people, or may display an image in which the number of people is superimposed. Further, the display unit 206 may output the number of people to a file, or may transmit the number of people using a network protocol.

図６は、図２の領域分割部２０３の処理方法の一例を示すフローチャートである。ステップＳ６０１では、領域分割部２０３は、分布取得部２０２で取得された人体サイズの分布と、決定済みの分割領域を基に、決定済みの分割領域における理想割合範囲内の人体サイズに対応する画素以外で想定される人体サイズが最大となる位置を決定する。ステップＳ６０１の処理の詳細は、図７を参照しながら説明する。 FIG. 6 is a flowchart showing an example of the processing method of the area dividing portion 203 of FIG. In step S601, the area division unit 203 is a pixel corresponding to the human body size within the ideal ratio range in the determined division area based on the distribution of the human body size acquired by the distribution acquisition unit 202 and the determined division area. Determine the position where the expected human body size is maximized. The details of the process of step S601 will be described with reference to FIG. 7.

図７は、図６のステップＳ６０１の処理の詳細を示すフローチャートである。ステップＳ７０１では、領域分割部２０３は、画像の中の人物の候補位置および候補位置で想定される人体サイズを初期化する。なお、本実施形態において、ステップＳ７０１にて、領域分割部２０３は、画像における全画素のうち最も左側かつ最も上側の画素を候補位置として初期化し、当該候補位置で想定される人体サイズはゼロとして初期化する。 FIG. 7 is a flowchart showing details of the process of step S601 of FIG. In step S701, the area dividing unit 203 initializes the candidate position of the person in the image and the human body size assumed at the candidate position. In the present embodiment, in step S701, the region dividing unit 203 initializes the leftmost and uppermost pixels of all the pixels in the image as candidate positions, and assumes that the human body size at the candidate positions is zero. initialize.

ステップＳ７０２では、領域分割部２０３は、画像取得部２０１により取得された画像の中の各画素を対象として、後述するステップＳ７０３〜Ｓ７０８の処理を繰り返す。本実施形態では、領域分割部２０３は、まず画像における全画素のうち、最も左側かつ最も上側の画素を対象とし、次に、画像の左上から右下までラスタスキャンを行うよう順次対象とする画素（対象画素）を変更しつつ、Ｓ７０３〜Ｓ７０８の処理を繰り返す。 In step S702, the area dividing unit 203 repeats the processes of steps S703 to S708 described later for each pixel in the image acquired by the image acquisition unit 201. In the present embodiment, the area dividing unit 203 first targets the leftmost and uppermost pixels of all the pixels in the image, and then sequentially targets the pixels so as to perform raster scan from the upper left to the lower right of the image. The processing of S703 to S708 is repeated while changing (target pixel).

ステップＳ７０３では、領域分割部２０３は、分布取得部２０２により取得された人体サイズの分布を基に、ステップＳ７０２で対象とした対象画素の位置で想定される人体サイズを取得する。 In step S703, the area dividing unit 203 acquires the human body size assumed at the position of the target pixel targeted in step S702 based on the distribution of the human body size acquired by the distribution acquisition unit 202.

ステップＳ７０４では、領域分割部２０３は、図６のステップＳ６０２で決定済みの分割領域が存在する場合には、各分割領域を対象として、後述するステップＳ７０５とＳ７０６の処理を繰り返す。なお、図６に示す決定済みの分割領域が存在しない場合は、Ｓ７０７へ遷移する。 In step S704, when the division area determined in step S602 of FIG. 6 exists, the area division unit 203 repeats the processes of steps S705 and S706 described later for each division area. If the determined division region shown in FIG. 6 does not exist, the transition to S707 occurs.

ステップＳ７０５では、領域分割部２０３は、ステップＳ７０２で対象とした画素が、ステップＳ７０４で対象とした分割領域（以下、対象分割領域）の内部に包含されるか否かを判定する。領域分割部２０３は、対象画素が対象分割領域に包含されていない場合には、ステップＳ７０４の繰り返し処理に進み、対象画素が対象分割領域に包含されている場合には、ステップＳ７０６に進む。 In step S705, the area dividing unit 203 determines whether or not the pixel targeted in step S702 is included in the divided area (hereinafter referred to as the target divided area) targeted in step S704. If the target pixel is not included in the target division area, the area division unit 203 proceeds to the iterative process of step S704, and if the target pixel is included in the target division area, proceeds to step S706.

ステップＳ７０６では、領域分割部２０３は、ステップＳ７０５の対象分割領域のサイズと、理想割合範囲である割合範囲５０２とを基に、人数推定部２０４が目標値５０１以上の正解率で分割領域の人数を推定可能な人体サイズの範囲である理想サイズ範囲を算出する。 In step S706, the area division unit 203 is based on the size of the target division area in step S705 and the ratio range 502, which is an ideal ratio range, and the number estimation unit 204 has a correct answer rate of 501 or more and the number of people in the division area. Calculate the ideal size range, which is the range of the human body size that can be estimated.

そして、領域分割部２０３は、ステップＳ７０３で取得した対象画素での人体サイズが、ステップＳ７０６で算出した理想サイズ範囲に含まれているか否かを判定する。領域分割部２０３は、ステップＳ７０３で取得した対象画素での人体サイズがステップＳ７０６で算出した理想サイズ範囲に含まれていない場合には、ステップＳ７０４の繰り返し処理に進む。また、領域分割部２０３は、ステップＳ７０３で取得した人体サイズがステップＳ７０６で算出した理想サイズ範囲に含まれている場合には、ステップＳ７０２へ遷移し、次の対象画素を決定する。 Then, the area dividing unit 203 determines whether or not the human body size of the target pixel acquired in step S703 is included in the ideal size range calculated in step S706. If the human body size of the target pixel acquired in step S703 is not included in the ideal size range calculated in step S706, the area dividing unit 203 proceeds to the iterative process of step S704. Further, when the human body size acquired in step S703 is included in the ideal size range calculated in step S706, the area dividing unit 203 transitions to step S702 and determines the next target pixel.

領域分割部２０３は、ステップＳ７０４のすべての分割領域の処理を終了した後、ステップＳ７０７に進む。ステップＳ７０７では、領域分割部２０３は、ステップＳ７０３で取得した対象画素における想定される人体サイズが現在の候補位置での人体サイズよりも大きいか否かを判定する。領域分割部２０３は、ステップＳ７０３で取得した人体サイズが現在の候補位置での人体サイズより大きい場合には、ステップＳ７０８に進む。また、領域分割部２０３は、ステップＳ７０３で取得した人体サイズが現在の候補位置での人体サイズより大きくない場合には、ステップＳ７０２へ遷移し、次の対象画素を決定する。 The area division unit 203 proceeds to step S707 after completing the processing of all the division areas in step S704. In step S707, the area dividing unit 203 determines whether or not the assumed human body size in the target pixel acquired in step S703 is larger than the human body size at the current candidate position. If the human body size acquired in step S703 is larger than the human body size at the current candidate position, the region dividing unit 203 proceeds to step S708. If the human body size acquired in step S703 is not larger than the human body size at the current candidate position, the area dividing unit 203 transitions to step S702 and determines the next target pixel.

ステップＳ７０８では、領域分割部２０３は、現在の対象画素を候補位置とし、現在の対象画素における人体サイズを候補位置で想定される人体サイズとする。 In step S708, the area dividing unit 203 sets the current target pixel as the candidate position and sets the human body size in the current target pixel as the human body size assumed in the candidate position.

領域分割部２０３は、画像における全画素に対しＳ７０２〜Ｓ７０８の繰り返し処理が終了した後における候補位置を、決定済みの分割領域における理想サイズ範囲内の人体サイズの画素以外の領域にて、人体サイズが最大となる位置として決定する。このとき、決定される人体サイズが最大となる位置をサイズ最大位置とする。 The area division unit 203 sets the candidate position after the iterative processing of S702 to S708 for all the pixels in the image in a region other than the human body size pixel within the ideal size range in the determined division area. Is determined as the maximum position. At this time, the position where the determined human body size is maximized is set as the maximum size position.

次に、図６のステップＳ６０２では、領域分割部２０３は、図５の割合範囲５０２と、ステップＳ６０１で決定されたサイズ最大位置と当該サイズ最大位置で想定される人体サイズとを基に、分割領域の位置とサイズを決定する。そして、領域分割部２０３は、決定した位置とサイズを基に、ステップＳ６０１で決定された位置を含む分割領域を決定する。本実施形態における領域分割部２０３は、ステップＳ６０１で決定されたサイズ最大位置で想定される人体サイズと分割領域のサイズとの比が理想割合範囲に含まれ、かつ、当該サイズ最大位置が分割領域の左下の端点に位置するよう分割領域を決定する。なお、分割領域は、ステップＳ６０１で決定された位置を中心とする領域でもよいし、右下、左上および右上のいずれかとする領域でもよい。 Next, in step S602 of FIG. 6, the area dividing portion 203 is divided based on the ratio range 502 of FIG. 5, the maximum size position determined in step S601, and the human body size assumed at the maximum size position. Determine the location and size of the area. Then, the area division unit 203 determines the division area including the position determined in step S601 based on the determined position and size. In the region division portion 203 in the present embodiment, the ratio of the human body size assumed at the maximum size position determined in step S601 to the size of the division region is included in the ideal ratio range, and the maximum size position is the division region. Determine the division area so that it is located at the lower left endpoint of. The divided region may be an region centered on the position determined in step S601, or may be any of the lower right, upper left, and upper right regions.

なお、ステップＳ６０１で決定されたサイズ最大位置は、決定済みの分割領域内に位置する場合も考えられる。具体的には、決定済みの分割領域である分割領域Ａ内の画素のうち、分割領域Ａの理想サイズ範囲から外れる人体サイズの画素においてサイズ最大位置となる画素が存在する場合がある。このとき、Ｓ６０２にて、領域分割部２０３は、分割領域Ａ内におけるサイズ最大位置で想定される人体サイズと分割領域のサイズとの比が理想割合範囲に含まれ、かつ、当該サイズ最大位置が分割領域の左下の端点に位置するような分割領域である分割領域Ｂを決定する。このように領域分割部２０３により決定された分割領域Ａと分割領域Ｂは一部重複する。 The maximum size position determined in step S601 may be located within the determined division area. Specifically, among the pixels in the divided area A, which is the determined divided area, there may be a pixel having the maximum size position in the human body size pixel outside the ideal size range of the divided area A. At this time, in S602, in the area dividing portion 203, the ratio of the human body size assumed at the maximum size position in the divided area A to the size of the divided area is included in the ideal ratio range, and the maximum size position is set. A division area B, which is a division area located at the lower left end point of the division area, is determined. The division area A and the division area B determined by the area division unit 203 partially overlap in this way.

次に、図６のステップＳ６０３では、領域分割部２０３は、画像取得部２０１により取得された画像の中の全画素がステップＳ６０２で決定した複数の分割領域のいずれかの理想サイズ範囲に含まれているか否かを判定する。領域分割部２０３は、全画素が複数の分割領域のいずれかの理想サイズ範囲に含まれている場合には、図６の処理を終了し、少なくとも１個の画素が複数の分割領域のいずれの理想サイズ範囲にも含まれていない場合には、ステップＳ６０１に戻る。 Next, in step S603 of FIG. 6, the area division unit 203 includes all the pixels in the image acquired by the image acquisition unit 201 within the ideal size range of any of the plurality of division areas determined in step S602. Judge whether or not. When all the pixels are included in the ideal size range of any of the plurality of divided areas, the area dividing unit 203 ends the process of FIG. 6, and at least one pixel is any of the plurality of divided areas. If it is not included in the ideal size range, the process returns to step S601.

以上のように、領域分割部２０３は、ステップＳ７０３で取得した人体サイズと、図５の割合範囲５０２とを基に、複数の分割領域の各々の位置とサイズを決定する。１つの分割領域内で理想サイズ範囲に含まれる画素と、理想サイズ範囲内に含まれない画素が存在する場合がある。しかしながら、本実施形態における領域分割部２０３では、複数の分割領域同士の一部を重複させることで、画像における全画素が、複数の分割領域のいずれかの理想サイズ範囲に含ませることができる。このようにすることで、画像から人数を推定する精度を向上させることができる。 As described above, the region division unit 203 determines the position and size of each of the plurality of division regions based on the human body size acquired in step S703 and the ratio range 502 of FIG. There may be pixels included in the ideal size range and pixels not included in the ideal size range in one divided region. However, in the region division unit 203 in the present embodiment, all the pixels in the image can be included in any one of the ideal size ranges of the plurality of division regions by overlapping a part of the plurality of division regions. By doing so, the accuracy of estimating the number of people from the image can be improved.

割合範囲５０２は、人数推定部２０４が目標値以上の正解率で推定することができる分割領域の横幅のサイズに対する人体の横幅のサイズの割合の範囲である。理想割合範囲である割合範囲５０２は、人数推定部２０４の推定特性に応じた人体のサイズの範囲、または、人数推定部２０４が目標値以上の正解率で推定することができる人体のサイズの範囲に対応する。 The ratio range 502 is a range of the ratio of the width size of the human body to the width size of the divided region that can be estimated by the number estimation unit 204 with a correct answer rate equal to or higher than the target value. The ratio range 502, which is the ideal ratio range, is a range of the size of the human body according to the estimation characteristics of the number estimation unit 204, or a range of the size of the human body that the number estimation unit 204 can estimate with a correct answer rate equal to or higher than the target value. Corresponds to.

領域分割部２０３は、ステップＳ７０３で取得した人体のサイズが、人数推定部２０４が目標値以上の正解率で人数を推定することができる人体のサイズの範囲である理想サイズ範囲に含まれるように、複数の分割領域を決定する。 The area division unit 203 is included in the ideal size range in which the size of the human body acquired in step S703 is the range of the size of the human body in which the number estimation unit 204 can estimate the number of people with a correct answer rate equal to or higher than the target value. , Determine multiple division areas.

領域分割部２０３は、人数推定部２０４が高い正解率で推定可能な割合範囲５０２に応じて、適切な分割領域を決定する。人数推定部２０４は、領域分割部２０３により決定された各分割領域内の人数を推定する。これにより、人数推定部２０４は、人数の推定の正解率を高めることができる。 The area division unit 203 determines an appropriate division area according to the ratio range 502 that can be estimated by the number estimation unit 204 with a high accuracy rate. The number of people estimation unit 204 estimates the number of people in each division area determined by the area division unit 203. As a result, the number of people estimation unit 204 can increase the correct answer rate for estimating the number of people.

なお、ユーザが画像の中の推定対象領域を設定してもよい。領域分割部２０３は、ユーザが設定した画像の中の推定対象領域を分割することにより、複数の分割領域を決定する。その場合、ステップＳ６０１では、領域分割部２０３は、ユーザが設定した推定対象領域内において、上記の位置を決定する。ステップＳ６０４では、領域分割部２０３は、ユーザが設定した推定対象領域内の全画素が複数の分割領域のいずれかに含まれているか否かを判定する。 The user may set the estimation target area in the image. The area division unit 203 determines a plurality of division areas by dividing the estimation target area in the image set by the user. In that case, in step S601, the area division unit 203 determines the above position within the estimation target area set by the user. In step S604, the area division unit 203 determines whether or not all the pixels in the estimation target area set by the user are included in any of the plurality of division areas.

また、ユーザが図５の目標値５０１を設定してもよい。その場合、ステップＳ７０６およびＳ８０６では、領域分割部２０３は、ユーザが設定した目標値５０１に対応する割合範囲５０２を用いる。 Further, the user may set the target value 501 of FIG. In that case, in steps S706 and S806, the area dividing unit 203 uses the ratio range 502 corresponding to the target value 501 set by the user.

（その他の実施形態）
本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサがプログラムを読み出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 (Other embodiments)
The present invention supplies a program that realizes one or more functions of the above-described embodiment to a system or device via a network or storage medium, and one or more processors in the computer of the system or device reads and executes the program. It can also be realized by the processing to be performed. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

以上、本発明の好ましい実施形態について説明したが、本発明はこれらの実施形態に限定されず、その要旨の範囲内で種々の変形及び変更が可能である。 Although the preferred embodiments of the present invention have been described above, the present invention is not limited to these embodiments, and various modifications and modifications can be made within the scope of the gist thereof.

１００画像処理装置、２０１画像取得部、２０２分布取得部、２０３領域分割部、２０４人数推定部、２０５人数統合部、２０６表示部 100 image processing device, 201 image acquisition unit, 202 distribution acquisition unit, 203 area division unit, 204 number of people estimation unit, 205 number of people integration unit, 206 display unit

Claims

A dividing means for determining a plurality of divided areas by dividing an image, and
It has an estimation means for estimating the number of people in each of the plurality of division areas determined by the division means.
The dividing means determines the position and size of each of the plurality of divided regions based on the size of the human body assumed at each position in the image.
An image processing apparatus characterized in that a part of at least two divided regions of the plurality of divided regions overlaps.

The dividing means sets each position of the plurality of divided regions based on the size of the human body assumed at each position in the image and the range of the size of the human body according to the estimation characteristics of the estimating means. The image processing apparatus according to claim 1, wherein the size is determined.

The dividing means is based on the size of the human body assumed at the position of each pixel in the image and the range of the size of the human body that the estimating means can estimate with a correct answer rate equal to or higher than the target value. The image processing apparatus according to claim 1 or 2, wherein the position and size of each of the plurality of divided regions are determined.

The dividing means is a ratio of the size of the human body assumed at the position of each pixel in the image to the size of the divided region that the estimating means can estimate with a correct answer rate equal to or higher than the target value. The image processing apparatus according to any one of claims 1 to 3, wherein the position and size of each of the plurality of divided regions are determined based on the range of the above.

The dividing means is the width of the human body with respect to the size of the human body assumed at the position of each pixel in the image and the width of the dividing region that the estimating means can estimate with a correct answer rate equal to or higher than the target value. The image processing apparatus according to any one of claims 1 to 4, wherein the position and size of each of the plurality of divided regions are determined based on the range of the size ratio of the above.

The dividing means includes the size of the human body assumed at the position of each pixel in the image within the range of the size of the human body in which the estimating means can estimate the number of people with a correct answer rate equal to or higher than the target value. The image processing apparatus according to any one of claims 1 to 5, wherein the plurality of divided regions are determined.

The image processing apparatus according to any one of claims 1 to 6, wherein all the pixels in the image are included in any one of the plurality of divided regions.

The image processing apparatus according to any one of claims 1 to 7, wherein the dividing means determines a plurality of divided regions by dividing an estimation target region in the image.

The image processing apparatus according to any one of claims 1 to 8, wherein the estimation means estimates the number of people by using a regressionr obtained by machine learning.

A division step that determines multiple division areas by dividing an image,
It has an estimation step for estimating the number of people in each of the plurality of division areas determined in the division step.
In the division step, the position and size of each of the plurality of division areas are determined based on the size of the human body assumed at each position in the image.
An image processing method of an image processing apparatus, characterized in that a part of at least two divided regions overlaps the plurality of divided regions.

A program for causing a computer to function as each means of the image processing apparatus according to any one of claims 1 to 9.