JP2019101664A

JP2019101664A - Estimating program, estimating system, and estimating method

Info

Publication number: JP2019101664A
Application number: JP2017230761A
Authority: JP
Inventors: 敏規半谷; Toshiki Hanya; 裕起蒲山; Yuki Kabayama; 鈴木　智美; Tomomi Suzuki; 智美鈴木
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2017-11-30
Filing date: 2017-11-30
Publication date: 2019-06-24
Anticipated expiration: 2037-11-30
Also published as: JP7069667B2

Abstract

To estimate at low cost or highly precisely a crowded situation of a staying area of objects in a space.SOLUTION: For an object detected from each of multiple first images that are images of a space, the objects correlated with each other between the multiple first images are grouped on the basis of detected positional information on each object and on a detection situation of each object in each of the multiple first images, and a staying area where each object stays in the space is estimated on the basis of the grouping.SELECTED DRAWING: Figure 1

Description

本発明は、推定プログラム、推定システム、及び推定方法に関する。 The present invention relates to an estimation program, an estimation system, and an estimation method.

店舗の混雑状況を把握する手法として、例えば、食券制等の店舗における利用者による発券機の利用状況を分析することで、利用者のサービスの提供待ち状況を把握する手法が知られている。 As a method of grasping the crowded situation of the store, for example, there is known a method of grasping the waiting condition of the service provision of the user by analyzing the usage situation of the ticketing machine by the user in the restaurant such as a meal ticket system.

上記の手法では、店舗の混雑状況を把握できるが、座席の混雑状況、例えば、座席の着席又は空席状況等を把握することは困難である。 With the above-mentioned method, it is possible to grasp the crowded situation of the store, but it is difficult to grasp the crowded situation of the seat, for example, the seating or vacant seat situation of the seat.

座席の混雑状況を把握する手法としては、各座席に加圧センサを設置したり、店舗に赤外線センサを設置したりする手法が知られている。加圧センサを用いた手法では、例えば、座席ごとに加圧センサを１つずつ取り付けることで、各座席における着席状況を検出し、店舗の混雑状況を把握できる。また、赤外線センサを用いた手法では、例えば、店舗に設置した赤外線センサにより、人体からの赤外線を検知することで、人の居る位置や人数を把握できる。 As a method of grasping the crowded condition of the seat, there is known a method of installing a pressure sensor in each seat or installing an infrared sensor in a store. In the method using the pressure sensor, for example, by installing one pressure sensor for each seat, it is possible to detect the seating situation in each seat and to grasp the crowded situation of the store. Moreover, in the method using an infrared sensor, for example, the position and the number of people can be grasped by detecting infrared rays from a human body by an infrared sensor installed in a store.

しかし、これらの手法では、例えば、座席ごとに加圧センサを１つずつ取り付けたり、人体からの赤外線の検知が困難な座席領域を分析用のソフトウェアに設定したり、といった導入コストや、分析用のソフトウェアの運用及び保守等の運用コスト等が発生し得る。 However, in these methods, for example, one pressure sensor is attached to each seat, or a seat area where detection of infrared rays from the human body is difficult to be set as analysis software, or for analysis Operation costs such as operation and maintenance of software of

従って、例えば、飲食店のような１００以上の座席を有する店舗の場合、店舗ごとに上述したコストが発生し得るため、容易に導入することは難しい。 Therefore, for example, in the case of a store having 100 or more seats such as a restaurant, the cost described above may occur for each store, so it is difficult to easily introduce it.

ところで、店舗に設置した監視カメラにより撮影された映像を分析することで、店舗における混雑状況を把握する手法も知られている。 By the way, the method of grasping | ascertaining the congestion condition in a shop is also known by analyzing the imaging | video image | photographed with the surveillance camera installed in the shop.

特開２０１０−８６３００号公報JP, 2010-86300, A 特開２０１７−１５６９５６号公報JP, 2017-156956, A

しかしながら、監視カメラの映像を分析する手法では、監視カメラの角度（例えば店舗内の撮影方向の角度や画角）、映像の解像度、撮影条件等が、認識精度に大きく影響を与える。また、例えば、混雑状況の監視対象とするエリアを指定して監視したり、撮影した映像と空席画像との比較により監視したりすることもできる。しかし、これらの監視を行なうためには、店舗の座席位置の情報や監視カメラの設定・設置条件等の種々の条件をソフトウェアに設定することになる。 However, in the method of analyzing the video of the surveillance camera, the angle of the surveillance camera (for example, the angle and angle of view in the shooting direction in the store), the resolution of the video, the shooting conditions, etc. greatly affect the recognition accuracy. Further, for example, it is possible to designate and monitor an area to be monitored of the crowded situation, or to monitor by comparing a photographed image with a vacant seat image. However, in order to perform such monitoring, various conditions such as information on the seat position of the store and setting / installation conditions of the monitoring camera are set in the software.

なお、座席位置の情報としては、上記の例においては座席配置図が挙げられる。座席位置（領域）は、空間において人物等の物体が滞留する「滞留領域」の一例である。 As the information on the seat position, there is a seat layout diagram in the above example. The seat position (area) is an example of a "retention area" in which an object such as a person is stagnating in space.

多くの店舗を有する企業の場合、このようなソフトウェアへの設定を店舗ごとに行なうことになるため、多大なコストが発生し得る。 In the case of a company having many stores, setting up such software on a store-by-store basis can result in a large cost.

１つの側面では、本発明は、空間における物体の滞留領域の混雑状況を、低コストに又は高精度に推定することを目的とする。 In one aspect, the present invention is directed to low-cost or high-accuracy estimation of congestion in a stagnant area of an object in space.

１つの側面では、推定プログラムは、以下の処理をコンピュータに実行させてよい。前記処理は、空間を撮影した複数の第１画像の各々から検出された物体について、検出された各物体の位置情報と、前記複数の第１画像の各々における各物体の検出状況と、に基づき、前記複数の第１画像間で相互に関連する物体をグループ化してよい。また、前記処理は、前記グループ化の結果に基づき、前記空間において各物体が滞留する滞留領域を推定してよい。 In one aspect, the estimation program may cause a computer to execute the following processing. The process is performed based on position information of each detected object and a detection state of each object in each of the plurality of first images, for an object detected from each of the plurality of first images captured in space. And interrelating objects may be grouped among the plurality of first images. Further, the processing may estimate a staying area in which each object stays in the space based on the grouping result.

１つの側面では、空間における物体の滞留領域の混雑状況を、低コストに又は高精度に推定することができる。 In one aspect, the congestion status of the stagnant area of an object in space can be estimated at low cost or with high accuracy.

一実施形態に係る混雑度推定システムの構成例を示すブロック図である。BRIEF DESCRIPTION OF THE DRAWINGS It is a block diagram which shows the structural example of the congestion degree estimation system which concerns on one Embodiment. 一実施形態に係る動作フェーズの一例を示す図である。It is a figure which shows an example of the operation | movement phase which concerns on one Embodiment. 一実施形態に係るサーバによる物体の検出例を示す図である。It is a figure which shows the example of a detection of the object by the server which concerns on one Embodiment. 検出情報の一例を示す図である。It is a figure which shows an example of detection information. 静的ボックス及び非静的ボックスの一例を示す図である。It is a figure which shows an example of a static box and a non-static box. ボックスの座標の類似性判定処理の一例を示す図である。It is a figure which shows an example of the similarity determination processing of the coordinate of a box. ボックス自体の類似性判定処理の一例を示す図である。It is a figure which shows an example of the similarity determination process of box itself. 座席位置の推定処理の一例を示す図である。It is a figure which shows an example of the estimation process of a seat position. 階層化クラスタリングの距離指標の一例を示す図である。It is a figure which shows an example of the distance parameter | index of hierarchical clustering. 座席位置の推定手順の一例を示す図である。It is a figure which shows an example of the presumed procedure of a seat position. （ａ）〜（ｃ）は、推定した混雑度の一例を示す図である。(A)-(c) is a figure which shows an example of the estimated degree of congestion. 混雑度の提示態様の一例を示す図である。It is a figure which shows an example of the presentation aspect of congestion degree. 一実施形態に係る座席位置推定フェーズの動作例を示すフローチャートである。It is a flowchart which shows the operation example of the seat position estimation phase which concerns on one Embodiment. 一実施形態に係る座席位置推定フェーズの動作例を示すフローチャートである。It is a flowchart which shows the operation example of the seat position estimation phase which concerns on one Embodiment. 一実施形態に係る混雑状況推定フェーズの動作例を示すフローチャートである。It is a flowchart which shows the operation example of the congestion condition estimation phase which concerns on one Embodiment. 一実施形態に係るコンピュータのハードウェア構成例を示す図である。It is a figure showing the example of hardware constitutions of the computer concerning one embodiment.

以下、図面を参照して本発明の実施の形態を説明する。ただし、以下に説明する実施形態は、あくまでも例示であり、以下に明示しない種々の変形や技術の適用を排除する意図はない。例えば、本実施形態を、その趣旨を逸脱しない範囲で種々変形して実施することができる。なお、以下の実施形態で用いる図面において、同一符号を付した部分は、特に断らない限り、同一若しくは同様の部分を表す。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. However, the embodiments described below are merely examples, and there is no intention to exclude the application of various modifications and techniques not explicitly stated below. For example, the present embodiment can be variously modified and implemented without departing from the scope of the present invention. In the drawings used in the following embodiments, portions given the same reference numerals indicate the same or similar portions unless otherwise specified.

〔１〕一実施形態
上述のように、監視カメラにより撮影された映像を分析して座席の混雑状況を把握するには、座席配置図の情報や監視カメラの設定・設置条件等をソフトウェアに設定することになり、店舗数が増えるほどコストが増大し得る。 [1] One Embodiment As described above, in order to analyze the video taken by the surveillance camera and grasp the congestion status of the seat, the information of the seat layout, the setting / installation conditions of the surveillance camera, etc. are set in the software As the number of stores increases, the cost may increase.

そこで、一実施形態では、監視カメラにより撮影された映像に基づき、座席位置を推定することができる推定システムについて説明する。 Therefore, in one embodiment, an estimation system capable of estimating a seat position based on an image captured by a surveillance camera will be described.

例えば、一実施形態に係る推定システムは、以下の処理を行なってよい。 For example, the estimation system according to an embodiment may perform the following processing.

・空間を撮影した複数の第１画像の各々から検出された物体について、検出された各物体の位置情報と、複数の第１画像の各々における各物体の検出状況と、に基づき、複数の第１画像間で相互に関連する物体をグループ化する。 -For an object detected from each of the plurality of first images captured in space, the plurality of first based on position information of each detected object and a detection state of each object in each of the plurality of first images Group objects related to each other in one image.

・グループ化の結果に基づき、空間において各物体が滞留する滞留領域を推定する。 Based on the result of grouping, estimate the staying area in which each object stays in space.

以上の処理により、一実施形態に係る推定システムは、複数の第１画像に基づいて、各物体が滞留する滞留領域、例えば、人が居る座席位置を推定することができる。なお、滞留領域は、人物が留まる特定の領域を意味し、例えば、椅子や座布団等の着席領域に限らず、立っている領域（空間）を含んでよい。立っている領域の一例としては、立食（或いは、「立ち食い」と呼ばれる）形式の飲食店等や、顧客が立ったままサービスの提供を受ける施設等において、店舗から顧客に対して割り当てられる立席領域が挙げられる。 According to the above processing, the estimation system according to the embodiment can estimate the staying area where each object stays, for example, the seat position where a person is present, based on the plurality of first images. In addition, a retention area means a specific area where a person stays, and may include, for example, a standing area (space) as well as a seating area such as a chair or a cushion. As an example of a standing area, a standing food (or called “stand-by”) type restaurant or the like, a facility that receives a service while the customer stands, etc. There is a seating area.

空間を撮影した第１画像に基づき、空間における滞留領域を推定することで、例えば、空間を撮影した画像と、推定した滞留領域とに基づいて、滞留領域の混雑状況を容易に推定することができる。 For example, by estimating the staying area in the space based on the first image obtained by imaging the space, it is possible to easily estimate the congestion state of the staying area based on the image obtained by imaging the space and the estimated staying area. it can.

例えば、監視カメラの映像に基づき座席の混雑状況を推定する際に、座席位置の情報等の条件をソフトウェアに設定する（或いは更新する）といった処理を不要とすることができ、低コスト化を実現できる。 For example, when estimating the crowded situation of the seat based on the video of the surveillance camera, the process of setting (or updating) the conditions such as the information of the seat position in the software can be unnecessary, and the cost can be reduced. it can.

また、事前に用意された座席配置図の情報を用いて座席の混雑状況を把握する場合、撮影した画像を、監視カメラの設定・設置条件等に基づき補正等を行なった上で、座席配置図と比較するため、混雑状況を正確に推定できない場合がある。これに対し、一実施形態に係る推定システムは、滞留領域の推定及び混雑状況の推定において、いずれも撮影した画像を利用した処理が行なわれる。従って、監視カメラの設定・設置条件等の影響を低減でき、混雑状況を高精度に推定することができる。 In addition, when the crowded situation of the seat is grasped using the information of the seat layout prepared in advance, the photographed image is corrected based on the setting / installation conditions of the monitoring camera, etc. In some cases, it may not be possible to accurately estimate the congestion situation in order to compare with. On the other hand, in the estimation system according to one embodiment, in the estimation of the staying area and the estimation of the congestion state, a process using an image captured is performed. Therefore, the influence of the setting and installation conditions of the surveillance camera can be reduced, and the congestion situation can be estimated with high accuracy.

以上のことから、一実施形態に係る推定システムは、推定した滞留領域の情報に基づいて、例えば、空間における滞留領域の利用状況を、低コストに又は高精度に推定することができる。 From the above, the estimation system according to an embodiment can, for example, estimate the utilization state of the staying area in space at low cost or with high accuracy, based on the information of the estimated staying area.

なお、空間を撮影する監視カメラは、防犯等の目的により、店舗等において導入され利用されている可能性が高い。このため、座席の混雑状況の推定に監視カメラを用いることにより、既存の設備を利用できることから、加圧センサや赤外線センサ、或いは発券機等を新たに導入して座席の混雑状況を推定するよりも、コストを抑制することができる。 In addition, the surveillance camera which image | photographs space is highly likely to be introduce | transduced and used in the store etc. by the objective, such as crime prevention. Therefore, existing equipment can be used by using the monitoring camera to estimate the congestion situation of the seat, so it is better to introduce a pressure sensor, an infrared sensor, a ticket issue machine, etc. to estimate the congestion situation of the seat. Even cost can be reduced.

〔１−１〕一実施形態の構成例
以下、一実施形態の構成例について説明する。図１は一実施形態に係る混雑度推定システム１の構成例を示すブロック図である。混雑度推定システム１は、推定対象の施設における滞留領域の混雑度を推定するシステムであり、上述した推定システムの一例である。混雑度推定システム１は、図１に示すように、例示的に、１以上（図１の例では複数）の監視カメラ２と、サーバ４と、をそなえてよい。 [1-1] Configuration Example of One Embodiment Hereinafter, a configuration example of one embodiment will be described. FIG. 1 is a block diagram showing a configuration example of a congestion degree estimation system 1 according to an embodiment. The congestion degree estimation system 1 is a system for estimating the congestion degree of the staying area in the facility to be estimated, and is an example of the estimation system described above. The congestion degree estimation system 1 may include, as shown in FIG. 1, one or more (a plurality of in the example of FIG. 1) surveillance cameras 2 and a server 4 as an example.

なお、監視カメラ２及びサーバ４は、例えば、ネットワーク５により相互に通信可能に接続されてよい。 The monitoring camera 2 and the server 4 may be communicably connected to each other by, for example, the network 5.

ネットワーク５は、例えば、ＬＡＮ（Local Area Network）或いはＷＡＮ（Wide Area Network）、又はこれらの組み合わせを含む、インターネット及びイントラネットの少なくとも一方であってよい。また、ネットワーク５は、ＶＰＮ（Virtual Private Network）等の仮想ネットワークを含んでもよい。なお、ネットワーク５は、有線ネットワーク及び無線ネットワークの一方又は双方により形成されてよい。 The network 5 may be, for example, at least one of the Internet and an intranet, including a Local Area Network (LAN) or a Wide Area Network (WAN), or a combination thereof. The network 5 may also include a virtual network such as a VPN (Virtual Private Network). The network 5 may be formed by either or both of a wired network and a wireless network.

監視カメラ２は、撮影方向における空間を撮影し、画像系列、例えば、動画像等の時系列に並んだ複数の画像（「フレーム」と称されてもよい）を取得してよい。 The surveillance camera 2 may capture a space in the shooting direction and acquire an image sequence, for example, a plurality of images (which may be referred to as “frames”) arranged in time sequence such as a moving image.

監視カメラ２は、例えば店舗３に設置されてよく、一例として、店舗３の店内及び店外に存在する顧客用の座席（好ましくは顧客用の全ての座席）を撮影範囲に含むように配置されてよい。 The surveillance camera 2 may be installed, for example, in the store 3 and, as an example, is arranged so as to include in the imaging range seats for customers (preferably all seats for customers) existing inside and outside the store 3 You may

店舗３の店内及び店外の全ての座席を撮影範囲に含めるために、例えば、複数の監視カメラ２が互いの死角を補完する（或いは一部の撮影範囲が重なる）ような位置に設置されてもよいし、１以上の可動式の監視カメラ２が設置されてもよい。或いは、これらの組み合わせが採用されてもよい。 In order to include all the seats inside and outside of the store 3 in the shooting range, for example, a plurality of monitoring cameras 2 are installed at positions that complement each other's blind spots (or some of the shooting ranges overlap) Alternatively, one or more movable surveillance cameras 2 may be installed. Alternatively, a combination of these may be employed.

監視カメラ２は、取得した映像をネットワーク５を介してサーバ４に送信してよい。例えば、監視カメラ２は、取得した映像を図示しないレコーダ等に蓄積し、所定のタイミングで、レコーダ内のデータをサーバ４に送信してもよい。所定のタイミングとしては、所定の時刻の到来、所定時間の経過、レコーダへの蓄積容量、蓄積フレーム数、等の種々の条件が用いられてよい。又は、監視カメラ２は、レコーダを介さずに、撮影した映像をサーバ４に送信してもよい。レコーダを介さない場合、例えば、監視カメラ２は、１〜数フレームごとに（略リアルタイムに）映像を送信してもよい。 The monitoring camera 2 may transmit the acquired video to the server 4 via the network 5. For example, the monitoring camera 2 may accumulate the acquired video in a recorder or the like (not shown) and transmit the data in the recorder to the server 4 at a predetermined timing. As the predetermined timing, various conditions such as the arrival of a predetermined time, the elapse of a predetermined time, the storage capacity to the recorder, the number of storage frames, and the like may be used. Alternatively, the monitoring camera 2 may transmit the captured video to the server 4 without passing through the recorder. When not via the recorder, for example, the monitoring camera 2 may transmit an image every one to several frames (in substantially real time).

監視カメラ２としては、例えば、ボックス型カメラ、ドームカメラ、ネットワークカメラ等が挙げられる。なお、ネットワークカメラとしては、ＩＰ（Internet Protocol）カメラ等が挙げられる。また、店内又は店外の照明が暗い店舗３においては、監視カメラ２として赤外線カメラ等の暗視カメラが用いられてもよい。監視カメラ２は、例えば、防犯カメラ、監視カメラ、街頭カメラ、定点カメラ等の種々の用途のカメラが用いられてよい。 As surveillance camera 2, a box type camera, a dome camera, a network camera etc. are mentioned, for example. The network camera may, for example, be an IP (Internet Protocol) camera. In addition, in the shop 3 in which the illumination inside or outside the shop is dark, a night vision camera such as an infrared camera may be used as the monitoring camera 2. For example, cameras for various uses such as a security camera, a surveillance camera, a street camera, and a fixed point camera may be used as the surveillance camera 2.

サーバ４は、監視カメラ２が撮影した映像に基づき、店舗３における座席位置を推定し、推定した座席位置と、監視カメラ２が撮影した映像と、に基づき、店舗３における座席の混雑度を推定してよい。サーバ４の詳細については後述する。 The server 4 estimates the seat position in the store 3 based on the video captured by the monitoring camera 2 and estimates the degree of congestion of the seat in the store 3 based on the estimated seat position and the video captured by the monitoring camera 2 You may Details of the server 4 will be described later.

なお、サーバ４が推定した座席の混雑度の情報は、例えば、図１に示すように、端末装置６に提供されてよい。一例として、サーバ４は、Ｗｅｂサーバの機能（例えば、後述する情報提示部１４）を有してよく、Ｗｅｂサーバの機能により、ネットワーク５を介して、店舗３の混雑度を表すＷｅｂページを端末装置６に表示させてもよい。 In addition, the information of the congestion degree of the seat which the server 4 estimated may be provided to the terminal device 6, for example, as shown in FIG. As an example, the server 4 may have a function of a Web server (for example, an information presentation unit 14 described later), and the Web server functions to terminal a Web page representing the degree of congestion of the store 3 via the network 5 It may be displayed on the device 6.

端末装置６は、サーバ４が推定した店舗３の混雑度の情報を受け取るコンピュータである。例えば、端末装置６は、店舗３の座席の混雑状況に関心のある店舗３の利用候補者が有するコンピュータであってよい。端末装置６としては、例えば、デスクトップ、ラップトップ又はモバイル等のＰＣ（Personal Computer）、タブレット、スマートホン、携帯電話等の各種情報処理装置が挙げられる。 The terminal device 6 is a computer that receives information on the degree of congestion of the store 3 estimated by the server 4. For example, the terminal device 6 may be a computer possessed by a candidate who uses the store 3 who is interested in the crowded situation of the seat of the store 3. Examples of the terminal device 6 include various information processing devices such as a PC (Personal Computer) such as a desktop, a laptop, or a mobile, a tablet, a smartphone, and a mobile phone.

端末装置６は、例えば、ネットワーク５を介して、サーバ４との間で、店舗３の座席の混雑状況取得に関する要求の送信、及び、混雑状況の推定結果に関する応答の受信等の種々の通信を行なってよい。 For example, the terminal device 6 transmits various requests such as transmission of a request for acquiring the congestion status of the seat of the store 3 and reception of a response regarding the estimation result of the congestion status with the server 4 via the network 5 You may do it.

なお、図１に例示するように、端末装置６が無線通信を行なうＰＣ、タブレット、スマートホン、携帯電話等である場合、ネットワーク５との接続は、基地局７を介したモバイルネットワーク経由で行なわれてよい。 In addition, as illustrated in FIG. 1, when the terminal device 6 is a PC, a tablet, a smart phone, a mobile phone or the like that performs wireless communication, connection with the network 5 is performed via the mobile network via the base station 7. It is good.

端末装置６は、例示的に、ユーザからの情報（操作要求）の入力手段、ユーザへの情報の出力手段、及び、サーバ４との間の通信手段、等をそなえてよい。 The terminal device 6 may include, for example, an input unit of information (operation request) from the user, an output unit of information to the user, a communication unit with the server 4, and the like.

〔１−２〕サーバ４の動作フェーズ
次に、サーバ４の動作フェーズについて説明する。 [1-2] Operation Phase of Server 4 Next, the operation phase of the server 4 will be described.

サーバ４は、上述のように、監視カメラ２が撮影した映像に基づき、座席位置を推定してよい。そして、サーバ４は、推定した座席位置と、監視カメラ２が撮影した（例えば最新の）映像と、に基づいて、座席の（例えば最新の）混雑状況を推定してよい。 As described above, the server 4 may estimate the seat position based on the image captured by the monitoring camera 2. Then, the server 4 may estimate the (for example, the latest) congestion status of the seat based on the estimated seat position and the (for example, the latest) video captured by the monitoring camera 2.

映像に基づく座席位置の推定には、種々の画像認識技術が用いられてよいが、一実施形態においては、例示的に、ニューラルネットワーク（ＮＮ；Neural Network）を用いたディープラーニング（Deep Learning）による検出モデルが用いられるものとする。 Although various image recognition techniques may be used for the estimation of the seat position based on the image, in one embodiment, it is exemplarily shown by Deep Learning using a neural network (NN; Neural Network). It is assumed that a detection model is used.

なお、店舗３の提供するサービス内容や時間帯等にも依るが、店舗３において顧客が座席を使用する期間は、十数分〜数十分（或いは１時間以上）であると考えられる。監視カメラ２は、少なくとも１秒間に１フレーム（１ＦＰＳ；Frame Per Second）以上の画像データを撮影可能であるが、顧客の着席、離席、退席等の座席の変化を検出するために全てのフレームのデータを分析することは非効率である。 Although it depends on the service content and the time zone provided by the store 3, the period in which the customer uses the seat in the store 3 is considered to be ten minutes to several tens of minutes (or more than one hour). The surveillance camera 2 can capture image data of one frame (1 FPS; Frame Per Second) or more at least per second, but all frames to detect changes in seating, such as seating, leaving and leaving of the customer Analyzing the data of is inefficient.

このため、機械学習では、撮影された画像データのうちの一部のフレームの画像データが用いられればよい。例えば、一実施形態においては、映像から５分間隔等の所定のサンプリング間隔で取得されたフレームがニューラルネットワークに入力されてよい。 For this reason, in machine learning, image data of a part of frames of the captured image data may be used. For example, in one embodiment, frames acquired at a predetermined sampling interval, such as five minutes apart from an image, may be input to the neural network.

また、一実施形態においては、座席位置の推定のために、例えば、数時間〜数日間の推定期間が設けられてよい。以下の説明では、例示的に、３日間の推定期間（所定期間）が設けられるものとする。 Also, in one embodiment, an estimation period of, for example, several hours to several days may be provided to estimate the seat position. In the following description, it is assumed that an estimation period of 3 days (predetermined period) is provided as an example.

以上の点から、一実施形態に係る手法は、図２に例示するフェーズに分けて実施されてよい。例えば、図２に示すように、監視カメラ２の運用が行なわれている店舗３に対して、監視カメラ２の映像をサーバ４に送信するための設定を行ない、所定期間として３日間の座席位置推定フェーズが設けられてよい。 From the above points, the method according to one embodiment may be divided into the phases illustrated in FIG. 2 and implemented. For example, as shown in FIG. 2, the setting for transmitting the video of the monitoring camera 2 to the server 4 is performed for the store 3 in which the monitoring camera 2 is operated, and the seat position for three days as a predetermined period An estimation phase may be provided.

座席位置推定フェーズにより、座席位置が推定されると、次いで、サーバ４による混雑状況推定フェーズが開始されてよい。混雑状況推定フェーズでは、サーバ４は、推定された座席位置と、監視カメラ２より送られてくる画像データとに基づいて、所定のタイミングで混雑状況の推定を行なってよい。 When the seat position is estimated by the seat position estimation phase, then the congestion state estimation phase by the server 4 may be started. In the congestion state estimation phase, the server 4 may estimate the congestion state at a predetermined timing based on the estimated seat position and the image data sent from the monitoring camera 2.

このように、座席位置推定フェーズは、一実施形態に係る手法の初期設定フェーズと位置付けられてよく、混雑状況推定フェーズは、一実施形態に係る手法の通常運用フェーズと位置付けられてよい。 As such, the seat position estimation phase may be positioned as the initialization phase of the method according to an embodiment, and the congestion situation estimation phase may be positioned as a normal operation phase of the method according to one embodiment.

なお、店舗３においては、例えば店舗内のレイアウト変更や、顧客又は従業員等による座席の移動が発生する可能性がある。そこで、座席位置推定フェーズの終了後、混雑状況推定フェーズにおいても、推定した座席位置を更新するための座席位置推定（更新）フェーズが実施されてもよい。座席位置推定（更新）フェーズは、座席位置推定フェーズと同様の処理に実施可能であるため、以下の説明では、これらを区別せずに、単に座席位置推定フェーズと表記する。 In the store 3, for example, layout change in the store or movement of a seat by a customer or an employee may occur. Therefore, after the end of the seat position estimation phase, the seat position estimation (updating) phase for updating the estimated seat position may be performed also in the congestion situation estimation phase. Since the seat position estimation (updating) phase can be performed in the same process as the seat position estimation phase, in the following description, these are simply referred to as the seat position estimation phase without distinction.

〔１−３〕サーバ４の構成例
次に、サーバ４の構成例について説明する。サーバ４は、店舗３、店舗３を有する企業等、又はデータセンタ等に設置される１以上のコンピュータの一例である。サーバ４としては、例えば、種々の物理サーバ装置及び／又は仮想サーバ装置が挙げられる。サーバ４の少なくとも一部の機能は、例えばクラウドサービスにより提供されるリソース、フレームワーク、アプリケーション等を利用して実現されてもよい。また、サーバ４の少なくとも一部の機能は、複数のコンピュータに分散又は冗長化して配置されてもよい。 [1-3] Configuration Example of Server 4 Next, a configuration example of the server 4 will be described. The server 4 is an example of one or more computers installed in the store 3, a company having the store 3 or the like, or a data center or the like. Examples of the server 4 include various physical server devices and / or virtual server devices. At least a part of the functions of the server 4 may be realized using, for example, resources provided by a cloud service, a framework, an application, and the like. Also, at least a part of the functions of the server 4 may be distributed or redundantly arranged on a plurality of computers.

図１に示すように、サーバ４は、例示的に、メモリ部１１、制御部１２、ＮＮ１３、及び、情報提示部１４をそなえてよい。 As illustrated in FIG. 1, the server 4 may include, for example, a memory unit 11, a control unit 12, an NN 13, and an information presentation unit 14.

メモリ部１１は、サーバ４の処理に用いられる種々の情報を格納する。メモリ部１１が格納する情報については、サーバ４の機能の説明において後述する。なお、メモリ部１１としては、メモリ、例えばＲＡＭ（Random Access Memory）等の揮発性メモリ、並びに、記憶部、例えばＨＤＤ（Hard Disk Drive）又はＳＳＤ（Solid State Drive）等の記憶装置、の一方又は双方が挙げられる。 The memory unit 11 stores various information used for processing of the server 4. Information stored in the memory unit 11 will be described later in the description of the function of the server 4. The memory unit 11 may be one of a memory, a volatile memory such as a random access memory (RAM), and a storage unit such as a storage device such as a hard disk drive (HDD) or a solid state drive (SSD). Both are mentioned.

制御部１２は、座席位置の推定及び混雑状況の推定に関する制御を行なう。図１に示すように、制御部１２は、例示的に、情報取得部１２１、静的ボックス判定部１２２、座席推定部１２３、及び、混雑度算出部１２４をそなえてよい。 The control unit 12 performs control regarding estimation of the seat position and estimation of the congestion state. As illustrated in FIG. 1, the control unit 12 may include, for example, an information acquisition unit 121, a static box determination unit 122, a seat estimation unit 123, and a congestion degree calculation unit 124.

情報取得部１２１は、監視カメラ２（店舗３）から送信された映像データを受信し、メモリ部１１に映像データ１１１として格納してよい。なお、一実施形態において、制御部１２及びＮＮ１３は、５分間隔で撮影された、時系列に並んだ画像データ（フレーム）を用いてよい。時系列に並んだ画像データは、複数の第１画像の一例である。 The information acquisition unit 121 may receive the video data transmitted from the monitoring camera 2 (store 3) and store the video data in the memory unit 11 as the video data 111. In one embodiment, the control unit 12 and the NN 13 may use time-series image data (frames) captured at five-minute intervals. The image data arranged in time series is an example of a plurality of first images.

例えば、情報取得部１２１は、受信した映像データから５分間隔でフレームを抽出した情報を、映像データ１１１としてメモリ部１１に格納してもよい。或いは、店舗３又は監視カメラ２において、撮影した映像データから５分間隔で抽出したフレーム群が、映像データとしてサーバ４に送信されてもよい。 For example, the information acquisition unit 121 may store information obtained by extracting frames at intervals of 5 minutes from the received video data as the video data 111 in the memory unit 11. Alternatively, in the store 3 or the monitoring camera 2, a frame group extracted at intervals of 5 minutes from the captured video data may be transmitted to the server 4 as video data.

なお、映像データ１１１は、少なくともＮ（Ｎは整数；例えば“５”）フレームの画像データを含んでよい。情報取得部１２１は、メモリ部１１の記憶容量節約の観点から、映像データ１１１のデータサイズ或いはフレーム数等に上限を設け、上限を超えるフレームについて、撮影日時が過去のフレーム順に映像データ１１１内のフレームを削除してもよい。 Note that the video data 111 may include image data of at least N (N is an integer; for example, “5”) frames. The information acquisition unit 121 sets an upper limit on the data size or the number of frames of the video data 111 from the viewpoint of saving the storage capacity of the memory unit 11, and for frames exceeding the upper limit, the image date and time in the video data 111 is in the order of the past frames. You may delete the frame.

静的ボックス判定部１２２及び座席推定部１２３は、座席位置推定フェーズにおいて、ＮＮ１３と協働して、映像データ１１１から店舗３の座席位置を推定する。換言すれば、静的ボックス判定部１２２及び座席推定部１２３は、以下のグループ化部、及び、推定部の一例である。 The static box determination unit 122 and the seat estimation unit 123 estimate the seat position of the store 3 from the video data 111 in cooperation with the NN 13 in the seat position estimation phase. In other words, the static box determination unit 122 and the seat estimation unit 123 are an example of the following grouping unit and estimation unit.

グループ化部は、空間を撮影した複数の第１画像の各々から検出された物体について、検出された各物体の位置情報と、複数の第１画像の各々における各物体の検出状況と、に基づき、複数の第１画像間で相互に関連する物体をグループ化してよい。推定部は、グループ化の結果に基づき、空間において各物体が滞留する滞留領域を推定してよい。 The grouping unit is configured based on position information of each detected object and a detection state of each object in each of the plurality of first images with respect to the objects detected from each of the plurality of first images captured in the space. The mutually related objects may be grouped among the plurality of first images. The estimation unit may estimate a staying area in which each object stays in space based on the grouping result.

混雑度算出部１２４は、推定した座席位置に基づいて、店舗３の混雑度を算出する。換言すれば、混雑度算出部１２４は、空間を撮影した第２画像から検出された物体の位置情報と、推定された空間における滞留領域の情報と、に基づき、滞留領域の混雑度を推定する混雑度推定部の一例である。 The congestion degree calculation unit 124 calculates the congestion degree of the store 3 based on the estimated seat position. In other words, the congestion degree calculation unit 124 estimates the congestion degree of the staying area based on the position information of the object detected from the second image obtained by imaging the space and the information on the staying area in the estimated space. It is an example of a congestion degree estimation part.

情報提示部１４は、例えば、Ｗｅｂサーバ或いはＤＢ（Database）サーバの機能を有してよく、端末装置６からの要求に応じて、端末装置６に対して、混雑度算出部１２４が算出した混雑度の情報を提示してよい。 The information presentation unit 14 may have, for example, a function of a Web server or a DB (Database) server, and the congestion calculated by the congestion degree calculation unit 124 with respect to the terminal device 6 in response to a request from the terminal device 6 The degree information may be presented.

以下、サーバ４の機能及び動作の一例について説明する。 Hereinafter, an example of the function and operation of the server 4 will be described.

〔１−３−１〕ＮＮの説明
まず、ＮＮ１３について説明する。ＮＮ１３は、監視カメラ２が空間を撮影した複数の第１画像の各々から、物体を検出する。例えば、ＮＮ１３は、メモリ部１１に格納された、５分間隔の複数の画像データを含む映像データ１１１に基づき物体を検出してよい。 [1-3-1] Description of NN First, the NN 13 will be described. The NN 13 detects an object from each of a plurality of first images of which the surveillance camera 2 has captured a space. For example, the NN 13 may detect an object based on video data 111 stored in the memory unit 11 and including a plurality of image data at 5-minute intervals.

ＮＮ１３は、事前に、画像データから検出対象の物体を検出するように機械学習が行なわれたシステムであってよい。例えば、ＮＮ１３には、画像データから人物の頭部を検出する検出モデルが適用されてよい。人物の頭部は、検出対象の物体の一例である。 The NN 13 may be a system in which machine learning is performed in advance to detect an object to be detected from image data. For example, a detection model that detects the head of a person from image data may be applied to the NN 13. The head of a person is an example of an object to be detected.

図３にＮＮ１３による物体の検出例を示す。図３に例示するように、ＮＮ１３は、入力された画像データから、人物の頭部の位置及びスコアを算出してよい。なお、図３は、監視カメラ２により撮影された撮影空間３０の画像に対して、ＮＮ１３が当該画像に基づいて検出した頭部の位置及びスコアを当て嵌めた様子を示す。撮影空間３０の画像は、画像の左上を（０，０）とし、画像の右下を（ｘ，ｙ）とする座標系を有してよい。ｘは画像の横幅（width）のサイズ、ｙは画像の高さ（height）のサイズを、それぞれピクセル数で示した値であってよい。 An example of detection of an object by the NN 13 is shown in FIG. As illustrated in FIG. 3, the NN 13 may calculate the position and score of the head of a person from the input image data. In addition, FIG. 3 shows a mode that the position and score of the head which NN13 detected based on the said image with respect to the image of the imaging | photography space 30 image | photographed with the monitoring camera 2 were fitted. The image of the imaging space 30 may have a coordinate system in which the upper left of the image is (0, 0) and the lower right of the image is (x, y). x may be the size of the width of the image (y), and y may be the size of the height of the image (number of pixels).

頭部の位置は、例えば、頭部を囲う矩形形状の領域として特定されてよい。以下、頭部を囲う矩形形状を「ボックス」と表記する場合がある。なお、ボックスの形状は、矩形形状に限定されるものではなく、円形状、楕円形状、多角形状等であってもよい。 The position of the head may be identified, for example, as a rectangular area surrounding the head. Hereinafter, the rectangular shape surrounding the head may be referred to as a "box". The shape of the box is not limited to the rectangular shape, and may be a circular shape, an elliptical shape, a polygonal shape, or the like.

スコアは、最大値を“１．０００”とする、検出した領域が人体の頭部であるという確からしさ（尤度）を示す情報の一例である。なお、図３の例は１２時（“12:00”）に撮影されたフレームである。 The score is an example of information indicating the certainty (likelihood) that the detected area is the head of a human body, the maximum value of which is “1.000”. The example of FIG. 3 is a frame photographed at 12 o'clock ("12:00").

検出対象の物体を人物（人体）の頭部とすることにより、人体のうちの監視カメラ２に移りやすい部位を捉えることができ、検出精度を向上させることができる。また、ディープラーニングにより物体を検出することにより、正面の顔以外にも、種々の姿勢或いは状態における人物の頭部を検出できる。 By setting the object to be detected as the head of a person (human body), it is possible to capture a part of the human body that is likely to move to the monitoring camera 2 and improve detection accuracy. Further, by detecting an object by deep learning, it is possible to detect the head of a person in various postures or states other than the front face.

例えば、ＮＮ１３は、背面（後頭部；図３の符号Ａ参照）、上面（頭頂部（俯いた姿勢）；符号Ｂ参照）、側面（横顔；符号Ｃ参照）等、人物の頭部が種々の姿勢であっても、正確に頭部を検出することができる。 For example, the NN 13 may have various postures such as the back (the back of the head; see A in FIG. 3), the upper surface (the top of the head (rolled posture); B), and the sides (cross-face; see C). Even the head can be detected accurately.

また、ＮＮ１３は、頭部が他の物体に隠れている状態、例えば、椅子に遮られている状態（符号Ｄ参照）や、着帽状態（符号Ｅ参照）、或いは混雑した場所において他人や障害物に遮られている状態等であっても、正確に頭部を検出することができる。 In addition, the NN 13 is in a state in which the head is hidden by another object, for example, a state in which it is blocked by a chair (refer to code D), a cap state (refer to code E), or others or obstacles in crowded places Even in the state of being blocked by an object, etc., the head can be detected accurately.

ＮＮ１３は、検出した頭部の位置及びスコアを検出情報１１２としてメモリ部１１に格納してよい。検出情報１１２はフレームごとに生成されてもよいし、検出情報１１２においてフレームの識別情報が物体に対応付けられてもよい。なお、検出情報１１２は、種々のフォーマットのファイルであってよく、一例として、ＣＳＶ（Comma-Separated Values）等の形式のファイルであってよい。 The NN 13 may store the detected position and score of the head in the memory unit 11 as the detection information 112. The detection information 112 may be generated for each frame, or in the detection information 112, identification information of a frame may be associated with an object. The detection information 112 may be a file of various formats. For example, the detection information 112 may be a file of a format such as CSV (Comma-Separated Values).

図４は検出情報１１２の一例である。図４に示すように、検出情報１１２は、例示的に、「Ｎｏ．」、「ｘ１」、「ｙ１」、「ｘ２」、「ｙ２」、「スコア」の情報が含まれてよい。なお、図４の例は、１つのフレーム（画像）において検出された物体の情報を示す。 FIG. 4 is an example of the detection information 112. As shown in FIG. 4, the detection information 112 may illustratively include information of “No.”, “x1”, “y1”, “x2”, “y2”, and “score”. The example of FIG. 4 shows information of an object detected in one frame (image).

「Ｎｏ．」は検出した物体を識別する情報である。「Ｎｏ．」はフレームを識別する情報を更に含んでもよい。「ｘ１」及び「ｙ１」は、検出した頭部のボックスの左上の頂点の座標（ｘ１，ｙ１）を示し、「ｘ２」及び「ｙ２」は、検出した頭部のボックスの右下の頂点の座標（ｘ２，ｙ２）を示してよい。スコアは、（ｘ１，ｙ１）、（ｘ２，ｙ２）で表されるボックスの尤度を示してよい。なお、ＮＮ１３により検出情報１１２に設定されるボックスは、スコアが“０．５００”以上であるボックスに制限されてもよい。 “No.” is information for identifying a detected object. "No." may further include information identifying a frame. “X1” and “y1” indicate the coordinates (x1, y1) of the top left corner of the detected head box, and “x2” and “y2” indicate the bottom right corner of the detected head box Coordinates (x2, y2) may be indicated. The score may indicate the likelihood of the box represented by (x1, y1), (x2, y2). Note that the box set in the detection information 112 by the NN 13 may be limited to a box whose score is “0.500” or more.

〔１−３−２〕静的ボックス判定部の説明
静的ボックス判定部１２２は、座席位置の推定の前処理として、人物の頭部の検出結果である検出情報１１２に基づいて、座席に座っている人の頭部位置を判定してよい。 [1-3-2] Description of Static Box Determination Unit The static box determination unit 122 sits on the seat based on the detection information 112 which is the detection result of the head of a person as pre-processing for estimation of the seat position. Position of the person who is

なお、静的ボックス判定部１２２による以下の処理の少なくとも一部は、ＮＮを用いたディープラーニングにより実行されてもよい。例えば、静的ボックス判定部１２２は、ＮＮをそなえてもよく、或いは、ＮＮ１３が更に以下の処理を実行するように構成されてもよい。 Note that at least part of the following processing by the static box determination unit 122 may be performed by deep learning using an NN. For example, the static box determination unit 122 may include an NN, or the NN 13 may be configured to further execute the following processing.

静的ボックス判定部１２２は、例えば、或るフレームに係る頭部の検出結果を、過去数フレームに係る頭部の検出結果と比較し、同じ位置に留まっているボックスを抽出する。同じ位置に留まっているボックスには、例えば、店舗３において、座席に着席している人物（の頭部）が含まれてよい。換言すれば、静的ボックス判定部１２２は、監視カメラ２の撮影画像から、撮影画像ごとに、滞留領域に留まっている物体を含む、滞留している物体を検出する。 For example, the static box determination unit 122 compares the detection result of the head related to a certain frame with the detection result of the head related to the past few frames, and extracts a box remaining at the same position. The box staying at the same position may include, for example, (the head of) a person seated in a seat at the store 3. In other words, the static box determination unit 122 detects, from the captured image of the monitoring camera 2, the staying object including the object remaining in the staying area for each captured image.

フレーム間のボックスの同一性判定には、ボックスの座標（例えば、大きさ及び位置）の類似性判定、及び、ボックス自体の類似性判定が用いられてよい。 Similarity determination of box coordinates (e.g., size and position) and similarity determination of the box itself may be used to determine the identity of the box between the frames.

以下、静的ボックス判定部１２２による「座っている人」の頭部位置のデータを取得する手法の一例について説明する。なお、以下の説明において、座っている人の頭部位置のボックスを「静的ボックス」と表記する。 Hereinafter, an example of a method of acquiring the data of the head position of “the person sitting” by the static box determination unit 122 will be described. In the following description, the box at the head position of the sitting person is referred to as a "static box".

静的ボックス判定部１２２は、過去フレームに係る頭部の検出結果のデータを、以下の２種類のデータに分けて管理してよい。 The static box determination unit 122 may divide and manage data of the detection result of the head related to the past frame into the following two types of data.

（ａ）静的ボックスのデータ。 (A) static box data.

（ｂ）過去Ｎフレームにおける、静的ボックス以外の非静的ボックスのデータ。なお、Ｎ＝５であるものとする。 (B) Non-static box data other than static box in the past N frames. Note that N = 5.

静的ボックス判定部１２２は、例えば、静的ボックス及び非静的ボックスのデータを、ボックス情報１１３としてメモリ部１１に格納してよい。ボックス情報１１３は、例えば、静的ボックスのデータとして、静的ボックスごとに、複数のフレームにおける当該静的ボックスに対応すると判断したボックスの識別情報を含んでよい。また、ボックス情報１１３は、非静的ボックスのデータとして、複数のフレームにおける静的ボックスではないボックスの識別情報を含んでよい。 The static box determination unit 122 may store, for example, data of static box and non-static box in the memory unit 11 as box information 113. The box information 113 may include, for example, as static box data, identification information of a box determined to correspond to the static box in a plurality of frames for each static box. Also, the box information 113 may include, as non-static box data, identification information of a box that is not a static box in a plurality of frames.

このように、静的ボックス判定部１２２は、所定数以上の第１画像間において同一位置に存在すると判断した所定数以上の物体を一の静的物体と対応付けて管理する管理部の一例である。 As described above, the static box determination unit 122 is an example of a management unit that manages a predetermined number or more of objects determined to be present at the same position among a predetermined number or more of first images in association with one static object. is there.

静的ボックス判定部１２２は、例えば、各時刻の画像データに対して、検出された物体の位置及び各フレームにおける各物体の検出状況に基づいて、以下の（ｉ）〜（iii）の処理を実行してよい。 The static box determination unit 122 performs, for example, the following processes (i) to (iii) on the image data of each time based on the detected position of the object and the detection condition of each object in each frame. May run.

以下の説明において、「現在の時刻のフレーム」とは、判定対象のフレームと読み替えてもよい。すなわち、静的ボックス判定部１２２は、判定対象のフレーム内で検出された各物体を、当該フレームよりも過去のＮフレームと比較してよい。また、静的ボックス判定部１２２は、判定を行なうと、次の（５分後の）フレームを判定対象のフレームとして、当該フレームよりも過去のＮフレームと比較してよい。 In the following description, the “frame of the current time” may be read as the frame to be determined. That is, the static box determination unit 122 may compare each object detected in the frame to be determined with N frames past the frame. In addition, when the static box determination unit 122 makes a determination, the next (five minutes later) frame may be set as a determination target frame and compared with the N frames past the frame.

（ｉ）静的ボックス判定部１２２は、現在の時刻のフレームにおいて検出されたボックスを過去の時刻のフレームの静的ボックスと比較し、同一と判定したボックスを静的ボックスのデータに追加する。 (I) The static box determination unit 122 compares the box detected in the frame of the current time with the static box of the frame of the past time, and adds the box determined to be identical to the data of the static box.

なお、静的ボックス判定部１２２は、検出情報１１２から、各時刻のフレームにおいて検出されたボックスの情報を取得してよい。フレーム間において静的ボックスが同一であるか否かの判定手法は後述する。 The static box determination unit 122 may acquire, from the detection information 112, the information of the box detected in the frame at each time. A method of determining whether or not the static box is the same between the frames will be described later.

（ii）静的ボックス判定部１２２は、現在の時刻フレームにおいて検出されたボックスのうち、静的ボックスと判別されなかったボックスを、過去Ｎフレームの静的ボックス以外の非静的ボックスのデータと比較する。そして、静的ボックス判定部１２２は、静的ボックスの条件を満たすボックスの組み合わせを検出すると、当該ボックスを新たな静的ボックスのデータに追加する。 (Ii) The static box determination unit 122 sets a box not determined to be a static box among the boxes detected in the current time frame to data of non-static boxes other than the static box of the past N frames. Compare. Then, when detecting a combination of boxes satisfying the condition of the static box, the static box determination unit 122 adds the box to data of a new static box.

例えば、静的ボックス判定部１２２は、Ｎ（Ｎ＝５）フレーム中のＭフレームに、互いに同一であると判定できるボックスが含まれていれば、これらのボックスのグループを新たに静的ボックスと判定してよい。なお、Ｍは、Ｎ未満の整数であり、例えば「３」等であってよい。 For example, if M frames in N (N = 5) frames include boxes that can be determined to be identical to each other, static box determination unit 122 newly adds a group of these boxes as a static box. You may judge. M is an integer less than N, and may be, for example, “3” or the like.

（iii）静的ボックス判定部１２２は、Ｍフレーム連続で観測（検出）されなかった静的ボックスについて、当該ボックスのデータを例えば記憶部のファイルに書き出し、メモリ上の静的ボックスのデータからは削除してよい。このような状況としては、例えば、着席していた人物が離席した等の事象が挙げられる。 (Iii) For a static box not observed (detected) continuously in M frames, the static box determination unit 122 writes the data of the box to a file of the storage unit, for example, and from the data of the static box on the memory, You may delete it. As such a situation, for example, an event such as a person who has been seated may have left the seat may be mentioned.

なお、上述のように、メモリ部１１は、メモリ及び記憶部の一方又は双方であってよい。従って、上記（iii）の処理は、過去Ｎフレーム中のＭフレームにおいて検出されたアクティブな静的ボックスのデータをメモリに格納し、過去Ｍフレームにおいて検出されていない非アクティブな静的ボックスのデータを記憶部に退避させる、ことを意味する。 As described above, the memory unit 11 may be one or both of the memory and the storage unit. Therefore, the process of (iii) above stores the data of the active static box detected in the M frame in the past N frames in the memory, and the data of the inactive static box not detected in the past M frame Means to evacuate the storage unit.

図５は、時刻“10:00”、“10:05”、“10:10”の３フレームにおける静的ボックス及び非静的ボックスの一例を示す図である。図５に示すように、例えば、３フレームにおいて同一であると判定された、右上がりの網掛けで示すボックス、及び、右下がりの網掛けで示すボックスは、それぞれ、静的ボックス（“static box 1”及び“static box 2”）であると判定される。一方、過去５フレーム中で同一であると判定されたフレームが３フレームに満たない白背景のボックスは、非静的ボックス（“non-static box”）であると判定される。 FIG. 5 is a diagram showing an example of static and non-static boxes in three frames at times "10:00", "10:05" and "10:10". As shown in FIG. 5, for example, a box indicated by hatching in the upper right and a box indicated by hatching in the lower right, which are determined to be identical in three frames, are respectively static boxes (“static box” It is determined to be 1 "and" static box 2 "). On the other hand, it is determined that a white background box in which frames determined to be identical in the past 5 frames are less than 3 frames is a non-static box ("non-static box").

次に、フレーム間のボックスの同一性の判定手法について説明する。 Next, a method of determining the identity of boxes between frames will be described.

（ボックスの座標の類似性判定）
はじめに、ボックスの座標の類似性判定の手法について説明する。静的ボックス判定部１２２は、フレーム間のボックス同士の座標を比較し、比較結果が閾値を超える場合、両者を「異なるボックス」と判定してよい。比較対象の座標は、大きさ及び位置の少なくとも一方を含んでよい。 (Similarity determination of box coordinates)
First, a method of determining similarity of box coordinates will be described. The static box determination unit 122 may compare the coordinates of the boxes between the frames, and determine that both are “different boxes” if the comparison result exceeds the threshold. Coordinates to be compared may include at least one of size and position.

静的ボックス判定部１２２は、ボックスの座標の類似性判定により、サイズが大きく異なる、及び／又は、位置（換言すれば距離）が大きく離れている、ボックスのペアを、静的ボックスの判定対象から除外してよい。これにより、その後のボックス自体の類似性判定における計算量を削減することができる。 The static box determination unit 122 determines a pair of boxes whose sizes are largely different and / or positions (in other words, distances) are largely separated by the similarity determination of the coordinate of the box as the determination target of the static box. May be excluded from This makes it possible to reduce the amount of calculation in similarity determination of the subsequent box itself.

図６は、ボックスの座標の類似性判定処理の一例を示す図である。図６に例示するように、静的ボックス判定部１２２は、フレーム間のボックスの大きさの比較、及び、フレーム間のボックスの位置の比較、の少なくとも一方を実行してよい。なお、図６の例では、比較する２つのフレームとして、現在のフレーム（ｆ）及び１つ過去のフレーム（ｆ−１）を用いている。 FIG. 6 is a diagram showing an example of the similarity determination process of the coordinates of the box. As illustrated in FIG. 6, the static box determination unit 122 may perform at least one of comparison of box sizes between frames and comparison of box positions between frames. In the example of FIG. 6, the current frame (f) and the one past frame (f-1) are used as two frames to be compared.

例えば、静的ボックス判定部１２２は、図６の左上に示すように、比較する２つのボックスのサイズ比が閾値、例えば“１．３”、以下であるか否かを判定する。なお、ボックスサイズ比は、例えば、（大きい方のボックスサイズ）／（小さい方のボックスサイズ）により求められてよい。ボックスサイズは、例えば、ボックスの幅及び高さの平均であってよく、（ｗｉｄｔｈ＋ｈｅｉｇｈｔ）／２により求められてよい。 For example, as shown in the upper left of FIG. 6, the static box determination unit 122 determines whether the size ratio of the two boxes to be compared is equal to or less than a threshold, for example, “1.3”. The box size ratio may be determined, for example, by (larger box size) / (smaller box size). The box size may be, for example, the average of the width and height of the box, and may be determined by (width + height) / 2.

また、例えば、静的ボックス判定部１２２は、図６の右上に示すように、比較する２つのボックス間の距離が、閾値、例えば“平均ボックスサイズ×１．５”以下であるか否かを判定する。なお、平均ボックスサイズは、比較する２つのボックス又は静的ボックスの平均のボックスサイズであってよい。或いは、平均ボックスサイズは、現在のフレーム、過去のＮフレーム、又は、これまで分析した全てのフレーム等において検出されたボックスの平均ボックスサイズであってよい。 Also, for example, as shown in the upper right of FIG. 6, the static box determination unit 122 determines whether the distance between two boxes to be compared is a threshold, for example, “average box size × 1.5” or less. judge. Note that the average box size may be the average box size of two boxes to be compared or a static box. Alternatively, the average box size may be the average box size of the detected box in the current frame, past N frames, or all frames analyzed so far.

静的ボックス判定部１２２は、上記の２つの判定により、ボックスのサイズ比、及び、ボックス間の距離、のうちの少なくとも一方が、対応する閾値よりも大きい場合、２つのボックスを静的ボックスの判定対象から除外してよい。 If at least one of the size ratio of the boxes and the distance between the boxes is larger than the corresponding threshold value, the static box determination unit 122 determines that the two boxes are static boxes. You may exclude from judgment object.

（ボックス自体の類似性判定）
次に、ボックス自体の類似性判定の手法について説明する。静的ボックス判定部１２２は、例えば、比較する２つのボックスのクロップ画像の類似性、及び、ボックス同士の重なり、をそれぞれスコア化し、これらのスコアの総合スコアを算出してよい。そして、静的ボックス判定部１２２は、総合スコアが閾値（例えば“１”）以上であれば、２つボックスが同一であると判定してよい。 (Similarity determination of the box itself)
Next, the method of judging the similarity of the box itself will be described. For example, the static box determination unit 122 may score the similarity of crop images of two boxes to be compared and the overlap between the boxes, respectively, and calculate the total score of these scores. Then, the static box determination unit 122 may determine that the two boxes are the same if the total score is equal to or greater than a threshold (for example, “1”).

なお、クロップ画像とは、画像データから物体（一実施形態では頭部）を切り出した画像である。クロップ画像の類似性のスコアを求めることは、例えば、色ヒストグラムの類似性のスコアを求めること、及び、ピクセル差分の類似性のスコアを求めること、を含んでよい。 The cropped image is an image obtained by cutting out an object (in one embodiment, a head) from image data. Determining the similarity score of the cropped image may include, for example, determining the similarity score of the color histogram, and determining the similarity score of the pixel difference.

図７は、ボックス自体の類似性判定処理の一例を示す図である。図７に例示するように、静的ボックス判定部１２２は、以下のスコアを算出してよい。 FIG. 7 is a diagram showing an example of similarity determination processing of the box itself. As illustrated in FIG. 7, the static box determination unit 122 may calculate the following score.

（Ｉ）色ヒストグラムの類似性のスコア：“sim_color” (I) Score of similarity of color histogram: "sim_color"

色ヒストグラムとしては、例えば、ＲＧＢ（Red Green Blue）値のヒストグラムが挙げられる。 An example of the color histogram is a histogram of RGB (Red Green Blue) values.

（II）ピクセル差分による類似性のスコア：“sim_pixel” (II) Similarity score by pixel difference: "sim_pixel"

ピクセル差分による類似性は、例えば、ＮＲＭＳＥ（Normalized Root Mean-Squared Error）等の手法により求められてよい。ＮＲＭＳＥによるピクセル差分の算出手法については、例えば、［ｈｔｔｐ:／／ｓｃｉｋｉｔ．ｉｍａｇｅ．ｏｒｇ／ｄｏｃｓ／ｄｅｖ／ａｐｉ／ｓｋｉｍａｇｅ．ｍｅａｓｕｒｅ．ｈｔｍｌ＃ｓｋｉｍａｇｅ．ｍｅａｓｕｒｅ．ｃｏｍｐａｒｅ＿ｎｒｍｓｅ］に記載されている。 The similarity based on pixel differences may be determined, for example, by a method such as NRMSE (Normalized Root Mean-Squared Error). For the calculation method of pixel difference by NRMSE, for example, [http: // scikit. image. org / docs / dev / api / skimage. measure. html # skimage. measure. It is described in [compare_nrmse].

（III）ボックスの重なりの度合い：ＩｏＵ（Intersection over Union） (III) Box overlap degree: IoU (Intersection over Union)

ＩｏＵは、２つのボックス（換言すれば領域）の重なりの度合いを示す情報である。例えば、ＩｏＵは、図７に示すように、座標系において２つの領域を重ねた場合に、２つの領域を結合した結合領域（ＯＲ領域）に対する、重複する重複領域（ＡＮＤ領域）の比率を示す。 IoU is information indicating the degree of overlap of two boxes (in other words, areas). For example, as shown in FIG. 7, IoU indicates the ratio of overlapping overlapping area (AND area) to a combined area (OR area) in which two areas are connected when two areas are overlapped in the coordinate system. .

なお、上記（Ｉ）〜（III）のスコアは、ニューラルネットワークにより算出されてもよい。 The scores (I) to (III) may be calculated by a neural network.

静的ボックス判定部１２２は、算出した上記（Ｉ）〜（III）のスコアに基づいて、“sim_color * sim_pixel + IoU”を総合スコアとして算出する。総合スコアは、物体の画像成分が一致するか否かを判断するための指標の一例である The static box determination unit 122 calculates “sim_color * sim_pixel + IoU” as a total score based on the calculated scores (I) to (III). The overall score is an example of an index for determining whether the image components of the object match.

そして、静的ボックス判定部１２２は、総合スコアが閾値（例えば“１”）以上か否かを判定する。総合スコアが閾値以上の場合、静的ボックス判定部１２２は、比較する２つのボックスがフレーム間で同一のボックスであると判定してよい。 Then, the static box determination unit 122 determines whether or not the total score is equal to or higher than a threshold (for example, “1”). If the total score is equal to or higher than the threshold value, the static box determination unit 122 may determine that two boxes to be compared are the same box between frames.

なお、静的ボックス判定部１２２は、上記（ｉ）の処理における、現在のフレームのボックスと過去のフレームの静的ボックスとの比較では、以下のようにして類似性の判定を行なってよい。例えば、静的ボックス判定部１２２は、現在のフレームにおける比較対象のボックスと、過去の複数のフレームの各々における比較対象の静的ボックスと、の間で、それぞれ類似性の総合スコアを算出してよい。そして、静的ボックス判定部１２２は、算出した類似性の総合スコアの平均値を計算し、平均値が閾値（例えば“１”）よりも大きいか否かを判定してよい。平均値が閾値よりも大きい場合に、比較対象のボックスは、静的ボックスに追加されてよい。 The static box determination unit 122 may determine the similarity in the following manner when comparing the box of the current frame with the static box of the past frame in the process of (i). For example, the static box determination unit 122 calculates an overall score of similarity between the comparison target box in the current frame and the comparison target static box in each of a plurality of past frames. Good. Then, the static box determination unit 122 may calculate the average value of the calculated overall score of the similarity, and determine whether the average value is larger than a threshold (for example, “1”). If the mean value is greater than the threshold, then the box to be compared may be added to the static box.

また、静的ボックス判定部１２２は、上記（ii）の処理における、現在のフレームのボックスと過去のフレームの非静的ボックスとの比較では、以下のようにして類似性の判定を行なってよい。例えば、静的ボックス判定部１２２は、現在のフレームにおける比較対象のボックスと、過去の複数のフレームの各々における非静的ボックスと、の間で、それぞれ類似性を算出してよい。そして、静的ボックス判定部１２２は、過去Ｎ（例えば“５”）フレーム中のＭ（例えば“３”）フレームにおいて、算出した類似性の総合スコアが閾値（例えば“１”）よりも大きいか否かを判定してよい。Ｍフレーム以上で総合スコアが閾値よりも大きい場合に、比較対象のボックス及び比較した非静的ボックスは、新たに静的ボックスと判定されてよい。 In addition, the static box determination unit 122 may perform similarity determination as follows in comparison between the box of the current frame and the non-static box of the past frame in the process (ii). . For example, the static box determination unit 122 may calculate the similarity between the comparison target box in the current frame and the non-static box in each of a plurality of past frames. Then, the static box determination unit 122 determines whether the calculated overall score of similarity is greater than a threshold (for example, “1”) in M (for example, “3”) frames in the past N (for example, “5”) frames. It may be determined whether or not. If the total score is greater than the threshold value in M frames or more, the compared box and the compared non-static box may be newly determined as a static box.

以上のように、静的ボックス判定部１２２は、ボックス自体の類似性を判定することにより、座席位置の誤検出の確率を低減させることができる。 As described above, the static box determination unit 122 can reduce the probability of false detection of the seat position by determining the similarity of the box itself.

例えば、レジ前やトイレ前等の位置は、座席以外に顧客が留まる可能性のある場所である。このような場所は、５分間隔のフレーム間において、ボックスが連続して検出される可能性がある。 For example, positions such as the front of the cash register and the front of the toilet are places where the customer may stay other than the seat. Such locations may be detected consecutively in the 5-minute interval between frames.

これに対し、静的ボックス判定部１２２は、ボックス自体の類似性を判定することにより、例えば、比較するボックスが互いに異なる人物の頭部である場合には、総合スコアが閾値以下となり、静的ボックスと判定されない。これにより、レジ前やトイレ前のような座席以外の場所において、互いに異なる人物の頭部が複数のフレームに亘って検出されたとしても、当該場所が座席位置として検出される可能性を低減又は排除することができ、座席位置の正確に推定することができる。 On the other hand, the static box determination unit 122 determines the similarity of the boxes themselves, and for example, when the boxes to be compared are the heads of persons different from each other, the overall score becomes equal to or less than the threshold and static It is not determined to be a box. This reduces the possibility that the location will be detected as a seat position, even if the heads of different persons are detected across multiple frames at locations other than the seat, such as before the cash register or in the bathroom. It can be eliminated and the seat position can be accurately estimated.

〔１−３−３〕座席推定部の説明
座席推定部１２３は、ボックス情報１１３として蓄積した「座っている人」の情報に基づいて、座席位置を推定し、推定した座席位置の情報を座席情報１１４としてメモリ部１１に格納する。 [1-3-3] Description of Seat Estimation Unit The seat estimation unit 123 estimates the seat position based on the information on “the person sitting” accumulated as the box information 113, and determines the estimated seat position information It is stored in the memory unit 11 as the information 114.

なお、座席推定部１２３による以下の処理の少なくとも一部は、ＮＮを用いたディープラーニングにより実行されてもよい。例えば、座席推定部１２３は、ＮＮをそなえてもよく、或いは、ＮＮ１３が更に以下の処理を実行するように構成されてもよい。 Note that at least part of the following processing by the seat estimation unit 123 may be performed by deep learning using an NN. For example, the seat estimation unit 123 may include the NN, or the NN 13 may be configured to further perform the following processing.

例えば、座席推定部１２３は、静的ボックス判定部１２２が判定した「座っている人」の位置データ（静的ボックス）を一定期間（一実施形態では３日程度）蓄積したボックス情報１１３を利用し、座席位置及び座席数を推定してよい。 For example, the seat estimation unit 123 uses the box information 113 in which the position data (static box) of the "person sitting" determined by the static box determination unit 122 is stored for a predetermined period (about three days in one embodiment) Seat position and number of seats may be estimated.

一例として、座席推定部１２３は、静的ボックスの情報を元にクラスタリングを行ない、推測されたクラスタの数を座席数として扱い、各クラスタの位置を各座席の位置情報として扱ってよい。なお、静的ボックスの情報としては、位置、サイズ、及び、観測された時刻（フレーム）が挙げれられる。また、各クラスタの位置は、各クラスタに含まれる静的ボックスの位置の平均値に基づき算出されてよい。 As an example, the seat estimation unit 123 may perform clustering based on static box information, treat the estimated number of clusters as the number of seats, and handle the position of each cluster as the position information of each seat. In addition, a position, a size, and observed time (frame) are mentioned as information of a static box. Also, the position of each cluster may be calculated based on the average value of the positions of static boxes included in each cluster.

一実施形態においては、店舗３の座席数は、初期設定の簡略化のために、取得・設定されていないものとする。この場合、クラスタリングのアルゴリズムは、階層型クラスタリングをベースとしてよい。 In one embodiment, it is assumed that the number of seats of the store 3 is not acquired / set in order to simplify the initial setting. In this case, the clustering algorithm may be based on hierarchical clustering.

例えば、図８に示すように、座席推定部１２３は、観測された全ての静的ボックスが別々のクラスタである状態を初期状態とし（“static boxes”参照）、距離の近いクラスタを同一クラスタとしてマージしてよい（“clustering result”参照）。マージする距離に閾値を設けることで、距離の離れたクラスタ同士はマージされずに残るため、座席推定部１２３は、残った各クラスタを最終的な座席データと推定してよい（“estimated seat positions”参照）。 For example, as shown in FIG. 8, the seat estimation unit 123 sets a state in which all the observed static boxes are separate clusters as an initial state (see “static boxes”), and sets close clusters as the same cluster. May merge (see “clustering result”). By providing a threshold for the distance to merge, the clusters separated by a distance remain unmerged, so the seat estimation unit 123 may estimate each remaining cluster as final seat data ("estimated seat positions "reference).

ここで、クラスタとは、観測された静的ボックスをグループ化したものであり、複数の第１画像間で相互に関連する物体の一例である。同一クラスタ内の静的ボックス同士は、同じ座席に座った別の人の観測結果と考えられる。すなわち、一実施形態においては、最終的に残ったクラスタの位置を物体が滞留する位置と捉え、物体が滞留する位置を、座席位置として捉えることによって、座席位置を推定するのである。 Here, a cluster is a group of observed static boxes, and is an example of an object mutually related among a plurality of first images. Static boxes in the same cluster are considered as observation results of another person sitting in the same seat. That is, in one embodiment, the position of the finally remaining cluster is regarded as the position where the object stays, and the position of the object is estimated as the seat position to estimate the seat position.

以下、図９を参照して、階層化クラスタリングを行なうための距離指標について説明する。図９は、階層化クラスタリングの距離指標の一例を示す図である。 Hereinafter, with reference to FIG. 9, a distance index for performing hierarchical clustering will be described. FIG. 9 is a diagram showing an example of a distance index of hierarchical clustering.

座席推定部１２３は、距離計算のためのクラスタの代表値を設定してよい。図９の例では、座席推定部１２３は、各静的ボックスについて、クラスタ内の静的ボックスの中心座標の平均値を（ｃ_ｘ，ｃ_ｙ）、ボックスサイズの平均値をｓｉｚｅ、クラスタ内に含まれる静的ボックスの数をｎ、としてよい。また、座席推定部１２３は、クラスタ内に含まれる静的ボックスについて、観測のあった時刻（フレーム）を“１”、観測の無い時刻（フレーム）を“０”、とする２値の配列を生成し、例えばメモリ部１１に保持してよい。配列は、複数の第１画像の各々における各物体の検出状況を示す情報の一例である。 The seat estimation unit 123 may set a representative value of a cluster for distance calculation. In the example of FIG. 9, the seat estimation unit 123 sets, for each static box, the average value of the central coordinates of the static box in the cluster to (c _x , c _y ), the average value of the box size to size, and The number of static boxes included may be n. In addition, the seat estimation unit 123 sets a binary array in which the observation time (frame) is “1” and the non-observation time (frame) is “0” for static boxes included in the cluster. It may be generated and held in the memory unit 11, for example. The array is an example of information indicating the detection status of each object in each of the plurality of first images.

座席推定部１２３は、上記の代表値を用いて、図９に例示するように、下記の（１）〜（３）式を算出することで、クラスタ間の距離指標を求めてよい。 The seat estimation unit 123 may obtain the distance index between clusters by calculating the following equations (1) to (3) as illustrated in FIG. 9 using the above-described representative values.

ここで、（ｃ_ｘＡ，ｃ_ｙＡ），ｓｉｚｅ_Ａ，ｎ_Ａは、それぞれ、クラスタＡの座標、サイズ、静的ボックス数を示し、（ｃ_ｘＢ，ｃ_ｙＢ），ｓｉｚｅ_Ｂ，ｎ_Ｂは、それぞれ、クラスタＢの座標、サイズ、静的ボックス数を示す。 _{_{Here, (c xA, c yA)}} , size A, n A , respectively, show the coordinates of the cluster A, the size, the static box _{_{number, (c xB, c yB)}} , size B, n B , respectively , Coordinates of cluster B, size, and number of static boxes.

なお、上記（３）式のうち、“penalty”以外の部分は、ｗａｒｄ法を用いた階層化クラスタリングの距離指標と同じである。これに対して、一実施形態では、ｗａｒｄ法の距離指標に“penalty”を追加することにより、座席位置及び座席数の推定精度を向上させている。 In addition, in said (3) Formula, parts other than "penalty" are the same as the distance index of hierarchical clustering which used the ward method. On the other hand, in one embodiment, the estimation accuracy of the seat position and the number of seats is improved by adding "penalty" to the ward method distance index.

例えば、クラスタＡ及びクラスタＢ間で同時刻（同一フレーム）に静的ボックス、換言すれば「座っている人」、が観測された場合、これらのクラスタには、互いに別の座席が存在することが予想される。 For example, when static boxes at the same time (the same frame) are observed between cluster A and cluster B, in other words, "person sitting", these clusters have different seats. Is expected.

このため、上記（３）式では、クラスタＡ及びクラスタＢ間で同時刻（同一フレーム）に静的ボックスが観測された場合に、観測された回数に応じてクラスタ間の距離を大きくするようなペナルティ項を追加している。例えば、座席推定部１２３は、観測時刻の配列同士のＡＮＤを取り、ＡＮＤ結果の配列のＳＵＭ（合計値）をペナルティとして、上記（３）式に適用してよい。 Therefore, in the above equation (3), when static boxes are observed at the same time (the same frame) between cluster A and cluster B, the distance between clusters is increased according to the number of times of observation. A penalty term has been added. For example, the seat estimation unit 123 may take AND of arrays of observation times, and apply SUM (total value) of the array of AND results as a penalty to the equation (3).

このように、座席推定部１２３は、静的物体ごとに、複数の第１画像の各々において当該静的物体に対応付けられた物体が存在するか否かを示す情報を生成してよい。そして、座席推定部１２３は、生成した情報に基づき、対応する物体が一の第１画像内に存在する静的物体同士をグループ化の対象から除外してよい。 As described above, the seat estimation unit 123 may generate, for each static object, information indicating whether or not there is an object associated with the static object in each of the plurality of first images. Then, based on the generated information, the seat estimation unit 123 may exclude static objects whose corresponding objects are present in one first image from grouping targets.

座席推定部１２３は、上記（３）式により得られるｄ_ｗａｒｄを距離指標として用い、全ての静的ボックスのデータが別々のクラスタである初期状態から（図１０の「初期状態」参照）、以下の手順により、階層化クラスタリングを実行してよい。 The seat estimation unit 123 uses d _ward obtained by the equation (3) as a distance index, and from the initial state in which the data of all static boxes are separate clusters (see “initial state” in FIG. 10), Hierarchical clustering may be performed according to the following procedure.

例えば、座席推定部１２３は、クラスタ間の距離指標（ｄ_ｗａｒｄ）が最も小さいクラスタペアを決定し、下記（４）式の停止条件（stop condition）を満たすか否かを判定する。 For example, the seat estimation unit 123 determines a cluster pair with the smallest distance index (d _ward ) between clusters, and determines whether the stop condition (stop condition) in the following equation (4) is satisfied.

上記（４）式の停止条件を満たす場合、座席推定部１２３は、階層化クラスタリングを終了し、残っているクラスタの情報を座席情報１１４としてメモリ部１１に格納する。 If the stop condition of the equation (4) is satisfied, the seat estimation unit 123 ends hierarchical clustering and stores information on remaining clusters as the seat information 114 in the memory unit 11.

一方、上記（４）式の停止条件を満たさない場合、座席推定部１２３は、当該クラスタペアをマージして新しいクラスタとする。 On the other hand, when the stop condition of the equation (4) is not satisfied, the seat estimation unit 123 merges the cluster pair and sets it as a new cluster.

座席推定部１２３は、上記の処理を、停止条件が満たされるか、或いは、クラスタ数が１になるまで繰り返し実行する。 The seat estimation unit 123 repeatedly executes the above process until the stop condition is satisfied or the number of clusters becomes one.

このように、階層化クラスタリングによりマージされずに残ったクラスタがそれぞれ座席位置として推定される（図１０の「座席位置の推定結果」参照）。座席推定部１２３は、クラスタに含まれる静的ボックスの中心座標の平均値（ｃ_ｘ，ｃ_ｙ）、及び、ボックスサイズの平均値（ｓｉｚｅ）を、座席の位置及び参考サイズのデータとして用いてよい。 Thus, clusters remaining without being merged by hierarchical clustering are estimated as seat positions (see “estimated result of seat position” in FIG. 10). The seat estimation unit 123 uses the average value (c _x , c _y ) of the center coordinates of the static box included in the cluster and the average value (size) of the box size as data of the seat position and the reference size. Good.

なお、座席推定部１２３は、マージされずに残ったクラスタのうち、クラスタ内に含まれるサンプル数（ｎの値）が極端に少ないクラスタを、ノイズと判断して座席情報から除外してもよい。サンプル数が極端に少ないクラスタとは、例えば、他のクラスタ内に含まれるサンプル数の平均値の数％（例示的に、２％）以下のクラスタであってよい。 The seat estimation unit 123 may determine that a cluster having an extremely small number of samples (value of n) included in a cluster among the clusters left unmerged may be determined as noise and excluded from the seat information. . A cluster with an extremely small number of samples may be, for example, a cluster having a few percent (exemplarily 2%) or less of the average value of the number of samples contained in other clusters.

〔１−３−４〕混雑度算出部の説明
混雑度算出部１２４は、混雑状況推定フェーズにおいて、頭部の検出結果と座席情報１１４とを用いて、混雑度を算出する。 [1-3-4] Description of Congestion Degree Calculation Unit The congestion degree calculation unit 124 calculates the congestion degree using the head detection result and the seat information 114 in the congestion state estimation phase.

例えば、混雑度算出部１２４は、人の頭部の検出結果と座席情報１１４の各座席位置との距離を算出し、距離の近い検出結果及び座席位置のペアから順に、座席位置に対して、座席位置から閾値以下の距離に存在する人の頭部の検出結果を割り当てる。 For example, the congestion degree calculation unit 124 calculates the distance between the detection result of the human head and each seat position of the seat information 114, and sequentially detects the pair of the detection result and the seat position closer to the seat position. The detection result of the head of a person present at a distance equal to or less than the threshold from the seat position is assigned.

これにより、頭部の検出結果が割り当てられた座席（人の居る座席）の情報と、検出結果が割り当てられない座席（空席）の情報と、が得られる。なお、頭部の検出位置と座席位置とが所定の距離以上離れている場合は、座席位置に対して当該頭部を割り当てなくてよい。 As a result, information of a seat (seat at which a person is present) to which the detection result of the head is assigned and information of a seat (vacant seat) to which the detection result is not assigned can be obtained. When the detected position of the head and the seat position are separated by a predetermined distance or more, the head may not be assigned to the seat position.

人の頭部の検出結果としては、例えば、混雑状況推定フェーズにおいて取得された第２の画像における物体の検出結果が用いられてよい。一例として、人の頭部の検出結果は、メモリ部１１に格納された検出情報１１２に含まれる、最新の（或いは端末装置６から要求された時刻の）検出対象のフレームにおいて検出された物体の情報が用いられてよい。 As a detection result of a human head, for example, a detection result of an object in a second image acquired in the congestion state estimation phase may be used. As an example, the detection result of the human head is the object detected in the latest detection target frame (or at the time requested by the terminal device 6) included in the detection information 112 stored in the memory unit 11. Information may be used.

混雑度算出部１２４は、例えば、“人の居る座席数／全座席数”を算出することにより、検出対象のフレームに基づく混雑率を算出してよい。また、混雑度算出部１２４は、算出した混雑率に対して、“０．３”，“０．６”等の閾値を設けることにより、「空いている」／「通常」／「混んでいる」等の混雑状況の程度を表す混雑度（離散値）を推定してよい。なお、上記の例では、３段階の混雑度を推定するものとしたが、閾値の数を増減させることにより、混雑度の段階数を増減させてもよい。 The congestion degree calculation unit 124 may calculate the congestion rate based on the detection target frame, for example, by calculating “the number of seats with people / the total number of seats”. In addition, the congestion degree calculation unit 124 sets “Threshold” / “Normal” / “Heavy” by setting a threshold such as “0.3” or “0.6” to the calculated congestion rate. The degree of congestion (discrete value) representing the degree of congestion status such as In the above example, although the congestion level is estimated to be three levels, the number of levels of the congestion level may be increased or decreased by increasing or decreasing the number of threshold values.

なお、上述のように、混雑状況推定フェーズ中に座席位置推定（更新）フェーズが実行される場合には、静的ボックス判定部１２２及び座席推定部１２３は、第２（第１）の画像における物体の検出結果を用いて、座席情報１１４を推定（更新）してよい。 As described above, when the seat position estimation (update) phase is executed during the crowded situation estimation phase, the static box determination unit 122 and the seat estimation unit 123 select the second image from the second (first) image. The seat information 114 may be estimated (updated) using the detection result of the object.

図１１に混雑度算出部１２４による混雑度の推定結果の一例を示す。なお、図１１は、監視カメラ２により撮影された撮影空間３０の画像に対して、混雑度算出部１２４が当該画像に基づいて検出した、人の居る座席と、空席とを当て嵌めた様子を示す。図１１においては、座席位置を丸の記号で示し、人が着席している座席位置を四角の記号で示している。 An example of the estimation result of the congestion degree by the congestion degree calculation part 124 is shown in FIG. Note that FIG. 11 shows a state in which a seat where a person is present and an empty seat detected by the congestion degree calculation unit 124 based on the image with respect to the image of the imaging space 30 captured by the monitoring camera 2. Show. In FIG. 11, the seat position is indicated by a circle and the seat position at which a person is seated is indicated by a square.

図１１（ａ）は混雑度が「空いている」場合、例えば混雑率が“０．３”未満である場合を示す。図１１（ｂ）は混雑度が「通常」の場合、例えば混雑率が“０．３”以上且つ“０．６”未満である場合を示す。図１１（ｃ）は混雑度が「混んでいる」場合、例えば混雑率が“０．６”以上である場合を示す。 FIG. 11A shows the case where the congestion degree is "vacant", for example, the congestion rate is less than "0.3". FIG. 11B shows the case where the congestion degree is “normal”, for example, the congestion rate is “0.3” or more and less than “0.6”. FIG. 11C shows the case where the congestion degree is "heavy", for example, the congestion rate is "0.6" or more.

混雑度算出部１２４は、推定した混雑度の情報を、混雑度情報１１５としてメモリ部１１に格納してよい。 The congestion degree calculation unit 124 may store information on the estimated congestion degree as the congestion degree information 115 in the memory unit 11.

なお、図１１（ａ）において符号Ａで示すように、座席位置の情報が存在しないエリアで検出された物体には、座席が割り当てられなくてよく、この場合、当該物体は混雑率の算出に考慮しなくてよい。これにより、店員や移動中の顧客等が混雑率の算出に影響を与えないようにすることができる。なお、図１１（ａ）においては、座席が割り当てられない物体を二重線の四角の記号で示している。 Note that, as indicated by a symbol A in FIG. 11A, no seat may be allocated to the object detected in the area where the information on the seat position does not exist, and in this case, the object is used to calculate the congestion rate. You do not have to consider it. As a result, it is possible to prevent a store clerk, a moving customer, etc. from affecting the calculation of the congestion rate. In FIG. 11A, an object to which no seat is assigned is indicated by a double-lined square symbol.

〔１−３−５〕情報提示部の説明
情報提示部１４は、制御部１２により推定された混雑度の情報を、例えば端末装置６に提示してよい。 [1-3-5] Description of Information Presentation Unit The information presentation unit 14 may present information on the degree of congestion estimated by the control unit 12 to the terminal device 6, for example.

図１２は、情報提示部１４による端末装置６への情報の提示例を示す図である。情報提示部１４は、例えば、図１２に示すように、端末装置６からの要求に応じて、端末装置６の表示装置６０に対して、混雑度算出部１２４による推定結果である混雑度情報１１５に基づく店舗３の混雑度を提示してよい。 FIG. 12 is a diagram showing an example of presentation of information to the terminal device 6 by the information presentation unit 14. For example, as illustrated in FIG. 12, the information presentation unit 14 causes congestion degree information 115 which is an estimation result by the congestion degree calculation unit 124 to the display device 60 of the terminal device 6 in response to a request from the terminal device 6. The congestion degree of the store 3 based on may be presented.

提示される情報は、１つの店舗３に関する情報であってもよいし、図１２に示すように複数の店舗３の各々に関する情報であってもよい。また、情報提示部１４は、図１２に示すように、店舗３の空席状況を閲覧できるようにしてもよい。空席情報は、例えば、推定した座席位置に基づきマップを生成し、当該マップに対して、空席或いは人が着席している座席を特定（例えば印を付ける等）した情報であってよい。 The information to be presented may be information on one store 3 or information on each of a plurality of stores 3 as shown in FIG. 12. Further, as shown in FIG. 12, the information presentation unit 14 may be able to browse the vacant seat status of the store 3. The vacant seat information may be, for example, information that generates a map based on the estimated seat position and specifies (for example, marks) a vacant seat or a seat where a person is seated on the map.

なお、端末装置６及び情報提示部１４は、一例として、端末装置６がＷｅｂクライアント、情報提示部１４がＷｅｂサーバとして機能してよい。この場合、端末装置６からのリクエストに応じて、情報提示部１４がＷｅｂページを生成し、生成したＷｅｂページを端末装置６に応答してよい。なお、リクエストは、例えば、所定のＵＲＬ（Uniform Resource Locator）に対するｈｔｔｐ（Hypertext Transfer Protocol）リクエストであってよい。また、Ｗｅｂページは、例えば、ｈｔｍｌ（HyperText Markup Language）等のマークアップ言語により生成されてよい。 In the terminal device 6 and the information presentation unit 14, for example, the terminal device 6 may function as a Web client, and the information presentation unit 14 may function as a Web server. In this case, the information presentation unit 14 may generate a Web page in response to a request from the terminal device 6 and may respond the generated Web page to the terminal device 6. The request may be, for example, an http (Hypertext Transfer Protocol) request for a predetermined URL (Uniform Resource Locator). Also, the web page may be generated by a markup language such as, for example, html (HyperText Markup Language).

以上のように、一実施形態によれば、サーバ４は、ＮＮ１３により、監視カメラ２の映像における人物の頭部を検出し、検出した頭部の位置を、静的ボックス判定部１２２及び座席推定部１２３により、例えばディープラーニングにより学習してよい。 As described above, according to one embodiment, the server 4 detects the head of a person in the video of the monitoring camera 2 by the NN 13 and detects the position of the detected head by the static box determination unit 122 and the seat estimation. For example, deep learning may be performed by the unit 123.

人物の頭部が頻繁に滞留する、例えば停止する場所は、座席の位置と推定することができる。従って、映像から検出した人物の頭部が、推定した座席位置に重なるか否かを判定することにより、座席に人が居るか否かを判断でき、混雑率を算出できるようになる。 A place where a person's head frequently stays, for example, a stop can be estimated as the position of a seat. Therefore, by determining whether the head of the person detected from the image overlaps the estimated seat position, it can be determined whether there is a person in the seat and the congestion rate can be calculated.

また、静的ボックス判定部１２２及び座席推定部１２３による処理によって、監視カメラ２の映像から座席位置を推定することができるため、店舗３の座席位置のマップや座席数を事前に取得しなくても、一実施形態に係る手法を適用できる。 In addition, since the seat position can be estimated from the image of the monitoring camera 2 by the processing by the static box determination unit 122 and the seat estimation unit 123, the map of the seat position of the store 3 and the number of seats can not be acquired in advance. Also, the method according to one embodiment can be applied.

例えば、店舗３において、監視カメラ２の映像をサーバ４に送信する等の簡素な初期設定を行なうことで、一実施形態に係る手法を容易に店舗３に適用でき、高精度に座席の混雑率を把握可能となる。特に、多くの店舗３の座席の混雑状況を把握する際には、煩雑な初期設定等が不要となるため、従来のように１店舗ごとに監視カメラ２の撮影領域等の設定を行なうといった、多くのコストを削減できる。 For example, by performing a simple initial setting such as transmitting the video of the monitoring camera 2 to the server 4 in the store 3, the method according to the embodiment can be easily applied to the store 3, and the congestion rate of the seat with high accuracy It becomes possible to grasp In particular, when grasping the crowded situation of the seats of many stores 3, complicated initial setting and the like are not necessary, so that the photographing area of the monitoring camera 2 is set for each store as in the prior art. You can save a lot of costs.

一実施形態に係る手法は、例えば、以下のような場面に適用し、又は、応用することができる。 The method according to an embodiment can be applied or applied to, for example, the following situations.

・店舗３の混雑状況をリアルタイムに監視及び数値化する。
・施設や店舗３の経営者が混雑状況を把握し、店舗戦略に活用する。
・施設や店舗３の利用者が、混雑状況を端末装置６のモバイルアプリ等で確認し、施設や店舗３に行くか否かの意思決定を行なう。 Monitor and quantify the congestion status of the store 3 in real time.
・ Managers of facilities and stores 3 grasp congestion and use them in store strategy.
The user of the facility or store 3 checks the congestion status with the mobile application of the terminal device 6 and makes a decision on whether to go to the facility or store 3 or not.

〔１−４〕動作例
次に、図１３〜図１５を参照して、上述の如く構成された混雑度推定システム１の動作例を説明する。 [1-4] Operation Example Next, with reference to FIGS. 13 to 15, an operation example of the congestion degree estimation system 1 configured as described above will be described.

〔１−４−１〕座席位置推定フェーズの動作例
はじめに、図１３及び図１４を参照して、座席位置推定フェーズの動作例を説明する。 [1-4-1] Operation Example of Seat Position Estimation Phase First, an operation example of the seat position estimation phase will be described with reference to FIGS. 13 and 14.

サーバ４では、制御部１２の情報取得部１２１が、店舗３に設置された監視カメラ２の映像データを取得し、映像データ１１１としてメモリ部１１に格納する。 In the server 4, the information acquisition unit 121 of the control unit 12 acquires video data of the monitoring camera 2 installed in the store 3 and stores the video data as the video data 111 in the memory unit 11.

図１３に例示するように、サーバ４のＮＮ１３は、映像データ１１１から一定間隔（例えば５分間隔）でフレームを取得し（ステップＳ１）、フレームから人物の頭部のボックスを検出する（ステップＳ２）。なお、ＮＮ１３は、検出したボックスを検出情報１１２としてメモリ部１１に格納してよい。 As illustrated in FIG. 13, the NN 13 of the server 4 acquires frames at regular intervals (for example, every 5 minutes) from the video data 111 (step S1) and detects a box of the head of a person from the frame (step S2). ). The NN 13 may store the detected box as the detection information 112 in the memory unit 11.

制御部１２の静的ボックス判定部１２２は、検出情報１１２に基づいて、現在（例えば最新）のフレーム内の各ボックスを、過去Ｎ（例えばＮ＝５）フレーム内のボックスと比較する（ステップＳ３）。 The static box determination unit 122 of the control unit 12 compares each box in the current (for example, latest) frame with a box in the past N (for example, N = 5) frames based on the detection information 112 (step S3) ).

なお、ボックス間の比較は、ボックスのサイズ及び／又はボックス間の距離、の比較によるスクリーニングと、色ヒストグラム、ピクセル差分、及びＩｏＵ等を用いたボックスの類似性の総合スコア同士の比較と、を含んでよい（図６及び図７参照）。 Note that comparisons between boxes include screening based on comparison of box size and / or distance between boxes, and comparison of overall scores of box similarity using color histograms, pixel differences, IoU, etc. May be included (see FIGS. 6 and 7).

静的ボックス判定部１２２は、比較した現在のフレーム内のボックスが過去のフレーム内のいずれかの静的ボックスと同一か否かを判定する（ステップＳ４）。 The static box determination unit 122 determines whether the box in the compared current frame is identical to any static box in the past frame (step S4).

ボックスが過去のフレーム内のいずれかの静的ボックスと同一の場合（ステップＳ４でＹｅｓ）、静的ボックス判定部１２２は、当該ボックスを、ボックス情報１１３における当該静的ボックスのデータに追加し（ステップＳ５）、処理がステップＳ８に移行する。 If the box is identical to any static box in the past frame (Yes in step S 4), the static box determination unit 122 adds the box to the data of the static box in the box information 113 ( Step S5), the process proceeds to step S8.

一方、ボックスが過去のフレーム内のいずれの静的ボックスとも同一ではない場合（ステップＳ４でＮｏ）、過去Ｎフレーム中のＭ（例えばＭ＝３）フレームに、ボックスと同一の非静的ボックスが存在するか否かを判定する（ステップＳ６）。ステップＳ６の条件を満たさない場合（ステップＳ６でＮｏ）、処理がステップＳ８に移行する。 On the other hand, if the box is not identical to any static box in the past frame (No in step S4), the non-static box identical to the box is added to the M (for example, M = 3) frame in the past N frames. It is determined whether it exists (step S6). When the condition of step S6 is not satisfied (No in step S6), the process proceeds to step S8.

一方、ステップＳ６の条件を満たす場合（ステップＳ６でＹｅｓ）、静的ボックス判定部１２２は、ボックス情報１１３において、これらのボックス及び非静的ボックスを新たな静的ボックスとして管理し（ステップＳ７）、処理がステップＳ８に移行する。 On the other hand, when the condition of step S6 is satisfied (Yes in step S6), the static box determination unit 122 manages these boxes and non-static boxes as new static boxes in the box information 113 (step S7). The process proceeds to step S8.

ステップＳ８では、過去Ｍフレーム連続で観測のない（すなわち検出されていない）静的ボックスが存在するか否かを判定する。該当する静的ボックスが存在しない場合（ステップＳ８でＮｏ）、処理がステップＳ１０に移行する。 In step S8, it is determined whether there is a static box without observation (ie, not detected) in the past M frames in a row. If there is no corresponding static box (No in step S8), the process proceeds to step S10.

一方、該当する静的ボックスが存在する場合（ステップＳ８でＹｅｓ）、静的ボックス判定部１２２は、当該静的ボックスをファイルに書き出し、メモリ上から削除する（ステップＳ９）。そして、静的ボックス判定部１２２は、現在のフレームにおいて全てのボックスを比較したか否かを判定する（ステップＳ１０）。 On the other hand, when the corresponding static box exists (Yes in step S8), the static box determination unit 122 writes the static box in a file and deletes the file from the memory (step S9). Then, the static box determination unit 122 determines whether all the boxes in the current frame have been compared (step S10).

全てのボックスを比較していない場合（ステップＳ１０でＮｏ）、処理がステップＳ３に移行し、比較を未実施のボックスについて比較を行なう。一方、全てのボックスを比較した場合（ステップＳ１０でＹｅｓ）、静的ボックス判定部１２２は、ボックス情報１１３の蓄積を開始してから所定期間、例えば３日間が経過したか否かを判定する（ステップＳ１１）。 If all the boxes have not been compared (No in step S10), the process proceeds to step S3, and comparison is performed on the unexecuted boxes. On the other hand, when all the boxes have been compared (Yes in step S10), the static box determination unit 122 determines whether or not a predetermined period, for example, three days has elapsed since the storage of the box information 113 starts. Step S11).

所定期間が経過していない場合（ステップＳ１１でＮｏ）、処理がステップＳ１に移行し、次に入力されるフレームについて、上記の処理を実行する。 When the predetermined period has not elapsed (No in step S11), the process proceeds to step S1, and the above process is performed on a frame to be input next.

一方、所定期間が経過した場合（ステップＳ１１でＹｅｓ）、処理が図１４のステップＳ１２に移行する。なお、この場合、ＮＮ１３及び静的ボックス判定部１２２は、座席位置推定（更新）フェーズとして、入力される映像データ１１１に基づき処理を継続してもよい。 On the other hand, when the predetermined period has elapsed (Yes in step S11), the process proceeds to step S12 in FIG. In this case, the NN 13 and the static box determination unit 122 may continue the processing based on the input video data 111 as a seat position estimation (update) phase.

図１４のステップＳ１２では、座席推定部１２３が、各静的ボックスを初期のクラスタとして、各クラスタの代表値を設定する。例えば、座席推定部１２３は、クラスタ内の静的ボックスの中心座標の平均値（ｃ_ｘ，ｃ_ｙ）、ボックスサイズの平均値“ｓｉｚｅ”、クラスタ内の静的ボックス数“ｎ”、クラスタ内の静的ボックスの観測有無を示す配列、等を設定してよい。なお、配列には、例えば、フレームごとに、静的ボックスが観測されていれば“１”、観測されていなければ“０”が設定されてよい。 In step S12 of FIG. 14, the seat estimation unit 123 sets representative values of each cluster, with each static box as an initial cluster. For example, the seat estimation unit 123 may calculate an average value (c _x , c _y ) of center coordinates of static boxes in a cluster, an average value “size” of box sizes, a number “n” of static boxes in a cluster, An array indicating the presence or absence of observation of the static box of, or the like may be set. In the arrangement, for example, “1” may be set for each frame if the static box is observed, and “0” if not.

次いで、座席推定部１２３は、クラスタのペアごとに、クラスタ間の距離指標（ｄ_ｗａｒｄ）を算出する（ステップＳ１３）。なお、距離指標（ｄ_ｗａｒｄ）は、上記（３）式により算出されてよい。距離指標（ｄ_ｗａｒｄ）の算出には、クラスタペアの配列同士のＡＮＤ結果のＳＵＭ（合計値）により求められるペナルティが加味されてよい（図９参照）。 Next, the seat estimation unit 123 calculates a distance index (d _ward ) between clusters for each pair of clusters (step S13). The distance index (d _ward ) may be calculated by the above equation (3). In calculating the distance index (d _ward ), a penalty obtained by the SUM (total value) of the AND results of the array of cluster pairs may be added (see FIG. 9).

座席推定部１２３は、距離指標（ｄ_ｗａｒｄ）が最小となるクラスタペアを決定し（ステップＳ１４）、当該クラスタペアが停止条件を満たすか否かを判定する（ステップＳ１５）。停止条件は、上記（４）式により算出されてよい（図９参照）。 The seat estimation unit 123 determines a cluster pair for which the distance index (d _ward ) is minimum (step S14), and determines whether the cluster pair satisfies the stop condition (step S15). The stop condition may be calculated by the above equation (4) (see FIG. 9).

クラスタペアが停止条件を満たさない場合（ステップＳ１５でＮｏ）、座席推定部１２３は、当該クラスタペアをマージして新しいクラスタとする（ステップＳ１６）。 If the cluster pair does not satisfy the stop condition (No in step S15), the seat estimation unit 123 merges the cluster pair into a new cluster (step S16).

座席推定部１２３は、残りクラスタ数が１であるか否かを判定する（ステップＳ１７）。残りクラスタ数が１ではない場合（ステップＳ１７でＮｏ）、処理がステップＳ１３に移行する。 The seat estimation unit 123 determines whether the number of remaining clusters is one (step S17). If the number of remaining clusters is not 1 (No in step S17), the process proceeds to step S13.

ステップＳ１５でクラスタペアが停止条件を満たす場合（ステップＳ１５でＹｅｓ）、又は、ステップＳ１７で残りクラスタ数が１の場合（ステップＳ１７でＹｅｓ）、処理がステップＳ１８に移行する。 If the cluster pair satisfies the stop condition in step S15 (Yes in step S15), or if the number of remaining clusters is 1 in step S17 (Yes in step S17), the process proceeds to step S18.

ステップＳ１８では、座席推定部１２３は、クラスタ内の静的ボックス数が他のクラスタの静的ボックス数の所定割合（例えば２％）以下であるクラスタを削除する。 In step S18, the seat estimation unit 123 deletes a cluster whose static box number in the cluster is equal to or less than a predetermined ratio (for example, 2%) of the static box number of other clusters.

そして、座席推定部１２３は、残ったクラスタを座席情報１１４と出力、例えばメモリ部１１に格納し（ステップＳ１９）、座席位置推定フェーズの処理が終了する。 Then, the seat estimation unit 123 outputs the remaining cluster and the seat information 114, for example, stores it in the memory unit 11 (step S19), and the process of the seat position estimation phase ends.

〔１−４−２〕混雑状況推定フェーズの動作例
次に、図１５を参照して、混雑状況推定フェーズの動作例を説明する。 [1-4-2] Operation Example of Congestion Situation Estimation Phase Next, an operation example of the congestion situation estimation phase will be described with reference to FIG.

図１５に例示するように、ＮＮ１３は、映像データ１１１から一定間隔（例えば５分間隔）でフレームを取得し（ステップＳ２１）、フレームから人物の頭部のボックスを検出する（ステップＳ２２）。なお、ＮＮ１３は、検出したボックスを検出情報１１２としてメモリ部１１に格納してよい。また、座席位置推定（更新）フェーズが行なわれる場合、座席位置推定（更新）フェーズにおける図１３のステップＳ１及びＳ２の処理は、ステップＳ２１及びＳ２２の処理に置き換えられてよい（ステップＳ１及びＳ２の処理を省略してもよい）。 As illustrated in FIG. 15, the NN 13 acquires frames from the video data 111 at regular intervals (for example, every five minutes) (step S21), and detects a box of the head of a person from the frames (step S22). The NN 13 may store the detected box as the detection information 112 in the memory unit 11. Further, when the seat position estimation (update) phase is performed, the processing of steps S1 and S2 of FIG. 13 in the seat position estimation (update) phase may be replaced with the processing of steps S21 and S22 (steps S1 and S2 Processing may be omitted).

制御部１２の混雑度算出部１２４は、検出情報１１２における現在（例えば最新）のフレームのボックスと、座席情報１１４における座席位置との間の距離を算出する（ステップＳ２３）。なお、距離は、ボックス及び座席位置のそれぞれの中心座標或いは平均座標間の距離であってよい。 The congestion degree calculation unit 124 of the control unit 12 calculates the distance between the box of the current (for example, the latest) frame in the detection information 112 and the seat position in the seat information 114 (step S23). The distance may be a distance between center coordinates or average coordinates of the box and the seat position.

次いで、混雑度算出部１２４は、距離の近い順に、座席位置に対して、距離が閾値以下のボックスを割り当てる（ステップＳ２４）。 Next, the congestion degree calculation unit 124 assigns a box having a distance equal to or less than a threshold to the seat position in the order of closeness (step S24).

そして、混雑度算出部１２４は、ボックスを割り当てられた、換言すれば人の居る、座席数をカウントし（ステップＳ２５）、混雑率として、“人の居る座席数／全座席数”を算出する（ステップＳ２６）。 Then, the congestion degree calculation unit 124 counts the number of seats allocated the box, in other words, where there are people (step S 25), and calculates “the number of seats where there are people / the total number of seats” as the congestion rate. (Step S26).

混雑度算出部１２４は、算出した混雑率が、第１の閾値の一例である“０．３”未満か否かを判定する（ステップＳ２７）。混雑率が“０．３”未満の場合（ステップＳ２７でＹｅｓ）、混雑度算出部１２４は、混雑度を「空いている」と推定し（ステップＳ２８）、混雑度情報１１５をメモリ部１１に格納して、処理がステップＳ３２に移行する。 The congestion degree calculation unit 124 determines whether the calculated congestion rate is less than “0.3” which is an example of the first threshold (step S27). If the congestion rate is less than "0.3" (Yes in step S27), the congestion degree calculating unit 124 estimates the congestion degree as "vacant" (step S28), and transmits the congestion degree information 115 to the memory unit 11. After storing, the process proceeds to step S32.

混雑率が“０．３”以上の場合（ステップＳ２７でＮｏ）、混雑度算出部１２４は、混雑率が、第２の閾値の一例である“０．６”未満か否かを判定する（ステップＳ２９）。混雑率が“０．６”未満（且つ“０．３”以上）の場合（ステップＳ２９でＹｅｓ）、混雑度算出部１２４は、混雑度を「通常」と推定し（ステップＳ３０）、混雑度情報１１５をメモリ部１１に格納して、処理がステップＳ３２に移行する。 When the congestion rate is “0.3” or more (No in step S27), the congestion degree calculation unit 124 determines whether the congestion rate is less than “0.6” which is an example of the second threshold ( Step S29). When the congestion rate is less than "0.6" (and "0.3" or more) (Yes in step S29), the congestion degree calculating unit 124 estimates the congestion degree as "normal" (step S30), and the congestion degree The information 115 is stored in the memory unit 11, and the process proceeds to step S32.

一方、混雑率が“０．６”以上の場合（ステップＳ２９でＮｏ）、混雑度算出部１２４は、混雑度を「混んでいる」と推定し（ステップＳ３１）、混雑度情報１１５をメモリ部１１に格納して、処理がステップＳ３２に移行する。 On the other hand, when the congestion rate is "0.6" or more (No in step S29), the congestion degree calculating unit 124 estimates the congestion degree as "crowded" (step S31), and the congestion degree information 115 is stored in the memory unit In step S32, the process proceeds to step S32.

ステップＳ３２では、サーバ４の情報提示部１４が、混雑度情報１１５に基づき、店舗３の混雑度の情報を提示する。例えば、情報提示部１４は、端末装置６からの要求に応じて、店舗３の混雑度の情報を含むＷｅｂページを端末装置６に表示させてよい。 In step S32, the information presentation unit 14 of the server 4 presents information on the degree of congestion of the store 3 based on the degree of congestion information 115. For example, the information presentation unit 14 may cause the terminal device 6 to display a Web page including information on the degree of congestion of the store 3 in response to a request from the terminal device 6.

以上により、混雑状況推定フェーズの処理が終了する。 Thus, the process of the congestion state estimation phase ends.

〔１−５〕ハードウェア構成例
次に、図１６を参照して、一実施形態に係るサーバ４のハードウェア構成例について説明する。以下、サーバ４の一例としてコンピュータ１０を例に挙げて、コンピュータ１０のハードウェア構成例について説明する。なお、端末装置６についても、サーバ４と同様のハードウェア構成をそなえてよい。 [1-5] Hardware Configuration Example Next, a hardware configuration example of the server 4 according to an embodiment will be described with reference to FIG. Hereinafter, a hardware configuration example of the computer 10 will be described by taking the computer 10 as an example of the server 4 as an example. The terminal device 6 may have the same hardware configuration as that of the server 4.

図１６に示すように、コンピュータ１０は、例示的に、プロセッサ１０ａ、メモリ１０ｂ、記憶部１０ｃ、ＩＦ（Interface）部１０ｄ、Ｉ／Ｏ（Input / Output）部１０ｅ、及び読取部１０ｆをそなえてよい。 As illustrated in FIG. 16, the computer 10 illustratively includes a processor 10a, a memory 10b, a storage unit 10c, an IF (Interface) unit 10d, an I / O (Input / Output) unit 10e, and a reading unit 10f. Good.

プロセッサ１０ａは、種々の制御や演算を行なう演算処理装置の一例である。プロセッサ１０ａは、コンピュータ１０内の各ブロックとバス１０ｉで相互に通信可能に接続されてよい。プロセッサ１０ａとしては、例えば、ＣＰＵ、ＭＰＵ、ＤＳＰ、ＡＳＩＣ、ＦＰＧＡ等の集積回路（ＩＣ）が用いられてもよい。なお、ＣＰＵはCentral Processing Unitの略称であり、ＭＰＵはMicro Processing Unitの略称である。ＤＳＰはDigital Signal Processorの略称であり、ＡＳＩＣはApplication Specific Integrated Circuitの略称であり、ＦＰＧＡはField-Programmable Gate Arrayの略称である。 The processor 10a is an example of an arithmetic processing unit that performs various controls and calculations. The processor 10a may be communicably connected to each block in the computer 10 by a bus 10i. As the processor 10a, for example, an integrated circuit (IC) such as a CPU, an MPU, a DSP, an ASIC, or an FPGA may be used. CPU is an abbreviation for Central Processing Unit, and MPU is an abbreviation for Micro Processing Unit. DSP is an abbreviation for Digital Signal Processor, ASIC is an abbreviation for Application Specific Integrated Circuit, and FPGA is an abbreviation for Field-Programmable Gate Array.

メモリ１０ｂは、種々のデータやプログラム等の情報を格納するハードウェアの一例である。メモリ１０ｂとしては、例えばＲＡＭ等の揮発性メモリが挙げられる。 The memory 10 b is an example of hardware that stores information such as various data and programs. Examples of the memory 10 b include volatile memory such as RAM.

記憶部１０ｃは、種々のデータやプログラム等の情報を格納するハードウェアの一例である。記憶部１０ｃとしては、例えばＨＤＤ等の磁気ディスク装置、ＳＳＤ等の半導体ドライブ装置、不揮発性メモリ等の各種記憶装置が挙げられる。不揮発性メモリとしては、例えば、フラッシュメモリ、ＳＣＭ（Storage Class Memory）、ＲＯＭ（Read Only Memory）等が挙げられる。 The storage unit 10 c is an example of hardware that stores information such as various data and programs. Examples of the storage unit 10 c include various storage devices such as a magnetic disk device such as an HDD, a semiconductor drive device such as an SSD, and a non-volatile memory. Examples of the non-volatile memory include a flash memory, a storage class memory (SCM), and a read only memory (ROM).

なお、図１に示すメモリ部１１は、例えば、サーバ４のメモリ１０ｂ及び記憶部１０ｃの少なくとも一方の記憶領域により実現されてもよい。 The memory unit 11 illustrated in FIG. 1 may be realized by, for example, at least one of the memory areas of the memory 10 b and the memory unit 10 c of the server 4.

また、記憶部１０ｃは、コンピュータ１０の各種機能の全部若しくは一部を実現するプログラム１０ｇを格納してよい。プロセッサ１０ａは、記憶部１０ｃに格納されたプログラム１０ｇをメモリ１０ｂに展開して実行することにより、図１に示すサーバ４としての機能を実現できる。 In addition, the storage unit 10 c may store a program 10 g for realizing all or part of various functions of the computer 10. The processor 10a can realize the function as the server 4 illustrated in FIG. 1 by developing the program 10g stored in the storage unit 10c in the memory 10b and executing the program 10g.

ＩＦ部１０ｄは、ネットワーク５との間の接続及び通信の制御等を行なう通信インタフェースの一例である。例えば、ＩＦ部１０ｄは、ＬＡＮ、或いは、光通信（例えばＦＣ（Fibre Channel；ファイバチャネル））等に準拠したアダプタを含んでよい。例えば、プログラム１０ｇは、当該通信インタフェースを介してネットワーク５からコンピュータ１０にダウンロードされ、記憶部１０ｃに格納されてもよい。 The IF unit 10 d is an example of a communication interface that performs connection with the network 5 and control of communication. For example, the IF unit 10d may include an adapter conforming to a LAN or optical communication (for example, FC (Fibre Channel; Fiber Channel)). For example, the program 10g may be downloaded from the network 5 to the computer 10 via the communication interface and stored in the storage unit 10c.

Ｉ／Ｏ部１０ｅは、マウス、キーボード、又は操作ボタン等の入力部、並びに、タッチパネルディスプレイ、ＬＣＤ（Liquid Crystal Display）等のモニタ、プロジェクタ、又はプリンタ等の出力部、の一方又は双方を含んでよい。 The I / O unit 10e includes one or both of an input unit such as a mouse, a keyboard or an operation button, and a touch panel display, a monitor such as an LCD (Liquid Crystal Display), an output unit such as a projector or a printer. Good.

読取部１０ｆは、記録媒体１０ｈに記録されたデータやプログラムの情報を読み出すリーダの一例である。読取部１０ｆは、記録媒体１０ｈを接続可能又は挿入可能な接続端子又は装置を含んでよい。読取部１０ｆとしては、例えば、ＵＳＢ（Universal Serial Bus）等に準拠したアダプタ、記録ディスクへのアクセスを行なうドライブ装置、ＳＤカード等のフラッシュメモリへのアクセスを行なうカードリーダ等が挙げられる。なお、記録媒体１０ｈにはプログラム１０ｇが格納されてもよく、読取部１０ｆが記録媒体１０ｈからプログラム１０ｇを読み出して記憶部１０ｃに格納してもよい。 The reading unit 10 f is an example of a reader that reads information of data and programs recorded on the recording medium 10 h. The reading unit 10 f may include a connection terminal or device capable of connecting to or inserting the recording medium 10 h. Examples of the reading unit 10 f include an adapter conforming to a USB (Universal Serial Bus) or the like, a drive device for accessing a recording disk, a card reader for accessing a flash memory such as an SD card, and the like. The program 10g may be stored in the recording medium 10h, and the reading unit 10f may read the program 10g from the recording medium 10h and store the program 10g in the storage unit 10c.

記録媒体１０ｈとしては、例示的に、磁気／光ディスクやフラッシュメモリ等の非一時的な記録媒体が挙げられる。磁気／光ディスクとしては、例示的に、フレキシブルディスク、ＣＤ（Compact Disc）、ＤＶＤ（Digital Versatile Disc）、ブルーレイディスク、ＨＶＤ（Holographic Versatile Disc）等が挙げられる。フラッシュメモリとしては、例示的に、ＵＳＢメモリやＳＤカード等が挙げられる。なお、ＣＤとしては、例示的に、ＣＤ−ＲＯＭ、ＣＤ−Ｒ、ＣＤ−ＲＷ等が挙げられる。また、ＤＶＤとしては、例示的に、ＤＶＤ−ＲＯＭ、ＤＶＤ−ＲＡＭ、ＤＶＤ−Ｒ、ＤＶＤ−ＲＷ、ＤＶＤ＋Ｒ、ＤＶＤ＋ＲＷ等が挙げられる。 Examples of the recording medium 10 h include non-temporary recording media such as a magnetic / optical disk and a flash memory. Examples of the magnetic / optical disc include a flexible disc, a compact disc (CD), a digital versatile disc (DVD), a Blu-ray disc, and a holographic versatile disc (HVD). Examples of the flash memory include a USB memory and an SD card. In addition, as CD, CD-ROM, CD-R, CD-RW etc. are mentioned illustratively. Further, examples of the DVD include a DVD-ROM, a DVD-RAM, a DVD-R, a DVD-RW, a DVD + R, a DVD + RW, and the like.

上述したコンピュータ１０のハードウェア構成は例示である。従って、サーバ４における、コンピュータ１０内でのハードウェアの増減（例えば任意のブロックの追加や削除）、分割、任意の組み合わせでの統合、又は、バスの追加若しくは削除等は適宜行なわれてもよい。 The hardware configuration of the computer 10 described above is an example. Therefore, hardware increase / decrease (for example, addition or deletion of arbitrary blocks), division, integration in any combination, addition or deletion of buses, etc. in the computer 10 in the server 4 may be appropriately performed. .

〔２〕その他
上述した一実施形態に係る技術は、以下のように変形、変更して実施することができる。 [2] Others The technology according to the above-described embodiment can be modified or changed as follows.

例えば、図１に示すサーバ４の各機能ブロックは、それぞれ任意の組み合わせで併合してもよく、分割してもよい。また、図１に示す制御部１２の各機能ブロックは、それぞれ任意の組み合わせで併合してもよく、分割してもよい。 For example, each functional block of the server 4 shown in FIG. 1 may be merged or divided in any combination. Further, each functional block of the control unit 12 shown in FIG. 1 may be merged or divided in any combination.

さらに、図１６に示すコンピュータ１０のプロセッサ１０ａは、シングルプロセッサやシングルコアプロセッサに限定されるものではなく、マルチプロセッサやマルチコアプロセッサであってもよい。 Furthermore, the processor 10a of the computer 10 shown in FIG. 16 is not limited to a single processor or a single core processor, and may be a multiprocessor or a multi-core processor.

上述した一実施形態では、映像に対する人の頭部の検出モデルの認識結果に基づき滞留領域を推定する手法を用いて、店舗映像の入力から混雑度を算出する処理について説明した。しかしながら、推定した滞留領域の情報は、上述した態様での利用に限定されるものではなく、種々の分析或いは推定に用いられてもよい。 In the above-described embodiment, the processing for calculating the degree of congestion from the input of the shop video has been described using a method for estimating the staying area based on the recognition result of the human head detection model for the video. However, the information of the estimated staying area is not limited to the use in the above-described aspect, and may be used for various analysis or estimation.

また、一実施形態において、座席推定部１２３は、階層化クラスタリングを行なうものとしたが、これに限定されるものではない。例えば、店舗３に設けられた座席の数が事前に判明している場合には、座席推定部１２３は、座席数と同数のクラスタを作成するように、非階層化クラスタリングを行なってもよい。 In one embodiment, the seat estimation unit 123 performs hierarchical clustering, but the present invention is not limited to this. For example, when the number of seats provided in the store 3 is known in advance, the seat estimation unit 123 may perform non-hierarchical clustering so as to create the same number of clusters as the number of seats.

座席数の情報は、例えば、サーバ４或いは管理者等によりネットワーク５を介してインターネット等から取得され、データベース、例えばメモリ部１１に格納されてよい。この場合、情報取得部１２１は、座席推定部１２３によるクラスタリングに際して、メモリ部１１から、撮影空間３０における座席数の情報を取得する取得部の一例として機能してよい。 The information on the number of seats may be acquired from the Internet or the like via the network 5 by the server 4 or a manager, for example, and stored in a database, for example, the memory unit 11. In this case, the information acquisition unit 121 may function as an example of an acquisition unit that acquires information on the number of seats in the imaging space 30 from the memory unit 11 when performing clustering by the seat estimation unit 123.

店舗３の座席数の情報は、インターネットにおいて施設や店舗３のホームページや、紹介サイト等に掲載されている場合が多く、容易に取得可能である。従って、初期設定におけるコストの増加を抑制しつつ、正確な座席数に基づき座席位置を正確に推定することができる。 The information on the number of seats of the store 3 can be easily acquired in many cases in the case of being posted on the website of the facility or store 3 or an introduction site on the Internet. Therefore, it is possible to accurately estimate the seat position based on the correct number of seats while suppressing the cost increase in the initial setting.

さらに、一実施形態において、サーバ４が１つの監視カメラ２からの映像データに基づいて座席位置及び混雑度を推定するものとしたが、これに限定されるものではない。 Furthermore, in one embodiment, the server 4 estimates the seat position and the degree of congestion based on the video data from one monitoring camera 2, but the present invention is not limited to this.

例えば、サーバ４は、店舗３に設けられた複数の監視カメラ２の各々から映像データを取得する場合、監視カメラ２ごとに、座席位置及び混雑度を推定してよい。そして、サーバ４は、推定した監視カメラ２ごとの混雑度（或いは混雑率）を例えば平均化することで、店舗３における混雑度を推定してよい。 For example, when acquiring the video data from each of the plurality of monitoring cameras 2 provided in the store 3, the server 4 may estimate the seat position and the congestion degree for each of the monitoring cameras 2. Then, the server 4 may estimate the degree of congestion in the store 3 by, for example, averaging the estimated degree of congestion (or the degree of congestion) for each monitoring camera 2.

〔３〕付記
以上の実施形態に関し、さらに以下の付記を開示する。 [3] Additional Notes The following additional notes will be disclosed regarding the above-described embodiment.

（付記１）
空間を撮影した複数の第１画像の各々から検出された物体について、検出された各物体の位置情報と、前記複数の第１画像の各々における各物体の検出状況と、に基づき、前記複数の第１画像間で相互に関連する物体をグループ化し、
前記グループ化の結果に基づき、前記空間において各物体が滞留する滞留領域を推定する、
処理をコンピュータに実行させる、推定プログラム。 (Supplementary Note 1)
For the objects detected from each of the plurality of first images obtained by imaging the space, the plurality of the plurality of objects are detected based on position information of each detected object and a detection condition of each object in each of the plurality of first images. Group interrelated objects between first images,
Based on a result of the grouping, a staying area in which each object stays in the space is estimated
An estimation program that causes a computer to execute processing.

（付記２）
前記空間を撮影した第２画像から検出された物体の位置情報と、推定された前記空間における前記滞留領域の情報と、に基づき、前記滞留領域の混雑度を推定する、
処理を前記コンピュータに実行させる、付記１に記載の推定プログラム。 (Supplementary Note 2)
The congestion degree of the staying area is estimated based on position information of an object detected from a second image obtained by imaging the space, and information on the staying area in the space estimated.
The estimation program according to appendix 1, causing the computer to execute a process.

（付記３）
所定数以上の第１画像間において同一位置に存在すると判断した前記所定数以上の物体を一の静的物体と対応付けて管理する、
処理を前記コンピュータに実行させ、
前記グループ化は、前記複数の第１画像間で、距離に関する条件を満たす静的物体同士を前記相互に関連する物体としてグループ化する、
付記１又は付記２に記載の推定プログラム。 (Supplementary Note 3)
The predetermined number or more of objects determined to be present at the same position among a predetermined number or more of first images are managed in association with one static object.
Execute the process on the computer,
The grouping groups static objects that satisfy a condition regarding a distance among the plurality of first images as the correlated objects.
The estimation program according to Appendix 1 or 2.

（付記４）
前記管理は、前記所定数以上の第１画像間において、物体の画像成分が一致すると判断した前記所定数以上の物体を前記一の静的物体と対応付けて管理する、
付記３に記載の推定プログラム。 (Supplementary Note 4)
The management manages, in association with the one static object, the predetermined number or more of objects determined that the image components of the objects match between the predetermined number or more of the first images.
A presumption program given in appendix 3.

（付記５）
前記グループ化は、
前記複数の第１画像の各々における各物体の検出状況を示す情報であって、前記静的物体ごとに、前記複数の第１画像の各々において当該静的物体に対応付けられた物体が存在するか否かを示す情報を生成し、
前記生成した情報に基づき、対応する物体が一の第１画像内に存在する静的物体同士をグループ化の対象から除外する、
付記３又は付記４に記載の推定プログラム。 (Supplementary Note 5)
The grouping is:
It is information which shows the detection condition of each object in each of a plurality of 1st pictures, and the object matched with the static object exists in each of a plurality of 1st pictures for every static object. Generate information indicating whether or not
Based on the generated information, excluding static objects whose corresponding objects are present in one first image from being grouped
Appendix 3. The estimation program according to Appendix 3 or 4.

（付記６）
前記グループ化は、階層化クラスタリングを行なう、
付記１〜５のいずれか１項に記載の推定プログラム。 (Supplementary Note 6)
The grouping performs hierarchical clustering,
The estimation program according to any one of appendices 1 to 5.

（付記７）
前記空間における前記滞留領域の数をデータベースから取得する、
処理を前記コンピュータに実行させ、
前記グループ化は、取得した前記滞留領域の数のグループを作成するように、非階層化クラスタリングを行なう、
付記１〜５のいずれか１項に記載の推定プログラム。 (Appendix 7)
Obtaining from the database the number of the stagnant areas in the space;
Execute the process on the computer,
The grouping performs non-hierarchical clustering so as to create a group of the acquired number of staying areas.
The estimation program according to any one of appendices 1 to 5.

（付記８）
前記物体は人体の頭部であり、
前記滞留領域は座席領域であり、
画像の特徴量を検出するニューラルネットワークを用いて、前記複数の第１画像の各々から人体の頭部である前記物体を検出する、
処理を前記コンピュータに実行させる、付記１〜７のいずれか１項に記載の推定プログラム。 (Supplementary Note 8)
The object is a human head,
The stagnant area is a seating area,
Detecting the object which is the head of a human body from each of the plurality of first images using a neural network that detects feature amounts of the image;
10. The estimation program according to any one of appendices 1 to 7, causing the computer to execute a process.

（付記９）
空間を撮影した複数の第１画像の各々から検出された物体について、検出された各物体の位置情報と、前記複数の第１画像の各々における各物体の検出状況と、に基づき、前記複数の第１画像間で相互に関連する物体をグループ化するグループ化部と、
前記グループ化の結果に基づき、前記空間において各物体が滞留する滞留領域を推定する推定部と、
をそなえる推定システム。 (Appendix 9)
For the objects detected from each of the plurality of first images obtained by imaging the space, the plurality of the plurality of objects are detected based on position information of each detected object and a detection condition of each object in each of the plurality of first images. A grouping unit that groups related objects among the first images;
An estimation unit configured to estimate a staying area in which each object stays in the space based on the grouping result;
An estimation system equipped with

（付記１０）
前記空間を撮影した第２画像から検出された物体の位置情報と、推定された前記空間における前記滞留領域の情報と、に基づき、前記滞留領域の混雑度を推定する混雑度推定部、
をそなえる、付記９に記載の推定システム。 (Supplementary Note 10)
A congestion degree estimation unit that estimates the congestion degree of the stagnant area based on position information of an object detected from a second image obtained by photographing the space and information on the stagnant area in the space estimated;
Appendix 9. The estimation system according to appendix 9.

（付記１１）
所定数以上の第１画像間において同一位置に存在すると判断した前記所定数以上の物体を一の静的物体と対応付けて管理する管理部、をそなえ、
前記グループ化部は、前記複数の第１画像間で、距離に関する条件を満たす静的物体同士を前記相互に関連する物体としてグループ化する、
付記９又は付記１０に記載の推定システム。 (Supplementary Note 11)
A management unit that manages the predetermined number or more of objects determined to be present at the same position among the predetermined number or more of first images in association with one static object;
The grouping unit groups static objects satisfying a condition related to a distance as the interrelated objects among the plurality of first images.
The estimation system according to Supplementary Note 9 or Supplementary Note 10.

（付記１２）
前記管理部は、前記所定数以上の第１画像間において、物体の画像成分が一致すると判断した前記所定数以上の物体を前記一の静的物体と対応付けて管理する、
付記１１に記載の推定システム。 (Supplementary Note 12)
The management unit manages, in association with the one static object, the predetermined number or more of objects determined that the image components of the objects match between the predetermined number or more of the first images.
The estimation system according to appendix 11.

（付記１３）
前記グループ化部は、
前記複数の第１画像の各々における各物体の検出状況を示す情報であって、前記静的物体ごとに、前記複数の第１画像の各々において当該静的物体に対応付けられた物体が存在するか否かを示す情報を生成し、
前記生成した情報に基づき、対応する物体が一の第１画像内に存在する静的物体同士をグループ化の対象から除外する、
付記１１又は付記１２に記載の推定システム。 (Supplementary Note 13)
The grouping unit is
It is information which shows the detection condition of each object in each of a plurality of 1st pictures, and the object matched with the static object exists in each of a plurality of 1st pictures for every static object. Generate information indicating whether or not
Based on the generated information, excluding static objects whose corresponding objects are present in one first image from being grouped
The estimation system according to Supplementary Note 11 or 12.

（付記１４）
前記グループ化部は、階層化クラスタリングにより前記グループ化を行なう、
付記９〜１３のいずれか１項に記載の推定システム。 (Supplementary Note 14)
The grouping unit performs the grouping by hierarchical clustering.
The estimation system according to any one of appendices 9 to 13.

（付記１５）
前記空間における前記滞留領域の数をデータベースから取得する取得部、をそなえ、
前記グループ化部は、取得した前記滞留領域の数のグループを作成するように、非階層化クラスタリングにより前記グループ化を行なう、
付記９〜１３のいずれか１項に記載の推定システム。 (Supplementary Note 15)
An acquisition unit for acquiring from the database the number of the stagnant areas in the space;
The grouping unit performs the grouping by non-layered clustering so as to create a group of the acquired number of staying areas.
The estimation system according to any one of appendices 9 to 13.

（付記１６）
前記物体は人体の頭部であり、
前記滞留領域は座席領域であり、
画像の特徴量を検出するニューラルネットワークであって、前記複数の第１画像の各々から人体の頭部である前記物体を検出する前記ニューラルネットワーク、をそなえる、付記９〜１５のいずれか１項に記載の推定システム。 (Supplementary Note 16)
The object is a human head,
The stagnant area is a seating area,
It is a neural network which detects the feature-value of a picture, provided with the above-mentioned neural network which detects the above-mentioned object which is a head of a human body from each of a plurality of above-mentioned 1st pictures, Estimated system described.

（付記１７）
空間を撮影した複数の第１画像の各々から検出された物体について、検出された各物体の位置情報と、前記複数の第１画像の各々における各物体の検出状況と、に基づき、前記複数の第１画像間で相互に関連する物体をグループ化し、
前記グループ化の結果に基づき、前記空間において各物体が滞留する滞留領域を推定する、推定方法。 (Supplementary Note 17)
For the objects detected from each of the plurality of first images obtained by imaging the space, the plurality of the plurality of objects are detected based on position information of each detected object and a detection condition of each object in each of the plurality of first images. Group interrelated objects between first images,
An estimation method for estimating a staying area in which each object stays in the space based on the grouping result.

（付記１８）
前記空間を撮影した第２画像から検出された物体の位置情報と、推定された前記空間における前記滞留領域の情報と、に基づき、前記滞留領域の混雑度を推定する、
付記１７に記載の推定方法。 (Appendix 18)
The congestion degree of the staying area is estimated based on position information of an object detected from a second image obtained by imaging the space, and information on the staying area in the space estimated.
The estimation method according to appendix 17.

（付記１９）
所定数以上の第１画像間において同一位置に存在すると判断した前記所定数以上の物体を一の静的物体と対応付けて管理し、
前記グループ化は、前記複数の第１画像間で、距離に関する条件を満たす静的物体同士を前記相互に関連する物体としてグループ化する、
付記１７又は付記１８に記載の推定方法。 (Appendix 19)
Managing a predetermined number or more of objects determined to be present at the same position among a predetermined number or more of first images in association with one static object;
The grouping groups static objects that satisfy a condition regarding a distance among the plurality of first images as the correlated objects.
The estimation method as described in Supplementary Note 17 or 18.

（付記２０）
前記管理は、前記所定数以上の第１画像間において、物体の画像成分が一致すると判断した前記所定数以上の物体を前記一の静的物体と対応付けて管理する、
付記１９に記載の推定方法。 (Supplementary Note 20)
The management manages, in association with the one static object, the predetermined number or more of objects determined that the image components of the objects match between the predetermined number or more of the first images.
An estimation method according to appendix 19.

１混雑度推定システム
２監視カメラ
３店舗
４サーバ
５ネットワーク
６端末装置
１０コンピュータ
１１メモリ部
１１１映像データ
１１２検出情報
１１３ボックス情報
１１４座席情報
１１５混雑度情報
１２制御部
１２１情報取得部
１２２静的ボックス判定部
１２３座席推定部
１２４混雑度算出部
１３ＮＮ（ニューラルネットワーク）
１４情報提示部 DESCRIPTION OF SYMBOLS 1 crowded degree estimation system 2 surveillance camera 3 store 4 server 5 network 6 terminal device 10 computer 11 memory section 111 video data 112 detected information 113 box information 114 seat information 115 crowded degree information 12 control section 121 information acquisition section 122 static box determination Part 123 Seat estimation part 124 Congestion degree calculation part 13 NN (neural network)
14 Information presentation department

Claims

For the objects detected from each of the plurality of first images obtained by imaging the space, the plurality of the plurality of objects are detected based on position information of each detected object and a detection condition of each object in each of the plurality of first images. Group interrelated objects between first images,
Based on a result of the grouping, a staying area in which each object stays in the space is estimated
An estimation program that causes a computer to execute processing.

The congestion degree of the staying area is estimated based on position information of an object detected from a second image obtained by imaging the space, and information on the staying area in the space estimated.
The estimation program according to claim 1, which causes the computer to execute a process.

The predetermined number or more of objects determined to be present at the same position among a predetermined number or more of first images are managed in association with one static object.
Execute the process on the computer,
The grouping groups static objects that satisfy a condition regarding a distance among the plurality of first images as the correlated objects.
The estimation program of Claim 1 or Claim 2.

The management manages, in association with the one static object, the predetermined number or more of objects determined that the image components of the objects match between the predetermined number or more of the first images.
The estimation program according to claim 3.

The grouping is:
It is information which shows the detection condition of each object in each of a plurality of 1st pictures, and the object matched with the static object exists in each of a plurality of 1st pictures for every static object. Generate information indicating whether or not
Based on the generated information, excluding static objects whose corresponding objects are present in one first image from being grouped
The estimation program of Claim 3 or Claim 4.

The grouping performs hierarchical clustering,
The estimation program of any one of Claims 1-5.

Obtaining from the database the number of the stagnant areas in the space;
Execute the process on the computer,
The grouping performs non-hierarchical clustering so as to create a group of the acquired number of staying areas.
The estimation program of any one of Claims 1-5.

The object is a human head,
The stagnant area is a seating area,
Detecting the object which is the head of a human body from each of the plurality of first images using a neural network that detects feature amounts of the image;
The estimation program according to any one of claims 1 to 7, which causes the computer to execute a process.

For the objects detected from each of the plurality of first images obtained by imaging the space, the plurality of the plurality of objects are detected based on position information of each detected object and a detection condition of each object in each of the plurality of first images. A grouping unit that groups related objects among the first images;
An estimation unit configured to estimate a staying area in which each object stays in the space based on the grouping result;
An estimation system equipped with

For the objects detected from each of the plurality of first images obtained by imaging the space, the plurality of the plurality of objects are detected based on position information of each detected object and a detection condition of each object in each of the plurality of first images. Group interrelated objects between first images,
An estimation method for estimating a staying area in which each object stays in the space based on the grouping result.