JP7336835B2

JP7336835B2 - Attribute determination device, attribute determination system, and attribute determination method

Info

Publication number: JP7336835B2
Application number: JP2018090459A
Authority: JP
Inventors: 望仲尾
Original assignee: Konica Minolta Inc
Current assignee: Konica Minolta Inc
Priority date: 2018-05-09
Filing date: 2018-05-09
Publication date: 2023-09-01
Anticipated expiration: 2038-05-09
Also published as: JP2019197353A

Description

本発明は、人物を上方から撮影した各フレームの画像に基づいて、人物の属性を決定する属性決定装置、属性決定システムおよび属性決定方法に関する。 The present invention relates to an attribute determination device, an attribute determination system, and an attribute determination method for determining attributes of a person based on each frame image of the person photographed from above.

従来、カメラで人物を撮影した画像から、人物の性別や年齢などの属性を精度良く認識する技術として、顔画像を用いた属性認識手法が広く知られている（例えば特許文献１、２参照）。しかし、例えば店舗において、ある商品群とコンタクトをとる人物の属性を顔画像に基づいて認識するためには、各売り場や各陳列棚に、顔画像を取得するためのカメラを設置する必要がある。この場合、カメラの設置台数が多くなり、コストが掛かるだけでなく、カメラを向けられるユーザの心理的負担も高くなる。 Conventionally, an attribute recognition method using a face image is widely known as a technique for accurately recognizing a person's attributes such as gender and age from an image of the person photographed by a camera (see Patent Documents 1 and 2, for example). . However, in a store, for example, in order to recognize the attributes of a person who contacts a certain product group based on the face image, it is necessary to install a camera for acquiring the face image at each sales floor and each display shelf. . In this case, the number of cameras to be installed increases, which not only increases the cost but also increases the psychological burden on the user to whom the camera is directed.

そこで、コスト低減およびユーザの心理的負担の軽減の観点から、店舗の天井などにカメラを設置し、人物を上方から広域に撮影して画像を取得し、取得した画像から人物の属性を認識する方法が考えられる。例えば特許文献３のシステムでは、天井や壁面に設置したメインカメラで店舗全域を撮影して人物の画像部分を解析し、顧客の顔、髪型、衣服、身長、アクセサリ、靴などを総合的に考慮して性別、年齢層を推測し、特定が困難な場合は、売り場の陳列棚の近くに設置した補助カメラで撮影した顔画像を詳細に解析して性別、年齢層を推測するようにしている。 Therefore, from the viewpoint of reducing costs and reducing the psychological burden on users, a camera is installed on the ceiling of a store, etc., a person is photographed from above in a wide range, an image is acquired, and the attribute of the person is recognized from the acquired image. I can think of a way. For example, in the system of Patent Document 3, the main camera installed on the ceiling or wall captures the entire store and analyzes the image part of the person, comprehensively considering the customer's face, hairstyle, clothes, height, accessories, shoes, etc. If it is difficult to identify the person's gender and age group, the facial image captured by the auxiliary camera installed near the display shelf in the sales floor is analyzed in detail to guess the gender and age group. .

特開２０１０－６１４６５号公報（請求項１～２、図１、図５等参照）Japanese Patent Application Laid-Open No. 2010-61465 (see claims 1 and 2, FIG. 1, FIG. 5, etc.) 特開２００８－１７６６８９号公報（請求項１～２、図１等参照）Japanese Patent Application Laid-Open No. 2008-176689 (claims 1 and 2, see FIG. 1, etc.) 特開２００７－７４３３０号公報（請求項１～３、段落〔００２１〕、〔００２３〕、図１等参照）Japanese Patent Application Laid-Open No. 2007-74330 (claims 1 to 3, paragraphs [0021], [0023], see FIG. 1, etc.)

しかし、天井などに設置されたカメラで撮影された画像から、人物の属性を精度よく判定するのは難易度が高い。このことは、特許文献３において、属性の特定が困難な場合には、補助カメラによって取得される顔画像を利用して属性を推測していることからも容易に理解できる。 However, it is difficult to accurately determine a person's attribute from an image captured by a camera installed on the ceiling or the like. This can be easily understood from the fact that in Japanese Patent Application Laid-Open No. 2002-200012, when it is difficult to specify attributes, the attributes are estimated using the face image acquired by the auxiliary camera.

例えば、店舗内では、人物が歩行したり、立ち止まったり、陳列棚の下部の商品をとるためにしゃがみ込んだり、立ち上がったりする。このように、人物の位置、行動、姿勢などが時系列で変わる場合、撮影画像に基づく人物の属性の認識結果が各フレーム間で異なる場合があり得る。例えば、人物が歩行中であるフレームでは、撮影画像に基づいて「２０代男性」との認識結果が得られ、同じ人物が立ち止まっているフレームでは、撮影画像に基づいて「４０代男性」との認識結果が得られる場合があり得る。これは、人物が歩行中の場合、画像内で人物の像にブレが生じ、上記画像に基づく人物の属性の認識精度が低下することに起因する。 For example, in a store, people walk, stop, crouch, and stand up to pick up items from the lower shelves. In this way, when the position, action, posture, etc. of a person change in time series, the recognition result of the attribute of the person based on the captured image may differ between frames. For example, in a frame in which a person is walking, a recognition result of "man in his 20s" is obtained based on the captured image, and in a frame in which the same person is standing still, a recognition result of "man in his 40s" is obtained based on the captured image. A recognition result may be obtained. This is because when a person is walking, the image of the person blurs in the image, and the recognition accuracy of the attribute of the person based on the image is lowered.

同様に、例えば、人物がしゃがみ込んだフレームでは、撮影画像に基づいて「２０代男性」との認識結果が得られ、同じ人物が立ち上がったフレームでは、撮影画像に基づいて「４０代男性」との認識結果が得られる場合があり得る。これは、しゃがみ込んでいる人物を上方から撮影すると、人物の身体の一部が隠れた画像が取得され、人物の全身の画像データが得られないため、上記画像に基づく属性の認識精度が低下することに起因する。 Similarly, for example, in a frame in which a person crouches down, a recognition result of "man in his 20s" is obtained based on the captured image, and in a frame in which the same person stands up, a recognition result of "man in his 40s" is obtained based on the captured image. can be obtained. This is because when a person crouching down is photographed from above, an image with a part of the person's body hidden is obtained, and image data of the person's whole body cannot be obtained. due to

人物を上方から撮影した各フレームの画像に基づき、各フレーム間で同一人物の属性を判断（決定）する場合において、「歩行中」や「しゃがみ込み」など、属性の認識精度を低下させる事象（属性の認識に影響を及ぼす事象）が数フレーム（例えばｍを２以上の自然数としてｍフレーム）にわたって続くと、その後のフレームにおいて、上記事象の消滅によって高い精度で属性を認識できたとしても、上記数フレームにおける精度の低い属性の認識結果の影響により、全フレーム（例えばＭをｍよりも大きい３以上の自然数としてＭフレーム）として誤った属性の決定（例えば、実際は「４０代男性」である人物にして「２０代男性」の属性決定）がなされる場合があり得る（この例については後述する実施の形態の中で比較例として説明する）。これでは、人物の属性を精度よく決定できているとは言えない。 When judging (determining) the attributes of the same person between each frame based on the image of each frame photographed from above, events such as "walking" and "squatting" that reduce the accuracy of attribute recognition ( event affecting the recognition of the attribute) continues for several frames (for example, m frames where m is a natural number of 2 or more), even if the attribute can be recognized with high accuracy due to the disappearance of the event in the subsequent frames, the above Due to the influence of low-accuracy attribute recognition results in a few frames, an erroneous attribute determination (for example, a person who is actually "a man in his 40s") is determined for all frames (for example, M frames where M is a natural number of 3 or more larger than m) (This example will be described as a comparative example in the embodiments described later). In this case, it cannot be said that the attributes of a person can be determined with high accuracy.

本発明は、上記の問題点を解決するためになされたもので、その目的は、人物を上方から撮影した各フレームの画像に基づいて各フレーム間で同一の人物の属性を決定する際に、属性の認識に影響を与える事象が数フレームにわたって続く場合でも、全体として（各フレームのトータルで）人物の属性を精度よく決定することができる属性決定装置、属性決定システムおよび属性決定方法を提供することにある。 The present invention has been made to solve the above problems, and its object is to determine the attributes of the same person between each frame based on the image of each frame photographing the person from above, To provide an attribute determination device, an attribute determination system, and an attribute determination method capable of accurately determining a person's attribute as a whole (total of each frame) even when an event affecting recognition of the attribute continues over several frames. That's what it is.

本発明の一側面に係る属性決定装置は、人物を上方から撮影した各フレームの画像に基づいて、前記人物の属性を決定する属性決定装置であって、各フレームの前記画像に基づいて、前記画像内における前記人物の像の情報を示す人物情報と、前記人物の属性と、前記属性の認識に影響を与える事象とを、各フレームごとに認識する人物認識部と、各フレームの前記人物情報に基づいて、各フレーム間で前記人物の像が同一人の像であるか否かを判断する人物同定部と、各フレーム間で前記人物の像が同一人の像であると判断された前記人物に関して、各フレームごとに、前記属性の認識結果に前記事象の認識結果を加味した属性情報を、認識した前記属性の各クラスについて求め、前記各クラスについて、前記属性情報を複数フレームで統合した結果に基づいて、前記人物の前記属性を決定する属性決定部とを備えている。 An attribute determination device according to one aspect of the present invention is an attribute determination device that determines attributes of a person based on images of frames photographing the person from above, and based on the images of each frame, the a person recognizing unit for recognizing, for each frame, person information indicating information about the image of the person in the image, attributes of the person, and events affecting recognition of the attributes; and the person information of each frame. a person identification unit for determining whether or not the images of the person are the same person between frames based on Regarding a person, for each frame, attribute information obtained by adding the recognition result of the event to the recognition result of the attribute is obtained for each class of the recognized attribute, and the attribute information is integrated in a plurality of frames for each class. an attribute determining unit that determines the attribute of the person based on the result of the determination.

本発明の他の側面に係る属性決定システムは、上記の属性決定装置と、前記属性決定装置と通信回線を介して接続される管理サーバーとを含み、前記管理サーバーは、前記属性決定装置から送出される情報を格納する格納部を備え、前記情報には、前記属性決定装置の前記属性決定部によって決定された前記属性が含まれる。 An attribute determination system according to another aspect of the present invention includes the attribute determination device described above, and a management server connected to the attribute determination device via a communication line, wherein the management server transmits from the attribute determination device The information includes the attribute determined by the attribute determination unit of the attribute determination device.

本発明のさらに他の側面に係る属性決定方法は、人物を上方から撮影した各フレームの画像に基づいて、前記人物の属性を決定する属性決定方法であって、各フレームの前記画像に基づいて、前記画像内における前記人物の像の情報を示す人物情報と、前記人物の属性と、前記属性の認識に影響を与える事象とを、各フレームごとに認識する人物認識工程と、各フレームの前記人物情報に基づいて、各フレーム間で前記人物の像が同一人の像であるか否かを判断する人物同定工程と、各フレーム間で前記人物の像が同一人の像であると判断された前記人物に関して、各フレームごとに、前記属性の認識結果に前記事象の認識結果を加味した属性情報を、認識した前記属性の各クラスについて求め、前記各クラスについて、前記属性情報を複数フレームで統合した結果に基づいて、前記人物の前記属性を決定する属性決定工程とを含む。 An attribute determination method according to still another aspect of the present invention is an attribute determination method for determining an attribute of a person based on an image of each frame photographing the person from above, the method comprising: a person recognition step of recognizing, for each frame, person information indicating information of the image of the person in the image, an attribute of the person, and an event affecting recognition of the attribute; a person identification step of determining whether or not the image of the person is the same person between frames based on the person information; and determining whether the image of the person is the image of the same person between frames. attribute information obtained by adding the recognition result of the event to the recognition result of the attribute is obtained for each class of the recognized attribute for each frame of the person, and the attribute information is obtained for a plurality of frames for each class. and an attribute determination step of determining the attributes of the person based on the integrated result of.

属性の認識に影響を与える事象が数フレームにわたって続く場合でも、最終的な属性決定に対する上記数フレームの悪影響を低減し、全体として（各フレームのトータルで）人物の属性を精度よく決定することができる。 Even if an event affecting attribute recognition continues over several frames, it is possible to reduce the adverse effects of the above few frames on the final attribute determination, and to accurately determine a person's attribute as a whole (total of each frame). can.

本発明の実施の一形態の属性決定システムの概略の構成を示すブロック図である。BRIEF DESCRIPTION OF THE DRAWINGS It is a block diagram which shows the structure of the outline of the attribute determination system of one embodiment of this invention. 上記属性決定システムが有する属性決定装置の詳細な構成を示すブロック図である。It is a block diagram which shows the detailed structure of the attribute determination apparatus which the said attribute determination system has. 任意のフレームの画像内の人物の像および人物矩形の一例を示す説明図である。FIG. 4 is an explanatory diagram showing an example of a person's image and a person's rectangle in an image of an arbitrary frame; 画像内における人物の像の様々な位置を模式的に示す説明図である。FIG. 4 is an explanatory diagram schematically showing various positions of a person's image within an image; ｎフレーム目の画像と、（ｎ＋１）フレーム目の画像とを模式的に示す説明図である。FIG. 4 is an explanatory diagram schematically showing an n-th frame image and an (n+1)-th frame image; 上記属性決定システムが有する管理サーバーの詳細な構成を示すブロック図である。It is a block diagram which shows the detailed structure of the management server which the said attribute determination system has. 上記属性決定システムにおける処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the process in the said attribute determination system. 上記属性決定システムにおいて、１フレーム目について得られた情報の一例を示す説明図である。FIG. 4 is an explanatory diagram showing an example of information obtained for the first frame in the attribute determination system; 上記属性決定システムにおいて、２フレーム目について得られた情報の一例を示す説明図である。FIG. 4 is an explanatory diagram showing an example of information obtained for a second frame in the attribute determination system; 上記属性決定システムにおいて、３フレーム目について得られた情報の一例を示す説明図である。FIG. 9 is an explanatory diagram showing an example of information obtained for the third frame in the attribute determination system; 本発明の他の実施の形態の属性決定システムにおいて得られる、時間的に異なる複数フレームのうち、一部のフレームの画像を模式的に示す説明図である。FIG. 10 is an explanatory diagram schematically showing images of some frames out of a plurality of temporally different frames obtained in an attribute determination system according to another embodiment of the present invention; 上記属性決定システムにおける処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the process in the said attribute determination system. 上記属性決定システムにおいて、１フレーム目について得られた情報の一例を示す説明図である。FIG. 4 is an explanatory diagram showing an example of information obtained for the first frame in the attribute determination system; 上記属性決定システムにおいて、２フレーム目について得られた情報の一例を示す説明図である。FIG. 4 is an explanatory diagram showing an example of information obtained for a second frame in the attribute determination system; 上記属性決定システムにおいて、３フレーム目について得られた情報の一例を示す説明図である。FIG. 9 is an explanatory diagram showing an example of information obtained for the third frame in the attribute determination system; 本発明のさらに他の実施の形態の属性決定システムにおいて得られる、時間的に異なる複数フレームのうち、一部のフレームの画像を模式的に示す説明図である。FIG. 10 is an explanatory diagram schematically showing images of a part of a plurality of temporally different frames obtained in an attribute determination system according to still another embodiment of the present invention; 上記属性決定システムにおける処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the process in the said attribute determination system. 上記属性決定システムにおいて、１フレーム目および２フレーム目について得られた情報の一例を示す説明図である。FIG. 4 is an explanatory diagram showing an example of information obtained for the first and second frames in the attribute determination system; 上記属性決定システムにおいて、３フレーム目について得られた情報の一例を示す説明図である。FIG. 9 is an explanatory diagram showing an example of information obtained for the third frame in the attribute determination system; 本発明のさらに他の実施の形態の属性決定システムにおいて、２人の人物を上方から撮影した任意のフレームの画像を模式的に示す説明図である。FIG. 11 is an explanatory diagram schematically showing an arbitrary frame image of two persons photographed from above in an attribute determination system according to still another embodiment of the present invention; 上記画像内で、２人の人物の像の位置を規定する人物矩形をそれぞれ示す説明図である。It is explanatory drawing which each shows the person rectangle which prescribe|regulates the position of the image of two people in the said image. 上記属性決定システムにおける処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the process in the said attribute determination system. 上記属性決定システムにおいて、１フレーム目および２フレーム目について得られた情報の一例を示す説明図である。FIG. 4 is an explanatory diagram showing an example of information obtained for the first and second frames in the attribute determination system; 上記属性決定システムにおいて、３フレーム目について得られた情報の一例を示す説明図である。FIG. 9 is an explanatory diagram showing an example of information obtained for the third frame in the attribute determination system;

本発明の各実施の形態について、図面に基づいて説明すれば、以下の通りである。なお、本発明は、以下の内容に限定されるわけではない。 Each embodiment of the present invention will be described below with reference to the drawings. In addition, the present invention is not limited to the following contents.

＜実施の形態１＞
〔属性決定システム〕
図１は、本実施形態の属性決定システム１の概略の構成を示すブロック図である。属性決定装置１は、撮像部２と、属性決定装置３と、管理サーバー４とを有して構成されている。撮像部２と属性決定装置３とは、通信回線Ｎ１を介して通信可能に接続されており、属性決定装置３と管理サーバー４とは、通信回線Ｎ２を介して通信可能に接続されている。通信回線Ｎ１およびＮ２は、例えばケーブル、光ファイバー、有線ＬＡＮ（Local Area Network）、無線ＬＡＮ、インターネット回線などから適宜選択されて構成される。以下、撮像部２、属性決定装置３および管理サーバー４の詳細について説明する。 <Embodiment 1>
[Attribute determination system]
FIG. 1 is a block diagram showing a schematic configuration of an attribute determination system 1 of this embodiment. The attribute determination device 1 includes an imaging unit 2, an attribute determination device 3, and a management server 4. FIG. The imaging unit 2 and the attribute determination device 3 are communicably connected via a communication line N1, and the attribute determination device 3 and the management server 4 are communicably connected via a communication line N2. The communication lines N1 and N2 are appropriately selected from, for example, cables, optical fibers, wired LANs (Local Area Networks), wireless LANs, Internet lines, and the like. Details of the imaging unit 2, the attribute determination device 3, and the management server 4 will be described below.

（撮像部）
撮像部２は、例えば店舗の天井または壁に設置され、店舗内の人物を上方から撮影して時間的に異なる各フレームの画像を取得するカメラで構成されている。店舗内に設置される撮像部２の台数は、特に限定されず、１台であってもよいし、２台以上であってもよい。少なくとも１台の撮像部２で取得された画像のデータは、通信回線Ｎ１を介して属性決定装置３に出力される。 (imaging unit)
The imaging unit 2 is installed, for example, on the ceiling or wall of the store, and is composed of a camera that captures images of people in the store from above and obtains images of frames that differ in time. The number of imaging units 2 installed in the store is not particularly limited, and may be one or two or more. Image data acquired by at least one imaging unit 2 is output to the attribute determination device 3 via the communication line N1.

（属性決定装置）
属性決定装置３は、撮像部２で取得されて該属性決定装置３に入力された画像、つまり、人物を上方から撮影した各フレームの画像に基づいて、人物の属性を決定する端末装置であり、例えばパーソナルコンピュータで構成されている。属性決定装置３は、撮像部２が設置される店舗と同じ店舗内に設置されていてもよいし、店舗の外部に撮像部２と通信可能に設置されていてもよい。 (attribute determination device)
The attribute determination device 3 is a terminal device that determines the attributes of a person based on the image acquired by the imaging unit 2 and input to the attribute determination device 3, that is, the image of each frame obtained by photographing the person from above. , for example, a personal computer. The attribute determination device 3 may be installed in the same store as the store where the imaging unit 2 is installed, or may be installed outside the store so as to communicate with the imaging unit 2 .

図２は、属性決定装置３の詳細な構成を示すブロック図である。属性決定装置３は、認識処理部１１と、記憶部１２と、入力部１３と、表示部１４と、通信部１５と、制御部１６とを有して構成されている。 FIG. 2 is a block diagram showing the detailed configuration of the attribute determination device 3. As shown in FIG. The attribute determination device 3 includes a recognition processing section 11 , a storage section 12 , an input section 13 , a display section 14 , a communication section 15 and a control section 16 .

記憶部１２は、属性決定装置３の各部を動作させるための動作プログラム、および認識処理部１１での処理によって得られるデータ（例えば決定された人物の属性に関する情報）等を記憶するメモリであり、例えばハードディスクで構成されている。なお、記憶部１２は、ＲＡＭ（Random Access Memory）、ＲＯＭ（Read Only Memory）、光ディスク、光磁気ディスク、不揮発性メモリなどの記録媒体から適宜選択して構成されてもよい。 The storage unit 12 is a memory that stores an operation program for operating each unit of the attribute determination device 3, data obtained by processing in the recognition processing unit 11 (for example, information about the attribute of the determined person), etc. For example, it consists of a hard disk. Note that the storage unit 12 may be configured by appropriately selecting from recording media such as RAM (Random Access Memory), ROM (Read Only Memory), optical disk, magneto-optical disk, and nonvolatile memory.

入力部１３は、例えばキーボード、マウス、タッチパッド、タッチパネルなどで構成され、例えば属性決定装置３を操作する操作者（ユーザ）による各種の指示入力を受け付ける。表示部１４は、認識処理部１１での処理結果（例えば決定された属性）をはじめとして、各種の情報を表示するデバイスであり、例えば液晶表示装置で構成される。通信部１５は、外部と通信するための入出力ポートを含むインターフェースである。通信部１５は、外部と無線通信を行う場合、アンテナ、送受信回路、変調回路、復調回路などを含んで構成される。制御部１６は、属性決定装置３の各部の動作を制御する中央演算処理装置（ＣＰＵ；Central Processing Unit）で構成されており、記憶部１２に記憶された動作プログラムに従って動作する。 The input unit 13 is composed of, for example, a keyboard, a mouse, a touch pad, a touch panel, etc., and receives various instruction inputs from an operator (user) who operates the attribute determination device 3, for example. The display unit 14 is a device for displaying various kinds of information including the processing result (for example, determined attributes) of the recognition processing unit 11, and is configured by, for example, a liquid crystal display device. The communication unit 15 is an interface including an input/output port for communicating with the outside. The communication unit 15 includes an antenna, a transmission/reception circuit, a modulation circuit, a demodulation circuit, and the like when performing wireless communication with the outside. The control section 16 is composed of a central processing unit (CPU) that controls the operation of each section of the attribute determination device 3 , and operates according to an operation program stored in the storage section 12 .

認識処理部１１は、人物の属性決定にかかわる処理を行う演算装置であり、例えばリアルタイムな画像処理に特化した演算装置であるＧＰＵ（Graphics Processing Unit）で構成されている。なお、認識処理部１１は、制御部１６と同一のまたは別個のＣＰＵで構成されていてもよい。このような認識処理部１１は、人物認識部１１ａと、人物同定部１１ｂと、属性決定部１１ｃとを有している。すなわち、認識処理部１１を構成するＧＰＵは、人物認識部１１ａと、人物同定部１１ｂと、属性決定部１１ｃとしての機能を有している。 The recognition processing unit 11 is an arithmetic device that performs processing related to determination of attributes of a person, and includes, for example, a GPU (Graphics Processing Unit) that is an arithmetic device specialized for real-time image processing. Note that the recognition processing unit 11 may be configured with the same CPU as the control unit 16 or a separate CPU. Such a recognition processing section 11 has a person recognition section 11a, a person identification section 11b, and an attribute determination section 11c. That is, the GPU that constitutes the recognition processing unit 11 has functions as a person recognition unit 11a, a person identification unit 11b, and an attribute determination unit 11c.

人物認識部１１ａは、人物を上方から撮影した各フレームの画像に基づいて、画像内における人物の像の情報を示す人物情報と、人物の属性と、属性の認識に影響を与える事象とを、各フレームごとに認識する。 Based on an image of each frame in which a person is photographed from above, the person recognition unit 11a recognizes person information indicating information about the image of the person in the image, attributes of the person, and events affecting the recognition of the attributes. Recognize each frame.

上記人物情報には、例えば、画像内における人物の像の位置、人物の像に含まれる色の種類および比率、人物の像の大きさなどが含まれる。ここで、画像内における人物の像の位置は、例えば上記画像内で上記像を含む人物矩形によって規定される。図３は、任意のフレームの画像２ａ内の人物の像Ｐおよび人物矩形Ｒの一例を示している。なお、図３において、符号５は、店舗内の商品棚を指し、符号６は、店舗内の通路を指す。画像２ａ内の人物の像Ｐの認識（人物像の有無および位置の認識）は、例えば公知の画像処理ソフトウェアに基づく処理により、人物の頭の形状および位置、各関節位置などを特定することによって行うことができる。人物認識部１１ａは、画像２ａ内で人物の像Ｐを認識すると、画像２ａ内で人物の像Ｐを取り囲む人物矩形Ｒを設定することができ、この人物矩形Ｒにより、画像２ａ内での人物の像Ｐの位置を規定することができる。 The person information includes, for example, the position of the person's image in the image, the type and ratio of colors included in the person's image, the size of the person's image, and the like. Here, the position of the image of the person in the image is defined, for example, by a person rectangle containing the image in the image. FIG. 3 shows an example of a person image P and a person rectangle R in an image 2a of an arbitrary frame. In FIG. 3, reference numeral 5 indicates a product shelf in the store, and reference numeral 6 indicates an aisle in the store. Recognition of the image P of the person in the image 2a (recognition of the presence or absence and position of the person image) is performed by, for example, processing based on known image processing software to specify the shape and position of the person's head, the positions of each joint, etc. It can be carried out. When the person recognition unit 11a recognizes the person image P in the image 2a, the person recognition unit 11a can set a person rectangle R surrounding the person image P in the image 2a. can define the position of the image P of .

なお、人物矩形Ｒは、画像２ａ内における人物の像Ｐの位置を正確に規定する点では、人物の像Ｐの少なくとも一部と接するように、人物の像Ｐを取り囲む矩形（枠）であることが望ましいが、必ずしも人物の像Ｐと接していることは必要とされない。例えば、人物矩形Ｒと人物の像Ｐとの間に、所定の（少しの）マージンがあってもよい。また、矩形とは、一般的に長方形を指すが、ここでは、長方形の特殊な形状である正方形も含む概念とする。 Note that the person rectangle R is a rectangle (frame) surrounding the person image P so as to be in contact with at least a part of the person image P in terms of accurately defining the position of the person image P in the image 2a. However, it is not necessarily required to be in contact with the image P of the person. For example, there may be a predetermined (small) margin between the person rectangle R and the image P of the person. Also, a rectangle generally refers to a rectangle, but here, the concept includes a square, which is a special shape of a rectangle.

上記した人物の属性は、人物の年齢および性別の少なくとも一方である。なお、人物認識部１１ａは、人物の年齢を、１歳ごとの満年齢で認識してもよいが、２０代、３０代、４０代などの大まかな年代で認識してもよいし、子供、大人、老人などの区分で認識してもよい。性別は、男性または女性を示す。 The attribute of the person described above is at least one of age and sex of the person. The person recognizing unit 11a may recognize the age of the person in terms of the full age of each year, but may also recognize the age of the person in general ages such as the 20s, 30s, and 40s. , old people, and so on. Gender indicates male or female.

本実施形態では、人物認識部１１ａは、深層学習（ディープラーニング）などの機械学習が可能なニューラルネットワークを含んで構成されており、このニューラルネットワークを利用して、人物の属性を認識することができる。より具体的には、各フレームの画像のデータを、属性認識について予め学習されたニューラルネットワークに入力すると、ニューラルネットワークから属性の認識結果が出力されるとともに、その認識結果の確からしさを示すスコアが算出される。これにより、例えば、人物の年齢は４０代であり、性別は男性であるという認識結果と、その確からしさを示すスコア（例えば０．８）とを得ることができる。なお、上記スコアは、０～１の間の値（点数）であり、１に近づくほど認識結果の確からしさが増大することを示す。 In this embodiment, the person recognition unit 11a includes a neural network capable of machine learning such as deep learning. Using this neural network, the person's attributes can be recognized. can. More specifically, when the image data of each frame is input to a neural network that has been trained for attribute recognition in advance, the neural network outputs attribute recognition results and scores that indicate the likelihood of the recognition results. Calculated. As a result, it is possible to obtain, for example, the recognition result that the person is in his 40s and the gender is male, and a score (for example, 0.8) indicating the probability of the recognition. The score is a value (score) between 0 and 1, and the closer to 1, the more likely the recognition result is.

上記した属性の認識に影響を与える事象には、例えば、画像内における人物の像の位置が含まれる。図４は、画像２ａ内における人物の像Ｐの様々な位置を模式的に示している。画像２ａ内の（１）の位置では、店舗内の吊り看板７で人物の像Ｐの一部が遮られており、（２）の位置では、撮像部２の直下に人物が位置しているために、人物の全身が画像２ａに映りにくくなっている。これらのケースでは、属性を認識するための人物の画像データが欠損するため、画像に基づく人物の属性の認識に影響を与える。つまり、上記（１）および（２）で示した人物の像Ｐの位置は、人物の属性の認識に影響を与える事象となる。一方、画像２ａ内で、人物の像Ｐが画像端に近い（３）の位置では、人物の全身が撮影されているため、属性を認識するための人物の画像データが十分存在し、画像に基づく属性の認識に最適となる。つまり、上記（３）の位置は、人物の属性の認識に影響を与える事象とはならない。ただし、人物の像Ｐが（３）の位置よりも画像端に寄りすぎると、人物の像Ｐが画像２ａ内に収まらず、人物の画像データが欠損する可能性があり、この場合は、人物の属性の認識に影響を与える事象となり得る。 Events that affect the recognition of the above attributes include, for example, the position of a person's image within an image. FIG. 4 schematically shows various positions of a person's image P within the image 2a. At the position (1) in the image 2a, a part of the image P of the person is blocked by the hanging signboard 7 inside the store, and at the position (2), the person is positioned directly below the imaging unit 2. Therefore, it is difficult for the whole body of the person to appear in the image 2a. In these cases, the image data of the person for recognizing the attributes is missing, which affects the recognition of the attributes of the person based on the image. That is, the position of the person's image P shown in (1) and (2) above is an event that affects the recognition of the person's attribute. On the other hand, in the image 2a, at the position (3) where the image P of the person is close to the edge of the image, the whole body of the person is photographed. It is ideal for recognizing attributes based on In other words, the position (3) above does not become an event that affects the recognition of a person's attribute. However, if the person's image P is too close to the edge of the image from the position of (3), the person's image P will not fit in the image 2a, and there is a possibility that the person's image data will be lost. It can be an event that affects the recognition of the attributes of

人物同定部１１ｂは、人物認識部１１ａによって認識された各フレームの人物情報（時系列情報）に基づいて、各フレーム間で（人物矩形内の）人物の像が同一人の像であるか否かを判断する。図５は、ｎフレーム目の画像２ａと、（ｎ＋１）フレーム目の画像２ａとを模式的に示している（ｎは自然数とする）。例えば、人物同定部１１ｂは、ｎフレーム目の画像２ａ内の人物矩形Ｒ_nの位置と、（ｎ＋１）フレーム目の画像２ａ内の人物矩形Ｒ_n+1の位置とを比較し、これらの位置の差（人物矩形の移動量）が所定範囲内（フレームレートに応じて決まる）であるか否かを判断することにより、人物矩形Ｒ_nと人物矩形Ｒ_n+1とが同一人について示すものであるか否か、つまり、人物矩形Ｒ_n内の人物の像Ｐ_nと、人物矩形Ｒ_n+1内の人物の像Ｐ_n+1とが同一人の像であるか否かを判断することができる。 Based on the person information (time-series information) of each frame recognized by the person recognition unit 11a, the person identification unit 11b determines whether the image of the person (within the person rectangle) is the image of the same person between the frames. to judge whether FIG. 5 schematically shows the n-th frame image 2a and the (n+1)-th frame image 2a (where n is a natural number). For example, the person identification unit 11b compares the position of the person rectangle R _n in the n-th frame image 2a with the position of the person rectangle R _n+1 in the (n+1)-th frame image 2a, and determines these positions. (the amount of movement of the person rectangle) is within a predetermined range (determined according to the frame _rate ) _. That is, it is determined whether the person image _Pn in the person rectangle _Rn and the person image _Pn _{+1 in the person rectangle Rn+} 1 are the same person. be able to.

また、人物同定部１１ｂは、人物矩形Ｒ_nの縦（横）の長さと、人物矩形Ｒ_n+1の縦（横）の長さとの差が所定範囲内であるか否か、人物矩形Ｒ_n内で各色が占める面積と、人物矩形Ｒ_n+1内で各色が占める面積との差（または比）が所定範囲内であるか否か、などを判断することによって、人物矩形Ｒ_n内の人物の像Ｐ_nと、人物矩形Ｒ_n+1内の人物の像Ｐ_n+1とが同一人の像であるか否かを判断してもよい。 The person identification unit 11b also determines whether the difference between the vertical (horizontal) length of the person rectangle R _n and the vertical (horizontal) length of the person rectangle R _n+1 is within a predetermined range. By determining whether the difference (or ratio) between the area occupied by each color _within _n and the area occupied by each color within person rectangle R _n+1 is within a predetermined range, and the person image P _n+1 in the person rectangle R _n+1 are the same _person image.

属性決定部１１ｃは、人物同定部１１ｂによって各フレーム間で人物の像が同一人の像であると判断された人物に関して、各フレームごとに、属性の認識結果に事象の認識結果を加味した属性情報を、認識した属性の各クラスについて求める。例えば、属性として年齢および性別を考えたとき、年齢のクラスとしては、例えば２０代、３０代、４０代、・・・が存在し、性別のクラスとしては、男性および女性の２クラスが存在する。したがって、属性全体のクラスとしては、年齢のクラス数×性別のクラス数だけ存在することになる。よって、属性決定部１１ｃは、各クラスごとに（例えば２０代男性、３０代男性、・・・４０代女性の各クラスごとに）、属性情報を求めることになる。なお、属性情報の具体例については後述する。特に、属性決定部１１ｃは、属性の認識に影響を与える事象（ここでは人物の像の位置）の認識結果に対応してスコアの信頼度（採用率）を設定し、人物認識部１１ａによって算出されたスコアと、設定した上記信頼度とに基づいて、上記クラスごとに上記属性情報を求める。 The attribute determining unit 11c determines, for each frame, the attribute recognition result obtained by adding the event recognition result to the attribute recognition result for a person for whom the person identifying unit 11b determines that the image of the person in each frame is the image of the same person. Information is sought for each class of recognized attributes. For example, when considering age and gender as attributes, age classes include, for example, 20s, 30s, 40s, etc., and gender classes include two classes, male and female. . Therefore, as classes for all attributes, there are as many as the number of age classes×the number of gender classes. Therefore, the attribute determination unit 11c obtains attribute information for each class (for example, for each class of men in their 20s, men in their 30s, . . . women in their 40s). A specific example of attribute information will be described later. In particular, the attribute determination unit 11c sets the reliability (adoption rate) of the score corresponding to the recognition result of the event (here, the position of the image of the person) that affects the recognition of the attribute, and the person recognition unit 11a calculates The attribute information is obtained for each class based on the calculated score and the set reliability.

ここで、上記の信頼度としては、例えば０～１の数値範囲を考えることができる。例えば、図４で示した画像２ａ内で、人物の像Ｐが（３）の位置にある場合、上述したように上記位置が人物の属性の認識に与える影響はほとんどないため、この場合は、スコアの信頼度を１．０に設定する。また、人物の像Ｐが（２）または（３）の位置にある場合、上述したように、上記位置は人物の属性の認識に悪影響を及ぼす可能性があるため、例えば（２）の位置については、スコアの信頼度を０．７に設定し、（１）の位置については、スコアの信頼度を０．２に設定する。なお、画像２ａ内でユーザが任意にエリアを設定し、設定したエリアごとに上記の信頼度を設定してもよい。 Here, a numerical range of 0 to 1, for example, can be considered as the above reliability. For example, in the image 2a shown in FIG. 4, when the image P of a person is at position (3), as described above, the position has little effect on the recognition of the attributes of the person. Set the score confidence to 1.0. Further, when the image P of the person is at the position (2) or (3), as described above, the above position may adversely affect the recognition of the person's attributes. sets the confidence of the score to 0.7, and for the position of (1) sets the confidence of the score to 0.2. Note that the user may arbitrarily set an area within the image 2a and set the above reliability for each set area.

属性決定部１１ｃは、人物認識部１１ａによって算出されたスコアに、上記信頼度を乗算することによって属性情報をクラスごとに求め、各クラスについて、属性情報を複数フレームで統合した結果に基づいて、人物の属性を決定する。なお、属性決定の処理の詳細については、後述する動作説明の中で行う。 The attribute determination unit 11c obtains attribute information for each class by multiplying the score calculated by the person recognition unit 11a by the reliability, and based on the result of integrating the attribute information in a plurality of frames for each class, Determine the attributes of a person. The details of the attribute determination process will be described later in the explanation of the operation.

（管理サーバー）
図１で示した管理サーバー４は、属性決定装置３で決定された人物の属性に関する情報を格納する端末装置であり、例えばパーソナルコンピュータで構成されている。図６は、管理サーバー４の詳細な構成を示すブロック図である。管理サーバー４は、格納部２１と、通信部２２と、制御部２３とを有している。 (management server)
The management server 4 shown in FIG. 1 is a terminal device that stores information about a person's attributes determined by the attribute determination device 3, and is configured by, for example, a personal computer. FIG. 6 is a block diagram showing the detailed configuration of the management server 4. As shown in FIG. The management server 4 has a storage section 21 , a communication section 22 and a control section 23 .

格納部２１は、管理サーバー４の各部を動作させるための動作プログラム、および属性決定装置３から送出される情報（例えば属性決定部１１ｃによって決定された属性）を格納するメモリであり、例えばハードディスクで構成されている。なお、格納部２１は、ＲＡＭ、ＲＯＭ、光ディスク、光磁気ディスク、不揮発性メモリなどの記録媒体から適宜選択して構成されてもよい。 The storage unit 21 is a memory that stores an operation program for operating each unit of the management server 4 and information sent from the attribute determination device 3 (for example, attributes determined by the attribute determination unit 11c). It is configured. The storage unit 21 may be configured by appropriately selecting recording media such as RAM, ROM, optical disk, magneto-optical disk, and non-volatile memory.

通信部２２は、外部と通信するための入出力ポートを含むインターフェースである。通信部２２は、外部と無線通信を行う場合、アンテナ、送受信回路、変調回路、復調回路などを含んで構成される。制御部２３は、管理サーバー４の各部の動作を制御するＣＰＵで構成されており、格納部２１に記憶された動作プログラムに従って動作する。 The communication unit 22 is an interface including an input/output port for communicating with the outside. The communication unit 22 includes an antenna, a transmission/reception circuit, a modulation circuit, a demodulation circuit, and the like when performing wireless communication with the outside. The control unit 23 is composed of a CPU that controls the operation of each unit of the management server 4 and operates according to the operation program stored in the storage unit 21 .

なお、管理サーバー４は、その他、キーボードなどの入力部、ディスプレイなどの表示部、属性決定装置３の認識処理部１１と同様の処理を行う演算処理部を含んで構成されていてもよい。 The management server 4 may also include an input unit such as a keyboard, a display unit such as a display, and an arithmetic processing unit that performs the same processing as the recognition processing unit 11 of the attribute determination device 3 .

〔属性決定方法〕
次に、本実施形態の属性決定システム１における動作（属性決定方法）について説明する。図７は、上記属性決定システム１における処理の流れを示すフローチャートである。なお、以下での説明を簡略化するため、ここでは、人物の属性を、「４０代男性」と「２０代男性」との２クラスに分類することとし、属性を判断する対象となる人物は、「４０代男性」であるとする（「４０代男性」が属性として正解であるとする）。なお、以下で示す属性Ａ_nは、ｎフレーム目の画像に基づいて人物認識部１１ａが認識した属性であって、ｎフレーム目の画像に映っている人物の属性を示し、属性決定部１１ｃが最終的に決定する属性Ｂとは区別されるものとする。 [Attribute determination method]
Next, the operation (attribute determination method) in the attribute determination system 1 of this embodiment will be described. FIG. 7 is a flow chart showing the flow of processing in the attribute determination system 1 described above. In order to simplify the explanation below, here, the attributes of a person are classified into two classes, "male in his 40s" and "male in his 20s". , “male in his 40s” (assuming that “male in his 40s” is the correct attribute). An attribute A _n shown below is an attribute recognized by the person recognition unit 11a based on the n-th frame image, and indicates the attribute of the person appearing in the n-th frame image. It shall be distinguished from attribute B which is finally determined.

まず、属性決定装置３は、撮像部２から、ｎ＝１として（Ｓ１）、１フレーム目の画像を取得すると（Ｓ２）、人物認識部１１ａは、上記画像に基づき、上述した手法で、人物矩形Ｒ_n（＝Ｒ₁）と、人物の属性Ａ_n（＝Ａ₁）と、属性Ａ_nの認識に影響を与える事象（ここでは人物矩形Ｒ_nによって規定される人物の像Ｐ_n（＝Ｐ₁）の位置）とを認識するとともに、属性Ａ_nの認識結果の確からしさを示すスコアＣ_n（＝Ｃ₁）を算出する（Ｓ３；人物認識工程）。これらの認識結果およびスコアＣ_nは、記憶部１２に記憶される。 First, when the attribute determination device 3 acquires the image of the first frame from the imaging unit 2 with n=1 (S1) (S2), the person recognition unit 11a recognizes the person by the method described above based on the image. A rectangle R _n (=R ₁ ), a person attribute A _n (=A ₁ ), and an event affecting the recognition of the attribute A _n (here, a person image P _n ₍ = The position of P ₁ _{)) is recognized, and a score C n} ₍ =C ₁ ) indicating the certainty of the recognition result of the attribute An is calculated (S3; person recognition step). These recognition results and scores C _n are stored in the storage unit 12 .

次に、人物同定部１１ｂは、人物矩形Ｒ_n内の人物を同定するが（Ｓ４；人物同定工程）、最初のフレームであるため（ｎ＝１であるため）、人物矩形Ｒ_n内の人物の像Ｐ_nに識別番号を付与することをもって人物の同定とする（例えばＩＤ＝０００１とする）。Ｓ３で認識した人物矩形Ｒ_n等の情報は、Ｓ４で付与された識別番号と対応付けて記憶部１２に記憶される。なお、Ｓ４の処理は、人物矩形Ｒ_nが認識された後であれば、Ｓ３の属性Ａ_nの認識、上記事象の認識、スコアＣ_nの算出と並行して行われてもよい。 _Next , the person identification unit 11b identifies a person within the person rectangle R _n (S4; person identification step). A person is identified by assigning an identification number to the image P _n of (for example, ID=0001). Information such as the person rectangle _Rn recognized in S3 is stored in the storage unit 12 in association with the identification number assigned in S4. The process of S4 may be performed in parallel with the recognition of the attribute A _n in S3, the recognition of the event, and the calculation of the score C _n after the person rectangle R _n is recognized.

続いて、属性決定部１１ｃは、Ｓ３での属性Ａ_nの認識に影響を与える事象の認識結果に対応してスコアＣ_nの信頼度ｆ（Ａ_n）（＝ｆ（Ａ₁））を設定する（Ｓ５～Ｓ７）。つまり、Ｓ３にて認識された事象（人物の像Ｐ_nの位置）が、属性認識に影響を与える位置である場合（Ｓ５でＹｅｓ）、属性決定部１１ｃは、上記事象の認識結果に対応して、スコアＣ_nの信頼度ｆ（Ａ_n）を１未満に設定する（Ｓ６）。一方、Ｓ３にて認識された事象（人物の像Ｐ_nの位置）が、属性認識に影響を与えない位置である場合（Ｓ５でＮｏ）、属性決定部１１ｃは、スコアＣ_nの信頼度ｆ（Ａ_n）を１に設定する（Ｓ７）。なお、上記の信頼度ｆ（Ａ_n）は、Ｓ５での認識結果、つまり、属性Ａ_nの認識に影響を与える事象の認識結果に対応して設定される値であり、Ｓ３での属性Ａ_nの認識結果（クラス）には依存しない（以下の実施形態でも同じ）。 Subsequently, the attribute determination unit 11c sets the reliability f(A _n ) (=f(A ₁ )) of the score C _n corresponding to the recognition result of the event affecting the recognition of the attribute A _n in S3. (S5 to S7). In other words, if the event (the position of the person's image P _n ) recognized in S3 is a position that affects attribute recognition (Yes in S5), the attribute determination unit 11c determines whether the event recognition result corresponds to the event recognition result. Then, the reliability f(A _n ) of the score C _n is set to less than 1 (S6). On the other hand, if the event (the position of the person's image P _n ) recognized in S3 is a position that does not affect attribute recognition (No in S5), the attribute determining unit _11c determines the reliability f (A _n ) is set to 1 (S7). The above reliability f(A _n ) is a value set corresponding to the recognition result in S5, that is, the recognition result of an event that affects the recognition of attribute A _n . It does not depend on the recognition result (class) of _n (the same applies to the following embodiments).

次に、属性決定部１１ｃは、Ｓ３で算出されたスコアＣ_nと、上記で設定した信頼度ｆ（Ａ_n）とに基づいて、属性情報Ｑ_n（＝Ｑ₁）を属性Ａ_nのクラスごとに求める（Ｓ８）。例えば、「２０代男性」、「４０代男性」のクラスごとに、スコアＣ_n×信頼度ｆ（Ａ_n）の値が、属性情報Ｑ_nとして求められる。求めた属性情報Ｑ_nは、記憶部１２に人物の識別情報と対応付けて記憶される。 Next, based on the score C _n calculated in S3 and the reliability f(A _n ) set above, the attribute determination unit 11c assigns the attribute information Q _n (=Q ₁ ) to the attribute A _n class (S8). For example, a value of score C _n ×reliability f(A _n ) is obtained as the attribute information Q _n for each class of “male in his 20s” and “male in his 40s”. The obtained attribute information Q _n is stored in the storage unit 12 in association with the person's identification information.

ここで、図８は、１フレーム目について得られた情報の一例を示している。この例では、ＩＤ＝０００１の人物について、属性Ａ₁が「２０代男性」（この属性のクラスを「Ｐ_20M」とする）である確からしさを示すスコアＣ₁（Ｐ_20M）が０．７であり、属性Ａ₁が「４０代男性」（この属性のクラスを「Ｐ_40M」とする）である確からしさを示すスコアＣ₁（Ｐ_40M）が０．０１となっている。画像内において、人物の像Ｐ₁の位置が、属性Ａ₁の認識に影響を与える位置（図４の（１）の位置と同じ）であり、人物認識部１１ａが上記画像に基づいて人物の属性Ａ₁の認識を精度よく行うことができなかった結果、「４０代男性」が正解であるにもかかわらず、「４０代男性」のスコアＣ₁（Ｐ_40M）よりも、「２０代男性」のスコアＣ₁（Ｐ_20M）のほうが高くなっている。 Here, FIG. 8 shows an example of information obtained for the first frame. In this example, for the person with ID=0001, the score C ₁ (P _20M ) indicating the likelihood that the attribute A ₁ is "male in his 20s" (this attribute class is "P _20M ") is 0.7. , and the score C ₁ (P _40M ) indicating the likelihood that the attribute A ₁ is "male in his 40s" (the class of this attribute is "P _40M ") is 0.01. In the image, the position of the person's image _P1 is the position that affects the recognition of the attribute _A1 (same as the position of (1) in FIG. 4), and the person recognition unit 11a recognizes the person based on the image. As a result of not being able to accurately recognize attribute A ₁ , even though “male in his 40s” was the correct answer, the score C ₁ (P _40M ) for “male in his 40s” was lower than “male in his 20s”. ' score C ₁ (P _20M ) is higher.

そこで、人物の像Ｐ₁の位置が属性Ａ₁の認識に影響を与える位置であることを考慮し、図８の例では、スコアＣ₁の信頼度ｆ（Ａ₁）を、ｆ（Ａ₁）＝ｆ₁（Ａ₁）＝０．２に設定している。これにより、各クラスについて、属性Ａ₁の認識結果に上記位置の認識結果を加味した属性情報Ｑ₁として、「２０代男性」のクラスについては、Ｑ₁（Ｐ_20M）＝Ｃ₁（Ｐ_20M）×ｆ₁（Ａ₁）＝０．７×０．２＝０．１４が得られており、「４０代男性」のクラスについては、Ｑ₁（Ｐ_40M）＝Ｃ₁（Ｐ_40M）×ｆ₁（Ａ₁）＝０．０１×０．２＝０．００２が得られている。 Therefore, considering that the _position of the person's image P ₁ is a position that affects the recognition of the attribute _A ₁ , in the _example of FIG. )=f ₁ (A ₁ )=0.2. As a result, for each class, the attribute information Q ₁ obtained by adding the recognition result of the position to the recognition result of the attribute A ₁ , Q ₁ (P _20M )=C ₁ (P _{20M )} for the class of "male in his twenties" ₎ ×f ₁ (A ₁ ) ₌ _0.7 × _0.2 =0.14. f ₁ (A ₁ )=0.01×0.2=0.002 is obtained.

次に、属性決定部１１ｃは、処理を継続するか否か、つまり、次の（ｎ＋１）フレーム目の画像についても、上記と同様の処理を行うか否かを判断する（Ｓ９）。基本的には、属性決定部１１ｃは、Ｓ９にて処理を継続すると判断して、Ｓ１０に移行する。Ｓ１０では、属性決定装置３は、ｎ＝ｎ＋１とし、その後、Ｓ２以降の処理を繰り返す。つまり、属性決定装置３は、撮像部２から、２フレーム目の画像を取得して、Ｓ２以降の処理を繰り返す。この場合、Ｓ４では、人物同定部１１ｂは、１フレーム目の人物情報と、２フレーム目の人物情報（例えば各フレームにおける人物矩形の位置（移動量）、大きさなど）に基づいて、各フレーム間で人物矩形内の人物の像が同一人の像であるか否かを判断する。 Next, the attribute determination unit 11c determines whether or not to continue the processing, that is, whether or not to perform the same processing as above on the next (n+1)-th frame image (S9). Basically, the attribute determination unit 11c determines to continue the processing in S9, and shifts to S10. In S10, the attribute determination device 3 sets n=n+1, and then repeats the processes after S2. That is, the attribute determination device 3 acquires the image of the second frame from the imaging unit 2, and repeats the processing from S2 onward. In this case, in S4, the person identification unit 11b identifies each frame based on the person information of the first frame and the person information of the second frame (for example, the position (movement amount), size, etc. of the person rectangle in each frame). In between, it is determined whether or not the images of the person in the person rectangle are the images of the same person.

以降、同様にして、（ｎ＋２）フレーム目以降の画像についても、Ｓ２以降の処理を繰り返す。そして、例えば、（ｎ＋ｋ）フレーム目（ｋは３以上の自然数とする）の画像に基づいて認識された人物のＩＤが、（ｎ＋（ｋ－１））フレーム目の画像に基づいて認識された人物のＩＤと異なる場合、属性決定部１１ｃは、各フレーム間で同一人についての属性Ｂを決定できないため、Ｓ９にて、処理を継続しないと判断してＳ１１に移行する。 After that, similarly, the processing after S2 is repeated also for the images after the (n+2)th frame. Then, for example, the ID of the person recognized based on the (n+k)-th frame image (where k is a natural number of 3 or more) is recognized based on the (n+(k−1))-th frame image. If it is different from the person's ID, the attribute determination unit 11c cannot determine the attribute B for the same person between frames, so in S9 it is determined not to continue the process, and the process proceeds to S11.

図９は、２フレーム目について得られた情報の一例を示している。この例では、ＩＤ＝０００１の人物について、１フレーム目と同様に、属性Ａ₂が「２０代男性」である確からしさを示すスコアＣ₂（Ｐ_20M）が０．７であり、属性Ａ₂が「４０代男性」である確からしさを示すスコアＣ₂（Ｐ_40M）が０．０１となっている。画像内において、人物の像Ｐ₂の位置が、属性Ａ₂の認識に影響を与える位置（図４の（２）の位置と同じ）であり、人物認識部１１ａが上記画像に基づいて人物の属性Ａ₂の認識を精度よく行うことができなかった結果、「４０代男性」が正解であるにもかかわらず、「２０代男性」のスコアＣ₂（Ｐ_40M）よりも、「２０代男性」のスコアＣ₂（Ｐ_20M）のほうが高くなっている。 FIG. 9 shows an example of information obtained for the second frame. In this example, for the person with ID=0001, the score C ₂ ₍ P _20M ) indicating the likelihood that the attribute A ₂ is “male in his 20s” is 0.7, as in the first frame. The score C ₂ (P _40M ) indicating the likelihood that the person is a “male in his 40s” is 0.01. In the image, the position of the person image _P2 is the position that affects the recognition of the attribute _A2 (same as the position (2) in FIG. 4), and the person recognition unit 11a recognizes the person based on the image. As a result of not being able to accurately recognize attribute A ₂ , even though "male in his 40s" is the correct answer, the score of "male in his 20s" is lower than the score C ₂ (P _40M ) ' score C ₂ (P _20M ) is higher.

そこで、人物の像Ｐ₂の位置が属性Ａ₂の認識に影響を与える位置であることを考慮し、図９の例では、スコアＣ₂の信頼度ｆ（Ａ₂）を、ｆ（Ａ₂）＝ｆ₂（Ａ₂）＝０．７に設定している。これにより、各クラスについて、属性Ａ₂の認識結果に上記位置の認識結果を加味した属性情報Ｑ₂として、「２０代男性」のクラスについては、Ｑ₂（Ｐ_20M）＝Ｃ₂（Ｐ_20M）×ｆ₂（Ａ₂）＝０．７×０．７＝０．４９が得られており、「４０代男性」のクラスについては、Ｑ₂（Ｐ_40M）＝Ｃ₂（Ｐ_40M）×ｆ₂（Ａ₂）＝０．０１×０．７＝０．００７が得られている。なお、人物像の上記（２）の位置は、上記（１）の位置に比べて属性認識に与える影響が小さいため、スコアＣ₂の信頼度ｆ（Ａ₂）を、スコアＣ₁の信頼度ｆ（Ａ₁）よりも高く設定している。 Considering that _the position of the _person image _P2 affects the recognition of the attribute _A2 , _in the example of FIG. )=f ₂ (A ₂ )=0.7. As a result, for each class, the attribute information _Q2 obtained by adding the recognition result of the position to the recognition result of the attribute _A2 is calculated as follows: _Q2 ( _P20M )= _C2 ( _P20M ) )×f ₂ (A ₂ ₎ ₌ 0.7× _0.7 = _0.49 . f ₂ (A ₂ )=0.01×0.7=0.007 is obtained. Since the position (2) of the human image has a smaller effect on attribute recognition than the position (1), the reliability f(A ₂ ) of the score C ₂ is replaced by the reliability of the score C ₁ It is set higher than f(A ₁ ).

一方、図１０は、３フレーム目について得られた情報の一例を示している。この例では、ＩＤ＝０００１の人物について、属性Ａ₃が「２０代男性」である確からしさを示すスコアＣ₃（Ｐ_20M）が０．０５であり、属性Ａ₃が「４０代男性」である確からしさを示すスコアＣ₃（Ｐ_40M）が０．９となっている。画像内において、人物の像Ｐ₃の位置が、属性Ａ₃の認識にほとんど影響を与えない位置（図４の（３）の位置と同じ）であり、人物認識部１１ａが上記画像に基づいて人物の属性Ａ₃の認識を精度よく行うことができた結果、「２０代男性」のスコアＣ₃（Ｐ_20M）よりも、「４０代男性」のスコアＣ₃（Ｐ_40M）のほうが高くなっている。これらのスコアＣ₃（Ｐ_20M）およびＣ₃（Ｐ_40M）の大小関係は、「４０代男性」を正解とする答えと対応する関係と言える。 On the other hand, FIG. 10 shows an example of information obtained for the third frame. In this example, for the person with ID=0001, the score C ₃ (P _20M ) indicating the probability that the attribute A ₃ is “male in his 20s” is 0.05, and the attribute A ₃ is “male in his 40s”. The score C ₃ (P _40M ) indicating certain certainty is 0.9. In the image, the position of the image _P3 of the person is a position that hardly affects the recognition of the attribute _A3 (same as the position (3) in FIG. 4), and the person recognition unit 11a detects the image based on the image. As a result of being able to accurately recognize the person's attribute _A3 , the score _C3 ( _P40M ) of "male in his 40s" is higher than the score _C3 ( _P20M ) of "male in his 20s". ing. The magnitude relationship between these scores C ₃ (P _20M ) and C ₃ (P _40M ) can be said to be a relationship corresponding to the correct answer of “male in his 40s”.

人物の像Ｐ₃の位置が属性Ａ₃の認識にほとんど影響を与えない位置であることを考慮し、図１０の例では、スコアＣ₃の信頼度ｆ（Ａ₃）を、ｆ（Ａ₃）＝ｆ₃（Ａ₃）＝１．０に設定している。これにより、各クラスについて、属性Ａ₃の認識結果に上記位置の認識結果を加味した属性情報Ｑ₃として、「２０代男性」のクラスについては、Ｑ₃（Ｐ_20M）＝Ｃ₃（Ｐ_20M）×ｆ₃（Ａ₃）＝０．０５×１．０＝０．０５が得られており、「４０代男性」のクラスについては、Ｑ₃（Ｐ_40M）＝Ｃ₃（Ｐ_40M）×ｆ₃（Ａ₃））＝０．９×１．０＝０．９が得られている。 Considering that the position of the image _P3 of the person is a position that hardly affects the recognition of the attribute _A3 , the reliability f( _A3 ) of the score _C3 is changed to f( _A3 )=f ₃ (A ₃ )=1.0. As a result, for each class, the attribute information _Q3 obtained by adding the recognition result of the position to the recognition result of the attribute _A3 is calculated as follows: _Q3 ( _P20M )= _C3 ( _P20M ) ₎ ×f ₃ (A ₃ ) ₌ _0.05 × _1.0 =0.05. f ₃ (A ₃ ))=0.9×1.0=0.9 is obtained.

Ｓ１１では、属性決定部１１ｃは、属性Ａ_nの各クラスについて、各フレームごとに求めた属性情報Ｑ_nを複数フレームで統合し、その結果に基づいて人物の属性Ｂを決定する。なお、上述したＳ５～Ｓ９、Ｓ１１の工程は、属性決定工程に対応する。 In S11, the attribute determination unit 11c integrates the attribute information Q _n obtained for each frame for each class of the attribute A _n in a plurality of frames, and determines the attribute B of the person based on the result. The steps S5 to S9 and S11 described above correspond to the attribute determination step.

ここで、フレーム数ｎが３つの上記の例において、仮に、属性認識に影響を与える位置（信頼度ｆ（Ａ_n））を考慮せずに属性Ｂを決定する場合（比較例１とする）、３フレームトータルでの「２０代男性」の認識結果を示す評価値Ｚ（Ｐ_20M）’は、スコアＣ_nを用いて以下の式で算出される。
Ｚ（Ｐ_20M）’＝Ｃ₁（Ｐ_20M）＋Ｃ₂（Ｐ_20M）＋Ｃ₃（Ｐ_20M）
＝０．７＋０．７＋０．０５
＝１．４５
一方、３フレームトータルでの「４０代男性」の認識結果を示す評価値Ｚ（Ｐ_40M）’は、スコアＣ_nを用いて以下の式で算出される。
Ｚ（Ｐ_40M）’＝Ｃ₁（Ｐ_40M）＋Ｃ₂（Ｐ_40M）＋Ｃ₃（Ｐ_40M）
＝０．０１＋０．０１＋０．９
＝０．９２
上記より、Ｚ（Ｐ_40M）’＞Ｚ（Ｐ_40M）’であるため、この場合は、属性Ｂが「２０代男性」と決定されることになる。つまり、「４０代男性」が正解であるにもかかわらず、３フレームトータルでは、属性Ｂは「２０代男性」と誤った決定がされることになる。 Here, in the above example where the number of frames n is 3, if the attribute B is determined without considering the position (reliability f(A _n )) that affects attribute recognition (comparative example 1) , an evaluation value Z(P _20M )′ indicating the recognition result of “male in his twenties” in the total of three frames is calculated by the following formula using the score C _n .
Z( _P20M )'= _C1 ( _P20M )+ _C2 ( _P20M )+ _C3 ( _P20M )
= 0.7 + 0.7 + 0.05
= 1.45
On the other hand, the evaluation value Z(P _40M )′ indicating the recognition result of “man in his 40s” in the total of 3 frames is calculated by the following formula using the score C _n .
Z( _P40M )'= _C1 ( _P40M )+ _C2 ( _P40M )+ _C3 ( _P40M )
= 0.01 + 0.01 + 0.9
= 0.92
From the above, Z(P _40M )′>Z(P _40M )′, so in this case attribute B is determined to be “male in twenties”. In other words, even though the correct answer is "male in his 40s", attribute B is erroneously determined to be "male in his 20s" in the three-frame total.

これに対して、本実施形態のように、属性認識に影響を与える位置（信頼度ｆ（Ａ_n））を考慮して属性Ｂを決定する場合、３フレームトータルでの「２０代男性」の認識結果を示す評価値Ｚ（Ｐ_20M）は、属性情報Ｑ_nを用いて以下の式で算出される。
Ｚ（Ｐ_20M）＝Ｑ₁（Ｐ_20M）＋Ｑ₂（Ｐ_20M）＋Ｑ₃（Ｐ_20M）
＝Ｃ₁（Ｐ_20M）・ｆ₁（Ａ₁）＋Ｃ₂（Ｐ_20M）・ｆ₂（Ａ₂）
＋Ｃ₃（Ｐ_20M）・ｆ₃（Ａ₃）
＝０．１４＋０．４９＋０．０５
＝０．６８
一方、３フレームトータルでの「４０代男性」の認識結果を示す評価値Ｚ（Ｐ_40M）は、属性情報Ｑ_nを用いて以下の式で算出される。
Ｚ（Ｐ_40M）＝Ｑ₁（Ｐ_40M）＋Ｑ₂（Ｐ_40M）＋Ｑ₃（Ｐ_40M）
＝Ｃ₁（Ｐ_40M）・ｆ₁（Ａ₁）＋Ｃ₂（Ｐ_40M）・ｆ₂（Ａ₂）
＋Ｃ₃（Ｐ_40M）・ｆ₃（Ａ₃）
＝０．００２＋０．００７＋０．９
＝０．９０９
上記より、Ｚ（Ｐ_20M）＜Ｚ（Ｐ_40M）であるため、属性決定部１１ｃは、３フレームトータルで、人物の属性Ｂは「４０代男性」であると決定する。この場合、決定された属性Ｂは、正しい属性と一致している。 On the other hand, as in the present embodiment, when attribute B is determined in consideration of the position (reliability f(A _n )) that affects attribute recognition, the total of three frames of "male in his twenties" The evaluation value Z (P _20M ) indicating the recognition result is calculated by the following formula using the attribute information Q _n .
Z ( _P20M ) = _Q1 ( _P20M ) + _Q2 ( _P20M ) + _Q3 ( _P20M )
= _C1 ( _P20M )* _f1 ( _A1 )+ _C2 ( _P20M )* _f2 ( _A2 )
+ C ₃ (P _20M )・f ₃ (A ₃ )
= 0.14 + 0.49 + 0.05
= 0.68
On the other hand, the evaluation value Z (P _40M ) indicating the recognition result of “man in his 40s” in the total of three frames is calculated by the following formula using the attribute information Q _n .
Z ( _P40M ) = _Q1 ( _P40M ) + _Q2 ( _P40M ) + _Q3 ( _P40M )
= C ₁ (P _40M )·f ₁ (A ₁ )+C ₂ (P _40M )·f ₂ (A ₂ )
+ C ₃ (P _40M )・f ₃ (A ₃ )
= 0.002 + 0.007 + 0.9
= 0.909
From the above, since Z(P _20M )<Z(P _40M ), the attribute determining unit 11c determines that the attribute B of the person is "male in 40's" for the three frames in total. In this case, the determined attribute B matches the correct attribute.

Ｓ１１にて、属性決定部１１ｃによって決定された属性Ｂは、記憶部１２に記憶される（Ｓ１２；記憶工程）。なお、Ｓ１２において、属性Ｂを記憶部１２に記憶する代わりに、属性Ｂの情報を通信部１５を介して管理サーバー４に送出し、管理サーバー４の格納部２１（図６参照）に格納させてもよく、また、記憶部１２と格納部２１との両方に属性Ｂの情報を記憶させるようにしてもよい。 The attribute B determined by the attribute determination unit 11c in S11 is stored in the storage unit 12 (S12; storage step). In S12, instead of storing the attribute B in the storage unit 12, the information of the attribute B is sent to the management server 4 via the communication unit 15 and stored in the storage unit 21 (see FIG. 6) of the management server 4. Alternatively, the information of the attribute B may be stored in both the storage unit 12 and the storage unit 21. FIG.

〔効果〕
以上のように、属性決定部１１ｃは、各フレーム間で同一人であると判断された人物に関して、各フレームごとに、人物認識部１１ａによる属性Ａ_nの認識結果に、属性Ａ_nの認識に影響を与える事象（ここでは画像内での人物の像Ｐ_nの位置）の認識結果を加味した属性情報Ｑ_nを、認識した属性Ａ_nの各クラスについて求める（Ｓ５～Ｓ８）。これにより、属性決定部１１ｃが、各クラスについて、属性情報Ｑ_nを複数フレームで統合した結果に基づいて、人物の属性Ｂを決定する際に（Ｓ１１）、属性Ａ_nの認識に影響を与える事象が生じたフレームについては、属性Ａ_nの認識結果の最終的な属性決定への寄与度を相対的に小さくし、属性Ａ_nの認識に影響を与える事象が生じていないフレームについては、属性Ａ_nの認識結果の最終的な属性決定への寄与度を相対的に大きくすることができる。その結果、属性Ａ_nの認識に影響を与える事象が数フレーム（上記の例では１フレーム目、２フレーム目）にわたって続く場合でも、最終的な属性Ｂの決定に対する上記数フレームの悪影響（属性Ａ_nの認識精度の低下の影響）を低減して、全体として（複数フレームのトータルで）人物の属性Ｂを精度よく決定することができる。〔effect〕
As described above, the attribute determination unit 11c adds the recognition result of the attribute A _n by the person recognition unit 11a to the recognition result of the attribute A _n for each frame regarding the person determined to be the same person in each frame. Attribute information Q _n that takes into account the recognition result of the influencing event (here, the position of the person's image P _n in the image) is obtained for each class of the recognized attribute A _n (S5 to S8). As a result, when the attribute determining unit 11c determines the attribute B of the person based on the result of integrating the attribute information Q _n for each class in a plurality of frames (S11), the recognition of the attribute A _n is affected. For frames in which an event occurs, the degree of contribution of the recognition result of attribute A _n to final attribute determination is made relatively small, and for frames in which an event affecting the recognition of attribute A _n does not occur, attribute The degree of contribution of the recognition result of A _n to final attribute determination can be made relatively large. As a result, even if the event that affects the recognition of attribute A _n lasts for several frames (the first and second frames in the above example), the adverse effect of the above few frames on the final determination of attribute B (attribute A _n ) can be reduced, and the attribute B of the person can be accurately determined as a whole (total of a plurality of frames).

また、属性決定部１１ｃは、属性Ａ_nの認識に影響を与える事象の認識結果に対応してスコアＣ_nの信頼度ｆ（Ａ_n）を設定し、人物認識部１１ａによって算出されたスコアＣ_nと、上記の信頼度ｆ（Ａ_n）とに基づいて、クラスごとに属性情報Ｑ_nを求める。このように、上記事象の認識結果に対応して信頼度ｆ（Ａ_n）を設定してクラスごとに属性情報Ｑ_nを求めることにより、各フレームについて得られる上記認識結果の属性Ｂの決定への寄与度を、上記事象に応じてクラスごとに適切に調整し、人物の属性Ｂを確実に精度よく決定することができる。 Further, the attribute determining unit 11c sets the reliability f(A _n ) of the score C _n corresponding to the recognition result of the event that affects the recognition of the attribute A _n , and the score C calculated by the person recognizing unit 11a. Attribute information Q _n is obtained for each class based on _n and the above reliability f(A _n ). Thus, by setting the reliability f(A _n ) corresponding to the recognition result of the event and obtaining the attribute information Q _n for each class, the attribute B of the recognition result obtained for each frame can be determined. can be appropriately adjusted for each class in accordance with the event, and the attribute B of the person can be determined reliably and accurately.

また、上記の信頼度ｆ（Ａ_n）は、属性Ａ_nの認識に影響を与える事象としての、人物の像Ｐ_nの位置に基づいて設定されている。これにより、属性決定部１１ｃは、各フレームごとに、上記信頼度ｆ（Ａ_n）を用いて、人物の像Ｐ_nの位置を考慮した適切な属性情報Ｑ_nを取得することができる。 Further, the reliability f(A _n ) described above is set based on the position of the person's image P _n as an event that affects the recognition of the attribute A _n . Thereby, the attribute determination unit 11c can acquire appropriate attribute information Q _n considering the position of the person image P _n for each frame using the reliability f(A _n ).

また、上記の信頼度ｆ（Ａ_n）は、画像内における人物の像Ｐ_nの位置が、全身が撮影された位置であるか否かに基づいて設定されている。これにより、本実施形態のように、人物の像Ｐ_nの位置が、全身が撮影された位置である場合とそうでない場合とで信頼度ｆ（Ａ_n）に差を持たせて、人物の像Ｐ_nの位置に応じた適切な属性情報Ｑ_nを取得することができる。 Further, the reliability f(A _n ) is set based on whether or not the position of the person's image P _n in the image is the position where the whole body is photographed. As a result, as in the present embodiment, the reliability f(A _n ) differs depending on whether the position of the person's image P _n is the position where the whole body is photographed or not. Appropriate attribute information Q _n corresponding to the position of image P _n can be obtained.

特に、本実施形態では、画像内における人物の像Ｐ_nの位置が、全身が撮影された位置（例えば図４の（３）の位置）である場合の信頼度ｆ（Ａ_n）は、画像内における人物の像Ｐ_nの位置が、全身の一部のみが撮影された位置（例えば図４の（１）または（２）の位置）である場合の信頼度ｆ（Ａ_n）よりも高く設定されている。これにより、人物の全身が撮影された画像に基づき、高い精度で認識される人物の属性Ａ_nについては、その認識結果の信頼度ｆ（Ａ_n）を上げて、最終的な属性Ｂの決定に対する上記認識結果の寄与度を高めた属性情報Ｑ_nを得ることができる。一方、人物の全身の一部のみが撮影された画像に基づき、低い精度で認識される人物の属性Ａ_nについては、その認識結果の信頼度ｆ（Ａ_n）を下げて、最終的な属性Ｂの決定に対する上記認識結果の寄与度を低くした属性情報Ｑ_nを得ることができる。 In particular, in the present embodiment, the reliability f(A _n ) when the position of the image P _n of the person in the image is the position where the whole body is photographed (for example, the position of (3) in FIG. 4) is is higher than the reliability f(A _n ) when the position of the person's image P _n is the position where only a part of the whole body is photographed (for example, the position of (1) or (2) in FIG. 4) is set. As a result, the reliability f(A _n ) of the recognition result of a person's attribute A _n that is recognized with high accuracy based on the image of the whole body of the person is increased, and the final attribute B is determined. It is possible to obtain the attribute information Q _n in which the degree of contribution of the recognition result to is increased. On the other hand, for a person's attribute A _n that is recognized with low accuracy based on an image in which only a part of the whole body of the person is captured, the reliability f(A _n ) of the recognition result is lowered, and the final attribute Attribute information Q _n can be obtained in which the degree of contribution of the recognition result to the determination of B is reduced.

また、人物の属性（Ａ_nおよびＢ）は、人物の年齢および性別である。これにより、人物の年齢および性別を、複数フレームのトータルで精度よく決定することができる。なお、上記属性は、人物の年齢および性別のどちらか一方だけであってもよい。この場合であっても、上述した本実施形態の属性決定方法を採用することによって、人物の年齢または性別を、複数フレームのトータルで精度よく決定することができる。 Also, the person's attributes (A _n and B) are the person's age and sex. As a result, the age and sex of a person can be determined with high accuracy in total for a plurality of frames. Note that the attribute may be only one of the person's age and gender. Even in this case, by adopting the attribute determination method of the present embodiment described above, it is possible to accurately determine the age or sex of a person in total for a plurality of frames.

また、本実施形態の属性決定装置３は、属性決定部１１ｃによって決定された属性Ｂを記憶する記憶部１２を備えている。これにより、例えば店舗やシステムの管理者（責任者）は、記憶部１２に記憶された属性Ｂの情報をもとに、店舗を訪れる人物の分析（どのような年代層が店舗に多く訪れるか）、人物の属性Ｂに応じた商品の開発や販売、マーケティングに関する分析などを行うことが可能となる。 The attribute determination device 3 of this embodiment also includes a storage unit 12 that stores the attribute B determined by the attribute determination unit 11c. As a result, for example, an administrator (responsible person) of a store or system can analyze people who visit the store based on the information of the attribute B stored in the storage unit 12 (what age groups often visit the store). , development and sales of products according to the attribute B of the person, and analysis of marketing can be performed.

また、本実施形態の属性決定システム１は、上述した属性決定装置３と、管理サーバー４とを含み、管理サーバー４は、属性決定装置３から送出される情報を格納する格納部２１を備え、上記情報には、属性決定装置３の属性決定部１１ｃによって決定された属性Ｂが含まれている。これにより、システムの管理者（責任者）は、管理サーバー４の格納部２１に記憶された属性Ｂの情報をもとに、店舗を訪れる人物の分析等を行うことが可能となる。また、店舗が複数存在し、各店舗に属性決定装置３が設けられる場合には、各属性決定装置３から送出される情報（属性Ｂ）を管理サーバー４の格納部２１にて一括管理（集中管理）し、格納された情報をもとに、複数の店舗間で分析結果を比較することも容易となる。 In addition, the attribute determination system 1 of this embodiment includes the attribute determination device 3 described above and the management server 4. The management server 4 includes a storage unit 21 for storing information sent from the attribute determination device 3, The information includes the attribute B determined by the attribute determination unit 11c of the attribute determination device 3. FIG. As a result, the system administrator (responsible person) can analyze the people visiting the store based on the attribute B information stored in the storage unit 21 of the management server 4 . Further, when there are multiple stores and each store is provided with the attribute determination device 3, the information (attribute B) sent from each attribute determination device 3 is collectively managed (centralized) in the storage unit 21 of the management server 4. management), and based on the stored information, it becomes easy to compare analysis results between multiple stores.

＜実施の形態２＞
本実施形態では、属性Ａ_nの認識に影響を与える事象に人物の行動が含まれ、上記行動を加味して属性Ｂを決定する以外は、実施の形態１と同様である。なお、人物の行動については、後述するように、画像内の人物の像（画像データ）から把握することができる。以下、実施の形態１と異なる部分について説明する。 <Embodiment 2>
This embodiment is the same as the first embodiment except that the event that affects the recognition of the attribute A _n includes a person's behavior, and the attribute B is determined in consideration of the above behavior. As will be described later, the behavior of a person can be grasped from the image (image data) of the person in the image. The parts different from the first embodiment will be described below.

図１１は、時間的に異なる複数フレームのうち、一部のフレームの画像２ａ₁～２ａ₄を模式的に示している。なお、図１１の画像２ａ₁～２ａ₄は、いずれも人物を斜め上方から撮影して得られた画像である。同図に示すように、店舗内で人物が行動するパターンとしては、歩行や立ち止まりなどが考えられ、この他にも、走行、旋回などが考えられる。人物の行動が、歩行中、走行中、旋回中など、動きを伴う行動である場合、画像内では人物の像がブレやすい。この場合、上記画像に基づく人物の属性Ａ_nの認識精度が低下しやすくなる。一方、人物の行動が立ち止まりなどの滞留行動（歩みを止める行動）である場合、画像内では人物の像にブレがほとんど生じないため、上記画像に基づく人物の属性Ａ_nの認識精度の低下はほとんど生じない。 FIG. 11 schematically shows images 2a ₁ to 2a ₄ of some frames out of a plurality of temporally different frames. Images 2a ₁ to 2a ₄ in FIG. 11 are images obtained by photographing a person obliquely from above. As shown in the figure, patterns of behavior of a person in a store include walking and stopping, and other patterns such as running and turning. When a person's action involves movement, such as walking, running, or turning, the image of the person tends to blur in the image. In this case, the recognition accuracy of the person's attribute A _n based on the image is likely to deteriorate. On the other hand, when a person's action is a dwelling action (stop walking) such as standing still, the image of the _person hardly blurs in the image. rarely occur.

そこで、本実施形態では、属性決定部１１ｃは、属性Ａ_nの認識に影響を与える行動（歩行中などの動きを伴う行動）については、図１１に示すように、信頼度ｆ（Ａ_n）を１未満（例えば０．２）に設定し、属性Ａ_nの認識に影響を与えない行動（立ち止まりなどの滞留行動）については、信頼度ｆ（Ａ_n）を１に設定して、各フレームごとに属性情報Ｑ_nを各クラスについて求め、求めた属性情報Ｑ_nを複数フレームで統合した結果に基づいて、人物の属性Ｂを決定するようにしている。 Therefore _, in the present embodiment, the attribute determination unit 11c determines the reliability f(A _n is set to less than 1 (for example, 0.2), and for actions that do not affect the recognition of the attribute A _n (staying behavior such as stopping), the confidence f(A _n ) is set to 1, and each frame The attribute information Q _n is obtained for each class for each class, and the attribute B of the person is determined based on the result of integrating the obtained attribute information Q _n in a plurality of frames.

図１２は、本実施形態の属性決定システム１における処理の流れを示すフローチャートである。なお、図１２のフローチャートは、図７のフローチャートのＳ３およびＳ５を、それぞれＳ３－１およびＳ５－１に置き換えたものである。なお、ここでは、実施の形態１と同様に、属性を判断する対象となる人物は、「４０代男性」であるとする（「４０代男性」が属性として正解であるとする）。 FIG. 12 is a flow chart showing the flow of processing in the attribute determination system 1 of this embodiment. In the flowchart of FIG. 12, S3 and S5 in the flowchart of FIG. 7 are replaced with S3-1 and S5-1, respectively. Here, as in Embodiment 1, it is assumed that the person whose attribute is to be determined is "male in his 40s" (assuming that "male in his 40s" is the correct attribute).

Ｓ３－１（人物認識工程）では、人物認識部１１ａは、人物を上方から撮影したｎフレーム目の画像に基づき、人物矩形Ｒ_nと、人物の属性Ａ_nと、属性Ａ_nの認識に影響を与える事象（ここでは人物の行動）とを認識するとともに、属性Ａ_nの認識結果の確からしさを示すスコアＣ_nを算出する（人物認識工程）。 In S3-1 (human recognition step), the human recognition unit 11a affects the recognition of the human rectangle R _n , the human attribute A _n , and the human attribute A _n based on the n-th frame image of the human being photographed from above. (human behavior in this case) and calculates a score C _n indicating the likelihood of the recognition result of the attribute A _n (person recognition step).

ここで、人物の行動の認識およびスコアＣ_nの算出は、予め学習されたニューラルネットワークを用いることによって行うことができる。つまり、人物認識部１１ａは、各画像２ａ₁～２ａ₄のデータを、予め行動認識用に学習されたニューラルネットワークに入力することにより、ニューラルネットワークから人物の行動の認識結果およびその確からしさを示すスコアＣ_nを出力させることができる。したがって、人物認識部１１ａは、ニューラルネットワークからの出力に基づき、人物の行動が、属性認識に影響を与える行動（動きを伴う行動）であるか、属性認識に影響を与えない滞留行動であるかを認識することができる。Ｓ３－１で得られた認識結果およびスコアＣ_nは、記憶部１２に記憶される。 Here, the recognition of the person's actions and the calculation of the score C _n can be performed by using a pre-learned neural network. In other words, the person recognition unit 11a inputs the data of each of the images 2a ₁ to 2a ₄ to a neural network that has been trained in advance for action recognition, thereby indicating the recognition result of the person's action and its likelihood from the neural network. A score C _n can be output. Therefore, based on the output from the neural network, the person recognition unit 11a determines whether the person's action is an action (behavior accompanied by movement) that affects attribute recognition or a staying action that does not affect attribute recognition. can be recognized. The recognition result and score C _n obtained in S3-1 are stored in the storage unit 12. FIG.

Ｓ５－１では、属性決定部１１ｃは、Ｓ３－１での属性Ａ_nの認識に影響を与える事象の認識結果に対応してスコアＣ_nの信頼度ｆ（Ａ_n）を設定する（Ｓ５－１～Ｓ７）。つまり、Ｓ３－１にて認識された事象（人物の行動）が、属性認識に影響を与える行動（動きを伴う行動）である場合（Ｓ５－１でＹｅｓ）、属性決定部１１ｃは、上記認識結果に対応して、スコアＣ_nの信頼度ｆ（Ａ_n）を１未満（例えば０．２）に設定する（Ｓ６）。一方、Ｓ３－１にて認識された事象（人物の行動）が、属性認識に影響を与えない滞留行動である場合（Ｓ５－１でＮｏ）、属性決定部１１ｃは、スコアＣ_nの信頼度ｆ（Ａ_n）を１に設定する（Ｓ７）。 In S5-1, the attribute determination unit 11c sets the reliability f(A _n ) of the score C _n corresponding to the recognition result of the event that affects the recognition of the attribute A _n in S3-1 (S5-1). 1 to S7). In other words, if the event (person's behavior) recognized in S3-1 is an action (behavior accompanied by movement) that affects attribute recognition (Yes in S5-1), the attribute determination unit 11c Corresponding to the result, the reliability f(A _n ) of the score C _n is set to less than 1 (for example, 0.2) (S6). On the other hand, if the event (human behavior) recognized in S3-1 is a staying behavior that does not affect attribute recognition (No in S5-1), the attribute determining unit 11c determines the reliability of the score C _n f(A _n ) is set to 1 (S7).

次に、属性決定部１１ｃは、Ｓ３－１で算出されたスコアＣ_nと、上記で設定した信頼度ｆ（Ａ_n）とに基づいて、属性情報Ｑ_n（＝Ｑ₁）を属性Ａ_nのクラスごとに求める（Ｓ８）。求めた属性情報Ｑ_nは、記憶部１２に人物の識別情報と対応付けて記憶される。 Next, the attribute determination unit 11c converts the attribute information Q _n (=Q ₁ ) to the attribute A _{n based on the score C n} calculated in S3-1 and the reliability f(A _n ) set above _. (S8). The obtained attribute information Q _n is stored in the storage unit 12 in association with the person's identification information.

（ｎ＋１）フレーム目以降の画像についてもＳ２以降の処理を繰り返し（Ｓ９、Ｓ１０）、例えば異なるフレーム間で人物のＩＤが異なる場合など、処理の継続が不要となった時点で（Ｓ９でＮｏ）、処理を継続しないと判断してＳ１１に移行する。Ｓ１１では、属性決定部１１ｃは、属性Ａ_nの各クラスについて、各フレームごとに求めた属性情報Ｑ_nを複数フレームで統合し、その結果に基づいて人物の属性Ｂを決定する。なお、上述したＳ５－１～Ｓ９、Ｓ１１の工程は、属性決定工程に対応する。 The processing after S2 is repeated for the (n+1)-th frame and subsequent images (S9, S10). , it determines not to continue the process, and shifts to S11. In S11, the attribute determination unit 11c integrates the attribute information Q _n obtained for each frame for each class of the attribute A _n in a plurality of frames, and determines the attribute B of the person based on the result. The steps S5-1 to S9 and S11 described above correspond to the attribute determination step.

図１３は、１フレーム目について得られた情報の一例を示している。この例では、ＩＤ＝０００１の人物について、属性Ａ₁が「２０代男性」である確からしさを示すスコアＣ₁（Ｐ_20M）が０．７であり、属性Ａ₁が「４０代男性」である確からしさを示すスコアＣ₁（Ｐ_40M）が０．０１となっている。人物の行動が動きを伴う行動（歩行中）であり、画像内で人物の像にブレが生じ、人物認識部１１ａが上記画像に基づいて人物の属性Ａ₁の認識を精度よく行うことができなかった結果、「４０代男性」が正解であるにもかかわらず、「４０代男性」のスコアＣ₁（Ｐ_40M）よりも、「２０代男性」のスコアＣ₁（Ｐ_20M）のほうが高くなっている。 FIG. 13 shows an example of information obtained for the first frame. In this example, for the person with ID=0001, the score C ₁ (P _20M ) indicating the probability that the attribute A ₁ is "male in his 20s" is 0.7, and the attribute A ₁ is "male in his 40s". The score C ₁ (P _40M ) indicating certain certainty is 0.01. If the action of the person is an action accompanied by movement (walking), the image of the person is blurred in the image, and the person recognition unit 11a cannot accurately recognize the attribute _A1 of the person based on the image. As a result, although "male in his 40s" is the correct answer, the score C ₁ (P _20M ) of "male in his 20s" is higher than the score C ₁ (P _40M ) of "male in his 40s". It's becoming

そこで、人物の行動が属性Ａ₁の認識に影響を与える行動であることを考慮し、図１３の例では、スコアＣ₁の信頼度ｆ（Ａ₁）を、ｆ（Ａ₁）＝ｆ₁（Ａ₁）＝０．２に設定している。これにより、各クラスについて、属性Ａ₁の認識結果に上記行動の認識結果を加味した属性情報Ｑ₁として、「２０代男性」のクラスについては、Ｑ₁（Ｐ_20M）＝Ｃ₁（Ｐ_20M）×ｆ₁（Ａ₁）＝０．７×０．２＝０．１４が得られており、「４０代男性」のクラスについては、Ｑ₁（Ｐ_40M）＝Ｃ₁（Ｐ_40M）×ｆ₁（Ａ₁）＝０．０１×０．２＝０．００２が得られている。 Considering that the behavior of a _person affects the recognition of attribute A ₁ , in the example of FIG _. 13, the reliability f(A ₁ ) of score C ₁ is set to (A ₁ )=0.2 is set. As a result, for each class, the attribute information Q ₁ obtained by adding the recognition result of the behavior to the recognition result of the attribute A ₁ , Q ₁ (P _20M )=C ₁ (P _20M ₎ ×f ₁ (A ₁ ) ₌ _0.7 × _0.2 =0.14. f ₁ (A ₁ )=0.01×0.2=0.002 is obtained.

図１４は、２フレーム目について得られた情報の一例を示している。この例においても、ＩＤ＝０００１の人物について、人物の行動が動きを伴う行動（歩行中）であるため、１フレーム目と同様に、「４０代男性」が正解であるにもかかわらず、「４０代男性」のスコアＣ₂（Ｐ_40M）よりも、「２０代男性」のスコアＣ₂（Ｐ_20M）のほうが高くなっている。そこで、１フレーム目と同様に、人物の行動が属性Ａ₂の認識に影響を与える行動であることを考慮し、スコアＣ₂の信頼度ｆ（Ａ₂）を、ｆ（Ａ₂）＝ｆ₂（Ａ₂）＝０．２に設定している。これにより、各クラスについて、属性Ａ₂の認識結果に上記行動の認識結果を加味した属性情報Ｑ₂として、「２０代男性」のクラスについては、Ｑ₂（Ｐ_20M）＝Ｃ₂（Ｐ_20M）×ｆ₂（Ａ₂）＝０．７×０．２＝０．１４が得られており、「４０代男性」のクラスについては、Ｑ₂（Ｐ_40M）＝Ｃ₂（Ｐ_40M）×ｆ₂（Ａ₂）＝０．０１×０．２＝０．００２が得られている。 FIG. 14 shows an example of information obtained for the second frame. In this example as well, for the person with ID=0001, the behavior of the person involves movement (walking). The score C ₂ (P _20M ) of “male in his 20s” is higher than the score C ₂ (P _40M ) of “male in his 40s”. Therefore, as in the first frame, considering that the behavior of the person affects the recognition of the attribute A ₂ , the reliability f(A ₂ ) of the score C ₂ is given by f(A ₂ )=f ₂ (A ₂ )=0.2 is set. As a result, for each class, the attribute information _Q2 obtained by adding the recognition result of the behavior to the recognition result of the attribute _A2 is calculated as follows: _Q2 ( _P20M )= _C2 ( _P20M ) )×f ₂ (A ₂ )=0.7×0.2=0.14, and for the class of “men in their 40s”, Q ₂ (P _40M )=C ₂ (P _40M )× f ₂ (A ₂ )=0.01×0.2=0.002 is obtained.

図１５は、３フレーム目について得られた情報の一例を示している。この例では、ＩＤ＝０００１の人物について、属性Ａ₃が「２０代男性」である確からしさを示すスコアＣ₃（Ｐ_20M）が０．０５であり、属性Ａ₃が「４０代男性」である確からしさを示すスコアＣ₃（Ｐ_40M）が０．９となっている。人物の行動が滞留行動（立ち止まり）であり、画像内で人物の像にブレが生じず、人物認識部１１ａが上記画像に基づいて人物の属性Ａ₃の認識を精度よく行うことができた結果、「２０代男性」のスコアＣ₃（Ｐ_20M）よりも、「４０代男性」のスコアＣ₃（Ｐ_40M）のほうが高くなっている。これらのスコアＣ₃（Ｐ_20M）およびＣ₃（Ｐ_40M）の大小関係は、「４０代男性」を正解とする答えと対応する関係と言える。 FIG. 15 shows an example of information obtained for the third frame. In this example, for the person with ID=0001, the score C ₃ (P _20M ) indicating the probability that the attribute A ₃ is “male in his 20s” is 0.05, and the attribute A ₃ is “male in his 40s”. The score C ₃ (P _40M ) indicating certain certainty is 0.9. The behavior of the person is a staying behavior (pause), the image of the person does not blur in the image, and the person recognition unit 11a is able to accurately recognize the attribute _A3 of the person based on the image. , the score C ₃ (P _40M ) of the “male in his 40s” is higher than the score C ₃ (P _20M ) of the “male in his 20s”. The magnitude relationship between these scores C ₃ (P _20M ) and C ₃ (P _40M ) can be said to be a relationship corresponding to the correct answer of “male in his 40s”.

人物の行動が属性Ａ₃の認識にほとんど影響を与えない行動であることを考慮し、図１５の例では、スコアＣ₃の信頼度ｆ（Ａ₃）を、ｆ（Ａ₃）＝ｆ₃（Ａ₃）＝１．０に設定している。これにより、各クラスについて、属性Ａ₃の認識結果に上記位置の認識結果を加味した属性情報Ｑ₃として、「２０代男性」のクラスについては、Ｑ₃（Ｐ_20M）＝Ｃ₃（Ｐ_20M）×ｆ₃（Ａ₃）＝０．０５×１．０＝０．０５が得られており、「４０代男性」のクラスについては、Ｑ₃（Ｐ_40M）＝Ｃ₃（Ｐ_40M）×ｆ₃（Ａ₃）＝０．９×１．０＝０．９が得られている。 Considering that the behavior of a person hardly affects the recognition of attribute _A3 , in the example _of FIG _{. 15, the reliability f(A3} ₎ of score _C3 is set to (A ₃ )=1.0 is set. As a result, for each class, the attribute information _Q3 obtained by adding the recognition result of the position to the recognition result of the attribute _A3 is calculated as follows: _Q3 ( _P20M )= _C3 ( _P20M ) ₎ ×f ₃ (A ₃ ₎ ₌ 0.05× _1.0 =0.05. f ₃ (A ₃ )=0.9×1.0=0.9 is obtained.

フレーム数ｎが３つの上記の例において、仮に、属性認識に影響を与える人物の行動（信頼度ｆ（Ａ_n））を考慮せずに属性Ｂを決定する場合（比較例２とする）、３フレームトータルでの「２０代男性」の認識結果の評価値Ｚ（Ｐ_20M）’は、スコアＣ_nを用いて以下の式で算出される。
Ｚ（Ｐ_20M）’＝Ｃ₁（Ｐ_20M）＋Ｃ₂（Ｐ_20M）＋Ｃ₃（Ｐ_20M）
＝０．７＋０．７＋０．０５
＝１．４５
一方、３フレームトータルでの「４０代男性」の認識結果の評価値Ｚ（Ｐ_40M）’ は、スコアＣ_nを用いて以下の式で算出される。
Ｚ（Ｐ_40M）’＝Ｃ₁（Ｐ_40M）＋Ｃ₂（Ｐ_40M）＋Ｃ₃（Ｐ_40M）
＝０．０１＋０．０１＋０．９
＝０．９２
上記より、Ｚ（Ｐ_40M）’＞Ｚ（Ｐ_40M）’であるため、この場合は、属性Ｂが「２０代男性」と決定されることになる。つまり、「４０代男性」が正解であるにもかかわらず、３フレームトータルでは、属性Ｂは「２０代男性」と誤った決定がされることになる。 In the above example where the number of frames n is 3, if the attribute B is determined without considering the behavior of the person (reliability f(A _n )) that affects attribute recognition (comparative example 2), The evaluation value Z(P _20M )′ of the recognition result of “male in his twenties” in the total of three frames is calculated by the following formula using the score C _n .
Z( _P20M )'= _C1 ( _P20M )+ _C2 ( _P20M )+ _C3 ( _P20M )
= 0.7 + 0.7 + 0.05
= 1.45
On the other hand, the evaluation value Z(P _40M )′ of the recognition result of “male in his 40s” in the total of three frames is calculated by the following formula using the score C _n .
Z( _P40M )'= _C1 ( _P40M )+ _C2 ( _P40M )+ _C3 ( _P40M )
= 0.01 + 0.01 + 0.9
= 0.92
From the above, Z(P _40M )′>Z(P _40M )′, so in this case attribute B is determined to be “male in twenties”. In other words, even though the correct answer is "male in his 40s", attribute B is erroneously determined to be "male in his 20s" in the three-frame total.

これに対して、本実施形態のように、属性認識に影響を与える人物の行動（信頼度ｆ（Ａ_n））を考慮して属性Ｂを決定する場合、３フレームトータルでの「２０代男性」の認識結果の評価値Ｚ（Ｐ_20M）は、属性情報Ｑ_nを用いて以下の式で算出される。
Ｚ（Ｐ_20M）＝Ｑ₁（Ｐ_20M）＋Ｑ₂（Ｐ_20M）＋Ｑ₃（Ｐ_20M）
＝Ｃ₁（Ｐ_20M）・ｆ₁（Ａ₁）＋Ｃ₂（Ｐ_20M）・ｆ₂（Ａ₂）
＋Ｃ₃（Ｐ_20M）・ｆ₃（Ａ₃）
＝０．１４＋０．１４＋０．０５
＝０．３３
一方、３フレームトータルでの「４０代男性」の認識結果の評価値Ｚ（Ｐ_40M）は、属性情報Ｑ_nを用いて以下の式で算出される。
Ｚ（Ｐ_40M）＝Ｑ₁（Ｐ_40M）＋Ｑ₂（Ｐ_40M）＋Ｑ₃（Ｐ_40M）
＝Ｃ₁（Ｐ_40M）・ｆ₁（Ａ₁）＋Ｃ₂（Ｐ_40M）・ｆ₂（Ａ₂）
＋Ｃ₃（Ｐ_40M）・ｆ₃（Ａ₃）
＝０．００２＋０．００２＋０．９
＝０．９０４
上記より、Ｚ（Ｐ_20M）＜Ｚ（Ｐ_40M）であるため、属性決定部１１ｃは、３フレームトータルで、人物の属性Ｂは「４０代男性」であると決定する。この場合、決定された属性Ｂは、正しい属性と一致している。 On the other hand, as in the present embodiment, when attribute B is determined in consideration of a person's behavior (reliability f(A _n )) that affects attribute recognition, "a male in his twenties ” _is calculated by the following formula using the attribute information Q _n .
Z ( _P20M ) = _Q1 ( _P20M ) + _Q2 ( _P20M ) + _Q3 ( _P20M )
= _C1 ( _P20M )* _f1 ( _A1 )+ _C2 ( _P20M )* _f2 ( _A2 )
+ C ₃ (P _20M )・f ₃ (A ₃ )
= 0.14 + 0.14 + 0.05
= 0.33
On the other hand, the evaluation value Z (P _40M ) of the recognition result of "male in his 40s" in the total of three frames is calculated by the following formula using the attribute information _Qn .
Z ( _P40M ) = _Q1 ( _P40M ) + _Q2 ( _P40M ) + _Q3 ( _P40M )
= C ₁ (P _40M )·f ₁ (A ₁ )+C ₂ (P _40M )·f ₂ (A ₂ )
+ C ₃ (P _40M )・f ₃ (A ₃ )
= 0.002 + 0.002 + 0.9
= 0.904
From the above, since Z(P _20M )<Z(P _40M ), the attribute determining unit 11c determines that the attribute B of the person is "male in 40's" for the three frames in total. In this case, the determined attribute B matches the correct attribute.

以上のように、本実施形態においても、属性決定部１１ｃは、各フレームごとに、人物認識部１１ａによる属性Ａ_nの認識結果に、属性Ａ_nの認識に影響を与える事象（ここでは人物の行動）の認識結果を加味した属性情報Ｑ_n、認識した属性Ａ_nの各クラスについて求める（Ｓ５－１～Ｓ８）。これにより、属性決定部１１ｃが、各クラスについて、属性情報Ｑ_nを複数フレームで統合した結果に基づいて、人物の属性Ｂを決定する際に（Ｓ１１）、属性Ａ_nの認識に影響を与える行動（動きを伴う行動）が生じたフレームについては、属性Ａ_nの認識結果の最終的な属性決定への寄与度を相対的に小さくし、属性Ａ_nの認識に影響を与える行動が生じていないフレームについては、属性Ａ_nの認識結果の最終的な属性決定への寄与度を相対的に大きくすることができる。その結果、属性Ａ_nの認識に影響を与える行動が数フレーム（上記の例では１フレーム目、２フレーム目）にわたって続く場合でも、最終的な属性Ｂの決定に対する上記数フレームの悪影響（属性Ａ_nの認識精度の低下の影響）を低減して、全体として（複数フレームのトータルで）人物の属性Ｂを精度よく決定することができる。 As described above, also in the present embodiment, the attribute determination unit _11c adds an event (in this case, a person _'s Attribute information Q _n that takes into account the recognition result of behavior) and recognized attributes A _n are obtained for each class (S5-1 to S8). As a result, when the attribute determining unit 11c determines the attribute B of the person based on the result of integrating the attribute information Q _n for each class in a plurality of frames (S11), the recognition of the attribute A _n is affected. For frames in which an action (behavior accompanied by movement) occurs, the degree of contribution of the recognition result of attribute A _n to final attribute determination is made relatively small, and actions that affect the recognition of attribute A _n occur. For frames without attribute A _n , the degree of contribution of the recognition result of attribute A n to final attribute determination can be made relatively large. As a result, even if the behavior that affects the recognition of attribute A _n lasts for several frames (the first and second frames in the above example), the adverse effect of the above few frames on the final determination of attribute B (attribute A _n ) can be reduced, and the attribute B of the person can be accurately determined as a whole (total of a plurality of frames).

また、上記の信頼度ｆ（Ａ_n）は、属性Ａ_nの認識に影響を与える事象、つまり、画像内における人物の像から把握される人物の行動に基づいて設定されている。これにより、属性決定部１１ｃは、上記信頼度ｆ（Ａ_n）を用いて、人物の行動を考慮した適切な属性情報Ｑ_nを取得することができる。 The reliability f(A _n ) described above is set based on an event that affects the recognition of the attribute A _n , that is, the action of the person ascertained from the image of the person in the image. Accordingly, the attribute determination unit 11c can obtain appropriate attribute information Q _n in consideration of the behavior of the person using the reliability f(A _n ).

また、上記の信頼度ｆ（Ａ_n）は、人物の行動が、動きを伴う行動であるか否かに基づいて設定されている。これにより、人物の行動が、動きを伴う行動である場合とそうでない場合とで信頼度ｆ（Ａ_n）に差を持たせて、人物の行動に応じた属性情報Ｑ_nを取得することができる。 Also, the reliability f(A _n ) described above is set based on whether or not the action of the person is an action involving movement. As a result, it is possible to acquire the attribute information Q _n according to the behavior of the person by giving a difference in the reliability f(A _n ) depending on whether the behavior of the person involves movement or not. can.

特に、本実施形態では、人物の行動が動きを伴う行動である場合の信頼度ｆ（Ａ_n）は、人物の行動が滞留行動である場合の信頼度ｆ（Ａ_n）よりも低く設定されている。これにより、動きを伴う人物の行動が撮影された画像に基づき、低い精度で認識される人物の属性Ａ_nについては、その認識結果の信頼度ｆ（Ａ_n）を下げて、最終的な属性Ｂの決定に対する上記認識結果の寄与度を低くした属性情報Ｑ_nを得ることができる。一方、人物の滞留行動が撮影された画像に基づき、高い精度で認識される人物の属性Ａ_nについては、その認識結果の信頼度ｆ（Ａ_n）を上げて、最終的な属性Ｂの決定に対する上記認識結果の寄与度を高めた属性情報Ｑ_nを得ることができる。 In particular, in the present embodiment, the reliability f(A _n ) when a person's action is an action accompanied by movement is set lower than the reliability f(A _n ) when a person's action is a staying action. ing. As a result, for a person's attribute A _n that is recognized with low accuracy based on an image in which a person's behavior accompanied by movement is captured, the reliability f(A _n ) of the recognition result is lowered, and the final attribute Attribute information Q _n can be obtained in which the degree of contribution of the recognition result to the determination of B is reduced. On the other hand, for a person's attribute A _n that is recognized with high accuracy based on an image of a person's staying behavior, the reliability f(A _n ) of the recognition result is increased, and the final attribute B is determined. It is possible to obtain the attribute information Q _n in which the degree of contribution of the recognition result to is increased.

＜実施の形態３＞
本実施形態では、属性Ａ_nの認識に影響を与える事象に人物の姿勢がさらに含まれ、上記姿勢をさらに加味して属性Ｂを決定する以外は、実施の形態２と同様である。なお、人物の姿勢については、後述するように、画像内の人物の像（画像データ）から把握することができる。以下、実施の形態２と異なる部分について説明する。なお、「姿勢」とは、体の構えを指す点で、動作の有無に着目した「行動」とは区別されるが、立ち止まった姿勢など、一部の姿勢については、行動（立ち止まり）と重複する場合もある。 <Embodiment 3>
This embodiment is the same as the second embodiment except that the posture of the person is included in the events that affect the recognition of the attribute A _n , and the attribute B is determined with the posture further taken into consideration. As will be described later, the posture of the person can be grasped from the image (image data) of the person in the image. Portions different from the second embodiment will be described below. In addition, "posture" refers to the posture of the body, so it is distinguished from "action" that focuses on the presence or absence of movement, but some postures such as standing posture overlap with action (stopping). sometimes.

図１６は、時間的に異なる複数フレームのうち、一部のフレームの画像２ａ₁₁～２ａ₁₅を模式的に示している。なお、図１６の画像２ａ₁₁～２ａ₁₅は、いずれも人物を斜め上方から撮影して得られた画像である。同図に示すように、店舗内で人物がとる姿勢としては、例えば店舗内の商品棚の最下部の商品を観察するときの「しゃがみ込み」がある。なお、立ち止まった状態からしゃがみ込むまでの動作（例えばしゃがみ始め）や、しゃがみ込んだ状態から立ち止まるまでの動作（例えば立ち上がり）は、行動と認識することができる。 FIG. 16 schematically shows images 2a ₁₁ to 2a ₁₅ of some frames out of a plurality of temporally different frames. Images 2a ₁₁ to 2a ₁₅ in FIG. 16 are images obtained by photographing a person obliquely from above. As shown in the figure, the posture taken by a person in a store includes, for example, "squatting down" when observing the products at the bottom of the product shelf in the store. An action from standing still to squatting (for example, starting to squat) and an action from squatting to stopping (for example, standing up) can be recognized as actions.

人物の姿勢が「しゃがみ込み」である場合、人物を上方から撮影した画像では、上半身によって下半身が隠れる画像が得られる。また、上方から見て、全身の一部が商品棚に隠れるような姿勢では、全身の一部のみを撮影した画像が得られる。これらの場合、人物の画像データが欠落しているため（全身の画像データが得られないため）、上記画像に基づく人物の属性Ａ_nの認識精度が低下しやすくなる。一方、人物の姿勢が、立ち止まりなど、全身が撮影される姿勢である場合、人物の撮影画像においては、人物の画像データの欠落がないため、上記画像に基づく人物の属性Ａ_nの認識精度の低下はほとんどない。 When the posture of the person is "crouching", an image in which the lower half of the body is hidden by the upper half of the body is obtained in an image of the person photographed from above. In addition, when viewed from above, an image in which only a part of the whole body is photographed is obtained in a posture in which a part of the whole body is hidden by the product shelf. In these cases, since the image data of the person is missing (because the image data of the whole body cannot be obtained), the accuracy of recognizing the attribute A _n of the person based on the image tends to decrease. On the other hand, when the posture of the person is such that the whole body is photographed, such as standing still, there is no lack of image data of the person in the photographed image of the _person . almost no decline.

そこで、本実施形態では、属性決定部１１ｃは、人物の姿勢が属性Ａ_nの認識に影響を与える姿勢（例えば全身の一部しか撮影されない姿勢）である場合は、図１６に示すように、信頼度ｆ（Ａ_n）を１未満に設定し、人物の姿勢が属性Ａ_nの認識に影響を与えない姿勢（例えば全身が撮影される姿勢）である場合は、信頼度ｆ（Ａ_n）を１に設定して、各フレームごとに属性情報Ｑ_nを各クラスについて求め、求めた属性情報Ｑ_nを複数フレームで統合した結果に基づいて、人物の属性Ｂを決定するようにしている。なお、人物の行動については、実施の形態２と同様に、動きを伴う行動の場合には信頼度ｆ（Ａ_n）を１未満に設定し、滞留行動の場合には信頼度ｆ（Ａ_n）を１に設定して、各フレームごとおよび各クラスごとに属性情報Ｑ_nを求める。 Therefore, in the present embodiment, when the posture of the person affects the recognition of the attribute A _n (for example, a posture in which only a part of the whole body is photographed), as shown in FIG. When the reliability f(A _n ) is set to less than 1, and the posture of the person is a posture that does not affect the recognition of the attribute A _n (for example, a posture in which the whole body is photographed), the reliability f(A _n ) is set to 1, the attribute information Q _n is obtained for each class for each frame, and the attribute B of the person is determined based on the result of integrating the obtained attribute information Q _n in a plurality of frames. As for human behavior, as in the second embodiment, the reliability f(A _n ) is set to less than 1 for behavior involving movement, and the reliability f(A _n ) is set to 1, and attribute information Q _n is obtained for each frame and each class.

図１７は、本実施形態の属性決定システム１における処理の流れを示すフローチャートである。なお、図１７のフローチャートは、図１２のフローチャートのＳ３－１およびＳ５－１を、それぞれＳ３－２およびＳ５－２に置き換えたものである。なお、ここでは、実施の形態２と同様に、属性を判断する対象となる人物は、「４０代男性」であるとする（「４０代男性」が属性として正解であるとする）。 FIG. 17 is a flow chart showing the flow of processing in the attribute determination system 1 of this embodiment. In the flowchart of FIG. 17, S3-1 and S5-1 in the flowchart of FIG. 12 are replaced with S3-2 and S5-2, respectively. Here, as in Embodiment 2, it is assumed that the person whose attribute is to be determined is "male in his 40s" (assuming that "male in his 40s" is the correct attribute).

Ｓ３－２（人物認識工程）では、人物認識部１１ａは、人物を上方から撮影したｎフレーム目の画像に基づき、人物矩形Ｒ_nと、人物の属性Ａ_nと、属性Ａ_nの認識に影響を与える事象（ここでは人物の行動および姿勢）とを認識するとともに、属性Ａ_nの認識結果の確からしさを示すスコアＣ_nを算出する（人物認識工程）。 In S3-2 (human recognition step), the human recognition unit 11a affects the recognition of the human rectangle R _n , the human attribute A _n , and the human attribute A _n based on the n-th frame image of the human being photographed from above. (person's behavior and posture in this case), and a score C _n indicating the likelihood of the recognition result of the attribute _An is calculated (person recognition step).

ここで、人物の姿勢の認識およびスコアＣ_nの算出は、予め学習されたニューラルネットワークを用いることによって行うことができる。つまり、人物認識部１１ａは、各画像２ａ₁～２ａ₄のデータを、予め姿勢認識用に学習されたニューラルネットワークに入力することにより、ニューラルネットワークから人物の姿勢の認識結果およびその確からしさを示すスコアＣ_nを出力させることができる。したがって、人物認識部１１ａは、ニューラルネットワークからの出力に基づき、人物の姿勢が、属性認識に影響を与える姿勢（全身の一部のみが撮影された姿勢）であるか、属性認識に影響を与えない姿勢（全身が撮影された姿勢）であるかを認識することができる。Ｓ３－１で得られた認識結果およびスコアＣ_nは、記憶部１２に記憶される。 Here, the recognition of the posture of the person and the calculation of the score C _n can be performed by using a pre-learned neural network. In other words, the person recognition unit 11a inputs the data of each of the images 2a ₁ to 2a ₄ to a neural network that has been trained for posture recognition in advance, thereby indicating the recognition result of the person's posture and its likelihood from the neural network. A score C _n can be output. Therefore, based on the output from the neural network, the person recognition unit 11a determines whether the posture of the person is a posture that affects attribute recognition (posture in which only a part of the whole body is photographed) or not. It is possible to recognize whether it is a non-existent posture (a posture in which the whole body is photographed). The recognition result and score C _n obtained in S3-1 are stored in the storage unit 12. FIG.

Ｓ５－２では、属性決定部１１ｃは、Ｓ３－２での属性Ａ_nの認識に影響を与える事象の認識結果に対応してスコアＣ_nの信頼度ｆ（Ａ_n）を設定する（Ｓ５－２～Ｓ７）。つまり、Ｓ３－２にて認識された事象（人物の行動、姿勢）が、属性認識に影響を与える事象である場合（Ｓ５－２でＹｅｓ）、属性決定部１１ｃは、上記認識結果に対応して、スコアＣ_nの信頼度ｆ（Ａ_n）を１未満に設定する（Ｓ６）。一方、Ｓ３－２にて認識された事象（人物の行動、姿勢）が、属性認識に影響を与えない事象である場合（Ｓ５－２でＮｏ）、属性決定部１１ｃは、スコアＣ_nの信頼度ｆ（Ａ_n）を１に設定する（Ｓ７）。 In S5-2, the attribute determining unit 11c sets the reliability f(A _n ) of the score C _n corresponding to the recognition result of the event affecting the recognition of the attribute A _n in S3-2 (S5- 2 to S7). In other words, if the event (person's behavior, posture) recognized in S3-2 is an event that affects attribute recognition (Yes in S5-2), the attribute determination unit 11c Then, the reliability f(A _n ) of the score C _n is set to less than 1 (S6). On the other hand, if the event (person's behavior, posture) recognized in S3-2 is an event that does not affect attribute recognition (No in S5-2), the attribute determination unit _11c The degree f(A _n ) is set to 1 (S7).

次に、属性決定部１１ｃは、Ｓ３－２で算出されたスコアＣ_nと、上記で設定した信頼度ｆ（Ａ_n）とに基づいて、属性情報Ｑ_n（＝Ｑ₁）を属性Ａ_nのクラスごとに求める（Ｓ８）。求めた属性情報Ｑ_nは、記憶部１２に人物の識別情報と対応付けて記憶される。 Next, the attribute determining unit 11c converts the attribute information Q _n (=Q ₁ ) to the attribute A _{n based on the score C n} calculated in S3-2 and the reliability f(A _n ) set above _. (S8). The obtained attribute information Q _n is stored in the storage unit 12 in association with the person's identification information.

（ｎ＋１）フレーム目以降の画像についてもＳ２以降の処理を繰り返し（Ｓ９、Ｓ１０）、例えば異なるフレーム間で人物のＩＤが異なる場合など、処理の継続が不要となった時点で（Ｓ９でＮｏ）、処理を継続しないと判断してＳ１１に移行する。Ｓ１１では、属性決定部１１ｃは、各フレームごとに求めた属性情報Ｑ_nを複数フレームで統合し、その結果に基づいて人物の属性Ｂを決定する。なお、上述したＳ５－２～Ｓ９、Ｓ１１の工程は、属性決定工程に対応する。 The processing after S2 is repeated for the (n+1)-th frame and subsequent images (S9, S10). , it determines not to continue the process, and shifts to S11. In S11, the attribute determination unit 11c integrates the attribute information Q _n obtained for each frame in a plurality of frames, and determines the attribute B of the person based on the result. The steps S5-2 to S9 and S11 described above correspond to the attribute determination step.

図１８は、１フレーム目および２フレーム目について得られた情報の一例を示している。この例では、ＩＤ＝０００１の人物について、Ｃ₁（Ｐ_20M）＝Ｃ₂（Ｐ_20M）＝０．８であり、Ｃ₁（Ｐ_40M）＝Ｃ₂（Ｐ_40M）＝０．０１となっている。画像中の人物の姿勢が「しゃがみ込み」であり、全身の一部のみ撮影された姿勢であるため、人物認識部１１ａが上記画像に基づいて人物の属性Ａ₁・Ａ₂の認識を精度よく行うことができなかった結果、「４０代男性」が正解であるにもかかわらず、「４０代男性」のスコアＣ₁（Ｐ_40M）およびＣ₂（Ｐ_40M）よりも、「２０代男性」のスコアＣ₁（Ｐ_20M）およびＣ₂（Ｐ_20M）のほうが高くなっている。 FIG. 18 shows an example of information obtained for the first and second frames. In this example, for the person with ID=0001, C ₁ (P _20M )=C ₂ (P _20M )=0.8 and C ₁ (P _40M )=C ₂ (P _40M )=0.01. ing. Since the posture of the person in the image is "squatting down" and only a part of the whole body is photographed, the person recognition unit 11a can accurately recognize the attributes _A1 and _A2 of the person based on the image. As a result of not being able to do so, despite the fact that "male in his 40s" is the correct answer, the score of "male in his 20s" was higher than the score C ₁ (P _40M ) and C ₂ (P _40M ) of "male in his 40s". 's scores C ₁ (P _20M ) and C ₂ (P _20M ) are higher.

そこで、人物の姿勢が属性Ａ₁・Ａ₂の認識に影響を与える姿勢であることを考慮し、図１８の例では、１フレーム目のスコアＣ₁の信頼度ｆ（Ａ₁）を、ｆ（Ａ₁）＝ｆ₁（Ａ₁）＝０．２に設定し、２フレーム目のスコアＣ₂の信頼度ｆ（Ａ₂）を、ｆ（Ａ₂）＝ｆ₂（Ａ₁）＝０．２に設定している。これにより、１フレーム目の属性Ａ₁の認識結果に上記姿勢の認識結果を加味した属性情報Ｑ₁として、Ｑ₁（Ｐ_20M）＝Ｃ₁（Ｐ_20M）×ｆ₁（Ａ₁）＝０．８×０．２＝０．１６が得られており、Ｑ₁（Ｐ_40M）＝Ｃ₁（Ｐ_40M）×ｆ₁（Ａ₁）＝０．０１×０．２＝０．００２が得られている。また、２フレーム目の属性Ａ₂の認識結果に上記姿勢の認識結果を加味した属性情報Ｑ₂として、Ｑ₂（Ｐ_20M）＝Ｃ₂（Ｐ_20M）×ｆ₂（Ａ₂）＝０．８×０．２＝０．１６が得られており、Ｑ₂（Ｐ_40M）＝Ｃ₂（Ｐ_40M）×ｆ₂（Ａ₃）＝０．０１×０．２＝０．００２が得られている。 Therefore, considering that the posture of a person affects the recognition of attributes A ₁ and _A ₂ , in _the example of FIG. (A ₁ )=f ₁ (A ₁ )=0.2, and the reliability f(A ₂ ) of the score C ₂ in the second frame is set to f(A ₂ )=f ₂ (A ₁ )=0 .2. As a result, attribute information Q ₁ obtained by adding the posture recognition result to the attribute A ₁ recognition result of the first frame is Q ₁ (P _20M )=C ₁ (P _20M )×f ₁ (A ₁ )=0. .8 x 0.2 = 0.16 is obtained, and Q ₁ (P _40M ) = C ₁ (P _40M ) x f ₁ (A ₁ ) = 0.01 x 0.2 = 0.002. It is Also, as the attribute information _Q2 obtained by adding the recognition result of the posture to the recognition result of the attribute _A2 in the second frame, _Q2 ( _P20M )= _C2 ( _P20M )× _f2 ( _A2 )=0. 8*0.2=0.16 is obtained, and _Q2 ( _P40M )= _C2 ( _P40M )* _f2 ( _A3 )=0.01*0.2=0.002 is obtained. ing.

図１９は、３フレーム目について得られた情報の一例を示している。この例では、ＩＤ＝０００１の人物について、スコアＣ₃（Ｐ_20M）が０．０５であり、スコアＣ₃（Ｐ_40M）が０．９となっている。画像中の人物の姿勢が「立ち止まり」であり、全身が撮影された姿勢であるため、人物認識部１１ａが上記画像に基づいて人物の属性Ａ₃の認識を精度よく行うことができた結果、「２０代男性」のスコアＣ₃（Ｐ_20M）よりも、「４０代男性」のスコアＣ₃（Ｐ_40M）のほうが高くなっている。これらのスコアＣ₃（Ｐ_20M）およびＣ₃（Ｐ_40M）の大小関係は、「４０代男性」を正解とする答えと対応する関係と言える。 FIG. 19 shows an example of information obtained for the third frame. In this example, the person with ID=0001 has a score C ₃ (P _20M ) of 0.05 and a score C ₃ (P _40M ) of 0.9. Since the posture of the person in the image is "stopping" and the whole body is photographed, the person recognition unit 11a was able to accurately recognize the attribute _A3 of the person based on the above image. The score C ₃ (P _40M ) of the “male in his 40s” is higher than the score C ₃ (P _20M ) of the “male in his 20s”. The magnitude relationship between these scores C ₃ (P _20M ) and C ₃ (P _40M ) can be said to be a relationship corresponding to the correct answer of “male in his 40s”.

人物の姿勢が属性Ａ₃の認識にほとんど影響を与えない姿勢であることを考慮し、図１９の例では、スコアＣ₃の信頼度ｆ（Ａ₃）を、ｆ（Ａ₃）＝ｆ₃（Ａ₃）＝１．０に設定している。これにより、各クラスについて、属性Ａ₃の認識結果に上記位置の認識結果を加味した属性情報Ｑ₃として、「２０代男性」のクラスについては、Ｑ₃（Ｐ_20M）＝Ｃ₃（Ｐ_20M）×ｆ₃（Ａ₃）＝０．０５×１．０＝０．０５が得られており、「４０代男性」のクラスについては、Ｑ₃（Ｐ_40M）＝Ｃ₃（Ｐ_40M）×ｆ₃（Ａ₃）＝０．９×１．０＝０．９が得られている。 Considering _that _the posture of a person has little effect on the recognition of attribute _A3 , in _the example of FIG _. (A ₃ )=1.0 is set. As a result, for each class, the attribute information _Q3 obtained by adding the recognition result of the position to the recognition result of the attribute _A3 is calculated as follows: _Q3 ( _P20M )= _C3 ( _P20M ) ₎ ×f ₃ (A ₃ ) ₌ _0.05 × _1.0 =0.05. f ₃ (A ₃ )=0.9×1.0=0.9 is obtained.

フレーム数ｎが３つの上記の例において、仮に、属性認識に影響を与える人物の姿勢（信頼度ｆ（Ａ_n））を考慮せずに属性Ｂを決定する場合（比較例３とする）、３フレームトータルでの「２０代男性」の認識結果の評価値Ｚ（Ｐ_20M）’は、スコアＣ_nを用いて以下の式で算出される。
Ｚ（Ｐ_20M）’＝Ｃ₁（Ｐ_20M）＋Ｃ₂（Ｐ_20M）＋Ｃ₃（Ｐ_20M）
＝０．８＋０．８＋０．０５
＝１．６５
一方、３フレームトータルでの「４０代男性」の認識結果の評価値Ｚ（Ｐ_40M）’ は、スコアＣ_nを用いて以下の式で算出される。
Ｚ（Ｐ_40M）’＝Ｃ₁（Ｐ_40M）＋Ｃ₂（Ｐ_40M）＋Ｃ₃（Ｐ_40M）
＝０．０１＋０．０１＋０．９
＝０．９２
上記より、Ｚ（Ｐ_40M）’＞Ｚ（Ｐ_40M）’であるため、この場合は、属性Ｂが「２０代男性」と決定されることになる。つまり、「４０代男性」が正解であるにもかかわらず、３フレームトータルでは、属性Ｂは「２０代男性」と誤った決定がされることになる。 In the above example where the number of frames n is 3, if the attribute B is determined without considering the posture of the person (reliability f(A _n )) that affects attribute recognition (comparative example 3), The evaluation value Z(P _20M )′ of the recognition result of “male in his twenties” in the total of three frames is calculated by the following formula using the score C _n .
Z( _P20M )'= _C1 ( _P20M )+ _C2 ( _P20M )+ _C3 ( _P20M )
= 0.8 + 0.8 + 0.05
= 1.65
On the other hand, the evaluation value Z(P _40M )′ of the recognition result of “male in his 40s” in the total of three frames is calculated by the following formula using the score C _n .
Z( _P40M )'= _C1 ( _P40M )+ _C2 ( _P40M )+ _C3 ( _P40M )
= 0.01 + 0.01 + 0.9
= 0.92
From the above, Z(P _40M )′>Z(P _40M )′, so in this case attribute B is determined to be “male in twenties”. In other words, even though the correct answer is "male in his 40s", attribute B is erroneously determined to be "male in his 20s" in the three-frame total.

これに対して、本実施形態のように、属性認識に影響を与える人物の姿勢（信頼度ｆ（Ａ_n））を考慮して属性Ｂを決定する場合、３フレームトータルでの「２０代男性」の認識結果の評価値Ｚ（Ｐ_20M）は、属性情報Ｑ_nを用いて以下の式で算出される。
Ｚ（Ｐ_20M）＝Ｑ₁（Ｐ_20M）＋Ｑ₂（Ｐ_20M）＋Ｑ₃（Ｐ_20M）
＝Ｃ₁（Ｐ_20M）・ｆ₁（Ａ₁）＋Ｃ₂（Ｐ_20M）・ｆ₂（Ａ₂）
＋Ｃ₃（Ｐ_20M）・ｆ₃（Ａ₃）
＝０．１６＋０．１６＋０．０５
＝０．３７
一方、３フレームトータルでの「４０代男性」の認識結果の評価値Ｚ（Ｐ_40M）は、属性情報Ｑ_nを用いて以下の式で算出される。
Ｚ（Ｐ_40M）＝Ｑ₁（Ｐ_40M）＋Ｑ₂（Ｐ_40M）＋Ｑ₃（Ｐ_40M）
＝Ｃ₁（Ｐ_40M）・ｆ₁（Ａ₁）＋Ｃ₂（Ｐ_40M）・ｆ₂（Ａ₂）
＋Ｃ₃（Ｐ_40M）・ｆ₃（Ａ₃）
＝０．００２＋０．００２＋０．９
＝０．９０４
上記より、Ｚ（Ｐ_20M）＜Ｚ（Ｐ_40M）であるため、属性決定部１１ｃは、３フレームトータルで、人物の属性Ｂは「４０代男性」であると決定する。この場合、決定された属性Ｂは、正しい属性と一致している。 On the other hand, as in the present embodiment, when attribute B is determined in consideration of a person's posture (reliability f(A _n )) that affects attribute recognition, "a male in his twenties" in three frames in total ” _is calculated by the following formula using the attribute information Q _n .
Z ( _P20M ) = _Q1 ( _P20M ) + _Q2 ( _P20M ) + _Q3 ( _P20M )
= _C1 ( _P20M )* _f1 ( _A1 )+ _C2 ( _P20M )* _f2 ( _A2 )
+ C ₃ (P _20M )・f ₃ (A ₃ )
= 0.16 + 0.16 + 0.05
= 0.37
On the other hand, the evaluation value Z (P _40M ) of the recognition result of "male in his 40s" in the total of three frames is calculated by the following formula using the attribute information _Qn .
Z ( _P40M ) = _Q1 ( _P40M ) + _Q2 ( _P40M ) + _Q3 ( _P40M )
= C ₁ (P _40M )·f ₁ (A ₁ )+C ₂ (P _40M )·f ₂ (A ₂ )
+ C ₃ (P _40M )・f ₃ (A ₃ )
= 0.002 + 0.002 + 0.9
= 0.904
From the above, since Z(P _20M )<Z(P _40M ), the attribute determining unit 11c determines that the attribute B of the person is "male in 40's" for the three frames in total. In this case, the determined attribute B matches the correct attribute.

以上のように、本実施形態においても、属性決定部１１ｃは、各フレームごとに、人物認識部１１ａによる属性Ａ_nの認識結果に、属性Ａ_nの認識に影響を与える事象（ここでは人物の姿勢）の認識結果を加味した属性情報Ｑ_nを各クラスについて求める（Ｓ５－２～Ｓ８）。これにより、属性決定部１１ｃが、各クラスについて、属性情報Ｑ_nを複数フレームで統合した結果に基づいて、人物の属性Ｂを決定する際に（Ｓ１１）、属性Ａ_nの認識に影響を与える姿勢が生じたフレームについては、属性Ａ_nの認識結果の最終的な属性決定への寄与度を相対的に小さくし、属性Ａ_nの認識に影響を与える姿勢が生じていないフレームについては、属性Ａ_nの認識結果の最終的な属性決定への寄与度を相対的に大きくすることができる。その結果、属性Ａ_nの認識に影響を与える姿勢が数フレーム（上記の例では１フレーム目、２フレーム目）にわたって続く場合でも、最終的な属性Ｂの決定に対する上記数フレームの悪影響（属性Ａ_nの認識精度の低下の影響）を低減して、複数フレームのトータルで人物の属性Ｂを精度よく決定することができる。 As described above, also in the present embodiment, the attribute determination unit _11c adds an event (in this case, a person _'s Attribute information Q _n that takes into account the recognition result of posture) is obtained for each class (S5-2 to S8). As a result, when the attribute determining unit 11c determines the attribute B of the person based on the result of integrating the attribute information Q _n for each class in a plurality of frames (S11), the recognition of the attribute A _n is affected. For frames in which _a pose occurs, the degree of contribution of the recognition result of attribute A _n to final attribute determination is made relatively small. The degree of contribution of the recognition result of A _n to final attribute determination can be made relatively large. As a result, even if the posture that affects the recognition of attribute A _n lasts for several frames (the first and second frames in the above example), the adverse effect of the above few frames on the final determination of attribute B (attribute A _n ) can be reduced, and the attribute B of the person can be accurately determined in total for a plurality of frames.

また、上記の信頼度ｆ（Ａ_n）は、属性Ａ_nの認識に影響を与える事象、つまり、画像内における人物の像から把握される人物の姿勢に基づいて設定されている。これにより、属性決定部１１ｃは、上記信頼度ｆ（Ａ_n）を用いて、人物の姿勢を考慮した適切な属性情報Ｑ_nを取得することができる。 The reliability f(A _n ) described above is set based on an event that affects the recognition of the attribute A _n , that is, the posture of the person ascertained from the image of the person in the image. Accordingly, the attribute determination unit 11c can obtain appropriate attribute information Q _n in consideration of the posture of the person using the reliability f(A _n ).

また、上記の信頼度ｆ（Ａ_n）は、画像内における人物の姿勢が、全身の一部のみが撮影された姿勢であるか否かに基づいて設定されている。これにより、人物の姿勢が、全身が撮影された姿勢である場合とそうでない場合とで信頼度ｆ（Ａ_n）に差を持たせて、人物の姿勢に応じた属性情報Ｑ_nを取得することができる。 Further, the reliability f(A _n ) is set based on whether or not the posture of the person in the image is a posture in which only a part of the whole body is photographed. As a result, the attribute information Q _n corresponding to the posture of the person is acquired by giving a difference in reliability f(A _n ) depending on whether the posture of the person is a posture in which the whole body is photographed or not. be able to.

特に、本実施形態では、画像内における人物の姿勢が、全身の一部のみが撮影された姿勢である場合の信頼度ｆ（Ａ_n）は、画像内における人物の姿勢が、全身が撮影された姿勢である場合の信頼度ｆ（Ａ_n）よりも低く設定されている。これにより、全身の一部のみが撮影された画像に基づき、低い精度で認識される人物の属性Ａ_nについては、その認識結果の信頼度ｆ（Ａ_n）を下げて、最終的な属性Ｂの決定に対する上記認識結果の寄与度を低くした属性情報Ｑ_nを得ることができる。一方、全身が撮影された画像に基づき、高い精度で認識される人物の属性Ａ_nについては、その認識結果の信頼度ｆ（Ａ_n）を上げて、最終的な属性Ｂの決定に対する上記認識結果の寄与度を高めた属性情報Ｑ_nを得ることができる。 In particular, in the present embodiment, the reliability f(A _n ) when the posture of a person in an image is a posture in which only a part of the whole body is photographed is calculated as follows: It is set to be lower than the reliability f(A _n ) in the case of the posture that is in the normal position. As a result, for a person's attribute A _n that is recognized with low accuracy based on an image in which only a part of the whole body is captured, the reliability f(A _n ) of the recognition result is lowered, and the final attribute B It is possible to obtain the attribute information Q _n in which the degree of contribution of the recognition result to the determination of is reduced. On the other hand, for a person's attribute A _n that is recognized with high accuracy based on an image of the whole body, the reliability f(A _n ) of the recognition result is increased, and the above recognition for the final determination of attribute B is performed. It is possible to obtain attribute information Q _n with a higher degree of contribution of the result.

＜実施の形態４＞
本実施形態では、属性Ａ_nの認識に影響を与える事象として、人物の位置を考えている点で実施の形態１と共通しているが、画像内での複数人の人物の像の位置関係、つまり、各人物矩形の位置関係を加味して属性Ｂを決定している点で、実施の形態１とは異なっている。以下、実施の形態１と異なる部分について説明する。 <Embodiment 4>
This embodiment is similar to the first embodiment in that the position of a person is considered as an event that affects the recognition of the attribute A _n . That is, the attribute B is determined in consideration of the positional relationship of each person rectangle, which is different from the first embodiment. The parts different from the first embodiment will be described below.

図２０は、２人の人物を上方から撮影した任意のフレームの画像２ａを模式的に示している。例えば、店舗内（実空間）において、２人の人物が物理的に密着していたり、一方の人物が他方の人物に密着せずに覆いかぶさる状態であった場合には、２人の人物を上方から撮影して得られる画像２ａでは、同図のように、２人の人物の像Ｐａ・Ｐｂが互いに重なる。その結果、図２１に示すように、画像２ａ内では、２人の人物の像Ｐａ・Ｐｂの位置を規定する人物矩形Ｒａ・Ｒｂも互いに重なる。この場合、人物矩形Ｒａ内の情報のうち、人物矩形Ｒｂと重なる部分の情報は、人物矩形Ｒｂ内の像Ｐｂに対応する人物の属性の認識に影響を及ぼす。同様に、人物矩形Ｒｂ内の情報のうち、人物矩形Ｒａと重なる部分の情報は、人物矩形Ｒａ内の像Ｐａに対応する人物の属性の認識に影響を及ぼす。その結果、双方の人物の属性の認識精度が低下する可能性がある。一方、画像２ａ内で各人物矩形Ｒａ・Ｒｂが離れている場合は、各人物矩形Ｒａ・Ｒｂ内の情報が、各人物の属性認識に互いに影響を及ぼすことはなく、各人物の属性の認識精度は向上する。 FIG. 20 schematically shows an arbitrary frame image 2a of two persons photographed from above. For example, in a store (real space), if two people are physically in close contact, or if one person is not in close contact with the other and is covering the other, the two people In the image 2a obtained by photographing from above, images Pa and Pb of two persons overlap each other as shown in the figure. As a result, as shown in FIG. 21, the person rectangles Ra and Rb that define the positions of the images Pa and Pb of the two persons also overlap each other in the image 2a. In this case, of the information within the person rectangle Ra, the information of the portion overlapping the person rectangle Rb affects the recognition of the attribute of the person corresponding to the image Pb within the person rectangle Rb. Similarly, of the information within the person rectangle Rb, the information of the portion overlapping the person rectangle Ra affects the recognition of the attribute of the person corresponding to the image Pa within the person rectangle Ra. As a result, there is a possibility that the recognition accuracy of the attributes of both persons will decrease. On the other hand, when the person rectangles Ra and Rb are separated from each other in the image 2a, the information in the person rectangles Ra and Rb do not mutually affect the attribute recognition of each person. Accuracy is improved.

そこで、本実施形態では、属性決定部１１ｃは、画像２ａ内で、人物の像Ｐａの位置を規定する一の人物矩形Ｒａが、他の人物の像Ｐｂの位置を規定する他の人物矩形Ｒｂと重なっている場合には、信頼度ｆ（Ａ_n）を１未満に設定し、一の人物矩形Ｒａが他の人物矩形Ｒｂと離れている場合には、信頼度ｆ（Ａ_n）を１に設定して、各フレームごとに属性情報Ｑ_nを各クラスについて求め、求めた属性情報Ｑ_nを複数フレームで統合した結果に基づいて、人物の属性Ｂを決定するようにしている。 Therefore, in the present embodiment, the attribute determination unit 11c determines that one person rectangle Ra that defines the position of the person image Pa in the image 2a is replaced by another person rectangle Rb that defines the position of another person image Pb. If one person rectangle Ra _is separated from another person rectangle Rb, the reliability f(A _n ) is set to 1 , the attribute information Q _n is obtained for each class for each frame, and the attribute B of the person is determined based on the result of integrating the obtained attribute information Q _n in a plurality of frames.

図２２は、本実施形態の属性決定システム１における処理の流れを示すフローチャートである。なお、図２２のフローチャートは、図７のフローチャートのＳ３およびＳ５を、それぞれＳ３－３およびＳ５－３に置き換えたものである。なお、ここでは、実施の形態１と同様に、属性を判断する対象となる人物（ＩＤ＝０００１の人物）は、「４０代男性」であるとする（「４０代男性」が属性として正解であるとする）。 FIG. 22 is a flow chart showing the flow of processing in the attribute determination system 1 of this embodiment. In the flowchart of FIG. 22, S3 and S5 in the flowchart of FIG. 7 are replaced with S3-3 and S5-3, respectively. Here, as in Embodiment 1, it is assumed that the person (person with ID=0001) whose attribute is to be determined is "male in his 40s" ("male in his 40s" is the correct attribute). assuming there is).

Ｓ３－３（人物認識工程）では、人物認識部１１ａは、実施の形態１と同様の手法で、２人の人物を上方から撮影したｎフレーム目の画像に基づいて、２人の人物の像を認識し、一方の人物の人物矩形Ｒ_naと、その人物矩形Ｒ_na内の像に対応する人物の属性Ａ_nと、属性Ａ_nの認識に影響を与える事象（ここでは他の人物の人物矩形Ｒ_nb）とを認識するとともに、属性Ａ_nの認識結果の確からしさを示すスコアＣ_nを算出する（人物認識工程）。得られた認識結果およびスコアＣ_nは、記憶部１２に記憶される。Ｓ４では、一の人物矩形Ｒ_naの人物に、ＩＤ＝０００１の識別番号を付与し、他の人物矩形Ｒ_nbの人物に、ＩＤ＝０００２の識別番号を付与する。 In S3-3 (person recognition step), the person recognition unit 11a uses the same method as in the first embodiment, based on the n-th frame image of the two persons photographed from above, to identify the images of the two persons. , the person rectangle R _na of one person, the attribute A _n of the person corresponding to the image within the person rectangle R _na , and an event that affects the recognition of the attribute A _n (here, the person R _nb ) are recognized, and a score C _n indicating the certainty of the recognition result of the attribute A _n is calculated (person recognition step). The obtained recognition results and scores C _n are stored in the storage unit 12 . In S4, the identification number of ID=0001 is given to the person of one person rectangle _Rna , and the identification number of ID=0002 is given to the person of another person rectangle _Rnb .

Ｓ５－３では、属性決定部１１ｃは、Ｓ３－３での属性Ａ_nの認識に影響を与える事象の認識結果（他の人物矩形Ｒ_nbが一の人物矩形Ｒ_naと重なっているか否か）に対応してスコアＣ_nの信頼度ｆ（Ａ_n）を設定する（Ｓ５－３～Ｓ７）。つまり、画像内で、一の人物矩形Ｒ_naと他の人物矩形Ｒ_nbとが重なっており、他の人物矩形Ｒ_nbが一の人物矩形Ｒ_na内の像に対応する人物（ＩＤ＝０００１）の属性認識に影響を与える場合（Ｓ５－３でＹｅｓ）、属性決定部１１ｃは、上記認識結果に対応して、スコアＣ_nの信頼度ｆ（Ａ_n）を１未満に設定する（Ｓ６）。一方、画像内で、一の人物矩形Ｒ_naと他の人物矩形Ｒ_nbとが離れており、他の人物矩形Ｒ_nbが一の人物矩形Ｒ_na内の像に対応する人物の属性認識に影響を与えない場合（Ｓ５－２でＮｏ）、属性決定部１１ｃは、スコアＣ_nの信頼度ｆ（Ａ_n）を１に設定する（Ｓ７）。 In S5-3, the attribute determination unit 11c determines the recognition result of the event that affects the recognition of the attribute A _n in S3-3 (whether or not another person rectangle R _nb overlaps one person rectangle R _na ). The reliability f(A _n ) of the score C _n is set corresponding to (S5-3 to S7). That is, in the image, one person rectangle R _na and another person rectangle R _nb overlap each other, and the other person rectangle R _nb corresponds to the image within the one person rectangle R _na (ID=0001). (Yes in S5-3), the attribute determining unit 11c sets the reliability f(A _n ) of the score C _n to less than 1 (S6) . On the other hand, one person rectangle _Rna and another person rectangle _Rnb are separated from each other in the image, and the other person rectangle _Rnb affects the attribute recognition of the person corresponding to the image in one person rectangle _Rna . is not given (No in S5-2), the attribute determining unit 11c sets the reliability f(A _n ) of the score C _n to 1 (S7).

次に、属性決定部１１ｃは、Ｓ３－３で算出されたスコアＣ_nと、上記で設定した信頼度ｆ（Ａ_n）とに基づいて、属性情報Ｑ_n（＝Ｑ₁）を属性Ａ_nのクラスごとに求める（Ｓ８）。求めた属性情報Ｑ_nは、記憶部１２に人物の識別情報（ＩＤ＝０００１）と対応付けて記憶される。 Next, the attribute determination unit 11c converts the attribute information Q _n (=Q ₁ ) to the attribute A _{n based on the score C n} calculated in S3-3 and the reliability f(A _n ) set above _. (S8). The obtained attribute information Q _n is stored in the storage unit 12 in association with the person's identification information (ID=0001).

（ｎ＋１）フレーム目以降の画像についてもＳ２以降の処理を繰り返し（Ｓ９、Ｓ１０）、処理の継続が不要となった時点でＳ１１に移行する。Ｓ１１では、属性決定部１１ｃは、各フレームごとに求めた属性情報Ｑ_nを複数フレームで統合し、その結果に基づいて人物の属性Ｂを決定する。なお、上述したＳ５－３～Ｓ９、Ｓ１１の工程は、属性決定工程に対応する。 The processing after S2 is repeated for the (n+1)-th frame and subsequent images (S9, S10), and when the continuation of the processing becomes unnecessary, the process proceeds to S11. In S11, the attribute determination unit 11c integrates the attribute information Q _n obtained for each frame in a plurality of frames, and determines the attribute B of the person based on the result. The steps S5-3 to S9 and S11 described above correspond to the attribute determination step.

図２３は、１フレーム目および２フレーム目について得られた情報の一例を示している。この例では、ＩＤ＝０００１の人物について、Ｃ₁（Ｐ_20M）＝Ｃ₂（Ｐ_20M）＝０．７であり、Ｃ₁（Ｐ_40M）＝Ｃ₂（Ｐ_40M）＝０．０１となっている。画像中で人物矩形Ｒ_na・Ｒ_nbが重なっており、人物認識部１１ａが、人物矩形Ｒ_na内の像に対応する人物（ＩＤ＝０００１）の属性Ａ₁・Ａ₂の認識を精度よく行うことができなかった結果、「４０代男性」が正解であるにもかかわらず、「４０代男性」のスコアＣ₁（Ｐ_40M）およびＣ₂（Ｐ_40M）よりも、「２０代男性」のスコアＣ₁（Ｐ_20M）およびＣ₂（Ｐ_20M）のほうが高くなっている。 FIG. 23 shows an example of information obtained for the first and second frames. In this example, for the person with ID=0001, C ₁ (P _20M )=C ₂ (P _20M )=0.7 and C ₁ (P _40M )=C ₂ (P _40M )=0.01. ing. The person rectangles _Rna and _Rnb overlap in the image, and the person recognition unit 11a accurately recognizes the attributes _A1 and _A2 of the person (ID=0001) corresponding to the image within the person rectangle _Rna . As a result, despite the fact that "male in his 40s" was the correct answer, the score of "male in his 20s" was higher than the score C ₁ (P _40M ) and C ₂ (P _40M ) of "male in his 40s". Scores C ₁ (P _20M ) and C ₂ (P _20M ) are higher.

そこで、人物矩形Ｒ_nbが、人物（ＩＤ＝０００１）の属性Ａ₁・Ａ₂の認識に影響を与える位置にあることを考慮し、図２３の例では、１フレーム目のスコアＣ₁の信頼度ｆ（Ａ₁）を、ｆ（Ａ₁）＝ｆ₁（Ａ₁）０．１に設定し、２フレーム目のスコアＣ₂の信頼度ｆ（Ａ₂）を、ｆ（Ａ₂）＝ｆ₂（Ａ₂）＝０．１に設定している。これにより、１フレーム目の属性Ａ₁の認識結果に上記人物矩形Ｒ_nbの認識結果（位置）を加味した属性情報Ｑ₁として、Ｑ₁（Ｐ_20M）＝Ｃ₁（Ｐ_20M）×ｆ₁（Ａ₁）＝０．７×０．１＝０．０７が得られており、Ｑ₁（Ｐ_40M）＝Ｃ₁（Ｐ_40M）×ｆ₁（Ａ₁）＝０．０１×０．１＝０．００１が得られている。また、２フレーム目の属性Ａ₂の認識結果に上記人物矩形Ｒ_nbの認識結果（位置）を加味した属性情報Ｑ₂として、Ｑ₂（Ｐ_20M）＝Ｃ₂（Ｐ_20M）×ｆ₂（Ａ₂）＝０．７×０．１＝０．０７が得られており、Ｑ₂（Ｐ_40M）＝Ｃ₂（Ｐ_40M）×ｆ₂（Ａ₂）＝０．０１×０．１＝０．００１が得られている。 Considering that the person rectangle R _nb is located at a position that affects the recognition of the attributes A ₁ and A ₂ of the person (ID=0001), in the example of _FIG . The degree f(A ₁ ) is set to f(A ₁ )=f ₁ (A ₁ ) 0.1, and the reliability f(A ₂ ) of the score C ₂ in the second frame is set to f(A ₂ )= It is set to f ₂ (A ₂ )=0.1. As a result, as attribute information _Q1 obtained by adding the recognition result (position) of the person rectangle _Rnb to the recognition result of the attribute _A1 of the first frame, _Q1 ( _P20M )= _C1 ( _P20M )× _f1 (A ₁ )=0.7×0.1=0.07 is obtained and Q ₁ (P _40M )=C ₁ (P _40M )×f ₁ (A ₁ )=0.01×0.1 = 0.001 is obtained. Also, as attribute information _Q2 obtained by adding the recognition result (position) of the person rectangle _Rnb to the recognition result of the attribute _A2 in the second frame, _Q2 ( _P20M )= _C2 ( _P20M )× _f2 ( A ₂ ) = 0.7 x 0.1 = 0.07 and Q ₂ (P _40M ) = C ₂ (P _40M ) x f ₂ (A ₂ ) = 0.01 x 0.1 = 0.001 is obtained.

図２４は、３フレーム目について得られた情報の一例を示している。この例では、ＩＤ＝０００１の人物について、スコアＣ₃（Ｐ_20M）が０．０１であり、スコアＣ₃（Ｐ_40M）が０．９となっている。画像中で人物矩形Ｒ_na・Ｒ_nbが互いに離れており、人物認識部１１ａが上記画像（人物矩形Ｒ_na内の人物（ＩＤ＝０００１）の像）に基づいて人物の属性Ａ₃の認識を精度よく行うことができた結果、「２０代男性」のスコアＣ₃（Ｐ_20M）よりも、「４０代男性」のスコアＣ₃（Ｐ_40M）のほうが高くなっている。これらのスコアＣ₃（Ｐ_20M）およびＣ₃（Ｐ_40M）の大小関係は、「４０代男性」を正解とする答えと対応する関係と言える。 FIG. 24 shows an example of information obtained for the third frame. In this example, for the person with ID=0001, the score C ₃ (P _20M ) is 0.01 and the score C ₃ (P _40M ) is 0.9. In the image, the person rectangles _Rna and _Rnb are separated from each other, and the person recognition unit 11a recognizes the person's attribute _A3 based on the image (the image of the person (ID=0001) in the person rectangle _Rna ). As a result of being able to do this with good accuracy, the score C ₃ (P _40M ) for the "male in his 40s" is higher than the score C ₃ (P _20M ) for the "male in his 20s". The magnitude relationship between these scores C ₃ (P _20M ) and C ₃ (P _40M ) can be said to be a relationship corresponding to the correct answer of “male in his 40s”.

人物矩形Ｒ_nbが、人物（ＩＤ＝０００１）の属性Ａ₃の認識にほとんど影響を与えない位置であることを考慮し、図２４の例では、スコアＣ₃の信頼度ｆ（Ａ₃）を、ｆ（Ａ₃）＝ｆ₃（Ａ₃）＝１．０に設定している。これにより、各クラスについて、人物（ＩＤ＝０００１）の属性Ａ₃の認識結果に上記人物矩形Ｒ_nbの認識結果を加味した属性情報Ｑ₃として、「２０代男性」のクラスについては、Ｑ₃（Ｐ_20M）＝Ｃ₃（Ｐ_20M）×ｆ₃（Ａ₃）＝０．０１×１．０＝０．０１が得られており、「４０代男性」のクラスについては、Ｑ₃（Ｐ_40M）＝Ｃ₃（Ｐ_40M）×ｆ₃（Ａ₃）＝０．９×１．０＝０．９が得られている。 Considering that the person rectangle _Rnb is _a position that hardly affects the recognition of the attribute _A3 of the person (ID=0001), in the example of FIG _. , f(A ₃ )=f ₃ (A ₃ )=1.0. As a result, for each class, the attribute information _Q3 obtained by adding the recognition result of the person rectangle _Rnb to the attribute _A3 _of the person (ID=0001) is obtained. (P _20M ) = C ₃ (P _20M ) x f ₃ (A ₃ ) = 0.01 x 1.0 = _0.01 . _40M )= _C3 ( _P40M )* _f3 ( _A3 )=0.9*1.0=0.9 is obtained.

フレーム数ｎが３つの上記の例において、仮に、属性認識に影響を与える人物矩形Ｒ_nbの位置（信頼度ｆ（Ａ_n））を考慮せずに属性Ｂを決定する場合（比較例４とする）、３フレームトータルでの「２０代男性」の認識結果の評価値Ｚ（Ｐ_20M）’は、スコアＣ_nを用いて以下の式で算出される。
Ｚ（Ｐ_20M）’＝Ｃ₁（Ｐ_20M）＋Ｃ₂（Ｐ_20M）＋Ｃ₃（Ｐ_20M）
＝０．７＋０．７＋０．０１
＝１．４１
一方、３フレームトータルでの「４０代男性」の認識結果の評価値Ｚ（Ｐ_40M）’ は、スコアＣ_nを用いて以下の式で算出される。
Ｚ（Ｐ_40M）’＝Ｃ₁（Ｐ_40M）＋Ｃ₂（Ｐ_40M）＋Ｃ₃（Ｐ_40M）
＝０．０１＋０．０１＋０．９
＝０．９２
上記より、Ｚ（Ｐ_40M）’＞Ｚ（Ｐ_40M）’であるため、この場合は、属性Ｂが「２０代男性」と決定されることになる。つまり、「４０代男性」が正解であるにもかかわらず、３フレームトータルでは、属性Ｂは「２０代男性」と誤った決定がされることになる。 In the above example where the number of frames n is 3, if the attribute B is determined without considering the position (reliability f(A _n )) of the person rectangle R _nb that affects attribute recognition (comparative example 4 and ), and the evaluation value Z(P _20M )′ of the recognition result of “male in his twenties” in the total of three frames is calculated by the following formula using the score C _n .
Z( _P20M )'= _C1 ( _P20M )+ _C2 ( _P20M )+ _C3 ( _P20M )
= 0.7 + 0.7 + 0.01
= 1.41
On the other hand, the evaluation value Z(P _40M )′ of the recognition result of “male in his 40s” in the total of three frames is calculated by the following formula using the score C _n .
Z( _P40M )'= _C1 ( _P40M )+ _C2 ( _P40M )+ _C3 ( _P40M )
= 0.01 + 0.01 + 0.9
= 0.92
From the above, Z(P _40M )′>Z(P _40M )′, so in this case attribute B is determined to be “male in twenties”. In other words, even though the correct answer is "male in his 40s", attribute B is erroneously determined to be "male in his 20s" in the three-frame total.

これに対して、本実施形態のように、属性認識に影響を与える人物矩形Ｒ_nbの位置（信頼度ｆ（Ａ_n））を考慮して属性Ｂを決定する場合、３フレームトータルでの「２０代男性」の認識結果の評価値Ｚ（Ｐ_20M）は、属性情報Ｑ_nを用いて以下の式で算出される。
Ｚ（Ｐ_20M）＝Ｑ₁（Ｐ_20M）＋Ｑ₂（Ｐ_20M）＋Ｑ₃（Ｐ_20M）
＝Ｃ₁（Ｐ_20M）・ｆ₁（Ａ₁）＋Ｃ₂（Ｐ_20M）・ｆ₂（Ａ₂）
＋Ｃ₃（Ｐ_20M）・ｆ₃（Ａ₃）
＝０．０７＋０．０７＋０．０１
＝０．１５
一方、３フレームトータルでの「４０代男性」の認識結果の評価値Ｚ（Ｐ_40M）は、属性情報Ｑ_nを用いて以下の式で算出される。
Ｚ（Ｐ_40M）＝Ｑ₁（Ｐ_40M）＋Ｑ₂（Ｐ_40M）＋Ｑ₃（Ｐ_40M）
＝Ｃ₁（Ｐ_40M）・ｆ₁（Ａ₁）＋Ｃ₂（Ｐ_40M）・ｆ₂（Ａ₂）
＋Ｃ₃（Ｐ_40M）・ｆ₃（Ａ₃）
＝０．００１＋０．００１＋０．９
＝０．９０２
上記より、Ｚ（Ｐ_20M）＜Ｚ（Ｐ_40M）であるため、属性決定部１１ｃは、３フレームトータルで、人物の属性Ｂは「４０代男性」であると決定する。この場合、決定された属性Ｂは、正しい属性と一致している。 On the other hand, as in the present embodiment, when the attribute B is determined in consideration of the position (reliability f(A _n )) of the person rectangle R _nb that affects attribute recognition, " The evaluation value Z (P _20M ) of the recognition result of "male in his twenties" is calculated by the following formula using the attribute information Q _n .
Z ( _P20M ) = _Q1 ( _P20M ) + _Q2 ( _P20M ) + _Q3 ( _P20M )
= _C1 ( _P20M )* _f1 ( _A1 )+ _C2 ( _P20M )* _f2 ( _A2 )
+ C ₃ (P _20M )・f ₃ (A ₃ )
= 0.07 + 0.07 + 0.01
= 0.15
On the other hand, the evaluation value Z (P _40M ) of the recognition result of "male in his 40s" in the total of three frames is calculated by the following formula using the attribute information _Qn .
Z ( _P40M ) = _Q1 ( _P40M ) + _Q2 ( _P40M ) + _Q3 ( _P40M )
= C ₁ (P _40M )·f ₁ (A ₁ )+C ₂ (P _40M )·f ₂ (A ₂ )
+ C ₃ (P _40M )・f ₃ (A ₃ )
= 0.001 + 0.001 + 0.9
= 0.902
From the above, since Z(P _20M )<Z(P _40M ), the attribute determining unit 11c determines that the attribute B of the person is "male in 40's" for the three frames in total. In this case, the determined attribute B matches the correct attribute.

以上のように、本実施形態においても、属性決定部１１ｃは、各フレームごとに、人物認識部１１ａによる属性Ａ_nの認識結果に、属性Ａ_nの認識に影響を与える事象（人物の位置（特に人物矩形Ｒ_na・Ｒ_nbの重なり））の認識結果を加味した属性情報Ｑ_nを各クラスについて求める（Ｓ５－３～Ｓ８）。これにより、属性決定部１１ｃが、各クラスについて、属性情報Ｑ_nを複数フレームで統合した結果に基づいて、人物の属性Ｂを決定する際に（Ｓ１１）、属性Ａ_nの認識に影響を与える事象が生じたフレーム（画像内で人物矩形Ｒ_na・Ｒ_nbが重なっているフレーム）については、属性Ａ_nの認識結果の最終的な属性決定への寄与度を相対的に小さくし、属性Ａ_nの認識に影響を与える事象が生じていないフレーム（画像内で人物矩形Ｒ_na・Ｒ_nbが離れているフレーム）については、属性Ａ_nの認識結果の最終的な属性決定への寄与度を相対的に大きくすることができる。その結果、属性Ａ_nの認識に影響を与える事象（人物矩形Ｒ_na・Ｒ_nbの重なり）が数フレーム（上記の例では１フレーム目、２フレーム目）にわたって続く場合でも、最終的な属性Ｂの決定に対する上記数フレームの悪影響（属性Ａ_nの認識精度の低下の影響）を低減して、複数フレームのトータルで人物の属性Ｂを精度よく決定することができる。 As described above, also in _the present embodiment, the attribute determining unit _11c adds an event (person's position ( In particular, the attribute information Q _n that takes into account the recognition results of the person rectangles R _na and R _nb overlapping)) is obtained for each class (S5-3 to S8). As a result, when the attribute determining unit 11c determines the attribute B of the person based on the result of integrating the attribute information Q _n for each class in a plurality of frames (S11), the recognition of the attribute A _n is affected. For the frame in which the event occurred (the frame in which the person rectangles _Rna and _Rnb overlap in the image), the contribution of the attribute A _n to the final attribute determination of the recognition result is made relatively small, and the attribute A For frames in which an event affecting the recognition of _n does not occur (frames in which the person rectangles _Rna and _Rnb are separated in the image), the degree of contribution of the recognition result of attribute A _n to final attribute determination is It can be relatively large. As a result, even if the event that affects the recognition of the attribute A _n (overlapping of the person rectangles R _na and R _nb ) continues over several frames (the first and second frames in the above example), the final attribute B It is possible to reduce the adverse effect of the above several frames on the determination of (influence of deterioration in the recognition accuracy of attribute A _n ), and accurately determine the attribute B of the person in the total of a plurality of frames.

また、上記の信頼度ｆ（Ａ_n）は、画像内で、一の人物矩形Ｒ_naが他の人物矩形Ｒ_nbと重なっているか否かに基づいて設定されている。これにより、画像内で、一の人物矩形Ｒ_naが他の人物矩形Ｒ_nbと重なっている場合とそうでない場合とで信頼度ｆ（Ａ_n）に差を持たせて、人物矩形Ｒ_na・Ｒ_nbの位置に応じた属性情報Ｑ_nを取得することができる。 Further, the reliability f(A _n ) is set based on whether or not one person rectangle R _na overlaps another person rectangle R _nb in the image. As a result, the reliability f(A _n ) differs depending on whether or not one person rectangle R _na overlaps another person rectangle R _nb in the image, and the person rectangle R _na · Attribute information Q _n corresponding to the position of R _nb can be obtained.

特に、本実施形態では、画像内で、一の人物矩形Ｒ_naが他の人物矩形Ｒ_nbと重なっている場合の信頼度ｆ（Ａ_n）は、一の人物矩形Ｒ_naが他の人物矩形Ｒ_nbから離れている場合の信頼度ｆ（Ａ_n）よりも低く設定されている。これにより、人物矩形Ｒ_na・Ｒ_nbが重なっている画像に基づき、低い精度で認識される人物の属性Ａ_nについては、その認識結果の信頼度ｆ（Ａ_n）を下げて、最終的な属性Ｂの決定に対する上記認識結果の寄与度を低くした属性情報Ｑ_nを得ることができる。一方、人物矩形Ｒ_na・Ｒ_nbが離れている画像に基づき、高い精度で認識される人物の属性Ａ_nについては、その認識結果の信頼度ｆ（Ａ_n）を上げて、最終的な属性Ｂの決定に対する上記認識結果の寄与度を高めた属性情報Ｑ_nを得ることができる。 In particular, in the present embodiment, the reliability f(A _n ) when one person rectangle R _na overlaps another person rectangle R _nb in an image is _calculated as follows. It is set lower than the reliability f(A _n ) when far from R _nb . As a result, based on the image in which the person rectangles R _na and R _nb are superimposed, the reliability f(A _n ) of the recognition result for the attribute A _n of the person recognized with low accuracy is lowered, and the final Attribute information Q _{n in} which the degree of contribution of the recognition result to determination of attribute B is reduced can be obtained. On the other hand, for a person attribute A _n that is recognized with high accuracy based on an image in which the person rectangles R _na and R _nb are distant, the reliability f(A _n ) of the recognition result is increased, and the final attribute It is possible to obtain attribute information Q _n in which the degree of contribution of the recognition result to the determination of B is increased.

＜プログラムおよび記録媒体＞
以上の各実施の形態で説明した属性決定装置３は、例えば、所定のプログラム（アプリケーションソフトウェア）をインストールしたコンピュータ（ＰＣ）で構成することができる。上記プログラムをコンピュータ（例えばＣＰＵとしての制御部１６）が読み取って実行することにより、属性決定装置３の各部を動作させて上述した各処理（各工程）を実行させることができる。このようなプログラムは、例えばネットワークを介して外部からダウンロードすることによって取得されて記憶部１２に記憶される。また、上記プログラムは、例えばＣＤ－ＲＯＭ（Compact Disk-Read Only Memory）などのコンピュータ読取可能な記録媒体に記録され、この記録媒体から上記プログラムをコンピュータが読み取って記憶部１２に記憶する形態であってもよい。 <Program and recording medium>
The attribute determination device 3 described in each of the above embodiments can be configured by, for example, a computer (PC) in which a predetermined program (application software) is installed. A computer (for example, the control unit 16 as a CPU) reads and executes the above program, so that each unit of the attribute determination device 3 can be operated to execute each process (each process) described above. Such a program is acquired by, for example, downloading from the outside via a network and stored in the storage unit 12 . The program is recorded in a computer-readable recording medium such as a CD-ROM (Compact Disk-Read Only Memory), and the computer reads the program from the recording medium and stores the program in the storage unit 12. may

＜補足＞
以上の各実施の形態を組み合わせて、複数フレームのトータルで人物の属性を決定することも可能である。例えば、属性の認識に影響を与える事象として、人物の位置、行動、姿勢を適宜組み合わせて信頼度を設定し、上記信頼度とスコアとに基づいて各フレームごとに属性情報を求め、複数フレームで属性情報を統合することによって、属性を決定するようにしてもよい。 <Supplement>
By combining the above embodiments, it is possible to determine the attributes of a person in total for a plurality of frames. For example, as an event that affects the recognition of attributes, a reliability level is set by appropriately combining a person's position, action, and posture, and attribute information is obtained for each frame based on the above reliability level and score, Attributes may be determined by integrating attribute information.

以上で説明した本実施形態の属性決定装置、属性決定システムおよび属性決定方法は、以下のように表現されてもよい。また、本実施形態で説明した内容は、以下のプログラムおよび記録媒体を含む。 The attribute determination device, attribute determination system, and attribute determination method of this embodiment described above may be expressed as follows. Further, the contents described in this embodiment include the following programs and recording media.

１．人物を上方から撮影した各フレームの画像に基づいて、前記人物の属性を決定する属性決定装置であって、
各フレームの前記画像に基づいて、前記画像内における前記人物の像の情報を示す人物情報と、前記人物の属性と、前記属性の認識に影響を与える事象とを、各フレームごとに認識する人物認識部と、
各フレームの前記人物情報に基づいて、各フレーム間で前記人物の像が同一人の像であるか否かを判断する人物同定部と、
各フレーム間で前記人物の像が同一人の像であると判断された前記人物に関して、各フレームごとに、前記属性の認識結果に前記事象の認識結果を加味した属性情報を、認識した前記属性の各クラスについて求め、前記各クラスについて、前記属性情報を複数フレームで統合した結果に基づいて、前記人物の前記属性を決定する属性決定部とを備えていることを特徴とする属性決定装置。 1. An attribute determining device for determining an attribute of a person based on an image of each frame photographing the person from above,
A person who recognizes, for each frame, based on the image of each frame, person information indicating information of the image of the person in the image, attributes of the person, and events affecting recognition of the attributes. a recognition unit;
a person identification unit that determines whether or not the image of the person is the image of the same person between frames based on the person information of each frame;
With respect to the person whose images are determined to be the same person in each frame, the recognized attribute information obtained by adding the recognition result of the event to the recognition result of the attribute is added for each frame. an attribute determination unit that determines the attributes of the person based on results obtained for each class of attributes and integrating the attribute information in a plurality of frames for each of the classes. .

２．前記人物認識部は、各フレームの前記画像に基づいて、前記属性の認識結果の確からしさを示すスコアを算出し、
前記属性決定部は、前記事象の認識結果に対応して前記スコアの信頼度を設定し、前記人物認識部によって算出された前記スコアと、前記信頼度とに基づいて、前記クラスごとに前記属性情報を求めることを特徴とする前記１に記載の属性決定装置。 2. The person recognition unit calculates a score indicating the likelihood of the recognition result of the attribute based on the image of each frame,
The attribute determination unit sets the reliability of the score corresponding to the recognition result of the event, and based on the score calculated by the person recognition unit and the reliability, the 2. The attribute determination device according to 1 above, which obtains attribute information.

３．前記事象は、前記画像内における前記人物の像の位置を含み、
前記信頼度は、前記人物の像の位置に基づいて設定されていることを特徴とする前記２に記載の属性決定装置。 3. the event includes the position of the image of the person within the image;
3. The attribute determination device according to 2, wherein the reliability is set based on the position of the image of the person.

４．前記信頼度は、前記画像内における前記人物の像の位置が、全身が撮影された位置であるか否かに基づいて設定されていることを特徴とする前記３に記載の属性決定装置。 4. 4. The attribute determination device according to 3 above, wherein the reliability is set based on whether or not the position of the image of the person in the image is a position where the whole body is photographed.

５．前記画像内における前記人物の像の位置が、全身が撮影された位置である場合の前記信頼度は、前記画像内における前記人物の像の位置が、全身の一部のみが撮影された位置である場合の前記信頼度よりも高く設定されていることを特徴とする前記４に記載の属性決定装置。 5. The reliability when the position of the image of the person in the image is the position where the whole body is photographed is the position where the image of the person in the image is the position where only a part of the whole body is photographed. 5. The attribute determination device according to 4 above, wherein the reliability is set higher than the reliability in a certain case.

６．前記信頼度は、前記画像内で、前記人物の像の位置を規定する一の人物矩形が、他の人物の像の位置を規定する他の人物矩形と重なっているか否かに基づいて設定されていることを特徴とする前記３に記載の属性決定装置。 6. The reliability is set based on whether or not one person rectangle defining the position of the image of the person overlaps another person rectangle defining the position of the image of another person in the image. 4. The attribute determination device according to 3 above, characterized in that:

７．前記画像内で、前記一の人物矩形が前記他の人物矩形と重なっている場合の前記信頼度は、前記一の人物矩形が前記他の人物矩形から離れている場合の前記信頼度よりも低く設定されていることを特徴とする前記６に記載の属性決定装置。 7. In the image, the confidence when the one person rectangle overlaps with the other person rectangle is lower than the confidence when the one person rectangle is separated from the other person rectangle. 7. The attribute determination device according to 6 above, characterized in that it is set.

８．前記事象は、前記画像内における前記人物の像から把握される前記人物の行動を含み、
前記信頼度は、前記人物の行動に基づいて設定されていることを特徴とする前記２から７のいずれかに記載の属性決定装置。 8. The event includes the behavior of the person grasped from the image of the person in the image,
8. The attribute determination device according to any one of 2 to 7, wherein the reliability is set based on the behavior of the person.

９．前記信頼度は、前記人物の行動が、動きを伴う行動であるか否かに基づいて設定されていることを特徴とする前記８に記載の属性決定装置。 9. 9. The attribute determination device according to 8, wherein the reliability is set based on whether or not the action of the person involves movement.

１０．前記人物の行動が動きを伴う行動である場合の前記信頼度は、前記人物の行動が滞留行動である場合の前記信頼度よりも低く設定されていることを特徴とする前記９に記載の属性決定装置。 10. 10. The attribute according to 9 above, wherein the reliability when the action of the person is an action involving movement is set lower than the reliability when the action of the person is a staying action. decision device.

１１．前記事象は、前記画像内における前記人物の像から把握される前記人物の姿勢を含み、
前記信頼度は、前記人物の姿勢に基づいて設定されていることを特徴とする前記２から１０のいずれかに記載の属性決定装置。 11. The event includes the posture of the person grasped from the image of the person in the image,
11. The attribute determination device according to any one of 2 to 10, wherein the reliability is set based on the posture of the person.

１２．前記信頼度は、前記画像内における前記人物の姿勢が、全身の一部のみが撮影された姿勢であるか否かに基づいて設定されていることを特徴とする前記１１に記載の属性決定装置。 12. 12. The attribute determination apparatus according to 11 above, wherein the reliability is set based on whether or not the posture of the person in the image is a posture in which only a part of the whole body is photographed. .

１３．前記画像内における前記人物の姿勢が、全身の一部のみが撮影された姿勢である場合の前記信頼度は、前記画像内における前記人物の姿勢が、全身が撮影された姿勢である場合の前記信頼度よりも低く設定されていることを特徴とする前記１２に記載の属性決定装置。 13. The reliability when the posture of the person in the image is a posture in which only a part of the whole body is photographed is the reliability when the posture of the person in the image is a posture in which the whole body is photographed. 13. The attribute determination device according to 12 above, wherein the attribute is set lower than the reliability.

１４．前記人物の前記属性は、前記人物の年齢および性別の少なくとも一方であることを特徴とする前記１から１３のいずれかに記載の属性決定装置。 14. 14. The attribute determination device according to any one of 1 to 13, wherein the attribute of the person is at least one of age and sex of the person.

１５．前記属性決定部によって決定された前記属性を記憶する記憶部をさらに備えていることを特徴とする前記１から１４のいずれかに記載の属性決定装置。 15. 15. The attribute determination device according to any one of 1 to 14, further comprising a storage unit that stores the attribute determined by the attribute determination unit.

１６．前記１から１５のいずれかに記載の属性決定装置と、
前記属性決定装置と通信回線を介して接続される管理サーバーとを含み、
前記管理サーバーは、前記属性決定装置から送出される情報を格納する格納部を備え、
前記情報には、前記属性決定装置の前記属性決定部によって決定された前記属性が含まれることを特徴とする属性決定システム。 16. 16. The attribute determination device according to any one of 1 to 15;
including a management server connected to the attribute determination device via a communication line,
The management server comprises a storage unit for storing information sent from the attribute determination device,
The attribute determination system, wherein the information includes the attribute determined by the attribute determination unit of the attribute determination device.

１７．人物を上方から撮影した各フレームの画像に基づいて、前記人物の属性を決定する属性決定方法であって、
各フレームの前記画像に基づいて、前記画像内における前記人物の像の情報を示す人物情報と、前記人物の属性と、前記属性の認識に影響を与える事象とを、各フレームごとに認識する人物認識工程と、
各フレームの前記人物情報に基づいて、各フレーム間で前記人物の像が同一人の像であるか否かを判断する人物同定工程と、
各フレーム間で前記人物の像が同一人の像であると判断された前記人物に関して、各フレームごとに、前記属性の認識結果に前記事象の認識結果を加味した属性情報を、認識した前記属性の各クラスについて求め、前記各クラスについて、前記属性情報を複数フレームで統合した結果に基づいて、前記人物の前記属性を決定する属性決定工程とを含むことを特徴とする属性決定方法。 17. An attribute determination method for determining an attribute of a person based on an image of each frame photographing the person from above,
A person who recognizes, for each frame, based on the image of each frame, person information indicating information of the image of the person in the image, attributes of the person, and events affecting recognition of the attributes. a recognition process;
a person identification step of determining whether or not the images of the person between the frames are images of the same person based on the person information of each frame;
With respect to the person whose images are determined to be the same person in each frame, the recognized attribute information obtained by adding the recognition result of the event to the recognition result of the attribute is added for each frame. and an attribute determination step of determining the attributes of the person based on results obtained for each attribute class and integrating the attribute information in a plurality of frames for each class.

１８．前記人物認識工程では、各フレームの前記画像に基づいて、前記属性の認識結果の確からしさを示すスコアを算出し、
前記属性決定工程では、前記事象の認識結果に対応して前記スコアの信頼度を設定し、前記人物認識部によって算出された前記スコアと、前記信頼度とに基づいて、前記クラスごとに前記属性情報を求めることを特徴とする前記１７に記載の属性決定方法。 18. In the person recognition step, based on the image of each frame, a score indicating the likelihood of the recognition result of the attribute is calculated;
In the attribute determination step, the reliability of the score is set corresponding to the recognition result of the event, and based on the score calculated by the person recognition unit and the reliability, the 18. The attribute determination method according to 17 above, wherein attribute information is obtained.

１９．前記事象は、前記画像内における前記人物の像の位置を含み、
前記信頼度は、前記人物の像の位置に基づいて設定されていることを特徴とする前記１８に記載の属性決定方法。 19. the event includes the position of the image of the person within the image;
19. The attribute determination method according to 18 above, wherein the reliability is set based on the position of the image of the person.

２０．前記信頼度は、前記画像内における前記人物の像の位置が、全身が撮影された位置であるか否かに基づいて設定されていることを特徴とする前記１９に記載の属性決定方法。 20. 20. The attribute determination method according to 19, wherein the reliability is set based on whether or not the position of the image of the person in the image is a position where the whole body is photographed.

２１．前記画像内における前記人物の像の位置が、全身が撮影された位置である場合の前記信頼度は、前記画像内における前記人物の像の位置が、全身の一部のみが撮影された位置である場合の前記信頼度よりも高く設定されていることを特徴とする前記２０に記載の属性決定方法。 21. The reliability when the position of the image of the person in the image is the position where the whole body is photographed is the position where the image of the person in the image is the position where only a part of the whole body is photographed. 21. The attribute determination method as described in 20 above, wherein the reliability is set higher than the reliability in a certain case.

２２．前記信頼度は、前記画像内で、前記人物の像の位置を規定する一の人物矩形が、他の人物の像の位置を規定する他の人物矩形と重なっているか否かに基づいて設定されていることを特徴とする前記１９に記載の属性決定方法。 22. The reliability is set based on whether or not one person rectangle defining the position of the image of the person overlaps another person rectangle defining the position of the image of another person in the image. 20. The attribute determination method according to 19 above, characterized in that

２３．前記画像内で、前記一の人物矩形が前記他の人物矩形と重なっている場合の前記信頼度は、前記一の人物矩形が前記他の人物矩形から離れている場合の前記信頼度よりも低く設定されていることを特徴とする前記２２に記載の属性決定方法。 23. In the image, the confidence when the one person rectangle overlaps with the other person rectangle is lower than the confidence when the one person rectangle is separated from the other person rectangle. 23. The attribute determination method according to the above 22, wherein the attribute is set.

２４．前記事象は、前記画像内における前記人物の像から把握される前記人物の行動を含み、
前記信頼度は、前記人物の行動に基づいて設定されていることを特徴とする前記１８から２３のいずれかに記載の属性決定方法。 24. The event includes the behavior of the person grasped from the image of the person in the image,
24. The attribute determination method according to any one of 18 to 23, wherein the reliability is set based on behavior of the person.

２５．前記信頼度は、前記人物の行動が、動きを伴う行動であるか否かに基づいて設定されていることを特徴とする前記２４に記載の属性決定方法。 25. 25. The attribute determination method according to 24 above, wherein the reliability is set based on whether or not the action of the person involves movement.

２６．前記人物の行動が動きを伴う行動である場合の前記信頼度は、前記人物の行動が滞留行動である場合の前記信頼度よりも低く設定されていることを特徴とする前記２５に記載の属性決定方法。 26. 26. The attribute according to 25 above, wherein the reliability when the action of the person is an action involving movement is set lower than the reliability when the action of the person is a staying action. How to decide.

２７．前記事象は、前記画像内における前記人物の像から把握される前記人物の姿勢を含み、
前記信頼度は、前記人物の姿勢に基づいて設定されていることを特徴とする前記１８から２６のいずれかに記載の属性決定方法。 27. The event includes the posture of the person grasped from the image of the person in the image,
27. The attribute determination method according to any one of 18 to 26, wherein the reliability is set based on the posture of the person.

２８．前記信頼度は、前記画像内における前記人物の姿勢が、全身の一部のみが撮影された姿勢であるか否かに基づいて設定されていることを特徴とする前記２７に記載の属性決定方法。 28. 28. The attribute determination method according to 27 above, wherein the reliability is set based on whether or not the posture of the person in the image is a posture in which only a part of the whole body is photographed. .

２９．前記画像内における前記人物の姿勢が、全身の一部のみが撮影された姿勢である場合の前記信頼度は、前記画像内における前記人物の姿勢が、全身が撮影された姿勢である場合の前記信頼度よりも低く設定されていることを特徴とする前記２８に記載の属性決定方法。 29. The reliability when the posture of the person in the image is a posture in which only a part of the whole body is photographed is the reliability when the posture of the person in the image is a posture in which the whole body is photographed. 29. The attribute determination method according to 28 above, wherein the attribute is set lower than the reliability.

３０．前記人物の前記属性は、前記人物の年齢および性別の少なくとも一方であることを特徴とする前記１７から２９のいずれかに記載の属性決定方法。 30. 30. The attribute determination method according to any one of 17 to 29, wherein the attribute of the person is at least one of age and sex of the person.

３１．前記属性決定工程によって決定された前記属性を記憶する記憶工程をさらに含むことを特徴とする請求項１７から３０のいずれかに記載の属性決定方法。 31. 31. The attribute determination method according to any one of claims 17 to 30, further comprising a storage step of storing said attributes determined by said attribute determination step.

３２．前記１７から３１のいずれかに記載の属性決定方法をコンピュータに実行させるための属性決定プログラム。 32. 32. An attribute determination program for causing a computer to execute the attribute determination method according to any one of 17 to 31 above.

３３．前記３２に記載の属性決定プログラムを記録した、コンピュータ読取可能な記録媒体。 33. 33. A computer-readable recording medium recording the attribute determination program according to 32 above.

以上、本発明の実施形態について説明したが、本発明の範囲はこれに限定されるものではなく、発明の主旨を逸脱しない範囲で拡張または変更して実施することができる。 Although the embodiments of the present invention have been described above, the scope of the present invention is not limited thereto, and can be implemented by being expanded or modified without departing from the gist of the invention.

本発明は、人物を上方から撮影した各フレームの画像に基づいて、人物の属性を決定する装置、システムおよび方法に利用可能である。 INDUSTRIAL APPLICABILITY The present invention is applicable to apparatuses, systems, and methods for determining attributes of a person based on each frame image of the person photographed from above.

１属性決定システム
３属性決定装置
４管理サーバー
１１ａ人物認識部
１１ｂ人物同定部
１１ｃ属性決定部
１２記憶部
２１格納部 1 attribute determination system 3 attribute determination device 4 management server 11a person recognition unit 11b person identification unit 11c attribute determination unit 12 storage unit 21 storage unit

Claims

An attribute determining device for determining an attribute of a person based on an image of each frame photographing the person from above,
A person who recognizes, for each frame, based on the image of each frame, person information indicating information of the image of the person in the image, attributes of the person, and events affecting recognition of the attributes. a recognition unit;
a person identification unit that determines whether or not the image of the person is the image of the same person between frames based on the person information of each frame;
With respect to the person whose images are determined to be the same person between frames, the recognized attribute information obtained by adding the recognition result of the event to the recognition result of the attribute is obtained for each frame. an attribute determining unit that determines the attributes of the person based on results obtained for each class of attributes and integrating the attribute information in a plurality of frames for each class;
The person recognition unit calculates a score indicating the likelihood of the recognition result of the attribute based on the image of each frame,
The attribute determination unit sets the reliability of the score corresponding to the recognition result of the event, and multiplies the score calculated by the person recognition unit by the reliability to obtain the 1. An attribute determination device, wherein attribute information is obtained, and the attribute information obtained for each class is integrated in a plurality of frames , and the attribute of the person is determined based on the magnitude relation of the evaluation values for each class .

the event includes the position of the image of the person within the image;
2. The attribute determination device according to claim 1, wherein the reliability is set based on the position of the image of the person.

3. The attribute determination device according to claim 2, wherein the reliability is set based on whether the position of the image of the person in the image is a position where the whole body is photographed.

The reliability when the position of the image of the person in the image is the position where the whole body is photographed is the position where the image of the person in the image is the position where only a part of the whole body is photographed. 4. The attribute determination device according to claim 3, wherein the reliability is set higher than the reliability in a certain case.

The reliability is set based on whether or not one person rectangle defining the position of the image of the person overlaps another person rectangle defining the position of the image of another person in the image. 3. The attribute determination device according to claim 2, wherein:

In the image, the confidence when the one person rectangle overlaps with the other person rectangle is lower than the confidence when the one person rectangle is separated from the other person rectangle. 6. The attribute determination device according to claim 5, wherein the attribute is set.

The event includes the behavior of the person grasped from the image of the person in the image,
7. The attribute determination device according to any one of claims 1 to 6, wherein the reliability is set based on behavior of the person.

8. The attribute determination device according to claim 7, wherein the reliability is set based on whether or not the action of the person involves movement.

9. The method according to claim 8, wherein the reliability when the action of the person is an action involving movement is set lower than the reliability when the action of the person is a staying action. Attribute determination device.

The event includes the posture of the person grasped from the image of the person in the image,
10. The attribute determination device according to any one of claims 1 to 9, wherein the reliability is set based on the posture of the person.

11. The attribute determination according to claim 10, wherein the reliability is set based on whether the posture of the person in the image is a posture in which only a part of the whole body is photographed. Device.

The reliability when the posture of the person in the image is a posture in which only a part of the whole body is photographed is the reliability when the posture of the person in the image is a posture in which the whole body is photographed. 12. The attribute determination device according to claim 11, wherein the attribute is set lower than the reliability.

13. The attribute determination device according to any one of claims 1 to 12, wherein the attribute of the person is at least one of age and sex of the person.

14. The attribute determination device according to any one of claims 1 to 13, further comprising a storage unit that stores the attribute determined by the attribute determination unit.

an attribute determination device according to any one of claims 1 to 14;
including a management server connected to the attribute determination device via a communication line,
The management server comprises a storage unit for storing information sent from the attribute determination device,
The attribute determination system, wherein the information includes the attribute determined by the attribute determination unit of the attribute determination device.

An attribute determination method for determining an attribute of a person based on an image of each frame of the person photographed from above,
A person recognition unit recognizes, based on the image of each frame, person information indicating information of the image of the person in the image, attributes of the person, and an event affecting recognition of the attributes for each frame. a person recognition process for recognizing each
a person identification step in which the person identification unit determines whether or not the images of the person between the frames are images of the same person based on the person information of each frame;
attribute information obtained by adding the recognition result of the event to the recognition result of the attribute for each frame with respect to the person whose image is determined to be the same person in each frame by the attribute determination unit; is obtained for each class of the recognized attribute, and for each class, the attribute of the person is determined based on the result of integrating the attribute information in a plurality of frames,
In the person recognition step, the person recognition unit calculates a score indicating the likelihood of the recognition result of the attribute based on the image of each frame,
In the attribute determination step, the attribute determination unit sets the reliability of the score corresponding to the recognition result of the event, and multiplies the score calculated by the person recognition step by the reliability. , the attribute information is obtained for each class, and the attribute of the person is determined based on the magnitude relationship of the evaluation values for each class, which is obtained by integrating the attribute information obtained for each class in a plurality of frames. attribute determination method.

17. The attribute determination method according to claim 16, wherein said attribute of said person is at least one of age and sex of said person.

18. The attribute determination method according to claim 16, further comprising a storage step of storing the attribute determined by the attribute determination step.

An attribute determination program for causing a computer to execute the attribute determination method according to any one of claims 16 to 18.