JP2020135369A

JP2020135369A - Information processing device, system, information processing method, and program

Info

Publication number: JP2020135369A
Application number: JP2019027497A
Authority: JP
Inventors: 潔考高橋; Kiyotaka Takahashi; 英生野呂; Hideo Noro; 俊亮中野; Toshiaki Nakano; 山本　貴久; Takahisa Yamamoto; 貴久山本; 将由山▲崎▼; Masayoshi Yamazaki; 孝嗣牧田; Takatsugu Makita
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2019-02-19
Filing date: 2019-02-19
Publication date: 2020-08-31

Abstract

To allow for outputting more appropriate attribute information.SOLUTION: An information processing method is provided, comprising: extracting feature quantities of an object in an input image; identifying a class of the object based on the extracted feature quantities; determining reliability of an identification result; and outputting preset attribute information associated with the class of the object provided that the reliability meets a predefined condition.SELECTED DRAWING: Figure 7

Description

本発明は、情報処理装置、システム、情報処理方法及びプログラムに関する。 The present invention relates to information processing devices, systems, information processing methods and programs.

画像中の人物の年齢・性別等の属性情報を判定する属性認識技術がある。属性認識技術の応用先として、商品の購買年齢層を把握するためのマーケティング用途や、セキュリティ用途等がある。これらの用途において、属性認識技術の利用場面は多岐にわたり、利用場面によって様々な撮影条件下の画像が撮影される。ここで、撮影条件とは、物体の顔向きや表情等の物体の状態、照明条件やオクルージョン等の物体の撮影状態を示す。オクルージョンや白飛び・黒つぶれ等が発生しうる撮影条件下では、オクルージョンや白飛び・黒つぶれ等が発生しない撮影条件下に比べて、属性認識結果が劣化する可能性が高い。
そこで、属性認識技術の性能劣化を補うため、顔等の物体認証技術を組み合わせた属性認識技術が提案されている。特許文献１には、入力画像を基に年代推定を行った後、要年齢制限対象者か判定が困難な年代と判定された場合は顔認証処理を行い、要年齢制限対象者か判定する技術が開示されている。また、特許文献２には、入力画像から抽出された顔の個人差を示す特徴量と、予め格納された各年齢層の基準特徴量とを比較し、最も類似度が高い年齢層を、その入力画像の被写体の年齢と判定する技術が開示されている。 There is an attribute recognition technology that determines attribute information such as the age and gender of a person in an image. Application destinations of attribute recognition technology include marketing applications for grasping the purchasing age group of products and security applications. In these applications, the attribute recognition technology is used in a wide variety of situations, and images under various shooting conditions are taken depending on the usage scene. Here, the shooting condition indicates the state of the object such as the face orientation and facial expression of the object, and the shooting state of the object such as the lighting condition and occlusion. Under shooting conditions where occlusion, overexposure, and underexposure may occur, there is a high possibility that the attribute recognition result will deteriorate compared to under shooting conditions where occlusion, overexposure, and underexposure do not occur.
Therefore, in order to compensate for the performance deterioration of the attribute recognition technology, an attribute recognition technology that combines an object authentication technology such as a face has been proposed. Patent Document 1 describes a technique for determining whether a person is subject to age restriction by performing face recognition processing when it is determined that the age is difficult to determine whether the person is subject to age restriction after performing age estimation based on an input image. Is disclosed. Further, in Patent Document 2, the feature amount indicating the individual difference of the face extracted from the input image is compared with the reference feature amount of each age group stored in advance, and the age group having the highest degree of similarity is selected. A technique for determining the age of a subject in an input image is disclosed.

特許第４９１０７１７号公報Japanese Patent No. 4910717 特許第４５２１０８６号公報Japanese Patent No. 4521086

しかしながら、従来技術では、画像の撮影条件等により、その物体の撮影画像に対する属性推定や物体認証の精度が劣化し、不適切な属性情報が出力されてしまうことがあった。
本発明は、より適切な属性情報を出力することを目的とする。 However, in the prior art, the accuracy of attribute estimation and object authentication for the captured image of the object deteriorates depending on the imaging conditions of the image, and inappropriate attribute information may be output.
An object of the present invention is to output more appropriate attribute information.

本発明の情報処理装置は、入力画像に含まれる物体の特徴量を抽出する抽出手段と、前記抽出手段により抽出された特徴量に基づいて、前記物体のクラスを特定する特定手段と、前記特定手段による特定結果の信頼度を決定する決定手段と、前記決定手段により決定された前記信頼度が予め定められた条件を満たす場合、前記特定手段により特定された前記物体のクラスに関する予め定められた属性情報を出力する出力手段と、を有する。 The information processing apparatus of the present invention includes an extraction means for extracting a feature amount of an object included in an input image, a specific means for specifying a class of the object based on the feature amount extracted by the extraction means, and the identification. When the determination means for determining the reliability of the specific result by the means and the reliability determined by the determination means satisfy the predetermined conditions, the determination means for the class of the object specified by the specific means is predetermined. It has an output means for outputting attribute information.

本発明によれば、より適切な属性情報を出力することができる。 According to the present invention, more appropriate attribute information can be output.

情報処理システムのシステム構成の一例を示す図である。It is a figure which shows an example of the system configuration of an information processing system. 画像処理装置のハードウェア構成の一例を示す図である。It is a figure which shows an example of the hardware composition of an image processing apparatus. 画像処理装置の機能構成の一例を示す図である。It is a figure which shows an example of the functional structure of an image processing apparatus. 登録処理の一例を示すフローチャートである。It is a flowchart which shows an example of a registration process. 登録情報の一例を示す図である。It is a figure which shows an example of the registration information. 照合処理の一例を示すフローチャートである。It is a flowchart which shows an example of a collation process. 出力処理の一例を示すフローチャートである。It is a flowchart which shows an example of output processing. 画像処理装置の機能構成の一例を示す図である。It is a figure which shows an example of the functional structure of an image processing apparatus. 出力処理の一例を示すフローチャートである。It is a flowchart which shows an example of output processing.

以下に、本発明の実施の形態を、図面に基づいて詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

＜実施形態１＞
本実施形態に係る情報処理システムは、画像内の物体の属性情報を出力するシステムである。属性情報とは、対応する属性に関する情報である。
本実施形態では、情報処理システムが店舗の商品棚付近に設置したカメラによる撮影画像を基に、商品棚の前にいる人物の年齢推定を行い、その人物の年齢情報を購買年齢層の分析処理装置に出力する例について説明する。年齢情報とは、年齢属性についての属性情報である。
図１は、本実施形態の情報処理システムのシステム構成の一例を示す図である。本実施形態では、情報処理システムは、商品棚に近づいた人物の属性として年齢を推定し、推定した年齢の情報を出力する。情報処理システムは、画像処理装置１、照明分布取得装置３、撮影装置４、分析処理装置５を含む。各要素は、ネットワーク２を介して、相互に通信可能に接続されている。図１には、年齢推定対象となる人物が商品棚の前に進行する様子が示されている。
画像処理装置１は、年齢推定対象の人物に顔認証処理を行うパーソナルコンピュータ（ＰＣ）、サーバ装置、タブレット装置等の情報処理装置である。画像処理装置１は、顔認証処理の結果を基に、購買年齢層の分析処理装置５に年齢情報を出力する。
ネットワーク２は、ローカルエリアネットワークであり、各装置間の通信に用いる。ただし、他の例として、ネットワーク２は、インターネット等の他の通信ネットワークであってもよい。 <Embodiment 1>
The information processing system according to the present embodiment is a system that outputs attribute information of an object in an image. Attribute information is information about the corresponding attribute.
In the present embodiment, the information processing system estimates the age of a person in front of the product shelf based on an image taken by a camera installed near the product shelf of the store, and analyzes the age information of the person in the purchasing age group. An example of outputting to the device will be described. The age information is attribute information about the age attribute.
FIG. 1 is a diagram showing an example of a system configuration of the information processing system of the present embodiment. In the present embodiment, the information processing system estimates the age as an attribute of the person who approaches the product shelf, and outputs the information of the estimated age. The information processing system includes an image processing device 1, a lighting distribution acquisition device 3, a photographing device 4, and an analysis processing device 5. The elements are communicably connected to each other via the network 2. FIG. 1 shows how a person to be age-estimated progresses in front of a product shelf.
The image processing device 1 is an information processing device such as a personal computer (PC), a server device, or a tablet device that performs face recognition processing on a person whose age is to be estimated. The image processing device 1 outputs age information to the analysis processing device 5 of the purchasing age group based on the result of the face recognition processing.
The network 2 is a local area network and is used for communication between each device. However, as another example, the network 2 may be another communication network such as the Internet.

照明分布取得装置３は、魚眼レンズを装着し、周囲の３次元空間上の照明分布を取得するカメラであり、取得した照明分布を、ネットワーク２を介して、画像処理装置１に出力する。照明分布とは、３次元空間上の、照明分布取得装置３の周囲の照明の分布、即ち、照明の方向、照明の強度、照明の色温度の分布を示すデータである。照明分布は、より具体的には、周囲の照明分布を示す球形、又は、ドーム形状の２次元画像である。外光の影響を受けるような場所では、強度のレンジが比較的広い太陽光等を含む照明の照明分布をより正確に撮影するため、ＨＤＲ（ＨｉｇｈＤｉｎａｍｉｃＲａｎｇｅ）撮影のできるカメラを用いるのが好ましい。そのため、照明分布取得装置３は、ＨＤＲ撮影のできるカメラである。本実施形態では、照明分布取得装置３は、魚眼レンズ装着のカメラであるとした。ただし、照明分布取得装置３は、周囲の照明分布を取得できれば、他の装置であるとしてもよい。例えば、照明分布取得装置３は、通常のカメラと自動雲台とを用いて画像を複数撮影し、スティッチ処理を加えることで周囲の照明分布を取得する装置であるとしてもよい。
また、照明分布取得装置３は、通常のカメラとミラーボール、反射鏡等を組み合わせて周囲の照明分布を取得する装置であってもよい。また、以降の処理で色温度を用いない場合、照明分布取得装置３は、少なくとも照明の方向と強度との分布を取得できればよい。 The illumination distribution acquisition device 3 is a camera equipped with a fisheye lens and acquires the illumination distribution in the surrounding three-dimensional space, and outputs the acquired illumination distribution to the image processing apparatus 1 via the network 2. The illumination distribution is data showing the distribution of illumination around the illumination distribution acquisition device 3 in a three-dimensional space, that is, the direction of illumination, the intensity of illumination, and the distribution of color temperature of illumination. More specifically, the illumination distribution is a spherical or dome-shaped two-dimensional image showing the ambient illumination distribution. In places affected by external light, it is preferable to use a camera capable of HDR (High Dynamic Range) photography in order to more accurately photograph the illumination distribution of illumination including sunlight, which has a relatively wide range of intensity. .. Therefore, the illumination distribution acquisition device 3 is a camera capable of HDR photographing. In the present embodiment, the illumination distribution acquisition device 3 is a camera equipped with a fisheye lens. However, the illumination distribution acquisition device 3 may be another device as long as it can acquire the ambient illumination distribution. For example, the illumination distribution acquisition device 3 may be an apparatus that acquires a plurality of images using a normal camera and an automatic pan head, and acquires the ambient illumination distribution by adding stitch processing.
Further, the illumination distribution acquisition device 3 may be a device that acquires the ambient illumination distribution by combining a normal camera with a mirror ball, a reflector, or the like. Further, when the color temperature is not used in the subsequent processing, the illumination distribution acquisition device 3 only needs to be able to acquire at least the distribution of the illumination direction and the intensity.

撮影装置４は、光学レンズと映像センサと通信ユニットとを含むネットワークカメラであり、年齢推定対象の人物を撮影する。そして、撮影装置４は、撮影した画像を画像処理装置１へ出力する。
分析処理装置５は、画像処理装置１から出力された年齢情報を分析する処理を行う。
本実施形態では、撮影装置４は、一台とするが、複数台あってもよい。
また、本実施形態では、画像処理装置１、照明分布取得装置３、撮影装置４、分析処理装置５それぞれは、個別の装置であるとするが、これらの装置のうちの複数の装置を１つの装置として構成してもよい。例えば、照明分布取得装置３と撮影装置４とは、両方の機能を有する１つの装置として構成されることとしてもよい。また、画像処理装置１と分析処理装置５とは、両方の機能を有する１つの装置として構成されることとしてもよい。 The photographing device 4 is a network camera including an optical lens, an image sensor, and a communication unit, and photographs a person whose age is to be estimated. Then, the photographing device 4 outputs the photographed image to the image processing device 1.
The analysis processing device 5 performs a process of analyzing the age information output from the image processing device 1.
In the present embodiment, the number of imaging devices 4 is one, but there may be a plurality of imaging devices 4.
Further, in the present embodiment, the image processing device 1, the illumination distribution acquisition device 3, the photographing device 4, and the analysis processing device 5 are each individual devices, but a plurality of devices among these devices are combined into one. It may be configured as a device. For example, the illumination distribution acquisition device 3 and the photographing device 4 may be configured as one device having both functions. Further, the image processing device 1 and the analysis processing device 5 may be configured as one device having both functions.

＜ハードウェア構成＞
図２に、画像処理装置１のハードウェア構成の一例を示す。画像処理装置１は、ＣＰＵ１１、ＲＯＭ１２、ＲＡＭ１３、二次記憶装置１４、通信装置１５、映像出力装置１６、操作入力装置１７、接続バス１８を含む。各要素は、接続バス１８を介して相互に通信可能に接続されている。
ＣＰＵ１１は、画像処理装置１を制御するＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔであり、ＲＯＭ１２や二次記憶装置１４に格納されたプログラムに従って処理を実行する。ＲＯＭ１２は、不揮発性メモリであり、制御プログラムや各種パラメタデータを記憶するＲｅａｄＯｎｌｙＭｅｍｏｒｙである。ＲＡＭ１３は、揮発性メモリであり、画像や制御プログラム、処理の実行結果等を一時的に記憶するＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙである。二次記憶装置１４は、ハードディスクやフラッシュメモリー等の書き換え可能な二次記憶装置であり、各種プログラム、画像、照明分布、年齢情報等を記憶する。 <Hardware configuration>
FIG. 2 shows an example of the hardware configuration of the image processing device 1. The image processing device 1 includes a CPU 11, a ROM 12, a RAM 13, a secondary storage device 14, a communication device 15, a video output device 16, an operation input device 17, and a connection bus 18. The elements are communicably connected to each other via the connection bus 18.
The CPU 11 is a Central Processing Unit that controls the image processing device 1, and executes processing according to a program stored in the ROM 12 or the secondary storage device 14. The ROM 12 is a non-volatile memory, and is a Read Only Memory that stores a control program and various parameter data. The RAM 13 is a volatile memory, and is a Random Access Memory that temporarily stores images, control programs, processing execution results, and the like. The secondary storage device 14 is a rewritable secondary storage device such as a hard disk or a flash memory, and stores various programs, images, lighting distribution, age information, and the like.

通信装置１５は、有線通信ユニットであり、ネットワークを介した他の装置との通信に利用される。通信装置１５は、無線通信ユニットであってもよい。映像出力装置１６は、ＣＲＴやＴＦＴ液晶等のモニタである。ＣＰＵ１１は、ＲＡＭ１３から取得した画像や制御プログラムの実行結果等を映像出力装置１６に表示する。
操作入力装置１７は、キーボードやマウス等のユーザからの操作を受け付ける入力装置である。接続バス１８は、画像処理装置１の各要素の接続に用いられるバスである。
ＣＰＵ１１が、ＲＯＭ１２、二次記憶装置１４等に記憶されたプログラムに従って処理を実行することで、図３，８で後述する画像処理装置１の機能、図４、６、７、９のフローチャートの処理等が実現される。 The communication device 15 is a wired communication unit and is used for communication with other devices via a network. The communication device 15 may be a wireless communication unit. The video output device 16 is a monitor such as a CRT or a TFT liquid crystal. The CPU 11 displays an image acquired from the RAM 13, an execution result of the control program, and the like on the video output device 16.
The operation input device 17 is an input device that accepts operations from users such as a keyboard and a mouse. The connection bus 18 is a bus used for connecting each element of the image processing device 1.
When the CPU 11 executes the process according to the program stored in the ROM 12, the secondary storage device 14, and the like, the function of the image processing device 1 described later in FIGS. 3 and 8 and the processing of the flowchart of FIGS. 4, 6, 7, and 9. Etc. are realized.

＜機能構成＞
図３に、画像処理装置１の機能構成の一例を示す。画像処理装置１は、取得部１０１、検出部１０２、顔特徴抽出部１０３、登録部１０４、年齢情報取得部１０５、照合部１０６、照明特徴抽出部１０７，判別部１０８、信頼度算出部１０９、信頼度判定部１１０、生成部１１１、出力部１１２を含む。
取得部１０１は、撮影装置４により撮影された２次元画像を取得し、検出部１０２へ出力する。また、本実施形態では、取得部１０１は、照明分布取得装置３を介して、照明分布の判別用の２次元画像を撮影し、撮影した２次元画像を照明特徴抽出部１０７へ出力する。
検出部１０２は、取得部１０１から取得した画像から物体を検出する。本実施形態では、検出部１０２は、物体として人の顔を検出し、検出した顔領域の画像を顔特徴抽出部１０３へ出力する。
顔特徴抽出部１０３は、検出部１０２から取得した顔領域の画像から顔認証用の特徴量を抽出し、抽出した特徴量を登録部１０４又は、照合部１０６へ出力する。 <Functional configuration>
FIG. 3 shows an example of the functional configuration of the image processing device 1. The image processing device 1 includes an acquisition unit 101, a detection unit 102, a face feature extraction unit 103, a registration unit 104, an age information acquisition unit 105, a collation unit 106, a lighting feature extraction unit 107, a discrimination unit 108, and a reliability calculation unit 109. The reliability determination unit 110, the generation unit 111, and the output unit 112 are included.
The acquisition unit 101 acquires a two-dimensional image captured by the photographing device 4 and outputs it to the detection unit 102. Further, in the present embodiment, the acquisition unit 101 captures a two-dimensional image for discriminating the illumination distribution via the illumination distribution acquisition device 3, and outputs the captured two-dimensional image to the illumination feature extraction unit 107.
The detection unit 102 detects an object from the image acquired from the acquisition unit 101. In the present embodiment, the detection unit 102 detects a human face as an object and outputs an image of the detected face region to the face feature extraction unit 103.
The face feature extraction unit 103 extracts the feature amount for face recognition from the image of the face area acquired from the detection unit 102, and outputs the extracted feature amount to the registration unit 104 or the collation unit 106.

登録部１０４は、顔特徴抽出部１０３から取得した顔認証用の特徴量を二次記憶装置１４に登録データとして記憶し、照合部１０６に出力する。登録部１０４は、誰の顔に対応する特徴量であるかを示す人物情報を、特徴量に対応付けたうえで、二次記憶装置１４に登録データとして記憶する。登録部１０４は、人物情報を、操作入力装置１７を用いユーザの操作に基づいて取得する。
年齢情報取得部１０５は、映像出力装置１６に登録済の人物を表示し、操作入力装置１７を介したユーザの操作に基づいて、登録済の人物の年齢情報の入力を受付ける。そして、年齢情報取得部１０５は、受け付けた年齢情報を、登録部１０４により登録データとして記憶された特徴量及び人物情報に紐付けて、二次記憶装置１４に記憶する。 The registration unit 104 stores the feature amount for face authentication acquired from the face feature extraction unit 103 as registration data in the secondary storage device 14, and outputs the feature amount to the collation unit 106. The registration unit 104 stores the person information indicating which face the feature amount corresponds to as the registration data in the secondary storage device 14 after associating the person information with the feature amount. The registration unit 104 acquires personal information based on the user's operation using the operation input device 17.
The age information acquisition unit 105 displays the person registered in the video output device 16, and accepts the input of the age information of the registered person based on the user's operation via the operation input device 17. Then, the age information acquisition unit 105 associates the received age information with the feature amount and the person information stored as the registration data by the registration unit 104, and stores the received age information in the secondary storage device 14.

照合部１０６は、顔特徴抽出部１０３から取得した顔認証用の特徴量と登録部１０４により登録された登録データとに基づいて照合を行い、照合結果を生成部１１１に出力する。
照明特徴抽出部１０７は、取得部１０１から取得した照明分布取得装置３により撮影された画像から、照明条件の判別に用いられる特徴量を抽出し、判別部１０８に出力する。ここで、照明特徴抽出部１０７は、撮影装置４により撮影された２次元画像を照明条件判別用の特徴量に紐付けておく。より具体的には、照明特徴抽出部１０７は、撮影装置４により撮影された２次元画像と照明条件判別用の特徴量との対応情報を、ＲＡＭ１３等に記憶することで紐付ける。 The collation unit 106 performs collation based on the feature amount for face authentication acquired from the face feature extraction unit 103 and the registration data registered by the registration unit 104, and outputs the collation result to the generation unit 111.
The illumination feature extraction unit 107 extracts the feature amount used for determining the lighting condition from the image captured by the illumination distribution acquisition device 3 acquired from the acquisition unit 101, and outputs the feature amount to the determination unit 108. Here, the illumination feature extraction unit 107 associates the two-dimensional image captured by the photographing device 4 with the feature amount for determining the illumination condition. More specifically, the illumination feature extraction unit 107 links the two-dimensional image captured by the photographing device 4 with the correspondence information of the feature amount for determining the illumination condition by storing it in the RAM 13 or the like.

判別部１０８は、照明特徴抽出部１０７から取得した照明条件判別用の特徴量を基に、照明条件を判別し、判別した照明条件を信頼度算出部１０９と信頼度判定部１１０とへ出力する。
信頼度算出部１０９は、判別部１０８から取得した照明条件判別結果を基に、顔認証（クラス判定）の結果の信頼度を算出し、信頼度判定部１１０へ出力する。
信頼度判定部１１０は、信頼度算出部１０９から取得したクラス判定の信頼度が予め定められた条件を満たすか否かを判定し、判定結果を生成部１１１に出力する。
生成部１１１は、照合部１０６から取得した照合結果を基に、照合結果の人物の登録データに紐付けられた年齢情報を取得する。そして、生成部１１１は、信頼度判定部１１０から取得した判定結果を基に、年齢情報を示す出力情報を生成し、出力部１１２に出力する。出力部１１２は、生成部１１１から取得した出力情報を、通信装置１５を介して分析処理装置５に出力する。 The determination unit 108 determines the lighting condition based on the feature amount for determining the lighting condition acquired from the lighting feature extraction unit 107, and outputs the determined lighting condition to the reliability calculation unit 109 and the reliability determination unit 110. ..
The reliability calculation unit 109 calculates the reliability of the result of face authentication (class determination) based on the lighting condition determination result acquired from the determination unit 108, and outputs the reliability to the reliability determination unit 110.
The reliability determination unit 110 determines whether or not the reliability of the class determination acquired from the reliability calculation unit 109 satisfies a predetermined condition, and outputs the determination result to the generation unit 111.
The generation unit 111 acquires the age information associated with the registration data of the person of the collation result based on the collation result acquired from the collation unit 106. Then, the generation unit 111 generates output information indicating the age information based on the determination result acquired from the reliability determination unit 110, and outputs the output information to the output unit 112. The output unit 112 outputs the output information acquired from the generation unit 111 to the analysis processing device 5 via the communication device 15.

本実施形態では、情報処理システムは、店舗を利用する可能性のある人物を事前に登録しておく登録処理と、照合対象の人物を照合する照合処理と、年齢情報を出力する出力処理と、を行う。以下では、それぞれの処理の詳細を説明する。
図４を用いて、登録処理を説明する。
Ｓ１００１において、取得部１０１は、撮影装置４により撮影された２次元画像を取得する。
本実施形態では、取得部１０１は、Ｓ１００１で撮影装置４により撮影された画像を取得する処理を行うこととした。ただし、他の例として、取得部１０１は、事前に用意された画像を取得する処理を行ってもよいし、画像処理プログラムにより擬似的に生成された人物の画像を取得する処理を行ってもよいし、これらの処理を複合的に行うことで、人物の画像を取得してもよい。 In the present embodiment, the information processing system includes a registration process for registering a person who may use the store in advance, a collation process for collating the person to be collated, and an output process for outputting age information. I do. The details of each process will be described below.
The registration process will be described with reference to FIG.
In S1001, the acquisition unit 101 acquires a two-dimensional image captured by the photographing device 4.
In the present embodiment, the acquisition unit 101 is to perform the process of acquiring the image captured by the photographing device 4 in S1001. However, as another example, the acquisition unit 101 may perform a process of acquiring an image prepared in advance, or may perform a process of acquiring an image of a person pseudo-generated by an image processing program. Alternatively, an image of a person may be acquired by performing these processes in combination.

Ｓ１００２において、検出部１０２は、Ｓ１００１で取得された画像から人物の顔を検出する。検出部１０２は、公知の技術を用いて、顔を検出する。公知の技術には、例えば、以下の参考文献１を用いることができる。
参考文献１：Ｒａｐｉｄｏｂｊｅｃｔｄｅｔｅｃｔｉｏｎｕｓｉｎｇａｂｏｏｓｔｅｄｃａｓｃａｄｅｏｆｓｉｍｐｌｅｆｅａｔｕｒｅｓ：Ｐ．Ｖｉｏｌａ、Ｍ．Ｊｏｎｅｓ：２００１
検出部１０２は、顔の検出結果を、画像中における顔の領域を表す矩形領域の座標として取得する。検出部１０２は、検出した顔領域に対して、顔サイズや顔の傾きが一定になるよう正規化処理を施す。他の例として、検出部１０２は、顔の目や鼻といった特徴点を抽出し、顔の部分領域を出力するようにしてもよい。その場合、検出部１０２は、公知の技術を用いて、顔の特徴点を抽出する。この公知の技術には、例えば、参考文献２がある。
参考文献２：特開２００９−２１１１７７号公報 In S1002, the detection unit 102 detects the face of a person from the image acquired in S1001. The detection unit 102 detects the face by using a known technique. For the known technique, for example, the following Reference 1 can be used.
Reference 1: Rapid object detection using a boosted cascade of simple objects: P.I. Viola, M.D. Jones: 2001
The detection unit 102 acquires the face detection result as the coordinates of the rectangular area representing the face area in the image. The detection unit 102 performs a normalization process on the detected face region so that the face size and the inclination of the face become constant. As another example, the detection unit 102 may extract feature points such as eyes and nose of the face and output a partial region of the face. In that case, the detection unit 102 extracts facial feature points using a known technique. This known technique includes, for example, Reference 2.
Reference 2: Japanese Patent Application Laid-Open No. 2009-21177

Ｓ１００３において、顔特徴抽出部１０３は、Ｓ１００２で検出された人物の顔の画像から顔認証用の特徴量を抽出する。顔特徴抽出部１０３は、顔の画像の特徴量として、公知の特徴量を抽出する。この公知の特徴量には、例えば、ＬＢＰ（ＬｏｃａｌＢｉｎａｒｙＰａｔｔｅｒｎ）特徴量、ＨＯＧ（ＨｉｓｔｏｇｒａｍｏｆＯｒｉｅｎｔｅｄＧｒａｄｉｅｎｔ）特徴がある。また、この公知の特徴量には、ＳＩＦＴ（Ｓｃａｌｅ−ＩｎｖａｒｉａｎｔＦｅａｔｕｒｅＴｒａｎｓｆｏｒｍ）特徴量がある。
他の例として、顔特徴抽出部１０３は、ＬＢＰ特徴量、ＨＯＧ特徴量、ＳＩＦＴ特徴量等の公知の特徴量の２つ以上を混合した特徴量を抽出することとしてもよい。また、顔特徴抽出部１０３は、ニューラルネットワークで特徴量を抽出してもよい。また、顔特徴抽出部１０３は、抽出した特徴量をＰＣＡ（ＰｒｉｎｃｉｐａｌＣｏｍｐｏｎｅｎｔＡｎａｌｙｓｉｓ）等の手法を用いて次元圧縮してもよい。 In S1003, the face feature extraction unit 103 extracts the feature amount for face recognition from the face image of the person detected in S1002. The face feature extraction unit 103 extracts a known feature amount as the feature amount of the face image. The known features include, for example, LBP (Local Binary Pattern) features and HOG (Histogram of Oriented Gradient) features. Further, the known feature amount includes SIFT (Scale-Invariant Feature Transfer Transfer) feature amount.
As another example, the face feature extraction unit 103 may extract a feature amount obtained by mixing two or more known feature amounts such as an LBP feature amount, a HOG feature amount, and a SIFT feature amount. Further, the face feature extraction unit 103 may extract the feature amount by the neural network. Further, the face feature extraction unit 103 may dimensionally compress the extracted feature amount by using a method such as PCA (Principal Component Analysis sis).

Ｓ１００４において、登録部１０４は、ユーザによる操作入力装置１７を介した操作に基づいて、Ｓ１００２で検出された人物の情報の入力を受付ける。そして、登録部１０４は、Ｓ１００３で抽出された特徴量と、受け付けたその特徴量に対応する人物の情報と、の組を、登録データとして、二次記憶装置１４に登録する。
Ｓ１００５において、年齢情報取得部１０５は、ユーザにより操作入力装置１７を介して入力された年齢情報を、Ｓ１００４で登録された登録データと紐付けて、二次記憶装置１４に登録する。本実施形態では、年齢情報取得部１０５は、年齢情報として、生年月日の情報を受付ける。ただし、他の例として、年齢情報取得部１０５は、年齢情報として、実年齢の情報、指定された年齢の情報等を受付けてもよい。
Ｓ１００４及びＳ１００５において登録される登録データと年齢情報とを管理する管理テーブルの一例を図５に示す。管理テーブルは、二次記憶装置に記憶されるテーブルであり、人物情報、特徴、生年月日の項目を含む。人物情報の項目には、どの人物であるかを示す情報が格納される。特徴の項目には、対応する人物の顔から抽出された特徴量が格納される。生年月日の項目には、Ｓ１００５で登録される年齢情報が格納される。
以上が、登録処理の詳細である。 In S1004, the registration unit 104 accepts the input of the person information detected in S1002 based on the operation by the user via the operation input device 17. Then, the registration unit 104 registers the set of the feature amount extracted in S1003 and the information of the person corresponding to the received feature amount in the secondary storage device 14 as registration data.
In S1005, the age information acquisition unit 105 registers the age information input by the user via the operation input device 17 in the secondary storage device 14 in association with the registration data registered in S1004. In the present embodiment, the age information acquisition unit 105 receives information on the date of birth as age information. However, as another example, the age information acquisition unit 105 may receive information on the actual age, information on the designated age, and the like as the age information.
FIG. 5 shows an example of a management table that manages the registration data registered in S1004 and S1005 and the age information. The management table is a table stored in the secondary storage device, and includes items of personal information, features, and date of birth. In the item of person information, information indicating which person is stored is stored. In the feature item, the feature amount extracted from the face of the corresponding person is stored. The age information registered in S1005 is stored in the date of birth item.
The above is the details of the registration process.

図６を用いて、情報処理システムが行う照合処理の詳細について説明する。
Ｓ１１０１において、取得部１０１は、撮影装置４により撮影された画像を取得する。Ｓ１１０１で取得された画像は、入力画像の一例である。
Ｓ１１０２において、検出部１０２は、Ｓ１１０１で取得された画像から人物の顔を検出する。本実施形態では、図１に示される商品棚の前に来た人物がいれば、その人物の顔は、Ｓ１１０２で検出部１０２により検出され、後の処理で照合されることになる。本実施形態では、検出部１０２は、画像の全領域から顔を検出するのではなく、画像内の事前に定められた範囲から顔を検出する。このようにすることで、検出部１０２は、商品棚の前をただ通り過ぎる通行人を、商品棚を見ている人物として誤って年齢認証対象としてしまうことを避けることができる。
Ｓ１１０３において、検出部１０２は、Ｓ１１０２で顔を一つ以上検出したか否かを判定する。検出部１０２は、Ｓ１１０２で顔を１つ以上検出した場合、処理をＳ１１０４に進め、Ｓ１１０２で顔を１つも検出していない場合、処理をＳ１１０１に進める。 The details of the collation process performed by the information processing system will be described with reference to FIG.
In S1101, the acquisition unit 101 acquires an image captured by the photographing device 4. The image acquired in S1101 is an example of an input image.
In S1102, the detection unit 102 detects the face of a person from the image acquired in S1101. In the present embodiment, if there is a person who came in front of the product shelf shown in FIG. 1, the face of that person is detected by the detection unit 102 in S1102 and collated in a later process. In the present embodiment, the detection unit 102 does not detect the face from the entire area of the image, but detects the face from a predetermined range in the image. By doing so, the detection unit 102 can prevent a passerby who just passes in front of the product shelf from being mistakenly subject to age authentication as a person looking at the product shelf.
In S1103, the detection unit 102 determines whether or not one or more faces have been detected in S1102. If the detection unit 102 detects one or more faces in S1102, the process proceeds to S1104, and if no face is detected in S1102, the detection unit 102 advances the process to S1101.

Ｓ１１０４において、顔特徴抽出部１０３は、Ｓ１１０２で検出された顔の画像から特徴量を抽出する。顔特徴抽出部１０３は、Ｓ１００３で説明した方法と同様の方法で、特徴量を抽出する。顔特徴抽出部１０３は、Ｓ１１０２で検出された顔が複数ある場合、複数の顔それぞれについて特徴量を抽出する。その場合、情報処理システムは、商品棚からの距離が最も近い顔の特徴量から順に、Ｓ１１０５の処理を実行する。
また、他の例として、顔特徴抽出部１０３は、優先的に処理する対象として事前に定められた画像中の座標範囲を設定しておき、顔の座標と設定された座標範囲とに基づいて、特徴抽出する顔を選択してもよい。例えば、商品棚を見ている人物が存在し得る領域の座標範囲が設定されている場合、情報処理システムは、より商品棚に興味を示している人物について処理を実行できる。また、例えば、ある商品を見ている人物が存在し得る領域の座標範囲が設定されている場合、情報処理システムは、人物がどの商品に興味を示しているかを判定し、商品の購買年齢層の分析を適切に行うことができる。
また、撮影装置４は、最も分析したい商品棚及び関連する商品棚が映るように画角が調整されて配置されることとしてもよい。情報処理システムは、撮影装置４により撮影された画像から検出される全ての顔から特徴を抽出するようにしてもよい。これにより、情報処理システムは、商品棚及び関連する商品棚の購買年齢層を併せて分析することが可能となる。また、情報処理システムは、更に、顔を追尾することで購買時の移動経路等を含め分析することが可能となる。 In S1104, the face feature extraction unit 103 extracts the feature amount from the face image detected in S1102. The face feature extraction unit 103 extracts the feature amount by the same method as the method described in S1003. When there are a plurality of faces detected in S1102, the face feature extraction unit 103 extracts the feature amount for each of the plurality of faces. In that case, the information processing system executes the processing of S1105 in order from the feature amount of the face closest to the product shelf.
Further, as another example, the face feature extraction unit 103 sets a coordinate range in a predetermined image as a priority processing target, and based on the face coordinates and the set coordinate range. , You may select the face to extract the features. For example, when the coordinate range of the area where the person looking at the product shelf can exist is set, the information processing system can execute the process for the person who is more interested in the product shelf. Further, for example, when a coordinate range of a region in which a person viewing a certain product can exist is set, the information processing system determines which product the person is interested in, and the purchase age group of the product. Can be analyzed appropriately.
Further, the photographing device 4 may be arranged so that the angle of view is adjusted so that the product shelf to be analyzed most and the related product shelf are reflected. The information processing system may extract features from all faces detected from the image captured by the photographing device 4. This makes it possible for the information processing system to analyze the product shelves and the purchasing age groups of the related product shelves together. In addition, the information processing system can further analyze the movement route at the time of purchase by tracking the face.

Ｓ１１０５において、照合部１０６は、登録データとして登録された各人物の特徴量と、Ｓ１１０４で抽出された特徴量と、の照合処理を行う。本実施形態では、特徴量は、顔領域の画像から抽出される特徴量は、ベクトル形式の特徴量であるとする。照合部１０６は、特徴量（ベクトル）同士の照合度（類似度）を求める。照合部１０６は、公知の方法を用いて、特徴量同士の照合度を算出する。例えば、照合部１０６は、特徴量同士の照合度として、ｃｏｓ類似度を求める。人物間照合度をＳとして、照合部１０６は、比較対象の特徴量Ａと特徴量Ｂとのｃｏｓ類似度を以下の式１を用いて求める。
Ｓ＝（Ａ・Ｂ）／（｜Ａ｜｜Ｂ｜）・・・（式１）
Ａ∈Ｒ^D、Ｂ∈Ｒ^Dである。Ｄは、特徴量Ａ、Ｂの次元数を表す。 In S1105, the collation unit 106 performs collation processing between the feature amount of each person registered as the registration data and the feature amount extracted in S1104. In the present embodiment, it is assumed that the feature amount extracted from the image of the face region is the feature amount in the vector format. The collation unit 106 obtains the collation degree (similarity) between the feature quantities (vectors). The collation unit 106 calculates the degree of collation between the feature quantities by using a known method. For example, the collation unit 106 obtains the cos similarity as the collation degree between the feature quantities. With the interpersonal collation degree as S, the collation unit 106 obtains the cos similarity between the feature amount A and the feature amount B to be compared using the following formula 1.
S = (A / B) / (| A || B |) ... (Equation 1)
A ∈ R ^D , B ∈ R ^D. D represents the number of dimensions of the feature quantities A and B.

照合部１０６は、Ｓ１１０４で抽出された特徴量のうちの１つをＡとし、登録データとして登録されている特徴量それぞれを順次Ｂとして、照合度Ｓを求める。そして、照合部１０６は、Ａが、最も大きいＳに対応するＢと照合されたとする。即ち、照合部１０６は、Ａと照合されたＢに対応する人物が、Ａに対応する人物（Ｓ１１０２で顔が検出された人物）であると認証する。これにより、照合部１０６は、Ｓ１１０２で顔が検出された人物を、Ｂに対応する人物であると特定する。言い換えると、照合部１０６は、Ｓ１１０２で顔が検出された人物のクラスを、Ｂに対応する人物のクラスであると特定する。
ただし、他の例として、照合部１０６は、特徴量同士の照合度として、ｃｏｓ類似度と異なる他の指標を用いてもよい。照合部１０６は、例えば、特徴量同士の正規化相互相関を用いてもよいし、距離関数を学習によって計算するメトリックラーニングの手法を用いて特徴量同士の類似の度合を示す指標を求めてもよい。
また、照合部１０６は、求めた照合度を予め定められた範囲の値になるように正規化してもよい。 The collation unit 106 obtains the collation degree S by setting one of the feature amounts extracted in S1104 as A and each of the feature amounts registered as the registration data as B in sequence. Then, it is assumed that the collation unit 106 collates A with B corresponding to the largest S. That is, the collation unit 106 authenticates that the person corresponding to B collated with A is the person corresponding to A (the person whose face is detected in S1102). As a result, the collating unit 106 identifies the person whose face is detected in S1102 as the person corresponding to B. In other words, the collating unit 106 identifies the class of the person whose face is detected in S1102 as the class of the person corresponding to B.
However, as another example, the collation unit 106 may use another index different from the cos similarity as the collation degree between the feature quantities. For example, the collating unit 106 may use the normalized cross-correlation between the features, or may obtain an index showing the degree of similarity between the features by using a metric learning method in which the distance function is calculated by learning. Good.
Further, the collation unit 106 may normalize the obtained collation degree to a value within a predetermined range.

Ｓ１１０６において、照合部１０６は、Ｓ１１０４で抽出された全ての顔の特徴量についてＳ１１０５の処理が完了したか否かを判定する。照合部１０６は、Ｓ１１０４で抽出された全ての顔の特徴量についてＳ１１０５の処理が完了したと判定した場合、処理をＳ１１０７に進め、完了していない特徴量があると判定した場合、処理をＳ１１０５に進める。
Ｓ１１０７において、照合部１０６は、Ｓ１１０５の処理の結果（Ｓ１１０２で検出された顔がどの人物であるかを示す情報）を、生成部１１１に出力する。
以上が照合処理フローである。 In S1106, the collating unit 106 determines whether or not the processing of S1105 is completed for all the facial features extracted in S1104. When the collating unit 106 determines that the processing of S1105 has been completed for all the facial features extracted in S1104, the processing proceeds to S1107, and when it is determined that there is an uncompleted feature amount, the processing is performed in S1105. Proceed to.
In S1107, the collating unit 106 outputs the result of the processing of S1105 (information indicating which person the face detected in S1102 is) to the generation unit 111.
The above is the collation processing flow.

図７を用いて、情報処理システムが実行する出力処理の詳細を説明する。
Ｓ１２０１において、取得部１０１は、Ｓ１１０１で取得された画像が撮影された際に照明分布取得装置３により撮影された画像を取得する。
Ｓ１２０２において、照明特徴抽出部１０７は、Ｓ１２０１で取得された画像から照明条件判別用の特徴量を抽出する。本実施形態では、照明特徴抽出部１０７は、照明条件判別用の特徴量として、輝度を抽出する。 The details of the output processing executed by the information processing system will be described with reference to FIG. 7.
In S1201, the acquisition unit 101 acquires the image taken by the illumination distribution acquisition device 3 when the image acquired in S1101 is taken.
In S1202, the illumination feature extraction unit 107 extracts the feature amount for determining the illumination condition from the image acquired in S1201. In the present embodiment, the illumination feature extraction unit 107 extracts the brightness as a feature amount for determining the illumination condition.

Ｓ１２０３において、判別部１０８は、Ｓ１２０２で抽出された特徴量に基づいて、Ｓ１１０１で取得された画像の撮影環境における照明条件を判別する。本実施形態では、判別部１０８は、人物の顔が存在しうる空間の照明条件を判別する。照明条件とは、画像の撮影環境において照明（光）がどのように照射されていたかを示す条件である。本実施形態では、照明条件は、照明の向き（順光、斜光、逆光の何れか）と、照明の平均照度と、を含む。順光とは、被写体である人物に正面（撮影装置４側）から照明が照射されている状態を示す。斜光とは、被写体である人物に横方向から照明が照射されている状態を示す。逆光とは、被写体である人物の背後から照明が照射されている状態を示す。
判別部１０８は、公知の技術を用いて、照明条件を判別する。このような公知の技術には、例えば、以下の参考文献３がある。
参考文献３：特許第５７２６７９２号公報 In S1203, the discriminating unit 108 discriminates the lighting conditions in the shooting environment of the image acquired in S1101 based on the feature amount extracted in S1202. In the present embodiment, the discriminating unit 108 discriminates the lighting conditions of the space where the face of a person can exist. The illumination condition is a condition indicating how the illumination (light) is irradiated in the image shooting environment. In the present embodiment, the illumination conditions include the direction of illumination (either forward light, oblique light, or backlight) and the average illuminance of the illumination. The normal light indicates a state in which the person who is the subject is illuminated from the front (the side of the photographing device 4). Oblique light indicates a state in which a person who is a subject is illuminated from the side. Backlight refers to a state in which illumination is emitted from behind the person who is the subject.
The discriminating unit 108 discriminates the lighting conditions by using a known technique. Such known techniques include, for example, Reference 3 below.
Reference 3: Japanese Patent No. 5726792

参考文献３では、撮影画像から取得された輝度と、その輝度から推定される照度と、の関係を定めた照度推定モデルが生成される。本実施形態では、判別部１０８は、このような照度推定モデルを生成し、照明条件の判別の際に、生成した照度推定モデルベースの照度推定器を用いて、画像の撮影環境の照度分布を基に、照明の向きが順光、逆光、斜光の何れであるかを判別する。
判別部１０８は、Ｓ１２０１で取得された矩形画像内の２次元の照度分布を、１次元の照度ベクトルに変換し、この照度ベクトルを基に照明条件を判別する照明条件判別器を事前に用意し、照明条件判別器を用いて照明条件を判別する。また、判別部１０８は、この照度ベクトルを基に、平均照度を求める。 In Reference 3, an illuminance estimation model that defines the relationship between the brightness acquired from the captured image and the illuminance estimated from the brightness is generated. In the present embodiment, the discrimination unit 108 generates such an illuminance estimation model, and when discriminating the lighting conditions, uses the generated illuminance estimation model-based illuminance estimator to determine the illuminance distribution of the image shooting environment. Based on this, it is determined whether the direction of the illumination is forward light, back light, or oblique light.
The discrimination unit 108 converts the two-dimensional illuminance distribution in the rectangular image acquired in S1201 into a one-dimensional illuminance vector, and prepares in advance a lighting condition discriminator that discriminates the lighting condition based on this illuminance vector. , Illumination conditions are discriminated using a lighting condition discriminator. Further, the discrimination unit 108 obtains the average illuminance based on this illuminance vector.

Ｓ１２０４において、信頼度算出部１０９は、Ｓ１２０３で判別された照明条件に基づいて、Ｓ１１０５での顔認証結果（クラスの特定結果）の信頼度を求め、求めた信頼度を信頼度判定部１１０に出力する。
より具体的には、信頼度算出部１０９は、判別された照明条件が示す照明の向きが順光であれば顔認証の信頼度を１００％、斜光であれば顔認証の信頼度を８０％、逆光であれば顔認証の信頼度を５０％とする。そして、信頼度算出部１０９は、判別された照明条件が示す平均照度が１００ルクス〜１５００ルクスの範囲外であれば、顔認証信頼度を２０％低減させる。例えば、照明条件が順光と判別され、平均照度が６７ルクスの場合、信頼度算出部１０９は、顔認証信頼度は１００％−２０％＝８０％とする。 In S1204, the reliability calculation unit 109 obtains the reliability of the face authentication result (class identification result) in S1105 based on the lighting conditions determined in S1203, and transmits the obtained reliability to the reliability determination unit 110. Output.
More specifically, the reliability calculation unit 109 sets the reliability of face recognition to 100% if the direction of lighting indicated by the determined lighting condition is normal light, and 80% if the direction of lighting is oblique light. If it is backlit, the reliability of face recognition is set to 50%. Then, the reliability calculation unit 109 reduces the face recognition reliability by 20% when the average illuminance indicated by the determined lighting condition is outside the range of 100 lux to 1500 lux. For example, when the lighting condition is determined to be normal light and the average illuminance is 67 lux, the reliability calculation unit 109 sets the face recognition reliability to 100% -20% = 80%.

Ｓ１２０５において、信頼度判定部１１０は、Ｓ１２０４で求められた顔認証の信頼度が予め定められた条件を満たすか否かを判定する。本実施形態では、この予め定められた条件は、顔認証の信頼度が予め定められた閾値以上となることである。本実施形態では、この閾値を７０％とする。
信頼度判定部１１０は、Ｓ１２０４で求められた顔認証の信頼度が予め定められた閾値（７０％）以上である場合、処理をＳ１２０６に進め、閾値（７０％）未満である場合、図７の処理を終了する。
例えば、判別された照明条件が示す照明の向きが斜光であり、かつ、照明条件が示す平均照度が５０ルクスである場合、顔認証の信頼度は、６０％となる。そのため、信頼度判定部１１０は、この信頼度が、予め定められた閾値（７０％）未満であるとして、図７の処理を終了する。
ただし、他の例として、信頼度判定部１１０は、認証対象の顔に対する追尾処理で追尾された顔に対する顔認証の信頼度の一定期間の平均及び標準偏差に基づいて、顔認証された顔の年齢情報を示す出力情報を出力するか否かを判定してもよい。 In S1205, the reliability determination unit 110 determines whether or not the reliability of face authentication obtained in S1204 satisfies a predetermined condition. In the present embodiment, the predetermined condition is that the reliability of face recognition is equal to or higher than a predetermined threshold value. In this embodiment, this threshold is set to 70%.
The reliability determination unit 110 advances the process to S1206 when the reliability of face authentication obtained in S1204 is equal to or higher than a predetermined threshold value (70%), and when it is less than the threshold value (70%), FIG. Ends the processing of.
For example, when the direction of the illumination indicated by the determined illumination condition is oblique light and the average illuminance indicated by the illumination condition is 50 lux, the reliability of face recognition is 60%. Therefore, the reliability determination unit 110 ends the process of FIG. 7, assuming that the reliability is less than a predetermined threshold value (70%).
However, as another example, the reliability determination unit 110 of the face-authenticated face is based on the average and standard deviation of the reliability of the face recognition for the face tracked by the tracking process for the face to be authenticated for a certain period of time. It may be determined whether or not to output the output information indicating the age information.

Ｓ１２０６において、生成部１１１は、Ｓ１１０５での顔認証の結果を基に、認証結果が示す人物の年齢情報を示す出力情報を生成する。本実施形態では、生成部１１１は、登録部１０４により登録された登録データから、Ｓ１１０５での顔認証の結果が示す人物の登録データを取得する。そして、生成部１１１は、取得した登録データと対応する年齢情報を取得し、現在のその人物の年齢を示す情報を出力情報として生成する。
本実施形態では、年齢情報が生年月日なので、生成部１１１は、Ｓ１１０１で取得された画像のＥＸＩＦ情報を基に、その画像の撮影時刻を特定し、特定した撮影時刻と年齢情報が示す生年月日とから、その人物の実年齢を特定する。そして、生成部１１１は、特定した実年齢を示す出力情報を生成する。例えば、Ｓ１１０５での顔認証の結果が図５の登録データベースの「Ａさん」で、かつ、Ｓ１１０１で取得された画像の撮影時刻が「２０１７年３月１日」の場合、生成部１１１は、以下のような処理を行う。即ち、生成部１１１は、その画像の撮影時刻「２０１７年３月１日」と、Ａさんの生年月日が「１９８０年１月１日」と、その画像の撮影時刻の時点でのＡさんの年齢を「３７歳」と特定し、「３７歳」を示す出力情報を生成する。 In S1206, the generation unit 111 generates output information indicating the age information of the person indicated by the authentication result based on the result of the face authentication in S1105. In the present embodiment, the generation unit 111 acquires the registration data of the person indicated by the result of the face authentication in S1105 from the registration data registered by the registration unit 104. Then, the generation unit 111 acquires the age information corresponding to the acquired registration data, and generates information indicating the current age of the person as output information.
In the present embodiment, since the age information is the date of birth, the generation unit 111 specifies the shooting time of the image based on the EXIF information of the image acquired in S1101, and the specified shooting time and the age information indicate the birth year. Identify the actual age of the person from the date. Then, the generation unit 111 generates output information indicating the specified actual age. For example, when the result of face recognition in S1105 is "Mr. A" in the registration database of FIG. 5, and the shooting time of the image acquired in S1101 is "March 1, 2017", the generation unit 111 Perform the following processing. That is, the generation unit 111 has a shooting time of the image "March 1, 2017", a date of birth of Mr. A "January 1, 1980", and Mr. A at the time of shooting the image. Is specified as "37 years old", and output information indicating "37 years old" is generated.

Ｓ１２０７において、出力部１１２は、Ｓ１２０６で生成された出力情報を、分析処理装置５に出力する。本実施形態では、出力部１１２は、Ｓ１２０６で生成された年齢を示す出力情報のみを出力することとする。ただし、他の例として、出力部１１２は、出力情報と併せて、出力情報の信頼性を示す参考情報として、Ｓ１１０５での顔認証の信頼度を出力してもよい。また、生成部１１１は、出力情報と併せて、照明条件等の撮影条件の情報を出力してもよい。
本実施形態では、出力部１１２は、出力情報を分析処理装置５に出力することとする。ただし、出力部１１２は、映像出力装置１６に出力情報を表示することで出力してもよい。
また、出力部１１２は、出力情報を出力する際に、出力情報に対応する人物が誰であるか示す情報（Ｓ１１０５での顔認証の結果の情報）を出力しないように制御する。これにより、情報処理システムは、個人情報が漏洩する可能性を低減することができる。例えば、アルコール飲料を販売する店のレジ担当のレジ画面に、未成年か否かを表示するユースケースに情報処理システムが適用される場合が考えられる。この場合、購入者の顔認証処理は内部的に行われ、結果が画面には表示されないため、個人情報がレジ係に不用意に公表されないこととなる。 In S1207, the output unit 112 outputs the output information generated in S1206 to the analysis processing device 5. In the present embodiment, the output unit 112 outputs only the output information indicating the age generated in S1206. However, as another example, the output unit 112 may output the reliability of face recognition in S1105 as reference information indicating the reliability of the output information together with the output information. In addition, the generation unit 111 may output information on shooting conditions such as lighting conditions in addition to the output information.
In the present embodiment, the output unit 112 outputs the output information to the analysis processing device 5. However, the output unit 112 may output by displaying the output information on the video output device 16.
Further, when the output information is output, the output unit 112 controls so as not to output information indicating who is the person corresponding to the output information (information as a result of face authentication in S1105). As a result, the information processing system can reduce the possibility of leakage of personal information. For example, an information processing system may be applied to a use case that displays whether or not a person is a minor on the cashier screen of a cashier in a store that sells alcoholic beverages. In this case, the purchaser's face recognition process is performed internally and the result is not displayed on the screen, so that the personal information is not inadvertently disclosed to the cashier.

以上、本実施形態では、情報処理システムは、処理対象の人物についての顔認証の結果及び顔認証の信頼度を求め、求めた顔認証の信頼度に基づいて、その人物の年齢属性を示す出力情報を生成し、出力することとなる。このように、情報処理システムは、顔認証の信頼度を考慮した上で、顔認証対象の人物の年齢を示す出力情報を生成する。これにより、情報処理システムは、不適切な年齢情報を出力する可能性を低減できる。
また、情報処理システムは、顔認証の結果を出力しないように制御することで、個人情報の不用意な漏洩を防ぐことができる。 As described above, in the present embodiment, the information processing system obtains the result of face authentication and the reliability of face authentication for the person to be processed, and outputs an output indicating the age attribute of the person based on the obtained reliability of face authentication. Information will be generated and output. In this way, the information processing system generates output information indicating the age of the person to be face-recognized, taking into consideration the reliability of face recognition. As a result, the information processing system can reduce the possibility of outputting inappropriate age information.
In addition, the information processing system can prevent inadvertent leakage of personal information by controlling so as not to output the result of face authentication.

（変形例１）
本実施形態では、取得部１０１は、Ｓ１００１、Ｓ１１０１で撮影装置４により撮影された２次元画像を取得することとした。ただし、他の例としては、取得部１０１は、撮影装置４により撮影された３次元データを基にレンダリングした２次元画像を取得してもよい。また、取得部１０１は、他の装置で撮影された画像を、ネットワークを介して、取得するようにしてもよい。
（変形例２）
本実施形態では、情報処理システムは、人物の属性情報として、年齢の情報を求めて、出力することとした。ただし、情報処理システムは、人物の属性情報として、性別、人種等の他の属性の情報を求めて、出力することとしてもよい。 (Modification example 1)
In the present embodiment, the acquisition unit 101 acquires the two-dimensional image photographed by the photographing apparatus 4 in S1001 and S1101. However, as another example, the acquisition unit 101 may acquire a two-dimensional image rendered based on the three-dimensional data photographed by the photographing apparatus 4. Further, the acquisition unit 101 may acquire an image taken by another device via a network.
(Modification 2)
In the present embodiment, the information processing system seeks and outputs age information as personal attribute information. However, the information processing system may obtain and output information on other attributes such as gender and race as the attribute information of the person.

（変形例３）
また、本実施形態では、情報処理システムは、単一のカメラである照明分布取得装置３により撮影された画像に基づいて、撮影装置４の撮影環境の照明条件を判別することとした。ただし、他の例として、情報処理システムは、複数の照明分布取得装置３を用いて以下のようにしてもよい。即ち、情報処理システムは、複数の照明分布取得装置３により撮影された複数の画像それぞれに基づいて、複数の照明分布取得装置３それぞれの周囲の照明条件を判別し、判別した複数の照明条件に基づいて、撮影装置４の撮影環境の照明条件を判別してもよい。
また、情報処理システムは、照明分布取得装置３により撮影された画像ではなく、撮影装置４の周囲に配置された照明センサ等のセンシングデバイスの信号に基づいて、撮影装置４の撮影環境の照明条件を判別してもよい。また、情報処理システムは、照明分布取得装置３により撮影された画像と、撮影装置４の周囲に配置されたセンシングデバイスの信号と、のそれぞれから、複数の照明条件を判別し、判別した複数の照明条件に基づいて、最終的な照明条件を求めてもよい。 (Modification 3)
Further, in the present embodiment, the information processing system determines the lighting conditions of the shooting environment of the shooting device 4 based on the images taken by the lighting distribution acquisition device 3 which is a single camera. However, as another example, the information processing system may use the plurality of illumination distribution acquisition devices 3 as follows. That is, the information processing system discriminates the lighting conditions around each of the plurality of lighting distribution acquisition devices 3 based on each of the plurality of images taken by the plurality of lighting distribution acquisition devices 3, and sets the determined lighting conditions. Based on this, the lighting conditions of the shooting environment of the shooting device 4 may be determined.
Further, the information processing system is based on the signals of a sensing device such as a lighting sensor arranged around the photographing device 4 instead of the image photographed by the lighting distribution acquisition device 3, and the lighting conditions of the photographing environment of the photographing device 4. May be determined. Further, the information processing system discriminates a plurality of lighting conditions from each of the image captured by the illumination distribution acquisition device 3 and the signal of the sensing device arranged around the imaging device 4, and a plurality of discriminated lighting conditions. The final lighting condition may be obtained based on the lighting condition.

（変形例４）
本実施形態では、信頼度算出部１０９は、Ｓ１２０４で判別された照明条件を基に、Ｓ１１０５での顔認証の信頼度を求めることとした。ただし、他の例として、信頼度算出部１０９は、照明条件以外の撮影条件に基づいて、顔認証の信頼度を求めてもよい。
例えば、信頼度算出部１０９は、Ｓ１１０１で取得された画像における人物の姿勢に基づいて、顔認証の信頼度を求めてもよい。例えば、信頼度算出部１０９は、Ｓ１１０１で取得された画像における人物の姿勢が、直立の場合、顔認証の信頼度を１００％にし、しゃがんでいる場合、顔認証の信頼度を５０％にしてもよい。
また、例えば、信頼度算出部１０９は、Ｓ１１０１で取得された画像における人物の向きに基づいて、顔認証の信頼度を求めてもよい。例えば、信頼度算出部１０９は、Ｓ１１０１で取得された画像における人物の向きが、正面向きの場合、顔認証の信頼度を１００％にし、横向きの場合、顔認証の信頼度を５０％にしてもよい。
また、例えば、信頼度算出部１０９は、Ｓ１１０１で取得された画像における人物の位置に基づいて、顔認証の信頼度を求めてもよい。例えば、信頼度算出部１０９は、Ｓ１１０１で取得された画像における人物の位置が、予め定められた領域内である場合、顔認証の信頼度を１００％にし、それ以外の領域内である場合、顔認証の信頼度を５０％にしてもよい。 (Modification example 4)
In the present embodiment, the reliability calculation unit 109 determines the reliability of face recognition in S1105 based on the lighting conditions determined in S1204. However, as another example, the reliability calculation unit 109 may obtain the reliability of face recognition based on shooting conditions other than the lighting conditions.
For example, the reliability calculation unit 109 may obtain the reliability of face recognition based on the posture of the person in the image acquired in S1101. For example, the reliability calculation unit 109 sets the reliability of face recognition to 100% when the posture of the person in the image acquired in S1101 is upright, and sets the reliability of face recognition to 50% when crouching. May be good.
Further, for example, the reliability calculation unit 109 may obtain the reliability of face recognition based on the orientation of the person in the image acquired in S1101. For example, the reliability calculation unit 109 sets the reliability of face recognition to 100% when the orientation of the person in the image acquired in S1101 is frontward, and sets the reliability of face recognition to 50% when the orientation is landscape. May be good.
Further, for example, the reliability calculation unit 109 may obtain the reliability of face recognition based on the position of a person in the image acquired in S1101. For example, the reliability calculation unit 109 sets the reliability of face recognition to 100% when the position of the person in the image acquired in S1101 is within a predetermined area, and sets the reliability of face recognition to 100% when the position is within the other area. The reliability of face recognition may be set to 50%.

また、例えば、信頼度算出部１０９は、Ｓ１１０１で取得された画像における人物に対するオクルージョンの度合に基づいて、顔認証の信頼度を求めてもよい。例えば、信頼度算出部１０９は、Ｓ１１０１で取得された画像における人物に対するオクルージョンの度合に応じた値を１００％から引いた割合を、顔認証の信頼度にしてもよい。
また、信頼度算出部１０９は、Ｓ１１０１で取得された画像における人物の姿勢、向き、位置、その人物に対するオクルージョンの度合、及びＳ１２０４で判別された照明条件のうちの２つ以上に基づいて、顔認証の信頼度を求めてもよい。 Further, for example, the reliability calculation unit 109 may obtain the reliability of face recognition based on the degree of occlusion with respect to a person in the image acquired in S1101. For example, the reliability calculation unit 109 may use the ratio obtained by subtracting the value corresponding to the degree of occlusion with respect to the person in the image acquired in S1101 from 100% as the reliability of face recognition.
Further, the reliability calculation unit 109 is based on two or more of the posture, orientation, position of the person in the image acquired in S1101, the degree of occlusion with respect to the person, and the lighting conditions determined in S1204. You may ask for the reliability of the authentication.

また、他の例として、信頼度算出部１０９は、認証対象の顔に対して行われる追尾処理の安定性に基づいて、顔認証の信頼度を求めてもよい。追尾処理の安定性とは、追尾処理がどの程度安定して行われているかを示す指標であり、例えば、連続する複数のフレームにおける追尾対象の人物の顔を検出できたフレームの割合等である。例えば、信頼度算出部１０９は、認証対象の顔に対して行われる追尾処理の安定性が高い程、顔認証の信頼度を高い値に決定する。
また、他の例として、信頼度算出部１０９は、Ｓ１１０５での顔認証の処理の際に求めらえた照合度に基づいて、顔認証の信頼度を求めてもよい。例えば、信頼度算出部１０９は、照合度が高いほど、顔認証の信頼度を高い値に決定する。 Further, as another example, the reliability calculation unit 109 may obtain the reliability of face recognition based on the stability of the tracking process performed on the face to be authenticated. The stability of the tracking process is an index showing how stable the tracking process is, and is, for example, the ratio of frames in which the face of the person to be tracked can be detected in a plurality of consecutive frames. .. For example, the reliability calculation unit 109 determines the reliability of face authentication to a higher value as the stability of the tracking process performed on the face to be authenticated is higher.
Further, as another example, the reliability calculation unit 109 may obtain the reliability of face authentication based on the collation degree obtained at the time of the face authentication process in S1105. For example, the reliability calculation unit 109 determines the reliability of face recognition to a higher value as the degree of verification is higher.

＜実施形態２＞
本実施形態では、情報処理システムは、顔認証の信頼度に応じて出力する情報を変更する。
本実施形態の情報処理システムのシステム構成は、実施形態１と同様である。情報処理システムの各要素のハードウェア構成は、実施形態１と同様である。
図８は、本実施形態の画像処理装置１の機能構成の一例を示す図である。本実施形態では、画像処理装置１は、図３に示す機能に加えて、属性特徴抽出部２０１、属性認識部２０２を含む。
属性特徴抽出部２０１は、撮影装置４により撮影された顔の画像から、その顔の人物の属性に関する特徴量を抽出する。本実施形態では、この属性は、年齢であるとする。
属性認識部２０２は、属性特徴抽出部２０１により抽出された特徴量に基づいて、検出部１０２により顔が検出された人物の属性を認識する。 <Embodiment 2>
In the present embodiment, the information processing system changes the information to be output according to the reliability of face recognition.
The system configuration of the information processing system of the present embodiment is the same as that of the first embodiment. The hardware configuration of each element of the information processing system is the same as that of the first embodiment.
FIG. 8 is a diagram showing an example of the functional configuration of the image processing device 1 of the present embodiment. In the present embodiment, the image processing device 1 includes an attribute feature extraction unit 201 and an attribute recognition unit 202 in addition to the functions shown in FIG.
The attribute feature extraction unit 201 extracts the feature amount related to the attribute of the person with the face from the image of the face taken by the photographing device 4. In this embodiment, this attribute is age.
The attribute recognition unit 202 recognizes the attribute of the person whose face is detected by the detection unit 102 based on the feature amount extracted by the attribute feature extraction unit 201.

本実施形態の情報処理システムは、実施形態１と同様に、登録処理、照合処理、出力処理を行う。これらの処理のうち、登録処理と照合処理とは、実施形態１と同様の処理である。また、本実施形態の情報処理システムの処理のうち、出力処理は、実施形態１と異なる。
図９を用いて、本実施形態の情報処理システムの出力処理の詳細を説明する。図９の処理のうちＳ１２０１〜Ｓ１２０４、Ｓ１２０６それぞれの処理は、図７と同様である。
Ｓ２２０１において、信頼度判定部１１０は、Ｓ１２０４で求められた顔認証の信頼度が予め定められた第１の閾値以上であるか否かを判定する。信頼度判定部１１０は、Ｓ１２０４で求められた顔認証の信頼度が第１の閾値以上であると判定した場合、処理をＳ１２０６に進め、第１の閾値未満であると判定した場合、処理をＳ２２０２に進める。本実施形態では、第１の閾値を８５％（０．８５）とする。 The information processing system of the present embodiment performs registration processing, collation processing, and output processing as in the first embodiment. Of these processes, the registration process and the collation process are the same processes as in the first embodiment. Further, among the processes of the information processing system of the present embodiment, the output process is different from that of the first embodiment.
The details of the output processing of the information processing system of the present embodiment will be described with reference to FIG. Of the processes of FIG. 9, the processes of S1201 to S1204 and S1206 are the same as those of FIG. 7.
In S2201, the reliability determination unit 110 determines whether or not the reliability of face authentication obtained in S1204 is equal to or higher than a predetermined first threshold value. When the reliability determination unit 110 determines that the reliability of the face authentication obtained in S1204 is equal to or higher than the first threshold value, the process proceeds to S1206, and when it is determined that the reliability is less than the first threshold value, the process is performed. Proceed to S2202. In this embodiment, the first threshold value is 85% (0.85).

Ｓ２２０２において、信頼度判定部１１０は、Ｓ１２０４で求められた顔認証の信頼度が第１の閾値よりも小さい、予め定められた第２の閾値以上であるか否かを判定する。信頼度判定部１１０は、Ｓ１２０４で求められた顔認証の信頼度が第２の閾値以上であると判定した場合、処理をＳ２２０３に進め、第２の閾値未満であると判定した場合、処理をＳ２２０４に進める。本実施形態では、第２の閾値を６０％（０．６０）とする。
例えば、Ｓ１２０３で求められた照明条件が示す照明の向きが順光で、かつ、平均照度が３００ルクスの場合、Ｓ１２０４で求められた顔認証の信頼度は、８０％となる。そのため、信頼度判定部１１０は、Ｓ２２０１、Ｓ２２０２の処理を経て、処理をＳ２２０３に進める。 In S2202, the reliability determination unit 110 determines whether or not the reliability of face recognition obtained in S1204 is smaller than the first threshold value and equal to or higher than a predetermined second threshold value. When the reliability determination unit 110 determines that the reliability of the face authentication obtained in S1204 is equal to or higher than the second threshold value, the process proceeds to S2203, and when it is determined that the reliability is less than the second threshold value, the process is performed. Proceed to S2204. In the present embodiment, the second threshold value is set to 60% (0.60).
For example, when the lighting direction indicated by the lighting conditions obtained in S1203 is normal light and the average illuminance is 300 lux, the reliability of face recognition obtained in S1204 is 80%. Therefore, the reliability determination unit 110 advances the process to S2203 through the processes of S2201 and S2202.

Ｓ２２０３において、生成部１１１は、Ｓ１２０６と同様の処理で、Ｓ１１０５で顔認証された人物の年齢の情報を求める。また、生成部１１１は、求めた年齢の誤差を、以下のようにして求める。即ち、生成部１１１は、Ｓ１１０５で照合部１０６により求められた照合度のうち最も大きいものから予め定められた個数である３個の照合度に対応する人物を特定する。生成部１１１は、特定した人物それぞれの登録データに対応する生年月日の情報を取得する。そして、生成部１１１は、取得した生年月日を基に、この３名の年齢を求める。そして、生成部１１１は、求めた３つの年齢の最大値と最小値との間の範囲を年齢の誤差の情報として求める。
例えば、上位３つの照合度に対応する人物の年齢が、照合度上位から「２８歳」、「２２歳」、「３２歳」の場合、年齢の誤差は、「２２歳〜３２歳」となる。
そして、生成部１１１は、求めた年齢を示す情報と、年齢の誤差を示す情報と、を含む年齢情報を、出力情報として生成する。 In S2203, the generation unit 111 obtains the age information of the person whose face has been authenticated in S1105 by the same process as in S1206. In addition, the generation unit 111 obtains the error of the obtained age as follows. That is, the generation unit 111 identifies a person corresponding to three collation degrees, which is a predetermined number from the largest collation degree obtained by the collation unit 106 in S1105. The generation unit 111 acquires the date of birth information corresponding to the registration data of each of the specified persons. Then, the generation unit 111 obtains the ages of these three persons based on the acquired date of birth. Then, the generation unit 111 obtains the range between the maximum value and the minimum value of the obtained three ages as the information of the age error.
For example, if the ages of the persons corresponding to the top three collations are "28", "22", and "32" from the top of the collation, the age error is "22 to 32". ..
Then, the generation unit 111 generates age information including the obtained age information and the age error information as output information.

Ｓ２２０４において、属性特徴抽出部２０１は、Ｓ１１０１で取得された画像から属性認識用の特徴量を抽出する。本実施形態では、属性特徴抽出部２０１は、この特徴量として、輝度勾配ヒストグラムを抽出する。属性特徴抽出部２０１は、以下の参考文献４の技術を用いて、輝度勾配ヒストグラムを抽出する。
参考文献４：ＡＰｅｄｅｓｔｒｉａｎＤｅｔｅｃｔｏｒＵｓｉｎｇＨｉｓｔｏｇｒａｍｓｏｆＯｒｉｅｎｔｅｄＧｒａｄｉｅｎｔｓａｎｄａＳｕｐｐｏｒｔＶｅｃｔｏｒＭａｃｈｉｎｅＣｌａｓｓｉｆｉｅｒ：Ｍ．Ｂｅｒｔｏｚｚｉ、Ａ．Ｂｒｏｇｇｉ、Ｍ．ＤｅｌＲｏｓｅ、Ｍ．Ｆｅｌｉｓａ、Ａ．ＲａｋｏｔｏｍａｍｏｎｊｙａｎｄＦ．Ｓｕａｒｄ、ＩＥＥＥＩｎｔｅｌｌｉｇｅｎｔＴｒａｎｓｐｏｒｔａｔｉｏｎＳｙｓｔｅｍｓＣｏｎｆｅｒｅｎｃｅ：２００７ In S2204, the attribute feature extraction unit 201 extracts the feature amount for attribute recognition from the image acquired in S1101. In the present embodiment, the attribute feature extraction unit 201 extracts a luminance gradient histogram as this feature amount. The attribute feature extraction unit 201 extracts a luminance gradient histogram using the technique of Reference 4 below.
Reference 4: A Pedestrian Detector Using Histograms of Oriented Gradients and a Support Vector Machine Classifier: M.D. Bertozi, A. Broggi, M.D. Del Rose, M.D. Felisa, A. Rakotomamonjy and F. Sound, IEEE Intelligent Transport Systems Conference: 2007

Ｓ２２０５において、属性認識部２０２は、Ｓ２２０４で抽出された属性認識用の特徴量を基に、Ｓ１１０１で取得された画像内の人物の年齢属性を認識する。本実施形態では、属性認識部２０２は、事前に用意された各年代に対応する年代推定器に、Ｓ２２０４で抽出された輝度勾配ヒストグラムを入力し、対象の人物が各年代に属する確からしさを示す尤度を求めることで年代認識を行う。属性認識部２０２は、参考文献４に開示されたＳｕｐｐｏｒｔＶｅｃｔｏｒＭａｃｈｉｎｅ（以下、ＳＶＭ）を用いて、各年代に属する尤度を求める。属性認識部２０２は、最も高い尤度に対応する年代の情報を、Ｓ１１０１で取得された画像内の人物の年齢情報として決定する。
例えば、対象の人物が１０代に属する尤度を求める推定器を構築する場合、この推定器は、１０代の人物の画像の特徴量には学習ラベルとして＋１、それ以外の年代の人物の画像の特徴量には学習ラベルとして−１を与えることで学習される。 In S2205, the attribute recognition unit 202 recognizes the age attribute of the person in the image acquired in S1101 based on the feature amount for attribute recognition extracted in S2204. In the present embodiment, the attribute recognition unit 202 inputs the luminance gradient histogram extracted in S2204 into the age estimator corresponding to each age prepared in advance, and indicates the certainty that the target person belongs to each age. Age recognition is performed by finding the likelihood. The attribute recognition unit 202 uses the Support Vector Machine (hereinafter referred to as SVM) disclosed in Reference 4 to obtain the likelihood of belonging to each age group. The attribute recognition unit 202 determines the age information corresponding to the highest likelihood as the age information of the person in the image acquired in S1101.
For example, when constructing an estimator for determining the likelihood that the target person belongs to a teenager, this estimator has +1 as a learning label for the feature amount of the image of the teenager, and the image of a person of another age. The feature quantity of is learned by giving -1 as a learning label.

ここで、属性認識部２０２は、属性認識結果と併せて属性認識結果の誤差範囲、又は、信頼度を求めてもよい。例えば、属性認識部２０２は、求めた尤度が最も高い２つの年代の尤度の差分を求め、その差分が所定の閾値以上の場合、上位から２番目の年代の情報を誤差範囲の情報として出力してもよい。例えば、年代毎の尤度は最小値０、最大値１とし、差分の閾値として０．１５とする。３０代の尤度が０．８と最も高く、４０代の尤度が０．６と次に高い場合、差分は０．２なので差分の閾値を超えている。この場合、属性認識部２０２は、年齢情報の誤差範囲の情報として、「４０代」の情報を求めてもよい。
また、本実施形態では、属性認識部２０２は、顔認証信頼度が低く年齢認識の精度劣化の可能性を考慮して年代認識を行うこととしたが、他の例として、年齢認識を行ってもよい。
また、情報処理システムは、Ｓ１１０５での照合結果に基づき年齢情報を求めて、求めた年齢情報を、Ｓ２２０５での属性認識処理、又は、属性認識結果の補正に利用してもよい。 Here, the attribute recognition unit 202 may obtain the error range or the reliability of the attribute recognition result together with the attribute recognition result. For example, the attribute recognition unit 202 obtains the difference between the likelihoods of the two age groups having the highest likelihood, and when the difference is equal to or greater than a predetermined threshold value, the information of the second highest age group is used as the information of the error range. It may be output. For example, the likelihood for each age group is set to a minimum value of 0 and a maximum value of 1, and the difference threshold is set to 0.15. When the likelihood in the 30s is the highest at 0.8 and the likelihood in the 40s is the next highest at 0.6, the difference is 0.2, which exceeds the threshold of the difference. In this case, the attribute recognition unit 202 may request the information of "40s" as the information of the error range of the age information.
Further, in the present embodiment, the attribute recognition unit 202 determines the age recognition in consideration of the possibility that the face recognition reliability is low and the accuracy of the age recognition deteriorates, but as another example, the age recognition is performed. May be good.
Further, the information processing system may obtain age information based on the collation result in S1105 and use the obtained age information for attribute recognition processing in S2205 or correction of the attribute recognition result.

Ｓ２２０６において、生成部１１１は、Ｓ２２０５で決定された年齢を示す出力情報を生成する。また、生成部１１１は、Ｓ２２０５で求められた年齢の誤差情報を、出力情報に含ませてもよい。
Ｓ２２０７において、出力部１１２は、Ｓ１２０６、Ｓ２２０３、Ｓ２２０６の何れかで生成された出力情報を、分析処理装置５に出力する。ただし、他の例として、出力部１１２は、映像出力装置１６に出力情報を表示することで出力してもよい。
また、出力部１１２は、出力情報を出力する際に、出力情報に対応する人物が誰であるか示す情報（Ｓ１１０５での顔認証の結果の情報）を出力しないように制御する。 In S2206, the generation unit 111 generates output information indicating the age determined in S2205. Further, the generation unit 111 may include the age error information obtained in S2205 in the output information.
In S2207, the output unit 112 outputs the output information generated by any one of S1206, S2203, and S2206 to the analysis processing device 5. However, as another example, the output unit 112 may output by displaying the output information on the video output device 16.
Further, when the output information is output, the output unit 112 controls so as not to output information indicating who is the person corresponding to the output information (information as a result of face authentication in S1105).

また、出力部１１２は、他の例として、年齢と、年齢の誤差の上限及び下限までの各差分を平均化した平均誤差の情報と、を出力してもよい。その場合、生成部１１１は、Ｓ１２０６、Ｓ２２０３、Ｓ２２０６の何れかで、以下の式２を用いて、この平均誤差の情報を求めて、求めた情報を出力情報に含ませる。
Ｅｒｒ_Ave ＝・｛（Ｅｒｒ_max−Ａｇｅ）＋（Ａｇｅ−Ｅｒｒ_min）｝／２・・・（式２）
Ｅｒｒ_Aveは、求める平均誤差を示す。Ａｇｅは、Ｓ１２０６、Ｓ２２０３、Ｓ２２０６の何れかで取得された年齢情報を示す。Ｅｒｒ_maxは、年齢の誤差情報が示す年齢の誤差の上限を示す。Ｅｒｒ_minは、年齢の誤差情報が示す年齢の誤差の下限を示す。
例えば、年齢の誤差が２２歳〜３２歳の場合、Ｅｒｒ_maxが３２、Ｅｒｒ_minが２２となる。そして、年齢情報が２８歳を示すとすると、式３を用いて、年齢の平均誤差は、｛（３２−２８）＋（２８−２２）｝／２＝５となる。生成部１１１は、出力情報として、「２８歳（誤差範囲：±５）」を示す情報を生成する。 Further, as another example, the output unit 112 may output the age and the information of the average error obtained by averaging the differences up to the upper limit and the lower limit of the error of the age. In that case, the generation unit 111 obtains the information of the average error by using the following equation 2 in any of S1206, S2203, and S2206, and includes the obtained information in the output information.
Err _Ave = ・ {(Err _max -Age) + (Age-Err _min )} / 2 ... (Equation 2)
Err _Ave indicates the average error to be obtained. Age indicates the age information acquired by any one of S1206, S2203, and S2206. Err _max indicates the upper limit of the age error indicated by the age error information. Err _min indicates the lower limit of the age error indicated by the age error information.
For example, when the age error is 22 to 32 years old, Err _max is 32 and Err _min is 22. Then, assuming that the age information indicates 28 years old, the average error of the age is {(32-28) + (28-22)} / 2 = 5 using Equation 3. The generation unit 111 generates information indicating "28 years old (error range: ± 5)" as output information.

以上、本実施形態では、情報処理システムは、顔認証信頼度に応じて出力する情報を調整することとした。これにより情報処理システムは、より適切な情報を出力できる。
本実施形態では、Ｓ２２０１で、情報処理システムは、Ｓ１２０４で求められた顔認証の信頼度が予め定められた第１の閾値未満と判定した場合、処理をＳ２２０２に進めることとした。ただし、他の例として、情報処理システムは、Ｓ１２０４で求められた顔認証の信頼度が予め定められた第１の閾値未満と判定した場合、処理をＳ２２０４に進めてもよい。 As described above, in the present embodiment, the information processing system adjusts the information to be output according to the reliability of face authentication. As a result, the information processing system can output more appropriate information.
In the present embodiment, when the information processing system determines in S2201 that the reliability of the face authentication obtained in S1204 is less than a predetermined first threshold value, the processing proceeds to S2202. However, as another example, when the information processing system determines that the reliability of the face authentication obtained in S1204 is less than a predetermined first threshold value, the processing may proceed to S2204.

＜その他の実施形態＞
本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサがプログラムを読み出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 <Other Embodiments>
The present invention supplies a program that realizes one or more functions of the above-described embodiment to a system or device via a network or storage medium, and one or more processors in the computer of the system or device reads and executes the program. It can also be realized by the processing to be performed. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

例えば、上述した情報処理システムの機能構成の一部又は全てをハードウェアとして画像処理装置１に実装してもよい。以上、本発明の実施形態の一例について詳述したが、本発明は係る特定の実施形態に限定されるものではない。例えば、上述した各実施形態を任意に組み合わせる等してもよい。 For example, a part or all of the functional configuration of the above-mentioned information processing system may be mounted on the image processing device 1 as hardware. Although an example of the embodiment of the present invention has been described in detail above, the present invention is not limited to the specific embodiment. For example, the above-described embodiments may be arbitrarily combined.

１画像処理装置
４撮影装置
１１ＣＰＵ 1 Image processing device 4 Imaging device 11 CPU

Claims

Extraction means for extracting the features of the object included in the input image,
A specific means for specifying the class of the object based on the feature amount extracted by the extraction means, and
A determination means for determining the reliability of a specific result by the specific means, and
When the reliability determined by the determination means satisfies a predetermined condition, an output means for outputting predetermined attribute information regarding the class of the object specified by the specific means and an output means.
Information processing device with.

The information processing device according to claim 1, wherein the determination means determines the reliability of a specific result by the specific means based on the shooting conditions of the input image.

The information processing apparatus according to claim 2, wherein the shooting conditions include at least one of a posture, orientation, position, degree of occlusion, and lighting conditions in the shooting environment of the input image.

The information processing device according to claim 1, wherein the determination means determines the reliability of a specific result by the specific means based on the stability of the tracking process of the object.

The determination means determines the reliability of the specific result by the specific means based on the degree of collation between the feature amount corresponding to the class specified by the specific means and the feature amount extracted by the extraction means. The information processing device according to claim 1.

When the reliability determined by the determination means does not satisfy the condition, the output means includes the attribute information regarding the class of the object specified by the specific means, information indicating an error of the attribute information, and information indicating an error of the attribute information. The information processing apparatus according to any one of claims 1 to 5.

When the reliability determined by the determination means does not satisfy the condition, the generation that generates the attribute information regarding the class of the object specified by the specific means based on the feature amount extracted from the input image. Have more means,
The information processing apparatus according to any one of claims 1 to 5, wherein the output means outputs the attribute information generated by the generation means when the reliability determined by the determination means does not satisfy the condition. ..

When the reliability determined by the determination means does not satisfy the above conditions, the highest degree of matching between each feature amount corresponding to each of a plurality of predetermined classes and the feature amount extracted by the extraction means. Further having a generation means for generating the attribute information regarding the class of the object specified by the specific means based on the attribute information about each class corresponding to each of a predetermined number of collation degrees from the object.
The information processing apparatus according to any one of claims 1 to 5, wherein the output means outputs the attribute information generated by the generation means when the reliability determined by the determination means does not satisfy the condition. ..

The information processing device according to any one of claims 1 to 8, wherein the attribute information includes at least one of age, gender, and race.

The information processing device according to any one of claims 1 to 9, wherein the object is a face of a person.

The information processing apparatus according to any one of claims 1 to 10, further comprising a control means for controlling the output means so as not to output information indicating a class of the object specified by the specific means.

An imaging device that captures the input image and
The information processing device according to any one of claims 1 to 11.
System with.

It is an information processing method executed by an information processing device.
An extraction step that extracts the features of the object included in the input image,
A specific step for identifying the class of the object based on the feature amount extracted in the extraction step, and
A determination step that determines the reliability of a specific result in the specific step, and
When the reliability determined in the determination step satisfies a predetermined condition, an output step for outputting predetermined attribute information regarding the class of the object specified in the specific step, and an output step.
Information processing methods including.

A program for causing a computer to function as each means of the information processing apparatus according to any one of claims 1 to 11.