WO2012132418A1 - Characteristic estimation device - Google Patents

Characteristic estimation device Download PDF

Info

Publication number
WO2012132418A1
WO2012132418A1 PCT/JP2012/002128 JP2012002128W WO2012132418A1 WO 2012132418 A1 WO2012132418 A1 WO 2012132418A1 JP 2012002128 W JP2012002128 W JP 2012002128W WO 2012132418 A1 WO2012132418 A1 WO 2012132418A1
Authority
WO
WIPO (PCT)
Prior art keywords
estimation
unit
attribute
attribute estimation
learning
Prior art date
Application number
PCT/JP2012/002128
Other languages
French (fr)
Japanese (ja)
Inventor
純 西村
宏明 由雄
Original Assignee
パナソニック株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by パナソニック株式会社 filed Critical パナソニック株式会社
Publication of WO2012132418A1 publication Critical patent/WO2012132418A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/50Maintenance of biometric data or enrolment thereof

Definitions

  • the present invention relates to an attribute estimation apparatus that estimates age, sex, and the like from a face image.
  • Patent Document 1 As a system for identifying the target attribute included in the image, for example, there is one described in Patent Document 1.
  • the system described in Patent Document 1 identifies a person's attributes (age, gender, etc.) based on a face image, and a computer constituting an offline training system generates an attribute identification dictionary, and an online operation system.
  • the computer that forms the character determines the attribute of the person based on the face image of the unknown person, using the attribute identification dictionary created by the computer constituting the offline training system.
  • the computer constituting the offline training system includes an attribute identification dictionary for identifying a person attribute of a face image, a plurality of sample image data each including a face image of a person with a known attribute, and an attribute of each person Is generated using learning sample data in which is associated.
  • the present invention has been made in view of such circumstances, and an object thereof is to provide an attribute estimation device that can be relearned with a small number of on-site samples.
  • the attribute estimation apparatus of the present invention uses an image input unit that inputs a face image, an estimation model holding unit that holds an estimation model for performing attribute estimation, and an estimation model that is held in the estimation model holding unit.
  • An attribute estimation unit that performs attribute estimation of the face image input by the image input unit, an estimation result accumulation unit that accumulates the face image and an attribute estimation result by the attribute estimation unit, and the attribute estimation unit
  • a sample extraction unit that extracts sample data for each group of attribute estimation results according to the above, and a relearning unit that updates the estimation model using data obtained by adding a correct answer to the sample data extracted by the sample extraction unit, Prepared.
  • the on-site sample data can be extracted evenly and can be re-learned with a small amount of on-site sample data.
  • a feature amount distribution calculation unit that calculates a feature amount distribution of a face image for each group of attribute estimation results by the attribute estimation unit is provided, and the sample extraction unit includes the attribute determined by the feature amount distribution calculation unit Sample data is extracted based on the facial image feature quantity distribution for each group of estimation results.
  • on-site sample data can be extracted evenly.
  • the feature amount distribution calculation unit clusters face image feature amounts for each group of the attribute estimation results, and the sample extraction unit extracts sample data based on a position in the cluster.
  • on-site sample data can be extracted evenly.
  • the sample extraction unit extracts data having a certain distance from the center of the cluster as sample data.
  • on-site sample data can be extracted evenly.
  • the sample data to which the correct answer is given at the time of re-learning in the re-learning unit is weighted.
  • the on-site sample data can be balanced with the initial learning sample data, and even a small amount of on-site sample data can be effectively relearned.
  • the sample data to which the correct answer is given at the time of re-learning by the re-learning unit is weighted according to the position in the cluster.
  • the on-site sample data can be balanced with the initial learning sample data, and even a small amount of on-site sample data can be effectively relearned.
  • a relearning start unit that starts relearning by the relearning unit when a set condition is satisfied.
  • the estimated model at the time of shipment can be accurately applied to the site.
  • the attribute is age.
  • the re-learning sample extraction device of the present invention extracts a sample for extracting sample data for re-learning an estimation model for each group of attribute estimation results for data in which a face image and its attribute estimation results are associated With parts.
  • the attribute estimation method of the present invention uses an image input step for inputting a face image, an estimation model holding step for holding an estimation model for performing attribute estimation, and an estimation model held in the estimation model holding step. , An attribute estimation step for estimating the attribute of the face image input in the image input step, an estimation result accumulation step for accumulating the face image and an attribute estimation result obtained by the attribute estimation step, and the attribute estimation step A sample extraction step for extracting sample data for each group of attribute estimation results by, and a relearning step for updating the estimation model using data obtained by adding a correct answer to the sample data extracted in the sample extraction step. Prepared.
  • sample data is obtained for each group of attribute estimation results, so that on-site sample data can be extracted evenly and relearning can be performed with a small amount of on-site sample data.
  • sample data is obtained for each group of attribute estimation results, so that on-site sample data can be extracted evenly, and relearning can be performed with a small amount of on-site sample data.
  • the block diagram which shows schematic structure of the attribute estimation apparatus which concerns on one embodiment of this invention Diagram showing the methods and effects of the fixed cluster number type and cluster number estimation type, which are examples of clustering methods
  • movement of the attribute estimation apparatus of FIG. 1 typically The figure which shows the example of a correct answer provision screen in the professional mode of the attribute estimation apparatus of FIG.
  • movement of the attribute estimation apparatus of FIG. The flowchart for demonstrating operation
  • FIG. 1 is a block diagram showing a schematic configuration of an attribute estimation apparatus according to an embodiment of the present invention.
  • an attribute estimation apparatus 1 includes an image input unit 12 including a video input unit 10 and a face detection unit 11, a feature amount extraction unit 13, an estimated model holding unit 14, and a face attribute estimation unit. 15, an on-site sample estimation result DB (estimation result storage unit) 16, a relearning start unit 17, a feature amount distribution calculation unit 18, a learning sample extraction unit (sample extraction unit) 19, a relearning data DB 20, A re-learning unit 21.
  • An on-site sample DB 22 is configured from data obtained from the image input unit 12 and the feature amount extraction unit 13 including the video input unit 10 and the face detection unit 11.
  • the video input unit 10 inputs video from the camera 2.
  • the face detection unit 11 extracts a face image from the video input by the video input unit 10.
  • the feature amount extraction unit 13 extracts the feature amount of the face image extracted by the face detection unit 11.
  • the feature quantity extraction unit 13 detects and normalizes facial parts such as eyes and nose from the facial image, and extracts facial features from the normalized image using Gabor features, LBP features, Haar features, and the like.
  • the feature quantity extracted by the feature quantity extraction unit 13 is multidimensional data.
  • the estimation model holding unit 14 holds an estimation model for performing attribute estimation.
  • the attribute is age.
  • the estimation model can be expressed by the following function.
  • a mapping G (Y) X that is converted from the face feature amount Y into a feature suitable for estimating the face attribute
  • a function F (X) for estimating the age, sex, etc. based on the face attribute feature amount X
  • the face attribute estimation unit 15 estimates the attribute of the face image input by the image input unit 12 using the estimation model held in the estimation model holding unit 14 and displays the result on the display terminal 3.
  • the on-site sample estimation result DB 16 accumulates the face image input by the image input unit 12 and the attribute estimation result by the face attribute estimation unit 15 in association with each other. That is, the face image collected on site and the face attribute estimation result estimated by the model are stored as a set.
  • the relearning start unit 17 starts relearning when the set condition is satisfied.
  • the feature amount distribution calculation unit 18 obtains the feature amount distribution of the face image for each group of attribute estimation results accumulated in the on-site sample estimation result DB 16. In this case, the feature quantity distribution calculation unit 18 performs clustering of face image feature quantities for each group of attribute estimation results. In this embodiment, clustering is performed for each estimated age. As shown in FIG. 2, clustering methods include a cluster number fixed type and a cluster number estimation type.
  • the learning sample extraction unit 19 extracts sample data based on the feature amount distribution of the face image for each group obtained by the attribute amount distribution calculation unit 18. In this case, sample data is extracted based on the position in the cluster. For example, data having a certain distance from the center of the cluster is extracted as sample data. Since it is too costly to input correct answers for all on-site samples, the learning sample extraction unit 19 extracts a small amount of samples.
  • the re-learning data DB 20 accumulates data in which the correct answer is added to the sample data extracted by the learning sample extraction unit 19.
  • the correct answer is given by the user. That is, the user uses the display terminal 3 to input correct face attributes for a small amount of the field sample extracted by the learning sample extraction unit 19.
  • the relearning data DB 20 gives the weight corresponding to the position in the cluster to the sample data to which the correct answer is given.
  • the relearning unit 21 updates the estimation model using data accumulated in the relearning data DB 20 (data obtained by adding a correct answer to the sample data extracted by the learning sample extraction unit 19).
  • FIG. 3 is a diagram schematically illustrating the basic operation of the attribute estimation apparatus 1 according to the present embodiment.
  • the face attribute estimation unit 15 performs attribute estimation of the face image obtained from the on-site sample DB 22 using the estimation model held in the estimation model holding unit 14.
  • the on-site sample is a group of face images detected from an image acquired by a camera (not shown) installed on the site.
  • the horizontal axis is the age (10's, 20's, 30's, 40's, 50's, 60's), and the vertical axis is the sample.
  • the learning sample extraction unit 19 collects samples for each age of the estimation results, performs clustering for each age, and extracts sample data closest to the cluster center 24 for each age.
  • the cluster center 24 is an average position of data belonging to each cluster.
  • the sample data extracted by the learning sample extraction unit 19 is on the order of several tens. In the example shown in FIG. 3, six pieces of sample data are extracted, but the number to be extracted may be determined in advance or may be determined according to the distribution situation.
  • the learning sample extraction unit 19 has two modes, a general mode and a professional mode.
  • the general mode the sample data closest to the cluster center 24 is extracted (only one).
  • sample data in the vicinity of the cluster center can be extracted.
  • the professional mode a user who has knowledge about face recognition presents a large amount of samples instead of a small amount of samples, and selects a sample that seems to be effective for relearning.
  • the large amount is an image of extracting not only one near the cluster center but also a plurality (several tens).
  • FIG. 4 is an example of a correct assignment screen in the professional mode.
  • each sample data extracted by the learning sample extraction unit 19 is given a correct answer by the correct answer giving unit 3a of the display terminal 3.
  • the correct answer giving unit 3a is operated by the user. For example, if the sample data 25a of the teenage group 50 shown in FIG. 3 is a person in their 30s, the user can enter the corresponding sample data 25a in the correct answer grant site sample data 60 displayed on the display terminal 3. The 30s (60a) of the teens to 60s displayed is designated. By this designation, a correct answer is given to the sample data 25a.
  • the 50s (60b) is designated out of the 10s to 60s displayed for the corresponding sample data 25b in the on-site sample data 60 for giving correct answers. .
  • a correct answer is given to the sample data 25b.
  • a correct answer is assigned to all the sample data extracted by the learning sample extraction unit 19 in this way.
  • the correct sample data 25 is stored in the relearning data DB 20.
  • the re-learning data DB 20 performs weighting on the accumulated correct-corrected sample data 25, and then combines the weighted correct-corrected sample data 25 and the initial learning sample data accumulated in the initial learning sample DB 30.
  • the reason why the sample data 25 given the correct answer is weighted is because the number of sample data 25 given the correct answer is overwhelmingly smaller than the number of initial learning sample data stored in the initial learning sample DB 30 (initial learning sample). This is because it is not effective to add the sample data 25 given the correct answer to the initial learning sample data as it is.
  • ⁇ ⁇ is 0.5 -If the difference from the correct attribute entered by the user is large, set a value close to 1 (note that if it is too large, generalization will drop)
  • the sample data 25 given the correct answer is accumulated in the re-learning data DB 20, so that the re-learning unit 21 receives the sample data given the correct answer and the initial learning sample data accumulated in the initial learning sample DB 30. Re-learn as a learning sample in combination.
  • the relearning unit 21 performs the following learning. ⁇ Learn new mapping from facial feature Y to facial attribute feature X ⁇ Learn new functions to estimate age, sex, etc. based on facial attribute feature X
  • the estimated model holding unit 14 updates the estimated model with the re-learning model 40 generated by the re-learning unit 21.
  • a new function F ′ (X) that estimates age, gender, etc. based on facial attribute feature amount X It is.
  • FIG. 5 is a flowchart for explaining the operation of the attribute estimation apparatus 1 according to the present embodiment.
  • an initial model is first generated (step S1). This initial model is generated in the laboratory.
  • model evaluation that is, attribute estimation is performed (step S2).
  • attribute estimation it is determined whether the number of re-learning has reached N (predetermined value represents an integer, for example, 3 times, 4 times, etc.) or whether the difference from the previous model is less than a predetermined threshold (step) S3). If the number of times of re-learning has reached N, or if the difference from the previous model is less than the threshold value (in the case of “Yes”), this processing is terminated as the end of re-learning.
  • N predetermined value represents an integer, for example, 3 times, 4 times, etc.
  • the timing of the end of the relearning the following timing can be given. -After a predetermined number of times-When the difference from the previous model is sufficiently small-When there is almost no change in the estimation result by the previous model and the estimation result by the model after relearning-The model based on the correct answer entered by the user When the accuracy is evaluated and the accuracy is saturated (or the accuracy has been improved more than a certain level)
  • FIG. 6 is a flowchart for explaining the operation of the learning sample extraction unit 19.
  • the learning sample extraction part 19 divides a sample for every attribute group (step S40).
  • the sample is mapped to the feature space for each attribute group (step S41), and clustering is performed for each attribute group (step S42).
  • a sample near the center of each cluster is extracted (step S43).
  • step S5 After the correct answer is given by the user operation, the re-learning unit 21 generates re-learning data (step S6), and re-learning is performed based on the generated re-learning data (step S7).
  • steps S2 to S7 are repeatedly performed at the timing of starting the relearning.
  • Timing can be given as the timing for starting the relearning (when to learn).
  • (1) When the distribution of samples used at the time of shipping model creation differs from the distribution of samples at each estimated age by more than a certain level (2)
  • the average of each age when the shipping model was created When the average of each estimated age of the field sample deviates more than a certain level (3) (1) and (2) may occur simultaneously (most common)
  • (4) Generate a temporary model by trusting each estimated age of the on-site sample, and when the shipping model and the estimation result become large, ⁇ Comparison of differences in estimated values: Difference between estimated values when attributes are estimated using on-site samples and estimated values when estimated using shipping models ⁇ Comparison of ratios by age: What percentage is in teens , Ratio by age, such as what percentage in 20s
  • the estimation model holding unit 14 holds an estimation model for performing attribute estimation
  • the face attribute estimation unit 15 holds the estimation model holding unit 14.
  • the learning sample extraction unit 19 extracts sample data for each group of attribute estimation results by the face attribute estimation unit 15, so that the field sample data is evenly distributed. Can be extracted and relearned with a small amount of field sample data.
  • a device that re-learns an estimated model using sample data for relearning obtained by obtaining a distribution of feature values and assigning correct data to a sample extracted based on the distribution is used for an image captured by a camera.
  • the method is applied to the output result of the apparatus for estimating the face attribute (corresponding to the on-site sample estimation result DB). It can be used in a service where the shop collects on-site samples, and when the on-site samples are sent to the center, the samples are extracted, given correct answers, and re-learned.
  • the feature amount distribution calculation unit 18 performs clustering of face image feature amounts for each group of attribute estimation results, and the learning sample extraction unit 19 performs position in the cluster. Since sample data is extracted based on the above, it is possible to extract field sample data evenly.
  • the sample data to which the correct answer is given at the time of re-learning by the re-learning unit 21 is weighted. Can be re-learned effectively even with a small amount of on-site sample data.
  • the relearning start unit 17 starts relearning when the set condition is satisfied, so that the estimated model at the time of shipment is adapted to the site with high accuracy. be able to.
  • each process shown in FIGS. 5 and 6 of the present embodiment can be described by a program and stored and distributed in a storage medium such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory.
  • a storage medium such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory.
  • the present invention has an effect that it can be re-learned with a small number of on-site samples, and can be applied to an apparatus for estimating the age and sex of a person.

Abstract

An estimation model for carrying out a characteristic estimation is retained in an estimation model retention unit (14). A characteristic estimation of an inputted facial image is carried out with a characteristic estimation unit (15), using the estimation model which is retained in the estimation model retention unit (14). Sample data is extracted with a learning sample extraction unit (19) for each group of characteristic estimation results by the characteristic estimation unit (15). Sample data is thus obtained for each group of characteristic estimation results, thereby allowing uniform extraction of onsite sample data, and re-learning with little onsite sample data.

Description

属性推定装置Attribute estimation device
 本発明は、顔画像から年齢や性別などを推定する属性推定装置に関する。 The present invention relates to an attribute estimation apparatus that estimates age, sex, and the like from a face image.
 画像に含まれる対象の属性を識別するシステムとして、例えば特許文献1に記載されているものがある。この特許文献1に記載されたシステムは、顔画像に基づいて人物の属性(年齢、性別など)を識別するものであり、オフライントレーニングシステムを構成するコンピュータが属性識別辞書を生成し、オンライン動作システムを形成するコンピュータが、オフライントレーニングシステムを構成するコンピュータによって作成された属性識別辞書を用いて、未知の人物の顔画像に基づき、当該人物の属性を判定する。前記オフライントレーニングシステムを構成するコンピュータは、顔画像の人物の属性を識別するための属性識別辞書を、属性が既知の人物の顔画像をそれぞれ含む複数枚のサンプル画像データと個々の人物の属性とを対応付けた学習サンプルデータを用いて生成する。 As a system for identifying the target attribute included in the image, for example, there is one described in Patent Document 1. The system described in Patent Document 1 identifies a person's attributes (age, gender, etc.) based on a face image, and a computer constituting an offline training system generates an attribute identification dictionary, and an online operation system. The computer that forms the character determines the attribute of the person based on the face image of the unknown person, using the attribute identification dictionary created by the computer constituting the offline training system. The computer constituting the offline training system includes an attribute identification dictionary for identifying a person attribute of a face image, a plurality of sample image data each including a face image of a person with a known attribute, and an attribute of each person Is generated using learning sample data in which is associated.
日本国特開2006-323507号公報Japanese Unexamined Patent Publication No. 2006-323507
 しかしながら、従来の属性推定システムでは、実験室で用意した学習サンプルだけでは現場において精度がでないという課題がある。つまり、現場の画像を取り込んだ場合には上手く推定することができないという課題がある。この課題は、現場サンプルを加えることで精度の改善が可能であるが、その際に必要な正解データを入力する手間が膨大でコストが嵩むという新たな課題が生ずる。 However, in the conventional attribute estimation system, there is a problem that only the learning sample prepared in the laboratory is not accurate in the field. In other words, there is a problem that it is not possible to estimate well when an on-site image is captured. Although this problem can be improved in accuracy by adding on-site samples, a new problem arises in that the time and effort required to input correct data required at that time is enormous and the cost is increased.
 本発明は、係る事情に鑑みてなされたものであり、少ない現場サンプルで再学習することができる属性推定装置を提供することを目的とする。 The present invention has been made in view of such circumstances, and an object thereof is to provide an attribute estimation device that can be relearned with a small number of on-site samples.
 本発明の属性推定装置は、顔画像を入力する画像入力部と、属性推定を行うための推定モデルを保持する推定モデル保持部と、前記推定モデル保持部に保持されている推定モデルを用いて、前記画像入力部にて入力された顔画像の属性推定を行う属性推定部と、前記顔画像と前記属性推定部による属性推定結果を対応付けて蓄積する推定結果蓄積部と、前記属性推定部による属性推定結果のグループごとにサンプルデータを抽出するサンプル抽出部と、前記サンプル抽出部にて抽出されたサンプルデータに正解を付与したデータを用いて前記推定モデルを更新する再学習部と、を備えた。 The attribute estimation apparatus of the present invention uses an image input unit that inputs a face image, an estimation model holding unit that holds an estimation model for performing attribute estimation, and an estimation model that is held in the estimation model holding unit. An attribute estimation unit that performs attribute estimation of the face image input by the image input unit, an estimation result accumulation unit that accumulates the face image and an attribute estimation result by the attribute estimation unit, and the attribute estimation unit A sample extraction unit that extracts sample data for each group of attribute estimation results according to the above, and a relearning unit that updates the estimation model using data obtained by adding a correct answer to the sample data extracted by the sample extraction unit, Prepared.
 上記構成によれば、属性推定結果のグループごとにサンプルデータを求めるので、まんべんなく現場サンプルデータを抽出することができ、少ない現場サンプルデータで再学習することができる。 According to the above configuration, since the sample data is obtained for each group of attribute estimation results, the on-site sample data can be extracted evenly and can be re-learned with a small amount of on-site sample data.
 上記構成において、前記属性推定部による属性推定結果のグループごとに顔画像の特徴量分布を求める特徴量分布計算部を備え、前記サンプル抽出部は、前記特徴量分布計算部で求められた前記属性推定結果のグループごとの顔画像の特徴量分布に基づいてサンプルデータを抽出する。 In the above configuration, a feature amount distribution calculation unit that calculates a feature amount distribution of a face image for each group of attribute estimation results by the attribute estimation unit is provided, and the sample extraction unit includes the attribute determined by the feature amount distribution calculation unit Sample data is extracted based on the facial image feature quantity distribution for each group of estimation results.
 上記構成によれば、まんべんなく現場サンプルデータを抽出することができる。 According to the above configuration, on-site sample data can be extracted evenly.
 上記構成において、前記特徴量分布計算部は、前記属性推定結果のグループごとに顔画像特徴量のクラスタリングを行い、前記サンプル抽出部は、クラスタ内の位置に基づいてサンプルデータを抽出する。 In the above configuration, the feature amount distribution calculation unit clusters face image feature amounts for each group of the attribute estimation results, and the sample extraction unit extracts sample data based on a position in the cluster.
 上記構成によれば、まんべんなく現場サンプルデータを抽出することができる。 According to the above configuration, on-site sample data can be extracted evenly.
 上記構成において、前記サンプル抽出部は、クラスタの中心から一定距離のものをサンプルデータとして抽出する。 In the above configuration, the sample extraction unit extracts data having a certain distance from the center of the cluster as sample data.
 上記構成によれば、まんべんなく現場サンプルデータを抽出することができる。 According to the above configuration, on-site sample data can be extracted evenly.
 上記構成において、前記再学習部での再学習の際に正解が付与されたサンプルデータに重みを付ける。 In the above configuration, the sample data to which the correct answer is given at the time of re-learning in the re-learning unit is weighted.
 上記構成によれば、現場サンプルデータを初期学習サンプルデータとバランスをとることができ、少ない現場サンプルデータでも効果的に再学習することができる。 According to the above configuration, the on-site sample data can be balanced with the initial learning sample data, and even a small amount of on-site sample data can be effectively relearned.
 上記構成において、前記再学習部での再学習の際に正解が付与されたサンプルデータにクラスタ内の位置に応じた重みを付ける。 In the above configuration, the sample data to which the correct answer is given at the time of re-learning by the re-learning unit is weighted according to the position in the cluster.
 上記構成によれば、現場サンプルデータを初期学習サンプルデータとバランスをとることができ、少ない現場サンプルデータでも効果的に再学習することができる。 According to the above configuration, the on-site sample data can be balanced with the initial learning sample data, and even a small amount of on-site sample data can be effectively relearned.
 上記構成において、設定された条件を満たした場合に前記再学習部での再学習を開始する再学習開始部を備えた。 In the above configuration, a relearning start unit is provided that starts relearning by the relearning unit when a set condition is satisfied.
 上記構成によれば、出荷時の推定モデルを現場に精度良く適応させることができる。 According to the above configuration, the estimated model at the time of shipment can be accurately applied to the site.
 上記構成において、前記属性は年齢である。 In the above configuration, the attribute is age.
 本発明の再学習用サンプル抽出装置は、顔画像とその属性推定結果が対応付けられたデータに対して、属性推定結果のグループごとに推定モデルの再学習のためのサンプルデータを抽出するサンプル抽出部を備えた。 The re-learning sample extraction device of the present invention extracts a sample for extracting sample data for re-learning an estimation model for each group of attribute estimation results for data in which a face image and its attribute estimation results are associated With parts.
 上記構成によれば、属性推定結果のグループごとにサンプルデータを求めるので、まんべんなく現場サンプルデータを抽出することができる。 According to the above configuration, since sample data is obtained for each group of attribute estimation results, it is possible to extract field sample data evenly.
 本発明の属性推定方法は、顔画像を入力する画像入力ステップと、属性推定を行うための推定モデルを保持する推定モデル保持ステップと、前記推定モデル保持ステップで保持されている推定モデルを用いて、前記画像入力ステップにて入力された顔画像の属性推定を行う属性推定ステップと、前記顔画像と前記属性推定ステップによる属性推定結果を対応付けて蓄積する推定結果蓄積ステップと、前記属性推定ステップによる属性推定結果のグループごとにサンプルデータを抽出するサンプル抽出ステップと、前記サンプル抽出ステップにて抽出されたサンプルデータに正解を付与したデータを用いて前記推定モデルを更新する再学習ステップと、を備えた。 The attribute estimation method of the present invention uses an image input step for inputting a face image, an estimation model holding step for holding an estimation model for performing attribute estimation, and an estimation model held in the estimation model holding step. , An attribute estimation step for estimating the attribute of the face image input in the image input step, an estimation result accumulation step for accumulating the face image and an attribute estimation result obtained by the attribute estimation step, and the attribute estimation step A sample extraction step for extracting sample data for each group of attribute estimation results by, and a relearning step for updating the estimation model using data obtained by adding a correct answer to the sample data extracted in the sample extraction step. Prepared.
 上記方法によれば、属性推定結果のグループごとにサンプルデータを求めるので、まんべんなく現場サンプルデータを抽出することができ、少ない現場サンプルデータで再学習することができる。 According to the above method, sample data is obtained for each group of attribute estimation results, so that on-site sample data can be extracted evenly and relearning can be performed with a small amount of on-site sample data.
 本発明によれば、属性推定結果のグループごとにサンプルデータを求めるので、まんべんなく現場サンプルデータを抽出することができ、少ない現場サンプルデータで再学習することができる。 According to the present invention, sample data is obtained for each group of attribute estimation results, so that on-site sample data can be extracted evenly, and relearning can be performed with a small amount of on-site sample data.
本発明の一実施の形態に係る属性推定装置の概略構成を示すブロック図The block diagram which shows schematic structure of the attribute estimation apparatus which concerns on one embodiment of this invention クラスタリング手法の一例であるクラスタ数固定型及びクラスタ数推定型それぞれの手法と効果を示す図Diagram showing the methods and effects of the fixed cluster number type and cluster number estimation type, which are examples of clustering methods 図1の属性推定装置の基本的な動作を模式的に示した図The figure which showed the basic operation | movement of the attribute estimation apparatus of FIG. 1 typically 図1の属性推定装置のプロフェッショナルモードにおける正解付与画面例を示す図The figure which shows the example of a correct answer provision screen in the professional mode of the attribute estimation apparatus of FIG. 図1の属性推定装置の動作を説明するためのフローチャートThe flowchart for demonstrating operation | movement of the attribute estimation apparatus of FIG. 図1の属性推定装置の学習サンプル抽出部の動作を説明するためのフローチャートThe flowchart for demonstrating operation | movement of the learning sample extraction part of the attribute estimation apparatus of FIG.
 以下、本発明を実施するための好適な実施の形態について、図面を参照して詳細に説明する。 Hereinafter, preferred embodiments for carrying out the present invention will be described in detail with reference to the drawings.
 図1は、本発明の一実施の形態に係る属性推定装置の概略構成を示すブロック図である。同図において、本実施の形態の属性推定装置1は、映像入力部10及び顔検出部11からなる画像入力部12と、特徴量抽出部13と、推定モデル保持部14と、顔属性推定部15と、現場サンプル推定結果DB(推定結果蓄積部)16と、再学習開始部17と、特徴量分布計算部18と、学習サンプル抽出部(サンプル抽出部)19と、再学習データDB20と、再学習部21とを備える。映像入力部10及び顔検出部11からなる画像入力部12と特徴量抽出部13とから得られたデータからは現場サンプルDB22が構成される。 FIG. 1 is a block diagram showing a schematic configuration of an attribute estimation apparatus according to an embodiment of the present invention. In the figure, an attribute estimation apparatus 1 according to the present embodiment includes an image input unit 12 including a video input unit 10 and a face detection unit 11, a feature amount extraction unit 13, an estimated model holding unit 14, and a face attribute estimation unit. 15, an on-site sample estimation result DB (estimation result storage unit) 16, a relearning start unit 17, a feature amount distribution calculation unit 18, a learning sample extraction unit (sample extraction unit) 19, a relearning data DB 20, A re-learning unit 21. An on-site sample DB 22 is configured from data obtained from the image input unit 12 and the feature amount extraction unit 13 including the video input unit 10 and the face detection unit 11.
 映像入力部10は、カメラ2からの映像を入力する。顔検出部11は、映像入力部10で入力された映像から顔画像を抽出する。特徴量抽出部13は、顔検出部11で抽出された顔画像の特徴量を抽出する。特徴量抽出部13は、顔画像に対して目や鼻などの顔部品を検出して正規化し、正規化した画像からGabor特徴、LBP特徴、Haar特徴などで顔としての特徴を抽出する。特徴量抽出部13で抽出された特徴量は多次元のデータである。推定モデル保持部14は、属性推定を行うための推定モデルを保持する。本実施の形態では、属性を年齢としている。
 ここで、推定モデルとは、以下の関数で表せる。
 ・顔特徴量Yから顔属性を推定するのに適した特徴へ変換する写像G(Y)=X
 ・顔属性特徴量Xを元に、年齢や性別などを推定する関数F(X)
The video input unit 10 inputs video from the camera 2. The face detection unit 11 extracts a face image from the video input by the video input unit 10. The feature amount extraction unit 13 extracts the feature amount of the face image extracted by the face detection unit 11. The feature quantity extraction unit 13 detects and normalizes facial parts such as eyes and nose from the facial image, and extracts facial features from the normalized image using Gabor features, LBP features, Haar features, and the like. The feature quantity extracted by the feature quantity extraction unit 13 is multidimensional data. The estimation model holding unit 14 holds an estimation model for performing attribute estimation. In this embodiment, the attribute is age.
Here, the estimation model can be expressed by the following function.
A mapping G (Y) = X that is converted from the face feature amount Y into a feature suitable for estimating the face attribute
A function F (X) for estimating the age, sex, etc. based on the face attribute feature amount X
 顔属性推定部15は、推定モデル保持部14に保持されている推定モデルを用いて、画像入力部12で入力された顔画像の属性推定を行いその結果を表示端末3に表示する。現場サンプル推定結果DB16は、画像入力部12で入力された顔画像と顔属性推定部15による属性推定結果を対応付けて蓄積する。すなわち、現場で採取した顔画像とそれに対してモデルにより推定した顔属性推定結果を組にして保存する。再学習開始部17は、設定された条件を満たした場合に、再学習を開始する。 The face attribute estimation unit 15 estimates the attribute of the face image input by the image input unit 12 using the estimation model held in the estimation model holding unit 14 and displays the result on the display terminal 3. The on-site sample estimation result DB 16 accumulates the face image input by the image input unit 12 and the attribute estimation result by the face attribute estimation unit 15 in association with each other. That is, the face image collected on site and the face attribute estimation result estimated by the model are stored as a set. The relearning start unit 17 starts relearning when the set condition is satisfied.
 特徴量分布計算部18は、現場サンプル推定結果DB16に蓄積された属性推定結果のグループごとに顔画像の特徴量分布を求める。この場合、特徴量分布計算部18は、属性推定結果のグループごとに顔画像特徴量のクラスタリングを行う。本実施の形態では、推定年代毎にクラスタリングを実施する。クラスタリング手法には、図2に示すように、クラスタ数固定型やクラスタ数推定型などがある。図1に戻り、学習サンプル抽出部19は、再学習時に、特徴量分布計算部18で求められた属性推定結果のグループごとの顔画像の特徴量分布に基づいてサンプルデータを抽出する。この場合、クラスタ内の位置に基づいてサンプルデータを抽出する。例えば、クラスタの中心から一定距離のものをサンプルデータとして抽出する。全ての現場サンプルに対して正解を入力するのはコストがかかりすぎるため、学習サンプル抽出部19で少量のサンプルを抽出する。 The feature amount distribution calculation unit 18 obtains the feature amount distribution of the face image for each group of attribute estimation results accumulated in the on-site sample estimation result DB 16. In this case, the feature quantity distribution calculation unit 18 performs clustering of face image feature quantities for each group of attribute estimation results. In this embodiment, clustering is performed for each estimated age. As shown in FIG. 2, clustering methods include a cluster number fixed type and a cluster number estimation type. Returning to FIG. 1, at the time of re-learning, the learning sample extraction unit 19 extracts sample data based on the feature amount distribution of the face image for each group obtained by the attribute amount distribution calculation unit 18. In this case, sample data is extracted based on the position in the cluster. For example, data having a certain distance from the center of the cluster is extracted as sample data. Since it is too costly to input correct answers for all on-site samples, the learning sample extraction unit 19 extracts a small amount of samples.
 再学習データDB20は、学習サンプル抽出部19で抽出されたサンプルデータに正解が付与されたデータを蓄積する。この場合、正解の付与はユーザによって行われる。すなわち、ユーザが表示端末3を使用して、学習サンプル抽出部19で抽出された少量の現場サンプルについて、正解の顔属性を入力する。表示端末3では、ユーザ操作によるサンプルデータへの正解付与を行い、再学習データDB20が、正解を付与したサンプルデータにクラスタ内の位置に応じた重みを付ける。再学習部21は、再学習データDB20に蓄積されているデータ(学習サンプル抽出部19にて抽出されたサンプルデータに正解を付与したデータ)を用いて推定モデルを更新する。 The re-learning data DB 20 accumulates data in which the correct answer is added to the sample data extracted by the learning sample extraction unit 19. In this case, the correct answer is given by the user. That is, the user uses the display terminal 3 to input correct face attributes for a small amount of the field sample extracted by the learning sample extraction unit 19. In the display terminal 3, the correct answer is given to the sample data by the user operation, and the relearning data DB 20 gives the weight corresponding to the position in the cluster to the sample data to which the correct answer is given. The relearning unit 21 updates the estimation model using data accumulated in the relearning data DB 20 (data obtained by adding a correct answer to the sample data extracted by the learning sample extraction unit 19).
 次に、本実施の形態の属性推定装置1の動作について説明する。
 図3は、本実施の形態の属性推定装置1の基本的な動作を模式的に示した図である。同図において、顔属性推定部15は、推定モデル保持部14に保持されている推定モデルを用いて、現場サンプルDB22より得られる顔画像の属性推定を行う。現場サンプルは、現場に設置したカメラ(図示略)で取得した画像から検出した顔画像群である。顔画像の属性推定した結果23(現場サンプル推定結果DB16で保持されている)は、横軸が年代(10代、20代、30代、40代、50代、60代)、縦軸がサンプル数として表示端末3に表示される。学習サンプル抽出部19は、推定結果の年代毎にサンプルをまとめて、年代毎にクラスタリングを実施し、年代毎にクラスタ中心24に最も近いサンプルデータを抽出する。クラスタ中心24は各クラスタに属するデータの平均位置のことである。学習サンプル抽出部19で抽出されるサンプルデータは数十のオーダである。図3に示す例では、6個のサンプルデータを抽出しているが、抽出する数を予め決めておいても良いし、分布状況に応じて決めるようにしても良い。
Next, operation | movement of the attribute estimation apparatus 1 of this Embodiment is demonstrated.
FIG. 3 is a diagram schematically illustrating the basic operation of the attribute estimation apparatus 1 according to the present embodiment. In the figure, the face attribute estimation unit 15 performs attribute estimation of the face image obtained from the on-site sample DB 22 using the estimation model held in the estimation model holding unit 14. The on-site sample is a group of face images detected from an image acquired by a camera (not shown) installed on the site. As for the result 23 of the face image attribute estimation (held in the field sample estimation result DB 16), the horizontal axis is the age (10's, 20's, 30's, 40's, 50's, 60's), and the vertical axis is the sample. It is displayed on the display terminal 3 as a number. The learning sample extraction unit 19 collects samples for each age of the estimation results, performs clustering for each age, and extracts sample data closest to the cluster center 24 for each age. The cluster center 24 is an average position of data belonging to each cluster. The sample data extracted by the learning sample extraction unit 19 is on the order of several tens. In the example shown in FIG. 3, six pieces of sample data are extracted, but the number to be extracted may be determined in advance or may be determined according to the distribution situation.
 学習サンプル抽出部19は、一般モードとプロフェッショナルモードの2つのモードを有している。上述したクラスタ中心24に最も近いサンプルデータを抽出する(1個だけ)のは一般モードである。なお、クラスタ中心に一番近いサンプルデータを1個抽出する以外に、クラスタ中心の近傍のサンプルデータを抽出することも可能である。一方、プロフェッショナルモードは、顔認識に関する知識を持ったユーザに対しては、少量のサンプルではなく、大量のサンプルを提示し、再学習に有効そうなサンプルを選んでもらう。大量とは、クラスタ中心の近傍の1個だけでなく、複数個(数十個位)を抽出するというイメージである。図4は、プロフェッショナルモードにおける正解付与画面例である。 The learning sample extraction unit 19 has two modes, a general mode and a professional mode. In the general mode, the sample data closest to the cluster center 24 is extracted (only one). In addition to extracting one piece of sample data closest to the cluster center, sample data in the vicinity of the cluster center can be extracted. On the other hand, in the professional mode, a user who has knowledge about face recognition presents a large amount of samples instead of a small amount of samples, and selects a sample that seems to be effective for relearning. The large amount is an image of extracting not only one near the cluster center but also a plurality (several tens). FIG. 4 is an example of a correct assignment screen in the professional mode.
 図3に戻り、学習サンプル抽出部19で抽出された各サンプルデータには、表示端末3が有する正解付与部3aにて正解が付与される。正解付与部3aはユーザにて操作される。例えば、図3に示す10代のグループ50のサンプルデータ25aが30代の人であれば、ユーザは、表示端末3に表示される正解付与用現場サンプルデータ60の中の該当するサンプルデータ25aに対して表示された10代~60代のうち30代(60a)を指定する。この指定によりサンプルデータ25aに正解が付与される。また、サンプルデータ25bが50代の人であれば、正解付与用現場サンプルデータ60の中の該当するサンプルデータ25bに対して表示された10代~60代のうち50代(60b)を指定する。この指定により、サンプルデータ25bに正解が付与される。このようにして学習サンプル抽出部19で抽出された全てのサンプルデータに対して正解が付与される。 Referring back to FIG. 3, each sample data extracted by the learning sample extraction unit 19 is given a correct answer by the correct answer giving unit 3a of the display terminal 3. The correct answer giving unit 3a is operated by the user. For example, if the sample data 25a of the teenage group 50 shown in FIG. 3 is a person in their 30s, the user can enter the corresponding sample data 25a in the correct answer grant site sample data 60 displayed on the display terminal 3. The 30s (60a) of the teens to 60s displayed is designated. By this designation, a correct answer is given to the sample data 25a. Further, if the sample data 25b is a person in their 50s, the 50s (60b) is designated out of the 10s to 60s displayed for the corresponding sample data 25b in the on-site sample data 60 for giving correct answers. . By this designation, a correct answer is given to the sample data 25b. A correct answer is assigned to all the sample data extracted by the learning sample extraction unit 19 in this way.
 正解付与されたサンプルデータ25は再学習データDB20に蓄積される。再学習データDB20は、蓄積した正解付与されたサンプルデータ25に対して重み付けを行い、その後、重み付けした正解付与されたサンプルデータ25と初期学習サンプルDB30に蓄積されている初期学習サンプルデータを組み合わせる。正解付与されたサンプルデータ25に重み付けする理由は、正解付与されたサンプルデータ25の数が初期学習サンプルDB30に蓄積されている初期学習サンプルデータの数より圧倒的に少ないからであり(初期学習サンプルデータは数千のオーダ)、正解付与されたサンプルデータ25をそのまま初期学習サンプルデータに加えただけでは、効果がでないからである。 The correct sample data 25 is stored in the relearning data DB 20. The re-learning data DB 20 performs weighting on the accumulated correct-corrected sample data 25, and then combines the weighted correct-corrected sample data 25 and the initial learning sample data accumulated in the initial learning sample DB 30. The reason why the sample data 25 given the correct answer is weighted is because the number of sample data 25 given the correct answer is overwhelmingly smaller than the number of initial learning sample data stored in the initial learning sample DB 30 (initial learning sample). This is because it is not effective to add the sample data 25 given the correct answer to the initial learning sample data as it is.
 ここで、各属性グループに対する重み付けの概要について説明する。
 ・属性グループ毎に重み付け
 各属性グループについて、属性グループを示す番号をc、前モデル学習時に利用したサンプル番号をi(i=1~N(c))、追加した現場サンプル番号をj(j=1~M(c))とするとき、属性グループcの各サンプルに対する重みw(c) 、w(c) は以下の式(1)を満たすように設定する。
Here, an outline of weighting for each attribute group will be described.
Weighting for each attribute group For each attribute group, the number indicating the attribute group is c, the sample number used in the previous model learning is i (i = 1 to N (c) ), and the added field sample number is j (j = 1 to M (c) ), the weights w (c) i and w (c) j for each sample of the attribute group c are set so as to satisfy the following expression (1).
Figure JPOXMLDOC01-appb-M000001
Figure JPOXMLDOC01-appb-M000001
 ・αは基本0.5
 ・ユーザが入力した正解属性との差が大きければ、1に近い値の設定(大きすぎると、汎用化が落ちるので注意)
・ Α is 0.5
-If the difference from the correct attribute entered by the user is large, set a value close to 1 (note that if it is too large, generalization will drop)
 ・各属性グループに分けない場合
 前モデル学習時に利用したサンプル番号をi(i=1~N)、追加した現場サンプル番号j(j=1~M)、各サンプルに対する重みをWi,Wjとするとき、以下に示す式(2)を満たすように設定する。
When not divided into each attribute group The sample number used in the previous model learning is i (i = 1 to N), the added field sample number j (j = 1 to M), and the weight for each sample is Wi and Wj. Is set so as to satisfy the following expression (2).
Figure JPOXMLDOC01-appb-M000002
Figure JPOXMLDOC01-appb-M000002
 図3に戻り、再学習データDB20に正解付与されたサンプルデータ25が蓄積されることで、再学習部21が、正解付与されたサンプルデータと初期学習サンプルDB30に蓄積されている初期学習サンプルデータを組み合わせて学習サンプルとして再学習を行う。 Returning to FIG. 3, the sample data 25 given the correct answer is accumulated in the re-learning data DB 20, so that the re-learning unit 21 receives the sample data given the correct answer and the initial learning sample data accumulated in the initial learning sample DB 30. Re-learn as a learning sample in combination.
 再学習部21は、以下の学習を行う。
 ・顔特徴量Yから顔属性特徴量Xへの新しい写像を学習
 ・顔属性特徴量Xをもとに、年齢や性別などを推定する新しい関数を学習
The relearning unit 21 performs the following learning.
・ Learn new mapping from facial feature Y to facial attribute feature X ・ Learn new functions to estimate age, sex, etc. based on facial attribute feature X
 推定モデル保持部14が、再学習部21で生成された再学習モデル40で推定モデルを更新する。
 ここで、再学習モデルとは、
 ・顔特徴量Yから顔属性特徴量Xへの新しい写像G’(Y)=X
 ・顔属性特徴量Xをもとに、年齢や性別などを推定する新しい関数F’(X)
である。
The estimated model holding unit 14 updates the estimated model with the re-learning model 40 generated by the re-learning unit 21.
Here, the relearning model is
New map G ′ (Y) = X from face feature Y to face attribute feature X
A new function F ′ (X) that estimates age, gender, etc. based on facial attribute feature amount X
It is.
 図5は、本実施の形態の属性推定装置1の動作を説明するためのフローチャートである。同図において、まず初期モデルを生成する(ステップS1)。この初期モデルは、実験室で生成する。初期モデルの生成後、モデル評価すなわち属性推定する(ステップS2)。属性推定後、再学習回数がN(所定の値で整数、例えば3回、4回など回数を表す)に達したか、又は、前モデルとの差分が所定の閾値未満かどうか判定する(ステップS3)。再学習回数がNに達した、又は、前モデルとの差分が閾値未満であれば(「Yes」の場合)、再学習終了として本処理を終える。ここで、再学習終了のタイミングとしては、以下のタイミングが挙げられる。
 ・予め決めておいた回数実施後
 ・前モデルとの差分が十分小さいとき
 ・前モデルによる推定結果と再学習後のモデルによる推定結果に変化が殆どないとき
 ・ユーザが入力した正解に基づいてモデルの精度評価を行い、精度が飽和(又は一定以上の精度改善があった)したとき
FIG. 5 is a flowchart for explaining the operation of the attribute estimation apparatus 1 according to the present embodiment. In the figure, an initial model is first generated (step S1). This initial model is generated in the laboratory. After the initial model is generated, model evaluation, that is, attribute estimation is performed (step S2). After attribute estimation, it is determined whether the number of re-learning has reached N (predetermined value represents an integer, for example, 3 times, 4 times, etc.) or whether the difference from the previous model is less than a predetermined threshold (step) S3). If the number of times of re-learning has reached N, or if the difference from the previous model is less than the threshold value (in the case of “Yes”), this processing is terminated as the end of re-learning. Here, as the timing of the end of the relearning, the following timing can be given.
-After a predetermined number of times-When the difference from the previous model is sufficiently small-When there is almost no change in the estimation result by the previous model and the estimation result by the model after relearning-The model based on the correct answer entered by the user When the accuracy is evaluated and the accuracy is saturated (or the accuracy has been improved more than a certain level)
 ステップS3の判定において、再学習回数がNに達してない、又は、前モデルとの差分が閾値以上であれば(「No」の場合)、学習サンプル抽出部19が正解付与用のサンプルデータを抽出する(ステップS4)。図6は、学習サンプル抽出部19の動作を説明するためのフローチャートである。同図において、学習サンプル抽出部19が、属性グループ毎にサンプルを分ける(ステップS40)。次いで、属性グループ毎にサンプルを特徴空間にマッピングを行い(ステップS41)、属性グループ毎にクラスタリングを実施する(ステップS42)。次いで、各クラスタの中心に近傍のサンプルを抽出する(ステップS43)。 If the number of re-learning has not reached N or the difference from the previous model is greater than or equal to a threshold value (in the case of “No”) in the determination in step S3, the learning sample extraction unit 19 sets the sample data for giving a correct answer. Extract (step S4). FIG. 6 is a flowchart for explaining the operation of the learning sample extraction unit 19. In the same figure, the learning sample extraction part 19 divides a sample for every attribute group (step S40). Next, the sample is mapped to the feature space for each attribute group (step S41), and clustering is performed for each attribute group (step S42). Next, a sample near the center of each cluster is extracted (step S43).
 図5のフローチャートに戻り、学習サンプル抽出部19による正解付与用サンプルの抽出後、ユーザ操作による正解付与が行われる(ステップS5)。ユーザ操作による正解付与後、再学習部21が再学習データを生成し(ステップS6)、生成した再学習データを元に再学習を行う(ステップS7)。上記ステップS2~S7の処理が再学習開始のタイミングで繰り返し行われる。 Referring back to the flowchart of FIG. 5, after the learning sample extraction unit 19 extracts the correct granting sample, the correct answer is given by the user operation (step S5). After the correct answer is given by the user operation, the re-learning unit 21 generates re-learning data (step S6), and re-learning is performed based on the generated re-learning data (step S7). The processes of steps S2 to S7 are repeatedly performed at the timing of starting the relearning.
 再学習開始のタイミング(どのタイミングで学習するか)としては、以下のタイミングが挙げられる。
 (1)出荷時モデルを作成したときに利用したサンプルの各年代における分布と、現場サンプルの各推定年代における分布が一定以上異なるとき
 (2)出荷時モデルを作成したときの各年代の平均と、現場サンプルの各推定年代の平均が一定以上ずれたとき
 (3)(1)と(2)が同時に起きることもあり得る(一番一般的なもの)
 (4)現場サンプルの各推定年代を信用して仮のモデルを生成して、出荷時モデルと推定結果が大きくなるとき、
 ・推定値そのものの差を比較:現場サンプルを用いて属性推定したときの推定値と出荷時モデルを用いて属性推定したときの推定値との差
 ・年代別比率の比較:10代は何%、20代は何%といった年代別の比率
The following timing can be given as the timing for starting the relearning (when to learn).
(1) When the distribution of samples used at the time of shipping model creation differs from the distribution of samples at each estimated age by more than a certain level (2) The average of each age when the shipping model was created When the average of each estimated age of the field sample deviates more than a certain level (3) (1) and (2) may occur simultaneously (most common)
(4) Generate a temporary model by trusting each estimated age of the on-site sample, and when the shipping model and the estimation result become large,
・ Comparison of differences in estimated values: Difference between estimated values when attributes are estimated using on-site samples and estimated values when estimated using shipping models ・ Comparison of ratios by age: What percentage is in teens , Ratio by age, such as what percentage in 20s
 このように本実施の形態の属性推定装置1によれば、推定モデル保持部14で、属性推定を行うための推定モデルを保持し、顔属性推定部15で、推定モデル保持部14に保持されている推定モデルを用いて、入力された顔画像の属性推定を行い、学習サンプル抽出部19で、顔属性推定部15による属性推定結果のグループごとにサンプルデータを抽出するので、まんべんなく現場サンプルデータを抽出することができ、少ない現場サンプルデータで再学習することができる。 As described above, according to the attribute estimation apparatus 1 of the present embodiment, the estimation model holding unit 14 holds an estimation model for performing attribute estimation, and the face attribute estimation unit 15 holds the estimation model holding unit 14. Using the estimated model, the learning sample extraction unit 19 extracts sample data for each group of attribute estimation results by the face attribute estimation unit 15, so that the field sample data is evenly distributed. Can be extracted and relearned with a small amount of field sample data.
 なお、本実施の形態の属性推定装置と同様の効果を得るための別の構成も考えられる。例えば、特徴量の分布を求めその分布に基づいて抽出されたサンプルに対して正解データを付与した再学習用のサンプルデータを用いて推定モデルを再学習する装置を、カメラで撮影した画像に対して顔属性を推定する装置の出力結果に対して(現場サンプル推定結果DBに相当)適用する実施形態にしてもよい。店舗では現場サンプルを集めるところまで行い、その現場サンプルをセンターに送るときサンプルを抽出して正解を付与して再学習するようなサービスにおいて活用可能である。 Note that another configuration for obtaining the same effect as that of the attribute estimation apparatus of the present embodiment is also conceivable. For example, a device that re-learns an estimated model using sample data for relearning obtained by obtaining a distribution of feature values and assigning correct data to a sample extracted based on the distribution is used for an image captured by a camera. It is also possible to adopt an embodiment in which the method is applied to the output result of the apparatus for estimating the face attribute (corresponding to the on-site sample estimation result DB). It can be used in a service where the shop collects on-site samples, and when the on-site samples are sent to the center, the samples are extracted, given correct answers, and re-learned.
 また、本実施の形態の属性推定装置1によれば、特徴量分布計算部18で、属性推定結果のグループごとに顔画像特徴量のクラスタリングを行い、学習サンプル抽出部19で、クラスタ内の位置に基づいてサンプルデータを抽出するので、まんべんなく現場サンプルデータを抽出することができる。 Further, according to the attribute estimation apparatus 1 of the present embodiment, the feature amount distribution calculation unit 18 performs clustering of face image feature amounts for each group of attribute estimation results, and the learning sample extraction unit 19 performs position in the cluster. Since sample data is extracted based on the above, it is possible to extract field sample data evenly.
 また、本実施の形態の属性推定装置1によれば、再学習部21での再学習の際に正解が付与されたサンプルデータに重みを付けるので、現場サンプルデータを初期学習サンプルデータとバランスをとることができ、少ない現場サンプルデータでも効果的に再学習することができる。 Moreover, according to the attribute estimation apparatus 1 of the present embodiment, the sample data to which the correct answer is given at the time of re-learning by the re-learning unit 21 is weighted. Can be re-learned effectively even with a small amount of on-site sample data.
 また、本実施の形態の属性推定装置1によれば、再学習開始部17が、設定された条件を満たした場合に再学習を開始するので、出荷時の推定モデルを現場に精度良く適応させることができる。 Further, according to the attribute estimation device 1 of the present embodiment, the relearning start unit 17 starts relearning when the set condition is satisfied, so that the estimated model at the time of shipment is adapted to the site with high accuracy. be able to.
 なお、本実施の形態の図5及び図6で示した各処理をプログラムで記述し、磁気ディスク、光ディスク、光磁気ディスク、半導体メモリ等の記憶媒体に格納して配布することも可能である。 It should be noted that each process shown in FIGS. 5 and 6 of the present embodiment can be described by a program and stored and distributed in a storage medium such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory.
 本発明を詳細にまた特定の実施態様を参照して説明したが、本発明の精神と範囲を逸脱することなく様々な変更や修正を加えることができることは当業者にとって明らかである。 Although the present invention has been described in detail and with reference to specific embodiments, it will be apparent to those skilled in the art that various changes and modifications can be made without departing from the spirit and scope of the invention.
 本出願は、2011年3月29日出願の日本特許出願(特願2011-073443)に基づくものであり、その内容はここに参照として取り込まれる。 This application is based on a Japanese patent application filed on Mar. 29, 2011 (Japanese Patent Application No. 2011-073443), the contents of which are incorporated herein by reference.
 本発明は、少ない現場サンプルで再学習することができるといった効果を有し、人物の年齢や性別などを推定する装置への適用が可能である。 The present invention has an effect that it can be re-learned with a small number of on-site samples, and can be applied to an apparatus for estimating the age and sex of a person.
 1 属性推定装置
 2 カメラ
 3 表示端末
 3a 正解付与部
 10 映像入力部
 11 顔検出部
 12 画像入力部
 13 特徴量抽出部
 14 推定モデル保持部
 15 顔属性推定部
 16 現場サンプル推定結果DB
 17 再学習開始部
 18 特徴量分布計算部
 19 学習サンプル抽出部
 20 再学習データDB
 21 再学習部
 22 現場サンプルDB
 23 顔画像の属性推定した結果
 24 クラスタ中心
 25,25a,25b サンプルデータ
 30 初期学習サンプルDB
 40 再学習モデル
 50 10代のグループ
 60,60a,60b 正解付与用現場サンプルデータ
DESCRIPTION OF SYMBOLS 1 Attribute estimation apparatus 2 Camera 3 Display terminal 3a Correct answer provision part 10 Image | video input part 11 Face detection part 12 Image input part 13 Feature-value extraction part 14 Estimation model holding part 15 Face attribute estimation part 16 On-site sample estimation result DB
17 Re-learning start unit 18 Feature quantity distribution calculation unit 19 Learning sample extraction unit 20 Re-learning data DB
21 Re-learning part 22 Site sample DB
23 Face image attribute estimation result 24 Cluster center 25, 25a, 25b Sample data 30 Initial learning sample DB
40 Re-learning model 50 Teenage group 60, 60a, 60b Field sample data for giving correct answers

Claims (10)

  1.  顔画像を入力する画像入力部と、
     属性推定を行うための推定モデルを保持する推定モデル保持部と、
     前記推定モデル保持部に保持されている推定モデルを用いて、前記画像入力部にて入力された顔画像の属性推定を行う属性推定部と、
     前記顔画像と前記属性推定部による属性推定結果を対応付けて蓄積する推定結果蓄積部と、
     前記属性推定部による属性推定結果のグループごとにサンプルデータを抽出するサンプル抽出部と、
     前記サンプル抽出部にて抽出されたサンプルデータに正解を付与したデータを用いて前記推定モデルを更新する再学習部と、
     を備えた属性推定装置。
    An image input unit for inputting a face image;
    An estimation model holding unit for holding an estimation model for performing attribute estimation;
    Using an estimation model held in the estimation model holding unit, an attribute estimation unit that performs attribute estimation of the face image input in the image input unit;
    An estimation result storage unit for storing the face image and the attribute estimation result by the attribute estimation unit in association with each other;
    A sample extraction unit for extracting sample data for each group of attribute estimation results by the attribute estimation unit;
    A re-learning unit that updates the estimation model using data obtained by adding a correct answer to the sample data extracted by the sample extraction unit;
    An attribute estimation device comprising:
  2.  前記属性推定部による属性推定結果のグループごとに顔画像の特徴量分布を求める特徴量分布計算部を備え、
     前記サンプル抽出部は、前記特徴量分布計算部で求められた前記属性推定結果のグループごとの顔画像の特徴量分布に基づいてサンプルデータを抽出する請求項1に記載の属性推定装置。
    A feature amount distribution calculation unit for obtaining a feature amount distribution of a face image for each group of attribute estimation results by the attribute estimation unit;
    The attribute estimation apparatus according to claim 1, wherein the sample extraction unit extracts sample data based on a feature amount distribution of a face image for each group of the attribute estimation results obtained by the feature amount distribution calculation unit.
  3.  前記特徴量分布計算部は、前記属性推定結果のグループごとに顔画像特徴量のクラスタリングを行い、
     前記サンプル抽出部は、クラスタ内の位置に基づいてサンプルデータを抽出する請求項2に記載の属性推定装置。
    The feature amount distribution calculation unit performs clustering of face image feature amounts for each group of the attribute estimation results,
    The attribute estimation apparatus according to claim 2, wherein the sample extraction unit extracts sample data based on a position in the cluster.
  4.  前記サンプル抽出部は、クラスタの中心から一定距離のものをサンプルデータとして抽出する請求項3に記載の属性推定装置。 4. The attribute estimation apparatus according to claim 3, wherein the sample extraction unit extracts a sample having a certain distance from the center of the cluster as sample data.
  5.  前記再学習部での再学習の際に正解が付与されたサンプルデータに重みを付ける請求項1乃至請求項4のいずれか一項に記載の属性推定装置。 The attribute estimation device according to any one of claims 1 to 4, wherein a weight is given to sample data to which a correct answer is given at the time of re-learning in the re-learning unit.
  6.  前記再学習部での再学習の際に正解が付与されたサンプルデータにクラスタ内の位置に応じた重みを付ける請求項5に記載の属性推定装置。 The attribute estimation device according to claim 5, wherein a weight corresponding to a position in the cluster is given to sample data to which a correct answer is given at the time of re-learning in the re-learning unit.
  7.  設定された条件を満たした場合に前記再学習部での再学習を開始する再学習開始部を備えた請求項1乃至請求項6のいずれか一項に記載の属性推定装置。 The attribute estimation device according to any one of claims 1 to 6, further comprising a re-learning start unit that starts re-learning in the re-learning unit when a set condition is satisfied.
  8.  属性は年齢である請求項1乃至請求項7のいずれか一項に記載の属性推定装置。 The attribute estimation device according to any one of claims 1 to 7, wherein the attribute is an age.
  9.  顔画像とその属性推定結果が対応付けられたデータに対して、属性推定結果のグループごとに推定モデルの再学習のためのサンプルデータを抽出するサンプル抽出部を備えた再学習用サンプル抽出装置。 A re-learning sample extraction device including a sample extraction unit that extracts sample data for re-learning an estimation model for each group of attribute estimation results for data in which a face image and its attribute estimation result are associated.
  10.  顔画像を入力する画像入力ステップと、
     属性推定を行うための推定モデルを保持する推定モデル保持ステップと、
     前記推定モデル保持ステップで保持されている推定モデルを用いて、前記画像入力ステップにて入力された顔画像の属性推定を行う属性推定ステップと、
     前記顔画像と前記属性推定ステップによる属性推定結果を対応付けて蓄積する推定結果蓄積ステップと、
     前記属性推定ステップによる属性推定結果のグループごとにサンプルデータを抽出するサンプル抽出ステップと、
     前記サンプル抽出ステップにて抽出されたサンプルデータに正解を付与したデータを用いて前記推定モデルを更新する再学習ステップと、
     を備えた属性推定方法。
    An image input step for inputting a face image;
    An estimation model holding step for holding an estimation model for performing attribute estimation;
    Using the estimation model held in the estimation model holding step, an attribute estimation step for performing attribute estimation of the face image input in the image input step;
    An estimation result accumulating step for accumulating the face image and the attribute estimation result of the attribute estimation step in association with each other;
    A sample extraction step for extracting sample data for each group of attribute estimation results by the attribute estimation step;
    A relearning step of updating the estimation model using data obtained by giving a correct answer to the sample data extracted in the sample extraction step;
    An attribute estimation method comprising:
PCT/JP2012/002128 2011-03-29 2012-03-27 Characteristic estimation device WO2012132418A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2011073443A JP2012208710A (en) 2011-03-29 2011-03-29 Characteristic estimation device
JP2011-073443 2011-03-29

Publications (1)

Publication Number Publication Date
WO2012132418A1 true WO2012132418A1 (en) 2012-10-04

Family

ID=46930192

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2012/002128 WO2012132418A1 (en) 2011-03-29 2012-03-27 Characteristic estimation device

Country Status (2)

Country Link
JP (1) JP2012208710A (en)
WO (1) WO2012132418A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014155639A1 (en) * 2013-03-29 2014-10-02 株式会社日立製作所 Video monitoring system and image retrieval system
WO2014178105A1 (en) * 2013-04-30 2014-11-06 Necソリューションイノベータ株式会社 Attribute estimation device
WO2015001856A1 (en) * 2013-07-01 2015-01-08 Necソリューションイノベータ株式会社 Attribute estimation system
JP2017117493A (en) * 2017-03-15 2017-06-29 東芝テック株式会社 Merchandise sales data processing apparatus and program

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6004084B2 (en) 2013-03-29 2016-10-05 富士通株式会社 Model updating method, apparatus, and program
WO2016006090A1 (en) * 2014-07-10 2016-01-14 株式会社東芝 Electronic apparatus, method, and program
WO2016103651A1 (en) * 2014-12-22 2016-06-30 日本電気株式会社 Information processing system, information processing method and recording medium
US10210464B2 (en) * 2015-03-11 2019-02-19 Qualcomm Incorporated Online training for object recognition system
JP6567488B2 (en) * 2016-12-22 2019-08-28 日本電信電話株式会社 Learning data generation device, development data generation device, model learning device, method thereof, and program
EP3451219A1 (en) 2017-08-31 2019-03-06 KBC Groep NV Improved anomaly detection
JP7463052B2 (en) * 2018-09-19 2024-04-08 キヤノン株式会社 Information processing device, information processing system, information processing method, and program
JP7306933B2 (en) 2018-09-21 2023-07-11 古河電気工業株式会社 Image determination device, image inspection device, and image determination method
JP7262232B2 (en) * 2019-01-29 2023-04-21 東京エレクトロン株式会社 Image recognition system and image recognition method
JP7351413B2 (en) 2020-05-08 2023-09-27 富士通株式会社 Identification method, generation method, identification program and identification device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006323507A (en) * 2005-05-17 2006-11-30 Yamaha Motor Co Ltd Attribute identifying system and attribute identifying method
JP2009093334A (en) * 2007-10-05 2009-04-30 Seiko Epson Corp Identification method and program

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006323507A (en) * 2005-05-17 2006-11-30 Yamaha Motor Co Ltd Attribute identifying system and attribute identifying method
JP2009093334A (en) * 2007-10-05 2009-04-30 Seiko Epson Corp Identification method and program

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014155639A1 (en) * 2013-03-29 2014-10-02 株式会社日立製作所 Video monitoring system and image retrieval system
JP5982557B2 (en) * 2013-03-29 2016-08-31 株式会社日立製作所 Video surveillance system and image search system
WO2014178105A1 (en) * 2013-04-30 2014-11-06 Necソリューションイノベータ株式会社 Attribute estimation device
JP5965057B2 (en) * 2013-04-30 2016-08-03 Necソリューションイノベータ株式会社 Attribute estimation device
WO2015001856A1 (en) * 2013-07-01 2015-01-08 Necソリューションイノベータ株式会社 Attribute estimation system
JPWO2015001856A1 (en) * 2013-07-01 2017-02-23 Necソリューションイノベータ株式会社 Attribute estimation system
US10296845B2 (en) 2013-07-01 2019-05-21 Nec Solution Innovators, Ltd. Attribute estimation system
JP2017117493A (en) * 2017-03-15 2017-06-29 東芝テック株式会社 Merchandise sales data processing apparatus and program

Also Published As

Publication number Publication date
JP2012208710A (en) 2012-10-25

Similar Documents

Publication Publication Date Title
WO2012132418A1 (en) Characteristic estimation device
CN108229322B (en) Video-based face recognition method and device, electronic equipment and storage medium
US9626551B2 (en) Collation apparatus and method for the same, and image searching apparatus and method for the same
CN108229314B (en) Target person searching method and device and electronic equipment
JP5506722B2 (en) Method for training a multi-class classifier
CN108288051B (en) Pedestrian re-recognition model training method and device, electronic equipment and storage medium
CN108280477B (en) Method and apparatus for clustering images
US20160132815A1 (en) Skill estimation method in machine-human hybrid crowdsourcing
CN105069424B (en) Quick face recognition system and method
JP5214760B2 (en) Learning apparatus, method and program
US20110103695A1 (en) Image processing apparatus and image processing method
CN108985190B (en) Target identification method and device, electronic equipment and storage medium
CN110503000B (en) Teaching head-up rate measuring method based on face recognition technology
CN112232241A (en) Pedestrian re-identification method and device, electronic equipment and readable storage medium
US20190138852A1 (en) Information processing apparatus, information processing method, and storage medium for generating teacher information
JP5214679B2 (en) Learning apparatus, method and program
CN111814846B (en) Training method and recognition method of attribute recognition model and related equipment
CN111814821A (en) Deep learning model establishing method, sample processing method and device
CN114842343A (en) ViT-based aerial image identification method
JP5746550B2 (en) Image processing apparatus and image processing method
CN115984930A (en) Micro expression recognition method and device and micro expression recognition model training method
TW202125323A (en) Processing method of learning face recognition by artificial intelligence module
CN113378852A (en) Key point detection method and device, electronic equipment and storage medium
CN114463798A (en) Training method, device and equipment of face recognition model and storage medium
JP7293658B2 (en) Information processing device, information processing method and program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12765214

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 12765214

Country of ref document: EP

Kind code of ref document: A1