JP7426922B2

JP7426922B2 - Program, device, and method for artificially generating a new teacher image with an attachment worn on a person's face

Info

Publication number: JP7426922B2
Application number: JP2020197902A
Authority: JP
Inventors: 剣明呉; 博楊; 元服部
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2020-11-30
Filing date: 2020-11-30
Publication date: 2024-02-02
Anticipated expiration: 2040-11-30
Also published as: JP2022086086A

Description

本発明は、新たな教師画像を人工的に生成する技術に関する。特に、人の顔にマスクなどが装着された顔画像を、教師画像として生成する用途に適する。 The present invention relates to a technique for artificially generating a new teacher image. It is particularly suitable for generating a face image of a person wearing a mask or the like as a teacher image.

撮影画像から人や物体を認識する機械学習エンジンの技術が発展してきている。特に、顔画像からその本人を認識する顔認識の精度は、深層学習(Deep Learning)技術の発展と共に、急激に向上している。例えばfacebook社は、深層学習を用いた顔認識技術DeepFace（登録商標）の精度が97.35%に達したと発表した（例えば非特許文献１参照）。
また、機械学習エンジン内の学習モデルを訓練するために、大量の教師画像を使用する必要があるが、例えばAffectiva社は、世界87か国以上から収集された約70億の感情特徴量を用いて、より正確な感情認識技術を実現している（例えば非特許文献２参照）。 Machine learning engine technology that recognizes people and objects from captured images is progressing. In particular, the accuracy of face recognition for recognizing a person from a face image is rapidly improving with the development of deep learning technology. For example, Facebook announced that the accuracy of DeepFace (registered trademark), a face recognition technology using deep learning, has reached 97.35% (for example, see Non-Patent Document 1).
In addition, in order to train the learning model within a machine learning engine, it is necessary to use a large amount of teacher images. For example, Affectiva uses approximately 7 billion emotional features collected from more than 87 countries around the world. As a result, more accurate emotion recognition technology has been realized (see, for example, Non-Patent Document 2).

従来、感情毎に大量の顔画像の特徴を学習し、その特徴に基づいて感情を認識する技術がある（例えば特許文献１参照）。具体的には、Ekman 7分類表情モデル（ニュートラル、喜び、嫌悪、怒り、サプライズ、悲しみ、恐怖）や、ポジティブ・ネガティブ・ニュートラルの３分類感情モデルなどがある。 BACKGROUND ART Conventionally, there is a technology that learns features of a large number of facial images for each emotion and recognizes the emotion based on the characteristics (for example, see Patent Document 1). Specifically, there are the Ekman 7-class facial expression model (neutral, joy, disgust, anger, surprise, sadness, fear) and the 3-class emotional model of positive, negative, and neutral.

また、対象人物の状態に基づく複数の認識モードを規定し、認識モード毎に認識器を有し、顔認識時に、認識モードに応じたいずれか１つの認識器を適用する技術もある（例えば特許文献２参照）。対象人物の現在の顔の状態として、マスク、メガネ、サングラス、帽子等の着用の有無がある。この技術によれば、現在の顔の閉鎖領域から認識モードを選択し、その認識モードに基づく認識器が認証の成否を判定する。即ち、各認識器は、閉鎖領域が異なる教師画像から訓練されたものである。 There is also a technology that defines multiple recognition modes based on the state of the target person, has a recognizer for each recognition mode, and applies any one recognizer according to the recognition mode during face recognition (for example, patented (See Reference 2). The current face condition of the target person includes whether or not he or she is wearing a mask, glasses, sunglasses, hat, or the like. According to this technique, a recognition mode is selected from the current closed area of the face, and a recognizer based on the selected recognition mode determines whether authentication is successful or not. That is, each recognizer is trained from teacher images with different closed regions.

更に、本願の出願人によって開発された「表情認識ＡＩ(Artificial Intelligence)」の技術もある（例えば非特許文献３参照）。
従来、顔認識技術は、顔認証成功によるロック解除のみならず、笑顔検出による写真の自動撮影機能や、テレビ番組の視聴者の表情解析に基づく受容度調査など、マーケティング用途でも活用されてきた。しかしながら、人間の顔の多くの部位を手掛かりとして検出するために、両目がはっきりと見える正面向きの顔にしか対応できなかった。
これに対し、非特許文献３に記載された機械学習の「多角適応型モデル制御技術」によれば、あらゆる顔の向き（アングルフリー）でも、その顔の表情を高精度に分析することができる。真横を向いている顔でも、正確に表情を認識することができる。
この技術によれば、事実上の標準である顔画像データセットＬＦＷ(Labeled Faces in the Wild)を用いて、顔の向きが45°以上で片目しか映っていない顔画像であっても、非常に高い精度で表情を認識することができる。 Furthermore, there is also a technique of "facial expression recognition AI (Artificial Intelligence)" developed by the applicant of the present application (for example, see Non-Patent Document 3).
In the past, facial recognition technology has been used not only for unlocking devices based on successful facial recognition, but also for marketing purposes, such as automatic photo-taking functions based on smile detection, and acceptance surveys based on facial expression analysis of TV program viewers. However, because it detects many parts of the human face as clues, it can only handle faces facing forward where both eyes are clearly visible.
On the other hand, according to the machine learning "multi-angle adaptive model control technology" described in Non-Patent Document 3, it is possible to analyze facial expressions with high accuracy regardless of the orientation of the face (angle-free). . It can accurately recognize facial expressions even when the face is facing directly to the side.
According to this technology, using the de facto standard face image dataset LFW (Labeled Faces in the Wild), even facial images with a face orientation of 45 degrees or more and only one eye visible can be Facial expressions can be recognized with high accuracy.

特開２０１１－１５０３８１号公報Japanese Patent Application Publication No. 2011-150381 特開２０１８－１６５９８３号公報Japanese Patent Application Publication No. 2018-165983

Taigman, Yaniv, et al. "Deepface: Closing the gap to human-level performance in face verification." Proceedings of the IEEE conference on computer vision and pattern recognition. 2014.Taigman, Yaniv, et al. "Deepface: Closing the gap to human-level performance in face verification." Proceedings of the IEEE conference on computer vision and pattern recognition. 2014. affectiva、[online]、［令和２年１１月８日検索］、インターネット＜URL:https://affectiva.jp/reason.html＞affectiva, [online], [searched on November 8, 2020], Internet <URL: https://affectiva.jp/reason.html> アングルフリーな表情認識ＡＩ、[online]、［令和２年１１月８日検索］、インターネット＜URL:https://www.kddi-research.jp/newsrelease/2018/080201.html＞Angle-free facial expression recognition AI, [online], [searched on November 8, 2020], Internet <URL: https://www.kddi-research.jp/newsrelease/2018/080201.html> shape_predictor_68_face_landmarks.dat、[online]、［令和２年１１月８日検索］、インターネット＜URL:http://dlib.net/files/＞shape_predictor_68_face_landmarks.dat, [online], [searched on November 8, 2020], Internet <URL: http://dlib.net/files/> Facial point annotations、[online]、［令和２年１１月８日検索］、インターネット＜URL:https://ibug.doc.ic.ac.uk/resources/facial-point-annotations/＞Facial point annotations, [online], [searched on November 8, 2020], Internet <URL: https://ibug.doc.ic.ac.uk/resources/facial-point-annotations/>

近年、新型コロナウイルス感染症が流行して以来、顔にマスクやゴーグルを着用することが一般的になっている。このような装着物を顔に着用した場合、顔の面積の最大70％が覆われてしまう。そのために、顔や表情を十分に認識できないという課題が生じてきた。一般的な顔認識アルゴリズムによれば、顔画像から可能な限り多くの特徴量を取り込む必要があるためである。 In recent years, since the outbreak of the new coronavirus infection, wearing masks and goggles on the face has become commonplace. When such a device is worn on the face, up to 70% of the facial area is covered. This has led to the problem that faces and expressions cannot be fully recognized. This is because, according to a general face recognition algorithm, it is necessary to capture as many features as possible from a face image.

勿論、顔認識の機械学習エンジンに対して、マスクやゴーグルを着用した顔や表情の教師画像を大量に訓練させればよい。しかしながら、そのような顔画像の教師画像を、現時点で大量に収集することは極めて困難である。 Of course, a machine learning engine for facial recognition can be trained on a large number of teacher images of faces and facial expressions wearing masks or goggles. However, it is currently extremely difficult to collect a large amount of such teacher images of faces.

前述した非特許文献１、２及び特許文献１、２に記載の技術は全て、顔の向きや傾きを全く考慮していない。また、非特許文献３に記載の技術によれば、顔の向きや傾きを考慮したものであるが、マスクやゴーグルを着用した場合、顔や表情を十分に認識することはできない。 All of the techniques described in Non-Patent Documents 1 and 2 and Patent Documents 1 and 2 described above do not take into account the direction or inclination of the face at all. Furthermore, although the technique described in Non-Patent Document 3 takes into consideration the orientation and inclination of the face, when a mask or goggles are worn, the face and facial expressions cannot be sufficiently recognized.

これに対し、本願の発明者らは、人の顔が映り込む教師画像を入力し、人の顔に装着物が着用された画像を新たな教師画像として人工的に生成することができないか、と考えた。 In response to this, the inventors of the present application have wondered if it is possible to input a teacher image in which a person's face is reflected and artificially generate an image of the person wearing an attachment as a new teacher image. I thought.

そこで、本発明は、人の顔が映り込む教師画像を入力し、人の顔に装着物が着用された画像を新たな教師画像として人工的に生成することができるプログラム、人工教師画像生成装置及び方法を提供することを目的とする。 Therefore, the present invention provides a program and an artificial teacher image generation device that can input a teacher image in which a person's face is reflected and artificially generate an image of the person wearing an attachment on the person's face as a new teacher image. and a method.

本発明によれば、人の顔が映り込む教師画像を入力し、人の顔に装着物が着用された画像を新たな教師画像として人工的に生成するようにコンピュータを機能させるプログラムであって、
向きが異なる複数の装着物画像を予め蓄積する装着物画像蓄積手段と、
入力された教師画像から、人の顔領域を検出する顔領域検出手段と、
検出された顔領域から特徴点を検出し、特徴点から顔の向き及び傾きを検出する特徴点検出手段と、
装着物画像蓄積手段から、顔の向きに応じた装着物画像を選択する装着物画像選択手段と、
選択された装着物画像を、顔の傾きに応じて回転させて、教師画像の人の顔領域に重畳させた人工教師画像を生成する人工教師画像生成手段と
してコンピュータを機能させることを特徴とする。 According to the present invention, there is provided a program that operates a computer to input a teacher image in which a person's face is reflected, and to artificially generate an image of a person wearing an attachment on the person's face as a new teacher image. ,
a fitted object image storage means for storing in advance a plurality of worn object images with different orientations;
face area detection means for detecting a human face area from the input teacher image;
Feature point detection means for detecting feature points from the detected face area and detecting the orientation and inclination of the face from the feature points;
a wearable object image selection means for selecting a wearable object image according to the orientation of the face from the wearable object image storage means;
The computer is characterized in that the computer functions as an artificial teacher image generating means for generating an artificial teacher image in which the selected attachment image is rotated according to the inclination of the face and superimposed on the human face area of the teacher image. .

本発明のプログラムにおける他の実施形態によれば、
装着物は、マスク、メガネ、ゴーグル又はサングラスである
ようにコンピュータを機能させることも好ましい。 According to another embodiment of the program of the present invention,
It is also preferable that the wearable item is a mask, glasses, goggles, or sunglasses for the computer to function.

本発明のプログラムにおける他の実施形態によれば、
特徴点検出手段から出力された顔領域について、その特徴点に既に装着物が着用されているか否かを推定し、装着物が着用されていない顔領域のみを出力する装着物判定手段と
してコンピュータを更に機能させることも好ましい。 According to another embodiment of the program of the present invention,
A computer is used as a wearing object determining means for estimating whether or not a wearing object is already worn at the feature point of the facial area output from the feature point detecting means and outputting only the facial area where no wearing object is worn. It is also preferable to make it function further.

本発明のプログラムにおける他の実施形態によれば、
装着物画像選択手段は、向きが同一であって異なる装着物画像を複数選択し、
人工教師画像生成手段は、複数の装着物画像それぞれを、教師画像の人の顔領域に重畳させた人工教師画像を複数生成する
ようにコンピュータを機能させることも好ましい。 According to another embodiment of the program of the present invention,
The wearable object image selection means selects a plurality of wearable object images having the same orientation and different orientations;
It is also preferable that the artificial teacher image generation means causes the computer to function so as to generate a plurality of artificial teacher images in which each of the plurality of wearable object images is superimposed on the human face area of the teacher image.

本発明のプログラムにおける他の実施形態によれば、
人工教師画像生成手段は、生成した１枚の人工教師画像に対して、照度、色又はコントラストが異なるように画像処理を施して、複数の人工教師画像を生成する
ようにコンピュータを機能させることも好ましい。 According to another embodiment of the program of the present invention,
The artificial teacher image generation means may cause the computer to perform image processing on the single generated artificial teacher image so that the illuminance, color, or contrast is different, thereby generating a plurality of artificial teacher images. preferable.

本発明のプログラムにおける他の実施形態によれば、
人工教師画像生成手段は、装着物画像を教師画像の人の顔領域に重畳する際に、顔の向き及び傾きに応じて視認されない当該装着物画像をカットする
ようにコンピュータを機能させることも好ましい。 According to another embodiment of the program of the present invention,
It is also preferable that the artificial teacher image generation means causes the computer to function so as to cut out the wearable object image that is not visible depending on the orientation and inclination of the face when superimposing the wearable object image on the human face area of the teacher image. .

本発明によれば、人の顔が映り込む教師画像を入力し、人の顔に装着物が着用された画像を新たな教師画像として人工的に生成する人工教師画像生成装置であって、
向きが異なる複数の装着物画像を予め蓄積する装着物画像蓄積手段と、
入力された教師画像から、人の顔領域を検出する顔領域検出手段と、
検出された顔領域から特徴点を検出し、特徴点から顔の向き及び傾きを検出する特徴点検出手段と、
装着物画像蓄積手段から、顔の向きに応じた装着物画像を選択する装着物画像選択手段と、
選択された装着物画像を、顔の傾きに応じて回転させて、教師画像の人の顔領域に重畳させた人工教師画像を生成する人工教師画像生成手段と
を有することを特徴とする。 According to the present invention, there is provided an artificial teacher image generation device that inputs a teacher image in which a person's face is reflected and artificially generates an image of a person wearing an attachment on the person's face as a new teacher image,
a fitted object image storage means for storing in advance a plurality of worn object images with different orientations;
face area detection means for detecting a human face area from the input teacher image;
Feature point detection means for detecting feature points from the detected face area and detecting the orientation and inclination of the face from the feature points;
a wearable object image selection means for selecting a wearable object image according to the orientation of the face from the wearable object image storage means;
The present invention is characterized by comprising an artificial teacher image generation means for generating an artificial teacher image in which the selected wearing object image is rotated according to the inclination of the face and superimposed on the human face area of the teacher image.

本発明によれば、人の顔が映り込む教師画像を入力し、人の顔に装着物が着用された画像を新たな教師画像として人工的に生成する装置の人工教師画像生成方法であって、
装置は、
向きが異なる複数の装着物画像を予め蓄積する装着物画像蓄積部を有し、
入力された教師画像から、人の顔領域を検出する第１のステップと、
検出された顔領域から特徴点を検出し、特徴点から顔の向き及び傾きを検出する第２のステップと、
装着物画像蓄積手段から、顔の向きに応じた装着物画像を選択する第３のステップと、
選択された装着物画像を、顔の傾きに応じて回転させて、教師画像の人の顔領域に重畳させた人工教師画像を生成する第４のステップと
を実行することを特徴とする。 According to the present invention, there is provided an artificial teacher image generation method using a device that inputs a teacher image in which a person's face is reflected and artificially generates an image of the person wearing an attachment on the person's face as a new teacher image. ,
The device is
It has a fitted object image storage section that stores in advance a plurality of worn object images in different orientations,
A first step of detecting a human face area from the input teacher image;
a second step of detecting feature points from the detected face area and detecting the orientation and inclination of the face from the feature points;
a third step of selecting a wearable object image according to the orientation of the face from the wearable object image storage means;
The present invention is characterized by executing a fourth step of rotating the selected attachment image according to the inclination of the face and generating an artificial teacher image superimposed on the human face area of the teacher image.

本発明のプログラム、人工教師画像生成装置及び方法によれば、人の顔が映り込む教師画像を入力し、人の顔に装着物が着用された画像を新たな教師画像として人工的に生成することができる。 According to the program, artificial teacher image generation device, and method of the present invention, a teacher image in which a person's face is reflected is input, and an image of the person wearing an attachment on the person's face is artificially generated as a new teacher image. be able to.

本発明における人工教師画像生成装置の機能構成図である。1 is a functional configuration diagram of an artificial teacher image generation device according to the present invention. 顔領域検出部の説明図である。FIG. 3 is an explanatory diagram of a face area detection section. 特徴点検出部の説明図である。FIG. 3 is an explanatory diagram of a feature point detection section. 顔の向き及び傾きを表す説明図である。It is an explanatory diagram showing direction and inclination of a face. 装着物画像蓄積部の説明図である。FIG. 3 is an explanatory diagram of an attached object image storage section. 装着物の向き及び傾きを表す説明図である。It is an explanatory view showing direction and inclination of a wearing object. 人工教師画像生成部の説明図である。FIG. 3 is an explanatory diagram of an artificial teacher image generation unit. 装着物判定部の説明図である。FIG. 3 is an explanatory diagram of a worn object determination section. 人工教師画像を用いた顔認証装置の第１の機能構成図である。FIG. 2 is a first functional configuration diagram of a face authentication device using an artificial teacher image. 人工教師画像を用いた顔認証装置の第２の機能構成図である。FIG. 2 is a second functional configuration diagram of a face authentication device using an artificial teacher image.

以下、本発明の実施の形態について、図面を用いて詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail using the drawings.

図１は、本発明における人工教師画像生成装置の機能構成図である。 FIG. 1 is a functional configuration diagram of an artificial teacher image generation device according to the present invention.

人工教師画像生成装置１は、人の顔が映り込む教師画像を入力し、人の顔に装着物が着用された画像を新たな教師画像として人工的に生成する。
図１によれば、人工教師画像生成装置１は、教師画像蓄積部１０と、顔領域検出部１１と、特徴点検出部１２と、装着物画像蓄積部１３０と、装着物画像選択部１３と、人工教師画像生成部１４とを有する。また、オプション的に、装着物判定部１２１を更に有するものであってもよい。これら機能構成部は、装置に搭載されたコンピュータを機能させるプログラムを実行することによって実現される。また、これら機能構成部の処理の流れは、装置の人工教師画像生成方法としても理解できる。 The artificial teacher image generation device 1 inputs a teacher image in which a person's face is reflected, and artificially generates an image of the person's face wearing an attachment as a new teacher image.
According to FIG. 1, the artificial teacher image generation device 1 includes a teacher image storage section 10, a face area detection section 11, a feature point detection section 12, a wearable object image storage section 130, and a wearable object image selection section 13. , and an artificial teacher image generation unit 14. Furthermore, as an option, it may further include a worn object determination section 121. These functional components are realized by executing a program that causes a computer installed in the device to function. Further, the processing flow of these functional components can be understood as an artificial teacher image generation method of the device.

［教師画像蓄積部１０］
教師画像蓄積部１０は、人の顔が映り込む教師画像を蓄積したものである。教師画像の人の顔には、基本的に、マスク、メガネ、ゴーグル又はサングラスのような装着物が着用されていないものとする。
図１によれば、人工教師画像生成装置１は、教師画像蓄積部１０から１枚の教師画像が入力された場合、最終的に１枚以上の人工教師画像を出力していく。これを繰り返すことによって、装着物を着用していない顔画像から、装着物を着用した人工的な大量の顔画像を生成することができる。
尚、本発明の実施形態として、装着物は、「マスク」であるとして説明するが、勿論、メガネ、ゴーグル又はサングラスであってもよい。 [Teacher image storage unit 10]
The teacher image storage unit 10 stores teacher images in which human faces are reflected. It is assumed that the person's face in the teacher image is basically not wearing a mask, glasses, goggles, or sunglasses.
According to FIG. 1, when one teacher image is input from the teacher image storage unit 10, the artificial teacher image generation device 1 eventually outputs one or more artificial teacher images. By repeating this process, it is possible to generate a large number of artificial facial images with the wearer on from facial images without the wearer.
In the embodiment of the present invention, the attachment will be described as a "mask," but it may of course be glasses, goggles, or sunglasses.

前述したように、教師画像は、映り込む人の顔に装着物が着用されていないものであるが、オプション的な装着物判定部１２１が機能する場合、教師画像の人の顔に装着物が着用されていてもよい。装着物が着用された人の顔が映り込む教師画像は、装着物判定部１２１によって人工教師画像の生成対象から除外される。
これによって、装着物を着用していない顔画像のみから、装着物を着用した人工的な大量の顔画像を生成することができる。 As mentioned above, the teacher image is an image in which no attachments are worn on the person's face in the teacher image, but if the optional attachment determination unit 121 functions, it is determined that the teacher image has no attachments on the person's face. May be worn. The teacher image in which the face of the person wearing the attachment is reflected is excluded by the attachment determination unit 121 from the generation target of the artificial teacher image.
With this, it is possible to generate a large number of artificial facial images with the wearer wearing only the face images without the wearer.

［顔領域検出部１１］
顔領域検出部１１は、入力された教師画像から、人の顔領域（例えばバウンディングボックス）を検出する。検出された顔領域は、特徴点検出部１２へ出力される。 [Face area detection unit 11]
The face area detection unit 11 detects a human face area (for example, a bounding box) from the input teacher image. The detected face area is output to the feature point detection section 12.

図２は、顔領域検出部の説明図である。
図２（ａ）によれば、教師画像には、２人の女性の顔が映り込んでいる。
図２（ｂ）によれば、顔領域検出部１１が、２人それぞれの顔領域を検出する。 FIG. 2 is an explanatory diagram of the face area detection section.
According to FIG. 2(a), the faces of two women are reflected in the teacher image.
According to FIG. 2(b), the face area detection unit 11 detects the face areas of two people.

顔領域検出部１１は、具体的には、Ｒ－ＣＮＮ(Regions with Convolutional Neural Networks)やＳＳＤ(Single Shot Multibox Detector)を用いる。
Ｒ－ＣＮＮは、四角形の顔領域を畳み込みニューラルネットワークの特徴と組み合わせて、顔領域のサブセットを検出する（領域提案）。次に、領域提案からＣＮＮ特徴量を抽出する。そして、ＣＮＮ特徴量を用いて予め学習したサポートベクタマシンによって、領域提案のバウンディングボックスを調整する。
ＳＳＤは、機械学習を用いた一般物体検知のアルゴリズムであって、デフォルトボックス(default boxes)という長方形のバウンディングボックスを決定する。１枚の画像上に、大きさの異なるデフォルトボックスを多数重畳させ、そのボックス毎に予測値を計算する。各デフォルトボックスについて、自身が物体からどのくらい離れていて、どのくらい大きさが異なるのか、とする位置を予測することができる。 Specifically, the face region detection unit 11 uses R-CNN (Regions with Convolutional Neural Networks) or SSD (Single Shot Multibox Detector).
R-CNN combines rectangular face regions with convolutional neural network features to detect a subset of face regions (region proposal). Next, CNN features are extracted from the region proposal. Then, the bounding box of the region proposal is adjusted using a support vector machine trained in advance using CNN features.
SSD is a general object detection algorithm using machine learning, and determines rectangular bounding boxes called default boxes. A large number of default boxes of different sizes are superimposed on one image, and a predicted value is calculated for each box. For each default box, it is possible to predict how far it is from the object and how different its size will be.

［特徴点検出部１２］
特徴点検出部１２は、検出された顔領域から「特徴点」（例えば６８個のキーポイント）を検出し、特徴点から「顔の向き及び傾き」を検出する。 [Feature point detection unit 12]
The feature point detection unit 12 detects "feature points" (for example, 68 key points) from the detected face area, and detects "face orientation and inclination" from the feature points.

図３は、特徴点検出部の説明図である。 FIG. 3 is an explanatory diagram of the feature point detection section.

（特徴点の検出）
最初に、顔領域から各部位（目、鼻、口、眉、顎）が検出され、その部位の輪郭から特徴点が抽出される。
例えばDlibの公式サイトによれば、顔の特徴点の学習済みモデル（例えば非特許文献４参照）が開示されている。この学習済みモデルは、3837枚(train:3148/test:689)のアノテーションが付くデータセットに対して訓練されたものである（例えば非特許文献５参照）。 (Detection of feature points)
First, each part (eyes, nose, mouth, eyebrows, chin) is detected from the facial area, and feature points are extracted from the outline of that part.
For example, the official website of Dlib discloses a trained model of facial feature points (for example, see Non-Patent Document 4). This learned model was trained on a dataset with 3837 annotations (train:3148/test:689) (see, for example, Non-Patent Document 5).

（顔の向き及び傾きの検出）
図３（ａ）によれば、顔の向きは、正面向きから見て「右向き」であって、顔の傾きは、正面垂線から見て「下３０度」である。
図３（ｂ）によれば、顔の向きは、正面向きであって、顔の傾きは、正面垂線である。 (Detection of face direction and inclination)
According to FIG. 3(a), the direction of the face is "to the right" when viewed from the front, and the inclination of the face is "down 30 degrees" when viewed from the front perpendicular.
According to FIG. 3(b), the face is facing forward, and the inclination of the face is the front perpendicular.

図４は、顔の向き及び傾きを表す説明図である。 FIG. 4 is an explanatory diagram showing the direction and inclination of the face.

図４（ａ）によれば、鼻中心のポイントの座標点から、「左頬の端点」と「右頬の端点」とそれぞれの距離が算出されている。
右顎の端点と鼻中心との間の距離が、左顎の端点と鼻中心との間との間の距離よりも第１の所定閾値以上長い場合、顔の向きは「左向き」であると判定する。
左向き：左顎の端点と鼻中心間との間の距離＜右顎の端点と鼻中心との間の距離
同様に、左顎の端点と鼻中心との間の距離が、右顎の端点と鼻中心との間の距離よりも第１の所定閾値以上長い場合、顔の向きは「右向き」と判定する。
右向き：左顎の端点と鼻中心間との間の距離＞右顎の端点と鼻中心との間の距離
そして、左顎の端点と鼻中心との間の距離が、右顎の端点と鼻中心との間の距離との差が第２の所定閾値以下で小さい場合（２つの距離がほぼ一致する場合）、顔の向きは「正面向き」と判定する。
尚、他の実施形態として、顔の向きも「右向き」「左向き」「正面向き」のみならず、向きに応じた角度があってもよい。その場合、右顎の端点と鼻中心との間の距離と、左顎の端点と鼻中心との間との間の距離との比率に応じて、角度を決定するものであってもよい。例えば「右向き」であっても、「やや右向き」「４５度右向き」「完全右向き」のように段階的に検出することもできる。 According to FIG. 4A, the distances between the "end point of the left cheek" and the "end point of the right cheek" are calculated from the coordinate points of the point at the center of the nose.
If the distance between the end point of the right jaw and the center of the nose is longer than the distance between the end point of the left jaw and the center of the nose by at least a first predetermined threshold, the face direction is determined to be "towards the left". judge.
Leftward: Distance between the end point of the left jaw and the center of the nose < Distance between the end point of the right jaw and the center of the nose Similarly, the distance between the end point of the left jaw and the center of the nose is If the distance is longer than the distance to the center of the nose by a first predetermined threshold value or more, the face direction is determined to be "to the right."
Right direction: Distance between the end point of the left jaw and the center of the nose > Distance between the end point of the right jaw and the center of the nose Then, the distance between the end point of the left jaw and the center of the nose is If the difference from the distance to the center is small and equal to or less than the second predetermined threshold (if the two distances almost match), the face orientation is determined to be "front facing".
In addition, as another embodiment, the face direction is not limited to "rightward facing", "leftward facing", and "frontward facing", but may also have an angle depending on the direction. In that case, the angle may be determined according to the ratio of the distance between the end point of the right jaw and the center of the nose to the distance between the end point of the left jaw and the center of the nose. For example, even if the object is ``to the right'', it can be detected in stages such as ``a little to the right,'' ``45 degrees to the right,'' and ``completely to the right.''

図４（ｂ）によれば、正面垂線と鼻中心線との間の角度を傾きとして表している。ここでは、正面垂線に対して、右回り３０度の傾きとして表されている。 According to FIG. 4(b), the angle between the front perpendicular and the nose center line is expressed as a slope. Here, it is expressed as an inclination of 30 degrees clockwise with respect to the front perpendicular.

［装着物画像蓄積部１３０］
装着物画像蓄積部１３０は、顔の向きに応じた複数の装着物画像を予め蓄積する。 [Weared object image storage unit 130]
The wearable object image storage section 130 stores in advance a plurality of wearable object images according to the orientation of the face.

図５は、装着物画像蓄積部の説明図である。 FIG. 5 is an explanatory diagram of the attached object image storage section.

図５によれば、装着物「マスク」の画像が、装着物画像蓄積部１３０に蓄積されている。ここで、正面向きのマスク画像と、右向きのマスク画像とが蓄積されている。勿論、同じ向きであっても、色や形状が異なる複数のマスク画像が蓄積されていることも好ましい。マスクの場合、例えば布製、不織布製、平型、ブリーツ型、立体型など様々な形状がある。
勿論、他の実施形態として、同じ種類のマスク画像であっても、「やや右向き」「４５度右向き」「完全右向き」のように段階的に蓄積したものであってもよい。 According to FIG. 5, images of a wearable object "mask" are stored in the wearable object image storage section 130. Here, a front-facing mask image and a right-facing mask image are accumulated. Of course, it is also preferable that a plurality of mask images having the same orientation but different colors and shapes are stored. In the case of masks, there are various shapes such as cloth, non-woven fabric, flat type, pleat type, three-dimensional type, etc.
Of course, as another embodiment, even if the mask images are of the same type, they may be accumulated in stages such as "slightly rightward,""45 degrees rightward," and "completely rightward."

［装着物画像選択部１３］
装着物画像選択部１３は、装着物画像蓄積部１３０から、顔の向きに応じた装着物画像を選択する。
また、装着物画像選択部１３は、顔の向きが同一であって異なる装着物（マスク）画像を複数選択するものであってもよい。 [Weared object image selection unit 13]
The wearable item image selection unit 13 selects a wearable item image according to the orientation of the face from the wearable item image storage unit 130.
Furthermore, the wearable object image selection unit 13 may select a plurality of wearable object (mask) images that have the same face orientation but differ from each other.

［人工教師画像生成部１４］
人工教師画像生成部１４は、選択された装着物画像を、顔の傾きに応じて回転させて、教師画像の人の顔領域に重畳させた人工教師画像を生成する。 [Artificial teacher image generation unit 14]
The artificial teacher image generation unit 14 generates an artificial teacher image in which the selected wearable object image is rotated according to the inclination of the face and superimposed on the human face area of the teacher image.

図６は、装着物の向き及び傾きを表す説明図である。 FIG. 6 is an explanatory diagram showing the orientation and inclination of the worn object.

図６（ａ）によれば、顔は、「右向き」であって「傾き無し」の状態である。このとき、右向きのマスク画像が、装着物画像蓄積部１３０から選択される。そして、右向きマスク画像について、顔の左側距離（左頬の特徴点と鼻中心との間の距離）と右側距離（右頬の特徴点と鼻中心との間の距離）とに応じて、マスク画像のサイズがリスケール（拡大／縮小）される。これによって、装着物画像の幅及び高さを、顔の着用領域（鼻中心、下・左・右顎の端点から囲まれる領域）の幅及び高さと一致させるように調整する。 According to FIG. 6(a), the face is "facing right" and "not tilted". At this time, a right-facing mask image is selected from the wearable object image storage section 130. Then, for the right-facing mask image, the mask is set according to the left side distance (distance between the left cheek feature point and the center of the nose) and right side distance (distance between the right cheek feature point and the nose center) of the face. The size of the image is rescaled (enlarged/reduced). Thereby, the width and height of the attachment image are adjusted to match the width and height of the wearing area of the face (the area surrounded from the center of the nose and the end points of the lower, left, and right jaws).

図６（ｂ）によれば、顔は、「右向き」であって「右回り３０度の傾き」の状態である。ここでは、図６（ａ）と異なって、顔が傾いている。そのために更に、マスク画像を、鼻中心線（鼻中心と顎との間の直線）に対して、右回り３０度に傾ける。 According to FIG. 6(b), the face is "facing right" and "tilted 30 degrees clockwise." Here, unlike in FIG. 6(a), the face is tilted. For this purpose, the mask image is further tilted 30 degrees clockwise with respect to the nose center line (the straight line between the nose center and the chin).

図７は、人工教師画像生成部の説明図である。 FIG. 7 is an explanatory diagram of the artificial teacher image generation section.

図７によれば、人工教師画像生成部１４は、教師画像における人の顔の特徴点の向き及び傾きと、装着物画像（例えばマスク画像）の向き及び傾きとが一致するように、教師画像の顔領域に装着物画像を重畳させる。 According to FIG. 7, the artificial teacher image generation unit 14 generates the teacher image so that the orientation and inclination of the feature points of the human face in the teacher image match the orientation and inclination of the wearable object image (for example, a mask image). The image of the attached object is superimposed on the face area of the person.

また、他の実施形態として、装着物画像選択部１３によって複数の装着物画像が選択された場合、人工教師画像生成部１４は、１枚の教師画像に映り込む人の顔に、装着物画像それぞれを重畳させて、複数の人工教師画像を生成するものであってもよい。例えば異なる種類や色や形状のマスク画像をそれぞれ、人の顔領域に重畳させることによって、水増しされた大量の人工教師画像を生成することができる。 Further, as another embodiment, when a plurality of wearable object images are selected by the wearable object image selection section 13, the artificial teacher image generation section 14 generates a wearable object image on a person's face reflected in one teacher image. A plurality of artificial teacher images may be generated by superimposing them. For example, by superimposing mask images of different types, colors, and shapes on a person's facial area, it is possible to generate a large number of padded artificial teacher images.

更に、他の実施形態として、人工教師画像生成部１４は、生成した１枚の人工教師画像に対して、照度、色又はコントラスト（セグメンテーション処理も含む）が異なるように画像処理を施して、複数の人工教師画像を生成するものであってもよい。これによって、多様な人工教師画像を生成することができる。 Furthermore, as another embodiment, the artificial teacher image generation unit 14 performs image processing on one generated artificial teacher image so that the illuminance, color, or contrast (including segmentation processing) is different, and generates a plurality of artificial teacher images. It may also be possible to generate an artificial teacher image. This makes it possible to generate a variety of artificial teacher images.

更に、他の実施形態として、人工教師画像生成部１４は、装着物画像を教師画像の人の顔領域に重畳する際に、顔の向き及び傾きに応じて視認されない当該装着物画像をカットするものであってもよい。例えば現実的な利用場面を想定するべく、左右向けのマスクに対して、顔向き側の片方の耳に掛かるフックベルトをカットする。 Furthermore, as another embodiment, when superimposing the wearable object image on the human face area of the teacher image, the artificial teacher image generation unit 14 cuts out the wearable object image that is not visible depending on the orientation and inclination of the face. It may be something. For example, in order to imagine a realistic usage scenario, for a left-right mask, a hook belt that goes over one ear on the side facing the face is cut.

尚、他の実施形態として、装着物画像選択部１３によって選択された装着物画像が、既に顔の傾きに応じたものであれば、人工教師画像生成部１４は、その装着物画像を教師画像の人の顔領域に重畳させるだけである。
例えば装着物画像蓄積部１３０に蓄積された装着物画像が３次元モデルであって、装着物画像選択部１３が、その３次元モデルを顔の向き及び傾きに応じて回転させ、その位置で撮影した２次元の装着物画像を出力するものであってもよい。 As another embodiment, if the wearable object image selected by the wearable object image selection section 13 is already in accordance with the inclination of the face, the artificial teacher image generation section 14 converts the fitted object image into a teacher image. The image is simply superimposed on the face area of the person.
For example, if the wearable object image stored in the wearable object image storage section 130 is a three-dimensional model, the wearable object image selection section 13 rotates the three-dimensional model according to the orientation and inclination of the face, and photographs the image at that position. A two-dimensional image of the attached object may be output.

［装着物判定部１２１］
前述した実施形態によれば、顔領域検出部１１に入力される教師画像に映り込む人の顔には、装着物が着用されていないことを前提としている。そのために、特徴点検出部１２は、顔に向き及び傾きがあっても、顔全体の特徴点が取得できる。
これに対し、装着物判定部１２１は、オプション的なものであって、特徴点検出部１２から出力された顔領域について、その特徴点に既に装着物が着用されているか否かを推定し、装着物が着用されていない顔領域のみを出力する。即ち、既に装着物が着用されている顔領域を除外する。 [Weared object determination unit 121]
According to the embodiment described above, it is assumed that the person's face reflected in the teacher image input to the face area detection unit 11 is not wearing any attachments. Therefore, the feature point detection unit 12 can obtain feature points of the entire face even if the face has orientation and inclination.
On the other hand, the wearable object determination unit 121 is an optional unit that estimates whether or not a wearable item is already worn at the feature point of the facial area output from the feature point detection unit 12. Only the facial area where no attachment is worn is output. That is, the face area where the attachment is already worn is excluded.

図８は、装着物判定部の説明図である。 FIG. 8 is an explanatory diagram of the attached object determination section.

図８によれば、例えばマスク着用済みの人が映り込む教師画像が、顔領域検出部１１に入力されたとする。顔領域検出部１１は、マスクを着用した顔領域を検出し、特徴点検出部１２は、マスクの閉域領域以外の顔領域についてのみ特徴点を検出する。マスの場合、口元部分の特徴点が検出できない。
装着物判定部１２１は、具体的には、マスクを着用しない顔の特徴点（68個）と、マスクを着用した顔の特徴点（＜68個）とを教師データとして予め訓練した、サポートベクタマシンであってもよい。これによって、装着物判定部１２１は、特徴点検出部１２から出力された顔領域に、既に装着物が着用されているか否かを分類して推定することができる。そして、装着物が着用されていない顔領域のみを出力する。
尚、装着物判定部１２１は、サポートベクタマシンに限られず、装着物有無の２分類の機械学習エンジンであればよく、又は、単純な特徴点群座標の分岐判定機能であってもよい。 According to FIG. 8, it is assumed that, for example, a teacher image in which a person wearing a mask is reflected is input to the face area detection unit 11. The face area detection unit 11 detects a face area wearing a mask, and the feature point detection unit 12 detects feature points only for face areas other than the closed area of the mask. In the case of a trout, the feature points around the mouth cannot be detected.
Specifically, the wearing object determination unit 121 uses a support vector trained in advance using the feature points of a face not wearing a mask (68 points) and the feature points of a face wearing a mask (<68 points) as training data. It may be a machine. Thereby, the wearable item determination unit 121 can classify and estimate whether or not a wearable item is already being worn in the face area output from the feature point detection unit 12. Then, only the facial region where no attachment is worn is output.
Note that the wearable object determination unit 121 is not limited to a support vector machine, and may be any machine learning engine that can classify two categories, ie, presence or absence of a wearable object, or may be a simple branch determination function of feature point group coordinates.

図９は、人工教師画像を用いた顔認証装置の第１の機能構成図である。 FIG. 9 is a first functional configuration diagram of a face authentication device using an artificial teacher image.

図９によれば、顔認識装置２は、認識対象画像を入力し、認識したユーザＩＤを出力する。顔認識装置２に入力される認識対象画像には、人の顔が映り込んでおり、その顔には、装着物を着用していても、していなくてもよい。
顔認識エンジン１５は、図１で前述したように、教師画像と、水増しされた人工教師画像との両方が混在したものによって、予め訓練されたものである。
顔認識エンジン１５としては、例えばGoogle（登録商標）のFacenet（登録商標）を用いることができる。これによって、顔領域から変換された多次元ベクトル（128/256/521バイト）によって学習する。多次元ベクトルのユークリッド距離が最も近いユーザＩＤを推定することができる。 According to FIG. 9, the face recognition device 2 inputs a recognition target image and outputs a recognized user ID. The recognition target image input to the face recognition device 2 includes a human face, and the face may or may not be wearing an accessory.
As described above with reference to FIG. 1, the face recognition engine 15 is trained in advance using a mixture of teacher images and padded artificial teacher images.
As the face recognition engine 15, for example, Facenet (registered trademark) of Google (registered trademark) can be used. As a result, learning is performed using multidimensional vectors (128/256/521 bytes) converted from the face region. It is possible to estimate the user ID whose multidimensional vector has the closest Euclidean distance.

顔認識エンジン１５は、例えば前述した非特許文献３に記載の表情認識ＡＩであってもよい。表情認識ＡＩは、あらゆる顔の向きでもその表情を高精度に分析することができるが、更に装着物を着用した場合でも表情認識が可能となる。表情認識ＡＩは、顔の向きと表情認識の推定を分離した２段階で訓練されている。特に、顔の向き（上・下・左・右・中）及び傾きに応じて、３種類の表情認識モデル（ポジ、ネガ、ニュートラル）を適用している。本発明によれば、顔の向き及び傾きに応じて着用物を着用させた人工教師画像を生成することができる。これによって、対象ユーザが顔に装着物を着用していても、可能な限り、顔表情を推定することができる。 The face recognition engine 15 may be, for example, the facial expression recognition AI described in Non-Patent Document 3 mentioned above. Facial expression recognition AI can analyze facial expressions with high accuracy regardless of the orientation of the face, but it can also recognize facial expressions even when wearing an accessory. Facial expression recognition AI is trained in two stages, separating facial orientation and facial expression recognition estimation. In particular, three types of facial expression recognition models (positive, negative, and neutral) are applied depending on the direction of the face (up, down, left, right, middle) and inclination. According to the present invention, it is possible to generate an artificial teacher image in which the user is wearing clothing according to the direction and inclination of the face. As a result, even if the target user is wearing something on his or her face, facial expressions can be estimated as much as possible.

図１０は、人工教師画像を用いた顔認証装置の第２の機能構成図である。 FIG. 10 is a second functional configuration diagram of a face authentication device using an artificial teacher image.

図１０も、図９と同様に、顔認識装置２は、認識対象画像を入力し、認識したユーザＩＤを出力する。顔認識装置２に入力される認識対象画像には、人の顔が映り込んでおり、その顔には、装着物を着用していても、していなくてもよい。
図１０によれば、顔認識エンジン１５１は、装着物を着用していない顔が映り込む教師画像によって予め訓練されたものであり、顔認識エンジン１５２は、装着物を着用した顔が映り込む画像に水増しされた人工教師画像によって予め訓練されたものである。 In FIG. 10, similarly to FIG. 9, the face recognition device 2 inputs the recognition target image and outputs the recognized user ID. The recognition target image input to the face recognition device 2 includes a human face, and the face may or may not be wearing an accessory.
According to FIG. 10, the face recognition engine 151 is trained in advance using a teacher image in which a face is not wearing an attachment, and the face recognition engine 152 is trained in advance using an image in which a face wearing an attachment is reflected. It was trained in advance using artificial teacher images that were inflated to .

顔認識装置２は、顔領域検出部１１と、特徴点検出部１２と、装着物判定部１２１と、顔認識エンジン１５１と、顔認識エンジン１５２とを有する。
顔領域検出部１１は、前述したものと同様に、入力された認識対象画像から、人の顔領域を検出する。
特徴点検出部１２は、前述したものと同様に、検出された顔領域から特徴点を検出する。
装着物判定部１２１も、前述したものと同様に、顔領域の特徴点に既に装着物が着用されているか否かを推定する。装着物無しの顔画像は、顔認識エンジン１５１へ入力され、装着物有りの顔画像は、顔認識エンジン１５２へ入力される。装着物無しの場合と装着物有りの場合とを別々に推定することによって、認識精度を高めることができる。 The face recognition device 2 includes a face area detection section 11 , a feature point detection section 12 , a worn object determination section 121 , a face recognition engine 151 , and a face recognition engine 152 .
The face area detection unit 11 detects a human face area from the input recognition target image, as described above.
The feature point detection unit 12 detects feature points from the detected face area, as described above.
Similarly to the above-mentioned unit, the wearable item determination unit 121 also estimates whether or not a wearable item is already worn at a feature point in the face area. A facial image without wearing items is input to the face recognition engine 151, and a facial image with wearing items is input to the face recognition engine 152. Recognition accuracy can be improved by separately estimating the case without an attached item and the case with an attached item.

以上、詳細に説明したように、本発明のプログラム、人工教師画像生成装置及び方法によれば、人の顔が映り込む教師画像を入力し、人の顔に装着物が着用された画像を新たな教師画像として人工的に生成することができる。
これによって、人の顔に装着物が着用された教師画像を更に大量に収集することなく、本発明によって水増し的に生成された人工教師画像で既存の顔認識エンジンを訓練するだけで、装着物が着用された顔画像の認識が可能となる。 As described in detail above, according to the program, artificial teacher image generation device, and method of the present invention, a teacher image in which a person's face is reflected is input, and a new image in which an attachment is worn on the person's face is generated. It can be artificially generated as a training image.
This makes it possible to simply train an existing face recognition engine with the artificial teacher images generated in an inflated manner according to the present invention, without having to collect a large number of teacher images in which the person wears the object on a person's face. It becomes possible to recognize facial images in which the wearer is wearing a mask.

前述した本発明の種々の実施形態について、本発明の技術思想及び見地の範囲の種々の変更、修正及び省略は、当業者によれば容易に行うことができる。前述の説明はあくまで例であって、何ら制約しようとするものではない。本発明は、特許請求の範囲及びその均等物として限定するものにのみ制約される。 Regarding the various embodiments of the present invention described above, various changes, modifications, and omissions within the scope of the technical idea and viewpoint of the present invention can be easily made by those skilled in the art. The above description is merely an example and is not intended to be limiting in any way. The invention is limited only by the claims and their equivalents.

１人工教師画像生成装置
１０教師画像蓄積部
１１顔領域検出部
１２特徴点検出部
１２１装着物判定部
１３０装着物画像蓄積部
１３装着物画像選択部
１４人工教師画像生成部
１５顔認識エンジン
１５１装着物無し用の顔認識エンジン
１５２装着物有り用の顔認識エンジン
２顔認識装置 1 Artificial teacher image generation device 10 Teacher image storage section 11 Face area detection section 12 Feature point detection section 121 Wearing object determination section 130 Wearing object image storage section 13 Wearing object image selection section 14 Artificial teacher image generation section 15 Face recognition engine 151 Wearing Facial recognition engine for use without objects 152 Face recognition engine for use with attached items 2 Face recognition device

Claims

A program that operates a computer to input a teacher image in which a person's face is reflected, and to artificially generate an image of the person wearing an object as a new teacher image,
a fitted object image storage means for storing in advance a plurality of worn object images with different orientations;
face area detection means for detecting a human face area from the input teacher image;
Feature point detection means for detecting feature points from the detected face area and detecting the orientation and inclination of the face from the feature points;
a wearable object image selection means for selecting a wearable object image according to the orientation of the face from the wearable object image storage means;
The computer is characterized in that the computer functions as an artificial teacher image generating means for generating an artificial teacher image in which the selected attachment image is rotated according to the inclination of the face and superimposed on the human face area of the teacher image. program.

2. The program according to claim 1, wherein the wearable item causes the computer to function as if it were a mask, glasses, goggles, or sunglasses.

A computer is used as a wearing object determining means for estimating whether or not a wearing object is already worn at the feature point of the facial area output from the feature point detecting means and outputting only the facial area where no wearing object is worn. The program according to claim 1 or 2, further comprising a function.

The wearable object image selection means selects a plurality of wearable object images having the same orientation and different orientations;
4. The artificial teacher image generating means causes the computer to function to generate a plurality of artificial teacher images in which each of the plurality of wearable object images is superimposed on the human face area of the teacher image. The program described in any one of the paragraphs.

The artificial teacher image generation means performs image processing on the single generated artificial teacher image so that illuminance, color, or contrast differs, and causes the computer to function to generate a plurality of artificial teacher images. The program according to any one of claims 1 to 4.

The artificial teacher image generation means is characterized in that when superimposing the wearable object image on the human face area of the teacher image, the computer functions to cut out the wearable object image that is not visible according to the orientation and inclination of the face. The program according to any one of claims 1 to 5.

An artificial teacher image generation device that inputs a teacher image in which a person's face is reflected and artificially generates an image of a person wearing an attachment on the person's face as a new teacher image,
a fitted object image storage means for storing in advance a plurality of worn object images with different orientations;
face area detection means for detecting a human face area from the input teacher image;
Feature point detection means for detecting feature points from the detected face area and detecting the orientation and inclination of the face from the feature points;
a wearable object image selection means for selecting a wearable object image according to the orientation of the face from the wearable object image storage means;
an artificial teacher image generating means for generating an artificial teacher image in which the selected attachment image is rotated according to the inclination of the face and superimposed on the human face area of the teacher image; Image generation device.

An artificial teacher image generation method using a device that inputs a teacher image in which a person's face is reflected and artificially generates an image of a person wearing an attachment on the person's face as a new teacher image, the method comprising:
The device is
It has a fitted object image storage section that stores in advance a plurality of worn object images in different orientations,
A first step of detecting a human face area from the input teacher image;
a second step of detecting feature points from the detected face area and detecting the orientation and inclination of the face from the feature points;
a third step of selecting a wearable object image according to the orientation of the face from the wearable object image storage means;
and a fourth step of rotating the selected wearable object image according to the inclination of the face and generating an artificial teacher image superimposed on the human face area of the teacher image. Artificial teacher image generation method.