JP7218769B2

JP7218769B2 - Image generation device, image generation method, and program

Info

Publication number: JP7218769B2
Application number: JP2021016375A
Authority: JP
Inventors: 昭裕早坂
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2021-02-04
Filing date: 2021-02-04
Publication date: 2023-02-07
Anticipated expiration: 2036-11-25
Also published as: JP2021073619A

Description

本開示は、合成画像を生成する技術に関する。 The present disclosure relates to technology for generating a composite image.

人物の顔が写った画像（以下、「顔画像」とも称す）と、他の画像を合成する技術へのニーズが高まっている。 2. Description of the Related Art There is an increasing need for a technique for synthesizing an image of a person's face (hereinafter also referred to as a "face image") with another image.

たとえば、顔画像を利用して個人を識別する顔認証技術において、認証の性能を向上させる方法として、その個人が写る顔画像を大量に用意しておく方法が考えられる。しかし、様々な状況下にある同一人を撮影した画像を収集することや、様々なバリエーションの格好（たとえば、ポーズ、服装、および髪形）をした同一人物の画像を収集することは、難しい。また、たとえば、民族衣装のように特殊な（または珍しい）装着物を着用した人物に対する認識（または認証）の性能を向上させる上では、その特殊な装着物を着用した様々な人物の顔画像が用意されていることが好ましい場合がある。しかし、特殊な装着物を着用した人物は、母数が多くないため、特殊な装着物を着用した様々な人物の顔画像を大量に収集するのは極めて困難である。そこで、顔画像と他の画像とを合成することで、個人と装着物等とが様々に組み合わさった画像を取得する技術が、役に立つ場合がある。すなわち、バリエーションの豊富な合成画像を生成し、生成した画像を教師データとして使用することで、顔認証の精度が向上することが期待される。 For example, in a face authentication technology that uses facial images to identify individuals, one possible method for improving authentication performance is to prepare a large number of facial images of the individual. However, it is difficult to collect images of the same person photographed under various circumstances, or to collect images of the same person in various variations of appearance (for example, poses, clothes, and hairstyles). In addition, for example, in order to improve the performance of recognition (or authentication) of a person wearing special (or rare) clothing such as ethnic costumes, facial images of various people wearing the special clothing are used. It may be preferable to be prepared. However, since the number of persons wearing special clothing is not large, it is extremely difficult to collect a large number of face images of various persons wearing special clothing. Therefore, a technique for obtaining images in which a person and a wearable object are variously combined by synthesizing a face image and another image may be useful. That is, it is expected that the accuracy of face recognition will be improved by generating synthetic images with a wide variety of variations and using the generated images as teacher data.

また、たとえば、人物を撮影して、その人物の髪型が変化したり、何らかの装飾物を付けたりした場合の画像を生成することは、実際に変化させなくてよいため、手軽にその様子を楽しんだり、その人物に似合う髪型や装飾物を見つけたりする上で便利である。 Also, for example, when a person is photographed and an image is generated when the person's hairstyle is changed or some kind of decoration is attached, there is no need to actually change the person, so it is easy to enjoy the situation. Also, it is convenient to find hairstyles and decorations that suit the person.

画像に写った人物に対する合成により新たな画像を生成する手法は、いくつか提案されている。 Several methods have been proposed for generating a new image by synthesizing a person in an image.

特許文献１は、予め用意されていた合成用顔画像から、撮影した被写体の顔の向きおよび表情に対応する合成用顔画像を選択し、被写体の顔部分を合成用顔画像に置き換えた画像を生成する画像合成装置を開示している。 Japanese Patent Application Laid-Open No. 2004-100001 selects a face image for synthesis corresponding to the face orientation and expression of a photographed subject from among face images for synthesis prepared in advance, and creates an image in which the face portion of the subject is replaced with the face image for synthesis. An image synthesizer for generating is disclosed.

特許文献２は、群衆状態認識のための教師データを生成する方法を開示している。この方法では、たとえば装置が、人物が含まれていない背景画像と人物画像とを、操作者の指示にもとづいて、適切なラベルの教師データになるように合成する。 Patent Literature 2 discloses a method of generating training data for crowd state recognition. In this method, for example, the device synthesizes a background image that does not include a person and a person image, based on an operator's instruction, so as to obtain training data with an appropriate label.

特許文献３は、髪型、顔部品、および装着品などの画像データを用意しておき、入力画像に対して適切に変形した画像データを合成することで、入力画像に写る人物に異なる髪型等を合成した画像を生成する合成方法を開示している。 In Patent Document 3, image data such as hairstyles, facial parts, and accessories are prepared, and image data appropriately deformed with respect to an input image are synthesized, thereby giving a different hairstyle, etc. to a person in the input image. A composition method for generating a composite image is disclosed.

特許文献４は、３次元の顔モデルと複数の光のモデルを用いて、さまざまな角度、照明環境を想定した顔画像を生成する画像生成装置を開示している。 Patent Literature 4 discloses an image generation device that generates face images assuming various angles and lighting environments using a three-dimensional face model and a plurality of light models.

なお、本開示に関連する文献として、物体の姿勢を認識する姿勢認識装置を開示する、特許文献５がある。 As a document related to the present disclosure, there is Patent Document 5, which discloses a posture recognition device that recognizes the posture of an object.

特開２０１０－８６１７８号公報Japanese Unexamined Patent Application Publication No. 2010-86178 国際公開第２０１４／２０７９９１号WO2014/207991 特開平８－９６１１１号公報JP-A-8-96111 特開２０１６－０６２２２５号公報JP 2016-062225 A 特開２０１１－２０９１１６号公報Japanese Unexamined Patent Application Publication No. 2011-209116

顔画像に髪型や装着物等を合成するような場合には、どちらか一方の画像をもう一方の画像へ、破綻も違和感もなく合成したいという要求がある。加えて、元の顔画像に写っていた人物の顔の特徴を、なるべく維持したいという要求もある。 In the case of compositing a hairstyle, clothing, etc., with a facial image, there is a demand to synthesize one of the images with the other image without failure or discomfort. In addition, there is also a demand to maintain as much as possible the features of the person's face captured in the original face image.

特許文献１は、被写体の顔部分を合成用顔画像に置き換える技術である。元の合成用顔画像を変形させて合成するため、生成した合成画像では元の画像の顔の特徴が維持されない。また、主にプライバシー保護などを目的としており、高品質な（自然な）合成画像を生成することは要求されないため、高品質な合成画像を生成するための技術は開示されていない。 Japanese Patent Application Laid-Open No. 2002-201000 discloses a technique for replacing a face portion of a subject with a face image for synthesis. Since the original face image for synthesis is deformed and synthesized, the facial features of the original image are not maintained in the generated synthesized image. In addition, since it is mainly aimed at privacy protection and the like and does not require generation of a high-quality (natural) synthetic image, no technology for generating a high-quality synthetic image is disclosed.

特許文献２に開示される技術は、人物を含まない背景画像と人物画像とを合成する技術であるため、特許文献２には、顔画像と装着物等とを違和感なく合成するための技術は開示されていない。 The technique disclosed in Patent Document 2 is a technique for synthesizing a background image and a person image that do not include a person. Not disclosed.

特許文献３に開示される合成方法は、手動による編集を伴う。従って、顔画像を入力して直ちに高品質な合成画像を生成することも、短時間で大量の合成画像を生成することも難しい。また、顔部品等を顔に対して最適になるように変形する方法としては、拡大縮小、移動、回転のみしか開示されていない。装置が自動で合成する場合の処理手順については開示されていない。 The synthesis method disclosed in Patent Document 3 involves manual editing. Therefore, it is difficult to input a face image and immediately generate a high-quality composite image, or to generate a large number of composite images in a short time. Moreover, as a method of transforming facial parts and the like so as to be optimal for the face, only enlargement/reduction, movement, and rotation are disclosed. There is no disclosure of the processing procedure when the device automatically synthesizes.

特許文献４に開示される合成方法は、照明条件等を変えた場合の顔画像を生成する技術であり、顔画像の顔の部分と、それ以外の部分とを合成する技術ではない。 The synthesizing method disclosed in Patent Document 4 is a technique for generating a face image when illumination conditions are changed, and is not a technique for synthesizing the face part of the face image with other parts.

本開示は、入力された顔画像に対し、その顔画像に含まれる顔の特徴を損なわずに、顔以外の部分が自然に合成された画像を生成することができる装置および方法等を提供することを目的の１つとする。なお、上記の「自然に」とは、違和感が少ない、不自然でない、という意味である。 The present disclosure provides an apparatus, method, and the like that can generate an image in which portions other than the face are naturally synthesized from an input face image without impairing facial features included in the face image. One of the purposes is to It should be noted that the above "naturally" means less discomfort and not unnatural.

本開示の一態様に係る画像生成装置は、入力される第１の顔画像に含まれる顔の特徴点およびあらかじめ記憶されている第２の顔画像に含まれる顔の特徴点に基づいて生成される顔画像の外周点を用いて、前記第２の顔画像の顔領域が前記第１の顔画像の顔領域に合うように切り出し、前記第２の顔画像の顔領域以外の領域を変形する画像変形手段と、前記第１の顔画像の顔領域と、前記変形された第２の顔画像の、前記顔領域以外の領域とが合成された第３の顔画像を生成する画像生成手段と、を有する。 An image generation device according to an aspect of the present disclosure generates an image based on facial feature points included in an input first facial image and facial feature points included in a pre-stored second facial image. using outer peripheral points of the face image, the face region of the second face image is cut out so as to match the face region of the first face image, and the region other than the face region of the second face image is deformed. image transforming means; and image generating means for generating a third facial image in which the facial area of the first facial image and the area other than the facial area of the deformed second facial image are synthesized. , has

本開示の一態様に係る画像生成方法は、入力される第１の顔画像に含まれる顔の特徴点およびあらかじめ記憶されている第２の顔画像に含まれる顔の特徴点に基づいて生成される顔画像の外周点を用いて、前記第２の顔画像の顔領域が前記第１の顔画像の顔領域に合うように切り出し、前記第２の顔画像の顔領域以外の領域を変形し、前記第１の顔画像の顔領域と、前記変形された第２の顔画像の、前記顔領域以外の領域とが合成された第３の顔画像を生成する。 An image generating method according to an aspect of the present disclosure is generated based on facial feature points included in a first facial image to be input and facial feature points included in a pre-stored second facial image. the facial region of the second facial image is cut out so as to match the facial region of the first facial image using outer peripheral points of the facial image, and the region other than the facial region of the second facial image is deformed; and generating a third face image in which the face region of the first face image and the region other than the face region of the deformed second face image are synthesized.

本開示の一態様に係るプログラムは、コンピュータに、入力される第１の顔画像に含まれる顔の特徴点およびあらかじめ記憶されている第２の顔画像に含まれる顔の特徴点に基づいて生成される顔画像の外周点を用いて、前記第２の顔画像の顔領域が前記第１の顔画像の顔領域に合うように切り出し、前記第２の顔画像の顔領域以外の領域を変形する画像変形処理と、前記第１の顔画像の顔領域と、前記変形された第２の顔画像の、前記顔領域以外の領域とが合成された第３の顔画像を生成する画像生成処理と、を実行させる。 A program according to an aspect of the present disclosure causes a computer to generate based on facial feature points included in a first facial image to be input and facial feature points included in a second facial image stored in advance. the face region of the second face image is clipped so as to match the face region of the first face image using the outer peripheral points of the face image obtained, and the region other than the face region of the second face image is deformed. and image generation processing for generating a third facial image in which the facial area of the first facial image and the area other than the facial area of the modified second facial image are synthesized. and let it run.

本開示によれば、入力された顔画像に対し、その顔画像に含まれる顔の特徴を損なわずに、顔以外の部分が自然に合成された画像を生成することができる。 According to the present disclosure, it is possible to generate an image in which parts other than the face are naturally synthesized with respect to an input face image without impairing facial features included in the face image.

第１の実施形態に係る画像生成装置の構成を示すブロック図である。1 is a block diagram showing the configuration of an image generation device according to a first embodiment; FIG. 第１の実施形態に係る画像変形部の構成を示すブロック図である。4 is a block diagram showing the configuration of an image transformation unit according to the first embodiment; FIG. 第１の実施形態に係る画像生成装置の動作の流れを示すフローチャートである。4 is a flow chart showing the flow of operations of the image generation device according to the first embodiment; 対象顔画像において抽出される特徴点の例を示す図である。FIG. 4 is a diagram showing an example of feature points extracted from a target face image; 顔領域が特定された対象顔画像と、マスクと、それらの乗算により生成する被マスク画像と、の例を示す図である。FIG. 3 is a diagram showing an example of a target face image in which a face region is specified, a mask, and a masked image generated by multiplying them; 対象顔画像における特徴点と素材画像における特徴点とのそれぞれの対応を特定する処理の概念を示す図である。FIG. 4 is a diagram showing the concept of processing for identifying correspondences between feature points in a target face image and feature points in a material image; 対象顔画像の外周点と、その外周点に相当する投影外周点を特定する処理の概念とを示す図である。FIG. 10 is a diagram showing the concept of processing for specifying outer peripheral points of a target face image and projected outer peripheral points corresponding to the outer peripheral points; 素材画像を変形させ、合成用素材画像を生成する処理の概念を示す図である。FIG. 4 is a diagram showing the concept of processing for transforming a material image and generating a material image for composition; 反転マスクを生成する処理の概念を示す図である。FIG. 4 is a diagram showing the concept of processing for generating an inversion mask; 合成用素材画像と反転マスクとから、合成用素材画像における顔領域以外の部分が抽出される例を示す図である。FIG. 10 is a diagram showing an example of extracting a portion other than a face region in a material image for composition from a material image for composition and an inversion mask; 被マスク画像と、合成用素材画像における顔領域以外の部分とを合成して、合成画像を生成する概念を示す図である。FIG. 10 is a diagram showing a concept of generating a composite image by compositing a masked image and a portion other than a face region in a composite material image; 第１の実施形態の変形例３に係る精密変形部による素材画像の変形の概念を示す図である。FIG. 12 is a diagram illustrating the concept of deformation of a material image by a precision deformation unit according to Modification 3 of the first embodiment; 第１の実施形態の変形例３に係る画像合成部による合成の概念を示す図である。FIG. 11 is a diagram showing the concept of synthesis by an image synthesizing unit according to Modification 3 of the first embodiment; 第２の実施形態に係る画像生成装置の構成を示すブロック図である。FIG. 11 is a block diagram showing the configuration of an image generation device according to a second embodiment; FIG. 第２の実施形態に係る画像生成装置の動作の流れを示すフローチャートである。9 is a flow chart showing the flow of operations of an image generating device according to the second embodiment; 一実施形態に係る画像生成装置の構成を示すブロック図である。1 is a block diagram showing the configuration of an image generating device according to one embodiment; FIG. 一実施形態に係る画像生成装置の動作の流れを示すフローチャートである。4 is a flow chart showing the flow of operations of an image generation device according to an embodiment; 一実施形態に係る顔照合装置の構成を示すブロック図である。1 is a block diagram showing the configuration of a face matching device according to one embodiment; FIG. 各実施形態の各部を構成するハードウェアの例を示すブロック図である。It is a block diagram showing an example of hardware which constitutes each part of each embodiment.

以下、実施形態について図面を参照して詳細に説明する。ただし、以下の実施形態に記載されている構成要素はあくまで例示であり、本願発明の技術範囲をそれらのみに限定する趣旨のものではない。 Hereinafter, embodiments will be described in detail with reference to the drawings. However, the components described in the following embodiments are merely examples, and are not intended to limit the technical scope of the present invention only to them.

＜＜第１の実施形態＞＞
まず、第１の実施形態について説明する。 <<First Embodiment>>
First, the first embodiment will be explained.

［構成の説明］
図１に、第１の実施形態に係る画像生成装置１１の構成を示す。画像生成装置１１は、入力部１１０と、特徴点検出部１１１と、顔領域抽出部１１２と、姿勢推定部１１３と、画像選択部１１４と、画像変形部１１５と、画像合成部１１６と、記憶部１１７とを備える。画像変形部１１５は、図２に示すように、パラメータ推定部１１５１と、投影外周点生成部１１５２と、精密変形部１１５３とを含む。 [Description of configuration]
FIG. 1 shows the configuration of an image generation device 11 according to the first embodiment. The image generation device 11 includes an input unit 110, a feature point detection unit 111, a face area extraction unit 112, a posture estimation unit 113, an image selection unit 114, an image transformation unit 115, an image synthesis unit 116, and a storage unit. and a portion 117 . The image transforming unit 115 includes a parameter estimating unit 1151, a projected outer peripheral point generating unit 1152, and a precise transforming unit 1153, as shown in FIG.

入力部１１０は、人物の顔が写った画像を入力として受け付ける。すなわち、入力部１１０は、人物の顔が写った画像を取り込む。以下、入力によって画像生成装置１１により取り込まれた画像を「対象顔画像」とも呼ぶ。対象顔画像は、カメラ等の撮像装置による撮像によって取得されてもよいし、人物の顔が写った画像を保存する記憶媒体や記憶装置等から読み出されてもよい。対象顔画像は、適宜、トリミング等の補正がなされてもよい。 The input unit 110 receives an image of a person's face as an input. That is, the input unit 110 captures an image of a person's face. Hereinafter, the image captured by the image generation device 11 as an input is also referred to as a "target face image". The target face image may be captured by an imaging device such as a camera, or may be read from a storage medium, storage device, or the like that stores an image of a person's face. The target face image may be appropriately corrected such as by trimming.

特徴点検出部１１１は、対象顔画像に写った人物の顔および顔の特徴点を検出する。顔の特徴点とは、その人物の顔（輪郭を含む）における、その人物の特徴を表しうる点である。顔および顔の特徴点を検出する方法としては、例えばＶｉｏｌａ－Ｊｏｎｅｓ法などがある。ただし、Ｖｉｏｌａ－Ｊｏｎｅｓ法はあくまでも一例であり、特徴点検出部１１１は、その他の既知の手法を用いて顔および顔の特徴点を検出してもよい。 The feature point detection unit 111 detects the face of a person in the target face image and the feature points of the face. A facial feature point is a point that can represent a feature of a person on the person's face (including an outline). Methods for detecting faces and facial feature points include, for example, the Viola-Jones method. However, the Viola-Jones method is merely an example, and the feature point detection unit 111 may detect the face and the feature points of the face using other known methods.

なお、対象顔画像に写った人物が複数人いる場合は、特徴点検出部１１１は、それぞれの人物の顔の特徴点を検出してもよいし、特定の人物のみの顔の特徴点を検出してもよい。特定の人物とは、たとえば、顔の大きさが最も大きい人物、顔の輝度が最も明るい人物でもよい。特定の人物とは、予め指定された顔画像との照合により、予め指定された顔画像の人物であると特定された人物でもよい。以下、顔の特徴点が抽出された顔を持つ人物を、「対象人物」と称する。 Note that when there are a plurality of persons appearing in the target face image, the feature point detection unit 111 may detect the feature points of each person's face, or may detect the feature points of only a specific person's face. You may The specific person may be, for example, the person with the largest face size or the person with the brightest face brightness. The specific person may be a person who is identified as the person of the predesignated face image by matching with the predesignated face image. Hereinafter, a person having a face for which facial feature points have been extracted is referred to as a "target person".

特徴点検出部１１１は、少なくとも、目、鼻、および口などの、顔の主要なパーツにおける特徴点を抽出する。パーツにおける特徴点とは、たとえば、パーツの端点（目頭、目尻等）や、パーツ上における端点の間をＰ等分（Ｐは自然数）する分割点、等である。特徴点検出部１１１は、顔において上記主要なパーツよりも外側に位置する特徴点（例えば、顔の輪郭や眉などにおける特徴点）も検出してもよい。 The feature point detection unit 111 extracts at least feature points in the main parts of the face such as the eyes, nose and mouth. The feature points of a part are, for example, end points (inner and outer corners of the eye) of the part, dividing points that divide the end points on the part into P equal parts (P is a natural number), and the like. The feature point detection unit 111 may also detect feature points located outside the main parts of the face (for example, feature points in the outline of the face, eyebrows, etc.).

顔領域抽出部１１２は、対象顔画像における、顔領域を抽出する。対象顔画像の顔領域は、少なくとも特徴点検出部１１１により検出された特徴点を含む、対象顔画像における領域である。対象顔画像の顔領域は、たとえば、顔の主要なパーツを包含する、対象人物の顔の内部（外郭を含んでもよい）の領域である。 The facial area extraction unit 112 extracts a facial area in the target facial image. The face region of the target face image is a region in the target face image that includes at least the feature points detected by the feature point detection unit 111 . The face region of the target face image is, for example, the region inside (and possibly including the outline of) the target person's face that includes the main parts of the face.

顔領域抽出部１１２は、特徴点検出部１１１により検出された特徴点の少なくとも１つに基づいて、顔領域を抽出する。顔領域抽出部１１２は、たとえば、検出された特徴点のうち、顔の主要なパーツよりも外側に位置する特徴点どうしを結ぶことで形成される閉領域を、顔領域として抽出してもよい。顔領域の抽出処理の具体的な例は、［動作の説明］において説明する。なお、以下の説明で、顔領域を抽出するために基になった特徴点を、「顔領域を規定する点」と称することもある。 The facial area extractor 112 extracts a facial area based on at least one of the feature points detected by the feature point detector 111 . Facial region extracting section 112 may extract, as a facial region, a closed region formed by connecting feature points located outside major parts of the face among the detected feature points, for example. . A specific example of the facial region extraction processing will be described in [Description of Operation]. Note that, in the following description, the feature points that are used as the basis for extracting the face area may also be referred to as "points that define the face area".

そして、顔領域抽出部１１２は、たとえば、顔領域の内部の点に“１”を、その他の領域の点に“０”を、それぞれマスク値として割り当てた、マスクを生成してもよい。顔領域抽出部１１２は、対象顔画像の各画素の画素値に対してマスクの各画素のマスク値を乗算することで、顔領域以外の画素の画素値がゼロになり、顔領域の画像のみの画素値が残った画像（以下、被マスク画像）を生成してもよい。 Then, the facial area extracting unit 112 may generate a mask by assigning, for example, "1" to points inside the facial area and "0" to points in other areas as mask values. The facial area extracting unit 112 multiplies the pixel value of each pixel of the target facial image by the mask value of each pixel of the mask, so that the pixel value of pixels other than the facial area becomes zero, and only the image of the facial area is extracted. An image in which the pixel values of are left (hereinafter referred to as a masked image) may be generated.

なお、抽出された特徴点の情報は、たとえば記憶部１１７に記憶されてもよい。抽出された特徴点の情報は、画像生成装置１１の各部が参照可能であるように記憶されてよい。特徴点の情報とは、たとえば、特徴点のそれぞれの位置、および、特徴点のそれぞれが顔のパーツのどの部分に相当するかを示す情報である。 Information on the extracted feature points may be stored in storage unit 117, for example. Information on the extracted feature points may be stored so that each unit of the image generation device 11 can refer to it. The feature point information is, for example, information indicating the position of each feature point and which part of the face each feature point corresponds to.

姿勢推定部１１３は、対象人物の顔の向き（「顔の姿勢」とも称す）を推定する。顔の姿勢は、たとえば、顔画像が写しとった三次元空間においてその人物の顔面が向いている方向である。顔の姿勢は、たとえば、顔画像の正面方向、すなわち撮像装置に対して正面を向いている場合の顔の方向を基準とした３種類の角度（ピッチ角、ヨー角、およびロール角）によって、記述可能である。ピッチ角は左右の軸まわりの回転角、ヨー角は上下の軸まわりの回転角、ロール角は前後の軸まわりの回転角である。顔の姿勢は、別の例では、三次元ベクトルによって記述可能である。たとえば、姿勢推定部１１３は、顔画像の正面方向に平行な基準ベクトルを含む、互いに直交する方向の３つの基準ベクトルを用いて、顔面の法線方向を記述してもよい。以下、顔の姿勢を特定する情報を、姿勢情報と称する。 The posture estimation unit 113 estimates the orientation of the target person's face (also referred to as “face posture”). The pose of the face is, for example, the direction in which the person's face is facing in the three-dimensional space captured by the face image. The posture of the face can be determined, for example, by three angles (pitch angle, yaw angle, and roll angle) based on the front direction of the face image, that is, the direction of the face when the face is facing the front with respect to the imaging device. can be described. The pitch angle is the angle of rotation about the left-right axis, the yaw angle is the angle of rotation about the vertical axis, and the roll angle is the angle of rotation about the front-back axis. Face pose, in another example, can be described by a three-dimensional vector. For example, posture estimating section 113 may describe the normal direction of the face using three reference vectors in mutually orthogonal directions, including a reference vector parallel to the front direction of the face image. Information specifying the posture of the face is hereinafter referred to as posture information.

姿勢推定部１１３は、顔の姿勢を、例えば特許文献５（上述）に開示される方法などを用いて推定すればよい。ただし、上記文献に開示される方法はあくまで一例であり、姿勢推定部１１３は、その他の既知の手法を用いて人物の顔の姿勢を推定してもよい。 Posture estimation section 113 may estimate the posture of the face using, for example, the method disclosed in Patent Document 5 (described above). However, the method disclosed in the above document is merely an example, and posture estimation section 113 may estimate the posture of the person's face using another known method.

記憶部１１７は、素材画像を記憶する。素材画像は、人物の顔を含む顔画像である。素材画像は、画像生成装置１１によって生成される合成画像において顔部分以外を構成する素材となる。たとえば、画像生成装置１１の利用者は、記憶部１１７に、合成に用いたい格好や髪型をした、または合成に用いたい装着物を身に着けた、人物が写った顔画像を、素材画像として記憶させる。記憶される素材画像には、その素材画像に含まれる顔の特徴点の情報と、姿勢情報とが関連付けられて記憶される。顔の特徴点の情報および姿勢情報は、画像生成装置１１等によって検出されてもよい。素材画像に含まれる顔の特徴点の少なくとも１つ以上は、特徴点検出部１１１が対象顔画像において検出する特徴点に対応付けられる。例えば、素材画像に含まれる顔の特徴点はすべて、対象顔画像において検出される特徴点のそれぞれに対応付けられていてもよい。素材画像に含まれる顔の特徴点は、例えば、特徴点検出部１１１による検出の方法と同じ方法によって検出された特徴点でもよい。 The storage unit 117 stores material images. A material image is a face image including a person's face. The material image is a material that constitutes the composite image generated by the image generating device 11 except for the face portion. For example, the user of the image generation device 11 stores, in the storage unit 117, a face image of a person with a style or hairstyle desired to be used for synthesis, or wearing an accessory desired to be used for synthesis, as a material image. Memorize. In the material image to be stored, the information of the feature points of the face included in the material image and the posture information are associated with each other and stored. The information on the facial feature points and the posture information may be detected by the image generation device 11 or the like. At least one or more feature points of the face included in the material image are associated with feature points detected in the target face image by the feature point detection unit 111 . For example, all feature points of the face included in the material image may be associated with each feature point detected in the target face image. The feature points of the face included in the material image may be feature points detected by the same method as the detection method by the feature point detection unit 111, for example.

画像選択部１１４は、記憶部１１７に記憶された素材画像の中から、合成に適切な素材画像を選択する。合成に適切な素材画像は、対象人物の顔を違和感なく合成しやすい素材画像である。 The image selection unit 114 selects material images suitable for synthesis from the material images stored in the storage unit 117 . A material image suitable for synthesis is a material image that facilitates synthesis of the target person's face without discomfort.

対象人物の顔を違和感なく合成しやすい素材画像は、たとえば、対象人物の顔の姿勢と近い姿勢の顔が写った素材画像である。すなわち、合成しやすい素材画像は、対象顔画像の姿勢情報に近い姿勢情報が関連付けられた素材画像である。２つの姿勢情報が「近い」とは、たとえば、姿勢情報を記述するパラメータの値が、２つの姿勢情報の間で近いことである。 A material image with which the target person's face can be easily synthesized without discomfort is, for example, a material image showing a face in a posture similar to that of the target person's face. That is, a material image that is easy to synthesize is a material image associated with posture information close to the posture information of the target face image. Two pieces of posture information being “close” means, for example, that the values of parameters describing the posture information are close between the two pieces of posture information.

すなわち、画像選択部１１４は、複数の素材画像に含まれる顔の姿勢と、対象人物の顔の姿勢とに基づいて、合成に適切な素材画像を選択する。たとえば、姿勢情報が顔のピッチ角、ヨー角、およびロール角の値で記述される場合、画像選択部１１４は、上記３つの値の組について、対象人物の姿勢情報と、各素材画像に関連付けられた姿勢情報との間のユークリッド距離を計算してもよい。ユークリッド距離は、パラメータごとの値の差の２乗値の、総和の平方根である。そして、画像選択部１１４は、対象人物の顔の姿勢情報に対してユークリッド距離が小さい姿勢情報が関連付けられた素材画像を、対象人物の顔の姿勢と近い姿勢の顔が写った素材画像として選択してもよい。たとえば、姿勢情報が三次元ベクトルで表される場合、対象人物の姿勢を表す単位ベクトルと、各素材画像に関連付けられた姿勢を表す単位ベクトルとの内積を計算してもよい。そして、画像選択部１１４は、対象人物の顔の姿勢を表すベクトルとの内積が大きいベクトルが関連付けられた素材画像を、対象人物の顔の姿勢と近い姿勢の顔が写った素材画像として選択してもよい。 That is, the image selection unit 114 selects material images suitable for synthesis based on the facial postures included in the plurality of material images and the facial posture of the target person. For example, when the posture information is described by the values of the pitch angle, yaw angle, and roll angle of the face, the image selection unit 114 associates the set of the above three values with the posture information of the target person and each material image. The Euclidean distance between the given pose information may be calculated. The Euclidean distance is the square root of the sum of the squared differences of the values for each parameter. Then, the image selection unit 114 selects a material image in which posture information with a small Euclidean distance is associated with the face posture information of the target person as a material image showing a face in a posture similar to the face posture of the target person. You may For example, when posture information is represented by a three-dimensional vector, an inner product of a unit vector representing the posture of the target person and a unit vector representing the posture associated with each material image may be calculated. Then, the image selection unit 114 selects the material image associated with the vector having a large inner product with the vector representing the face posture of the target person as the material image showing the face in a posture similar to the face posture of the target person. may

なお、記憶部１１７に記憶された素材画像には、予め姿勢情報が関連付けられて記憶されていてもよい。 Note that the material images stored in the storage unit 117 may be stored in association with posture information in advance.

画像選択部１１４は、対象人物の顔の姿勢情報に最も近い姿勢情報が関連付けられた素材画像を１枚だけ選択してもよい。画像選択部１１４は、対象人物の顔の姿勢情報に対する姿勢情報の近さが所定の基準以内である顔画像をＮ枚（Ｎは２以上の整数）特定し、抽出された顔画像のうちＭ枚（Ｍは１以上Ｎ以下の整数）を、ランダムに、または特定の基準に基づいて、選択してもよい。画像選択部１１４は、肌の色や輝度などに基づいて、適切な素材画像を絞り込む工程を経て素材画像を決定してもよい。たとえば、画像選択部１１４は、対象顔画像の顔領域内の輝度の分布と、素材画像の顔領域内の輝度の分布を比較してもよい。画像選択部１１４は、顔領域内の輝度の分布が対象顔画像に類似している素材画像を選択してもよい。それにより、より自然な合成画像が生成されることが期待される。 The image selection unit 114 may select only one material image associated with posture information closest to the posture information of the target person's face. The image selection unit 114 identifies N face images (N is an integer equal to or greater than 2) in which the proximity of the posture information to the face posture information of the target person is within a predetermined criterion, and selects M of the extracted face images. The sheets (M is an integer between 1 and N) may be selected randomly or based on a specific criterion. The image selection unit 114 may determine material images through a process of narrowing down appropriate material images based on skin color, brightness, and the like. For example, the image selection unit 114 may compare the luminance distribution in the face region of the target face image with the luminance distribution in the face region of the material image. The image selection unit 114 may select a material image in which the luminance distribution in the face region is similar to the target face image. As a result, it is expected that a more natural synthesized image will be generated.

なお、画像選択部１１４により２枚以上の素材画像が選択された場合は、以降で説明される合成画像の生成の処理は、選択されたそれぞれの素材画像に対して行われればよい。 Note that when two or more material images are selected by the image selection unit 114, the process of generating a composite image, which will be described later, may be performed on each of the selected material images.

画像変形部１１５は、画像選択部１１４により選択された素材画像を、その素材画像の特徴点の情報と、対象顔画像およびその特徴点の情報とに基づき変形させる。この変形によって生成する画像を、以下、合成用素材画像と称する。合成用素材画像は、選択された素材画像と対象顔画像とを合成するための画像である。 The image transformation unit 115 transforms the material image selected by the image selection unit 114 based on the information on the feature points of the material image and the information on the target face image and its feature points. An image generated by this deformation is hereinafter referred to as a composite material image. The material image for synthesis is an image for synthesizing the selected material image and the target face image.

画像変形部１１５は、画像選択部１１４により選択された素材画像を合成に適切な画像になるように変形する。画像変形部１１５は、合成に適切な画像への変形として、以下の手順で、素材画像の変形を行う。なお、パラメータ推定部１１５１、投影外周点生成部１１５２、および精密変形部１１５３は、画像変形部１１５に含まれる部である。 The image transformation unit 115 transforms the material images selected by the image selection unit 114 into images suitable for synthesis. The image transforming unit 115 transforms the material image according to the following procedure to transform the image into an image suitable for synthesis. Note that the parameter estimation unit 1151 , the projection peripheral point generation unit 1152 , and the precise transformation unit 1153 are units included in the image transformation unit 115 .

まず、パラメータ推定部１１５１が、対象人物の顔の特徴点の情報と、選択された素材画像の顔の特徴点の情報とに基づき、対象顔画像の座標系と素材画像の座標系とを対応付ける幾何変形パラメータを推定する。ここで行われる幾何変形は厳密である必要はない。たとえば、パラメータ推定部１１５１は、アフィン変換と同程度の自由度の幾何変形パラメータを求めればよい。パラメータ推定部１１５１は、たとえば、対象人物の顔の特徴点が幾何変形パラメータによって投影される位置が、それぞれ、素材画像の対応付けられた特徴点の位置になるべく近くなるような、幾何変形パラメータを求める。なるべく近くなるような幾何変形パラメータを求めるために、パラメータ推定部１１５１は、たとえば、最小二乗法などを用いればよい。 First, the parameter estimation unit 1151 associates the coordinate system of the target face image with the coordinate system of the material image based on the information on the feature points of the face of the target person and the information on the feature points of the selected material image. Estimate geometric deformation parameters. The geometric transformations performed here do not have to be exact. For example, the parameter estimator 1151 may obtain a geometric deformation parameter with a degree of freedom similar to that of affine transformation. The parameter estimating unit 1151, for example, sets geometric deformation parameters such that the positions at which the feature points of the target person's face are projected by the geometric deformation parameters are as close as possible to the positions of the associated feature points of the material image. demand. In order to obtain geometric deformation parameters that are as close as possible, the parameter estimator 1151 may use, for example, the method of least squares.

幾何変形パラメータを求める際に基づく、対象人物の顔の特徴点および素材画像の顔の特徴点は、顔における特徴点の全てでもよいし、一部（例えば、主要パーツにおける特徴点のみ等）でもよい。 The feature points of the face of the target person and the feature points of the material image based on which the geometric deformation parameters are obtained may be all of the feature points of the face, or some of them (for example, only the feature points of the main parts). good.

なお、パラメータ推定部１１５１は、アフィン変換よりも自由度の高い幾何変形のパラメータを推定してもよいが、推定されたパラメータによる変換によって、対象人物の顔の特徴が損なわれないことが望ましい。 The parameter estimating unit 1151 may estimate parameters for geometric transformation that have a higher degree of freedom than affine transformation, but it is desirable that the transformation using the estimated parameters does not impair the features of the target person's face.

そして、投影外周点生成部１１５２が、パラメータ推定部１１５１により推定された幾何変形パラメータを用いて、対象顔画像の外周上の点を素材画像上へ投影する。すなわち、投影外周点生成部１１５２は、対象顔画像の外周上の点に相当する（対応する）、素材画像上の点を特定する。本実施形態では、この投影する処理により特定された素材画像の点を、投影外周点と呼ぶ。素材画像に投影する、対象顔画像の外周上の点（以下、「外周点」）は、たとえば、対象顔画像が四角形である場合は、その四角形の４つの頂点を含む、複数の点である。 Then, using the geometric deformation parameters estimated by the parameter estimation unit 1151, the projection outer peripheral point generation unit 1152 projects the points on the outer periphery of the target face image onto the material image. In other words, the projection circumference point generation unit 1152 identifies points on the material image that correspond to (correspond to) points on the circumference of the target face image. In the present embodiment, the points of the material image specified by this projection process are called projection peripheral points. Points on the periphery of the target face image projected onto the material image (hereinafter referred to as "periphery points") are, for example, when the target face image is a quadrangle, a plurality of points including the four vertices of the quadrangle. .

ただし、特定された素材画像上の投影外周点が、素材画像の外側に位置する場合がありうる。そのような場合とは、たとえば、対象人物の顔の位置と素材画像に写る顔の位置が著しく異なる場合等である。また、素材画像に含まれる顔の大きさが対象人物の顔の大きさよりも大きい場合も、推定される幾何変形パラメータは画像を拡大させる変形を含む変形のパラメータとなるため、投影外周点が素材画像の外側に位置する場合がある。そのような場合には、投影外周点生成部１１５２は、対象顔画像をトリミングしてもよい。たとえば、投影外周点生成部１１５２は、トリミングされた後の対象顔画像の外周点が、素材画像上（外周線上でもよい）に投影されるように、対象顔画像をトリミングしてもよい。対象顔画像がトリミングされた場合、画像変形部１１５はパラメータ推定部１１５１および投影外周点生成部１１５２による処理を改めて行い、以降の処理では、元の対象顔画像の代わりに、トリミングされた対象顔画像が用いられればよい。 However, there may be a case where the identified projected peripheral point on the material image is positioned outside the material image. Such a case is, for example, a case where the position of the target person's face and the position of the face appearing in the material image are significantly different. Also, when the size of the face included in the material image is larger than the size of the target person's face, the estimated geometric deformation parameters are parameters for deformation including deformation that enlarges the image. May be located outside the image. In such a case, the projected peripheral point generator 1152 may trim the target face image. For example, the projected peripheral point generator 1152 may trim the target face image so that the trimmed peripheral points of the target face image are projected onto the material image (or on the peripheral line). When the target face image is trimmed, the image transforming unit 115 performs the processing by the parameter estimating unit 1151 and the projection peripheral point generating unit 1152 again. An image may be used.

上記のような投影外周点生成部１１５２による投影によれば、対象顔画像全体の領域に相当する、素材画像上の領域を特定することが可能となる。すなわち、投影外周点を結ぶことによって形成される線が、対象顔画像の外周線（すなわち、外郭、画像の枠）に相当する。 According to the projection by the projection peripheral point generation unit 1152 as described above, it is possible to specify the area on the material image that corresponds to the area of the entire target face image. That is, the line formed by connecting the projected peripheral points corresponds to the peripheral line (that is, the outline, the frame of the image) of the target face image.

投影外周点が特定されたら、精密変形部１１５３が、素材画像の対象顔画像全体の領域に相当する領域を、素材画像の顔領域が対象顔画像の顔領域に合うように、変形する。 After the projection outer periphery point is specified, the precise transformation unit 1153 transforms the area corresponding to the entire target face image area of the material image so that the face area of the material image matches the face area of the target face image.

素材画像の顔領域とは、対象顔画像の顔領域に相当する素材画像における領域である。すなわち、素材画像の顔領域とは、対象顔画像の顔領域を規定する特徴点（顔領域を抽出する際に基になった特徴点）に対応付けられる、素材画像における特徴点に基づいて抽出される領域である。たとえば、対象顔画像の顔領域が、主要なパーツよりも外側に位置する特徴点を結ぶことによって抽出された場合では、素材画像の顔領域は、その特徴点にそれぞれ対応付けられる素材画像上の特徴点を結ぶことによって形成される領域である。すなわち、この場合、精密変形部１１５３による変形によって、変形後の素材画像における、対象顔画像の顔領域を規定する特徴点に対応付けられる特徴点の位置は、それぞれ対象顔画像の顔領域を規定する特徴点の位置に一致する。 The face area of the material image is an area in the material image corresponding to the face area of the target face image. That is, the face region of the material image is extracted based on the feature points in the material image that are associated with the feature points that define the face region of the target face image (the feature points based on which the face region is extracted). This is the area where For example, when the face region of the target face image is extracted by connecting the feature points located outside the main parts, the face region of the material image is the A region formed by connecting feature points. That is, in this case, the positions of the feature points corresponding to the feature points defining the face region of the target face image in the material image after deformation by the deformation by the precision deformation unit 1153 respectively define the face region of the target face image. match the positions of the feature points that

本実施形態では、精密変形部１１５３による変形の対象となる領域は、素材画像の対象顔画像全体の領域に相当する領域である。すなわち、精密変形部１１５３は、たとえば、素材画像における投影外周点どうしを結ぶことによって形成される線に囲まれる領域を、投影外周点に基づいて切り出し、切り出した領域を変形する。切り出された領域の外周線は、素材画像上における、対象顔画像の外周線に相当する。 In this embodiment, the area to be transformed by the precise transformation unit 1153 is the area corresponding to the entire target face image of the material image. That is, the precise transformation unit 1153 cuts out, for example, an area surrounded by lines formed by connecting the projected outer circumference points in the material image based on the projected outer circumference points, and deforms the cut out area. The outline of the clipped region corresponds to the outline of the target face image on the material image.

精密変形部１１５３は、たとえば、素材画像の顔領域の外周線と、切り出された領域の外周線と、で囲まれる領域が、対象顔画像の顔領域の外周線と、対象顔画像の外周線と、で囲まれる領域に一致するように、切り出された領域を変形する。 For example, the precision transforming unit 1153 converts the area surrounded by the outer circumference of the face area of the material image and the outer circumference of the clipped area into the outer circumference of the face area of the target face image and the outer circumference of the target face image. and transform the clipped region to match the region bounded by .

精密変形部１１５３による変形は、対象顔画像の顔領域を規定する特徴点に対応付けられる素材画像における特徴点だけでなく、対象顔画像の顔領域を規定する特徴点以外の特徴点に対応付けられる素材画像における特徴点も、その特徴点に対応付けられる対象顔画像における特徴点の位置にそれぞれ一致するような、変形であってもよい。精密変形部１１５３による変形は、素材画像の顔領域内に存在する特徴点にも基づく変形であってもよいし、当該特徴点には基づかない変形であってもよい。素材画像の顔領域内の画像は精密変形部１１５３による変形によって変形されなくてもよい。なぜなら、素材画像の顔領域内の画像は、後述の画像合成部１１６による合成処理によって対象顔画像の顔に置き換えられるからである。 The deformation by the precision deformation unit 1153 is performed not only on the feature points in the material image that are associated with the feature points defining the face region of the target face image, but also on the feature points other than the feature points that define the face region of the target face image. The feature points in the material image to be processed may also be deformed so as to match the positions of the feature points in the target face image associated with the feature points. The deformation by the precise deformation unit 1153 may be deformation based on feature points existing in the face area of the material image, or may be deformation not based on the feature points. The image in the face area of the material image does not have to be deformed by the precision deformation unit 1153 . This is because the image in the face area of the material image is replaced with the face of the target face image by synthesis processing by the image synthesizing unit 116, which will be described later.

上記のような変形のため、精密変形部１１５３は、たとえば、自由度の高い非線形な幾何変形を行う。例えば、精密変形部１１５３は、特徴点および外周点を結んで形成される三角形のパッチごとにアフィン変換をする方法や、薄板スプライン法などを用いる。 For the deformation as described above, the precision deformation unit 1153 performs, for example, nonlinear geometric deformation with a high degree of freedom. For example, the precision transformation unit 1153 uses a method of performing affine transformation for each triangular patch formed by connecting feature points and outer peripheral points, a thin plate spline method, or the like.

ただし、精密変形部１１５３は、必ずしも変形対象の領域全体に対して非線形な幾何変形を行わなくてもよい。非線形な幾何変形が行われる領域は、少なくとも顔領域の外周線を含む（境界線として含んでいてもよい）領域であればよい。精密変形部１１５３は、顔領域の外周線を含まない部分領域に対しては、線形の幾何変形を行ってもよい。 However, the precision transformation unit 1153 does not necessarily have to perform nonlinear geometric transformation on the entire transformation target region. The area on which the non-linear geometric deformation is performed may be an area including at least the outer peripheral line of the face area (may include as a boundary line). The precise transformation unit 1153 may perform linear geometric transformation on a partial area that does not include the outer peripheral line of the face area.

以上のような画像変形部１１５による処理により、合成用素材画像が生成される。 A composition material image is generated by the processing by the image transformation unit 115 as described above.

画像合成部１１６は、合成画像を生成する。具体的には、画像合成部１１６は、顔領域抽出部１１２により抽出された顔領域以外の部分が、画像変形部１１５により生成された合成用素材画像に置き換えられた画像を生成する。 The image composition unit 116 generates a composite image. Specifically, the image synthesizing unit 116 generates an image in which the portion other than the facial region extracted by the facial region extracting unit 112 is replaced with the synthetic material image generated by the image transforming unit 115 .

画像合成部１１６が合成画像を生成するのに際し、顔領域抽出部１１２がマスクを生成していた場合は、画像合成部１１６はそのマスクを用いてもよい。すなわち、画像合成部１１６は、マスクの各画素に設定されているマスク値を反転させる（“０”であった部分に“１”を割り当て、逆に“１”であった部分に“０”を割り当てる）ことで、顔領域以外の部分のみを抽出可能な、反転マスクを生成する。そして、画像合成部１１６は、反転マスクと合成用素材画像の各画素を乗算することで、合成用素材画像における顔領域以外の部分を抽出できる。画像合成部１１６は、このようにして抽出された、合成用素材画像の顔領域以外の部分と、対象顔画像の顔領域（すなわち、顔領域抽出部１１２が生成した被マスク画像）とを、合成すればよい。前述したように、合成用素材画像は、顔領域が対象顔画像の顔領域と一致するように変形されているため、対象顔画像の被マスク画像と合成用素材画像の顔領域以外の画像は、単純に各画素の加算をするだけで合成可能である。 When the image synthesizing unit 116 generates a synthetic image, if the facial region extracting unit 112 has generated a mask, the image synthesizing unit 116 may use the mask. That is, the image synthesizing unit 116 inverts the mask value set for each pixel of the mask (assigns “1” to the portion that was “0”, conversely assigns “0” to the portion that was “1”). ) to generate an inversion mask that can extract only parts other than the face area. Then, the image synthesizing unit 116 can extract a portion other than the face region in the synthetic material image by multiplying each pixel of the inversion mask and the synthetic material image. The image synthesizing unit 116 extracts the portion other than the facial area of the material image for synthesis and the facial area of the target facial image (that is, the masked image generated by the facial area extracting unit 112). should be synthesized. As described above, since the material image for synthesis is deformed so that the face area matches the face area of the target face image, the masked image of the target face image and the image other than the face area of the material image for synthesis , can be synthesized by simply adding each pixel.

ただし、この合成において、合成された２つの画像の境界に不自然なエッジが発生する場合がある。画像合成部１１６は、加算される２つの画像の一方または両方の画像の色相・彩度・明度を調整したり、境界付近の画素の色を加工したりしてもよい。境界付近については、たとえば、画像合成部１１６は、加算される２つの画像のマスク値を重み付きで平均して混合してもよい。画像合成部１１６は、ＰｏｉｓｓｏｎＩｍａｇｅＥｄｉｔｉｎｇのような手法を用いてもよい。 However, in this synthesis, an unnatural edge may occur at the boundary between the two synthesized images. The image synthesizing unit 116 may adjust the hue, saturation, and brightness of one or both of the two images to be added, or may process the color of pixels near the boundary. In the vicinity of the boundary, for example, the image synthesizing unit 116 may perform weighted averaging and blending of the mask values of the two images to be added. The image synthesizing unit 116 may use a technique such as Poisson Image Editing.

［動作の説明］
次に、具体例を用いて、第１の実施形態に係る画像生成装置１１の動作の例を説明する。図３は、画像生成装置１１の処理の流れを示すフローチャートである。ただし、図３に示す処理の流れは例示であり、各ステップは必ずしも図３に示された順で行われなくともよい。 [Explanation of operation]
Next, an example of the operation of the image generation device 11 according to the first embodiment will be described using a specific example. FIG. 3 is a flow chart showing the processing flow of the image generation device 11. As shown in FIG. However, the flow of processing shown in FIG. 3 is an example, and the steps do not necessarily have to be performed in the order shown in FIG.

ステップＳ３１では、特徴点検出部１１１が、入力部１１０によって取り込まれた対象顔画像中の、顔および顔特徴点を検出する。図４の（ａ）は、対象顔画像の例である。この例において、特徴点検出部１１１は、たとえば、図４の（ｂ）において白い小円で示されるような部分、すなわち両目尻、両目頭、鼻下、口端、眉、および輪郭（フェイスライン）等において、計１８個の特徴点を検出するとする。 In step S<b>31 , the feature point detection unit 111 detects a face and facial feature points in the target face image captured by the input unit 110 . (a) of FIG. 4 is an example of a target face image. In this example, the feature point detection unit 111 detects, for example, the portions indicated by the small white circles in FIG. ), etc., a total of 18 feature points are detected.

ステップＳ３２では、顔領域抽出部１１２が、ステップＳ３１で検出された顔特徴点に基づいて、入力画像の顔領域を抽出する。たとえば、顔領域抽出部１１２は、ステップＳ３１で検出された顔特徴点のうち、顔の輪郭上の特徴点および眉の特徴点を、目、鼻、および口を囲むように線分で結ぶことで形成される閉領域を、顔領域として抽出する（図５の（ａ））。 In step S32, the facial area extraction unit 112 extracts the facial area of the input image based on the facial feature points detected in step S31. For example, of the facial feature points detected in step S31, the facial region extraction unit 112 connects the feature points on the contour of the face and the feature points of the eyebrows with line segments surrounding the eyes, nose, and mouth. is extracted as a face area ((a) in FIG. 5).

なお、顔領域抽出部１１２が顔領域を抽出する方法は、特徴点に基づいて顔の主要なパーツを含む領域を抽出可能な方法であれば、上記以外の方法でもよい。例えば、顔領域抽出部１１２は、鼻における特徴点と、目および口の端における特徴点とをそれぞれ結ぶ線分を所定の長さだけ延長した位置にある端点を制御点とする、なめらかな曲線（たとえば、補間型スプライン曲線）で囲まれる領域を、顔領域としてもよい。 It should be noted that the facial area extracting unit 112 may extract the facial area by any method other than the above, as long as it is a method capable of extracting an area including the main parts of the face based on the feature points. For example, the face region extracting unit 112 uses, as a control point, an end point at a position obtained by extending a line segment connecting a feature point on the nose and a feature point on the edge of the eyes and mouth by a predetermined length, to form a smooth curve. A region surrounded by (for example, an interpolating spline curve) may be used as the face region.

そして、顔領域抽出部１１２は、対象人物の顔領域のみを抽出するためのマスクを生成する。たとえば、顔領域抽出部１１２は、対象顔画像における、顔領域の内部の点に対し“１”を、その他の領域に対して“０”を、それぞれマスク値として割り当てた二値のマスクを生成する（図５の（ｂ））。 Then, the facial area extraction unit 112 generates a mask for extracting only the facial area of the target person. For example, the face region extracting unit 112 generates a binary mask by assigning “1” to points inside the face region and “0” to other regions in the target face image as mask values. ((b) in FIG. 5).

そして、顔領域抽出部１１２は、そのマスクを用いて被マスク画像を生成する。たとえば、顔領域抽出部１１２は、対象顔画像の各画素とマスクの各画素を乗算することで、顔領域の画素のみが抽出された被マスク画像を生成する（図５の（ｃ））。 Then, the facial region extraction unit 112 uses the mask to generate a masked image. For example, the facial area extraction unit 112 multiplies each pixel of the target facial image by each pixel of the mask to generate a masked image in which only the pixels of the facial area are extracted ((c) in FIG. 5).

ステップＳ３３では、姿勢推定部１１３が、対象人物の顔の姿勢を推定する。たとえば、姿勢推定部１１３は、顔の姿勢を、特許文献５に開示される方法等を用いて推定するものとする。姿勢推定部１１３は、推定した姿勢を基に、姿勢情報を生成する。 In step S33, the posture estimation unit 113 estimates the posture of the target person's face. For example, posture estimation section 113 estimates the posture of the face using the method disclosed in Patent Document 5 or the like. Posture estimation section 113 generates posture information based on the estimated posture.

ステップＳ３４では、画像選択部１１４が、記憶部１１７に記憶された素材画像の中から、合成に適切な素材画像を選択する。この具体例では、画像選択部１１４は、対象顔画像の姿勢情報と素材画像に関連付けられた姿勢情報との間のユークリッド距離、すなわち、顔の姿勢を決定づける各種の角度の値のずれの２乗値の和が、最も小さい素材画像を１枚、選択するものとする。 In step S<b>34 , the image selection unit 114 selects material images suitable for synthesis from the material images stored in the storage unit 117 . In this specific example, the image selection unit 114 calculates the Euclidean distance between the pose information of the target face image and the pose information associated with the material image, that is, the square of the deviation of the various angle values that determine the pose of the face. It is assumed that one material image with the smallest sum of values is selected.

なお、選択された素材画像における、対象顔画像の特徴点に対応付けられる特徴点はそれぞれ、特定可能であるとする。以降の処理が実行可能であるためには、少なくとも、対象顔画像の顔領域を規定する特徴点に対応付けられる素材画像の特徴点が、それぞれ特定可能であればよい。 It is assumed that each feature point in the selected material image that is associated with the feature point of the target face image can be specified. In order to be able to execute the subsequent processing, at least, it is sufficient that the feature points of the material image associated with the feature points that define the face region of the target face image can be identified.

ステップＳ３５では、画像変形部１１５が、ステップＳ３４で選択された素材画像を合成に適切な画像になるように変形する。具体的な例として、画像変形部１１５内の各部が、次のような処理を行う。 In step S35, the image transformation unit 115 transforms the material image selected in step S34 into an image suitable for synthesis. As a specific example, each unit in the image transforming unit 115 performs the following processing.

まず、パラメータ推定部１１５１が、対象顔画像の顔の特徴点の情報と素材画像の顔の特徴点の情報に基づき、対象顔画像の座標系と素材画像の座標系とを対応付ける幾何変形パラメータを推定する。たとえば、パラメータ推定部１１５１は、対象顔画像の顔の目、鼻、および口における特徴点の位置と、それらの特徴点に対応付けられる素材画像の特徴点の位置とを比較することにより（図６）、アフィン変換パラメータを、最小二乗法を用いて推定する。 First, the parameter estimating unit 1151 determines a geometric deformation parameter that associates the coordinate system of the target face image with the coordinate system of the material image based on the information of the feature points of the face of the target face image and the information of the feature points of the material image. presume. For example, the parameter estimation unit 1151 compares the positions of the feature points in the eyes, nose, and mouth of the target face image with the positions of the feature points in the material image associated with those feature points (Fig. 6) Estimate the affine transformation parameters using the least squares method.

そして、投影外周点生成部１１５２が、推定したアフィン変換パラメータを使って、対象顔画像の外周上の点（図７の例では、８点）を素材画像上へ投影する。この投影により素材画像上で特定された点を投影外周点とする。 Then, using the estimated affine transformation parameters, the projected peripheral point generation unit 1152 projects the points (8 points in the example of FIG. 7) on the peripheral of the target face image onto the material image. A point specified on the material image by this projection is defined as a projected peripheral point.

そして、精密変形部１１５３が、素材画像の、対象顔画像全体の領域に相当する領域を、素材画像の顔領域が対象顔画像の顔領域に合うように、変形する。素材画像の、対象顔画像全体の領域に相当する領域は、投影外周点を結ぶことにより形成される線である。精密変形部１１５３は、たとえば、投影外周点を結ぶことにより形成される線によって切り出す。そして、精密変形部１１５３は、切り出した領域を、顔領域の外周線と、切り出した領域の外周線と、で囲まれる領域（図８の（ａ）において斜線で示される領域）の形状が、対象顔画像の顔領域の外周線および対象顔画像の外周線とで囲まれる領域（図８の（ｂ）において斜線で示される領域）の形状になるように、薄板スプライン法で変形する。精密変形部１１５３は、このとき、パラメータ推定部１１５１が推定した変換パラメータを参照してもよい。たとえば、精密変形部１１５３は、パラメータ推定部１１５１が推定した変換パラメータの逆変換のパラメータを使用して、素材画像の、対象顔画像全体の領域に相当する領域を変形させた後、精密な変形を行ってもよい。 Then, the precise transformation unit 1153 transforms the area of the material image corresponding to the entire area of the target face image so that the face area of the material image matches the face area of the target face image. The area corresponding to the entire area of the target face image in the material image is a line formed by connecting the projection outer peripheral points. The precision deformation portion 1153 is cut out by, for example, a line formed by connecting points of the projected periphery. Then, the precision deformation unit 1153 determines that the shape of the area surrounded by the outer peripheral line of the face area and the outer peripheral line of the extracted area (the hatched area in (a) of FIG. 8) is It is deformed by the thin plate spline method so as to have the shape of the area surrounded by the outer peripheral line of the target facial image and the outer peripheral line of the target facial image (the area indicated by diagonal lines in FIG. 8(b)). At this time, the precise transformation unit 1153 may refer to the transformation parameters estimated by the parameter estimation unit 1151 . For example, the precise transformation unit 1153 transforms an area of the material image corresponding to the entire area of the target face image using parameters for inverse transformation of the transformation parameters estimated by the parameter estimation unit 1151, and then performs precise transformation. may be performed.

これらの処理により、画像変形部１１５は、顔領域が対象顔画像の顔領域に一致する合成用素材画像（図８の（ｃ））を、生成する。 Through these processes, the image transforming unit 115 generates a composition material image ((c) in FIG. 8) whose face region matches the face region of the target face image.

ステップＳ３６では、画像合成部１１６が、ステップＳ３２で生成された被マスク画像と、ステップＳ３５で生成された合成用素材画像を合成し、新たに合成画像を生成する。画像合成部１１６は、まず、ステップＳ３２で生成されたマスクのマスク値を反転させる（０であった部分に１を割り当て、逆に１であった部分に０を割り当てる）ことで、顔以外の領域を抽出する反転マスクを生成する（図９）。次に、画像合成部１１６は、ステップＳ３５で生成された合成用素材画像に、上記反転マスクを乗算することで、合成用素材画像の顔以外の領域を抽出する（図１０）。そして、画像合成部１１６は、抽出した合成用素材画像の顔以外の領域と、ステップＳ３２で生成した被マスク画像とを、ＰｏｉｓｓｏｎＩｍａｇｅＥｄｉｔｉｎｇの方法を用いて合成する（図１１）。 In step S36, the image synthesizing unit 116 synthesizes the masked image generated in step S32 and the composition material image generated in step S35 to generate a new composite image. The image synthesizing unit 116 first inverts the mask value of the mask generated in step S32 (assigns 1 to the portion that was 0, and conversely assigns 0 to the portion that was 1). Generate an inverse mask that extracts the region (FIG. 9). Next, the image synthesizing unit 116 multiplies the material image for synthesis generated in step S35 by the inversion mask, thereby extracting a region other than the face of the material image for synthesis (FIG. 10). Then, the image synthesizing unit 116 synthesizes the region other than the face of the extracted synthesis material image and the masked image generated in step S32 using the Poisson Image Editing method (FIG. 11).

以上のような処理の流れにより、対象人物が、素材画像の容姿や髪型になり、または装飾物を着用したように見える合成画像が生成される。 Through the flow of processing as described above, a composite image is generated in which the target person has the appearance and hairstyle of the material image, or appears to be wearing decorations.

［効果］
第１の実施形態に係る画像生成装置１１によれば、対象顔画像に写った対象人物の顔の特徴は保持したまま、顔周辺を別のテクスチャに置き換えることができる。 [effect]
According to the image generation device 11 according to the first embodiment, it is possible to replace the periphery of the face with another texture while maintaining the features of the face of the target person captured in the target face image.

画像生成装置１１は、対象顔画像の、抽出された顔領域に対して変形を行わない。そのため、対象人物の特徴を残すことができる。一方で、画像変形部１１５により、素材画像が対象顔画像に合わせて精密に変形されるため、合成により生成される画像は自然な（違和感のない）画像になる。画像選択部１１４が、対象顔画像に含まれる顔の姿勢情報に近い姿勢情報に関連付けられた素材画像を選択することで、より違和感のない合成画像が生成する。 The image generation device 11 does not transform the extracted face area of the target face image. Therefore, the characteristics of the target person can be left. On the other hand, the image transformation unit 115 precisely transforms the material image in accordance with the target face image, so that the image generated by synthesis becomes a natural image (no sense of incongruity). The image selection unit 114 selects material images associated with posture information that is close to the posture information of the face included in the target face image, thereby generating a composite image that does not give a sense of incongruity.

このように、生成される合成画像は、違和感なく、顔の特徴が損なわれないように合成された画像である。したがって、生成される合成画像は、たとえば、顔認識や顔認証に用いられる教師データとして信頼性の高い画像である。 In this way, the synthesized image that is generated is an image that is synthesized so as not to impair the features of the face without causing discomfort. Therefore, the generated composite image is an image with high reliability as teacher data used for face recognition and face authentication, for example.

記憶部１１７に複数の素材画像を記憶させておくことで、画像生成装置１１は、素材画像の選択から変形、合成までをすべて自動的に実行することができる。つまり、画像生成装置１１によれば、労力をかけることなく、種々の合成画像を早く生成することができる。 By storing a plurality of material images in the storage unit 117, the image generation device 11 can automatically perform all processes from material image selection to transformation and synthesis. In other words, according to the image generation device 11, various composite images can be quickly generated without labor.

様々な合成画像をたくさん生成することで、たとえば、それらの合成画像を教師データとして用いて顔照合を行う装置が、精度のよい照合を行うことができる。 By generating a large number of various synthesized images, for example, a device that performs face matching using those synthesized images as teacher data can perform highly accurate matching.

また、入力された顔画像に写る人物に様々なテクスチャを自然に合成した画像が即座に生成することで、たとえば、その人物に似合う髪型や格好を、容易に、その人物の特徴を考慮した上で検討することができる。 In addition, by immediately generating an image in which various textures are naturally combined with the person in the input face image, for example, it is possible to easily determine the hairstyle and appearance that suit the person, taking into consideration the characteristics of the person. can be considered.

＜変形例＞
素材画像は、特定の人物の顔を必ずしも含んでいなくともよい。すなわち、素材画像の顔部分（対象顔画像の顔に置き換えられる部分）には顔が写っていなくともよい。代わりに、素材画像には顔の主要パーツの位置を示す情報が関連付けられていれば、画像生成装置１１の各処理は実行可能である。 <Modification>
A material image does not necessarily include the face of a specific person. That is, the face portion of the material image (the portion to be replaced with the face of the target face image) does not have to include the face. Instead, each process of the image generation device 11 can be executed if information indicating the positions of the main parts of the face is associated with the material image.

＜変形例２＞
また、たとえば、画像選択部１２４は、対象人物の姿勢情報と素材画像に関連付けられた姿勢情報との間の近さを算出する際に、画像に平行な面における回転角（すなわち、ロール角）を考慮に入れなくてもよい。すなわち、画像選択部１２４は、ヨー角およびピッチ角のみに基づいて、２つの姿勢情報の近さを算出してもよい。 <Modification 2>
Further, for example, when calculating the closeness between the posture information of the target person and the posture information associated with the material image, the image selection unit 124 may calculate the rotation angle (that is, the roll angle) in a plane parallel to the image. need not be taken into account. That is, the image selection unit 124 may calculate the closeness of the two pieces of posture information based only on the yaw angle and the pitch angle.

パラメータ推定部１１５１が推定する幾何変形パラメータが、アフィン変換パラメータのように回転操作を含む変換のパラメータであれば、選択された素材画像と対象顔画像との間のロール角に関するずれは合成画像の質に影響しない。そのようなずれは素材画像の座標系と対象顔画像の座標系との間の対応付けにおいて考慮されるからである。 If the geometric deformation parameters estimated by the parameter estimating unit 1151 are parameters for transformation including a rotation operation, such as affine transformation parameters, the roll angle deviation between the selected material image and the target face image is does not affect quality. This is because such a deviation is taken into account in the correspondence between the coordinate system of the material image and the coordinate system of the target face image.

パラメータ推定部１１５１が推定する幾何変形パラメータが回転を考慮しない場合は、画像生成装置１１は、素材画像を回転させた画像を新たな素材画像として使用すればよい。たとえば、画像選択部１２４が、選択された素材画像を、素材画像と対象顔画像との双方のロール角のずれ（差）に基づいて、回転させることで、双方のロール角を一致させる。そして、画像選択部１２４は、回転させることで新たに生成した画像を、素材画像として画像変形部１１５に送信する。これにより、新たな素材画像が、ステップＳ３５以降の処理で使用される。なお、素材画像を回転させた場合、画像選択部１２４は、対象顔画像における特徴点の情報も修正する。すなわち、画像選択部１２４は、対象顔画像の回転に用いた回転パラメータを用いて、特徴点の座標も回転させ、特徴点の位置の情報を更新する。 When the geometric deformation parameters estimated by the parameter estimation unit 1151 do not consider rotation, the image generation device 11 may use an image obtained by rotating the material image as a new material image. For example, the image selection unit 124 rotates the selected material image based on the shift (difference) between the roll angles of the material image and the target face image, thereby matching the roll angles of both. Then, the image selection unit 124 transmits the image newly generated by rotating to the image transformation unit 115 as a material image. As a result, the new material image is used in the processes after step S35. Note that when the material image is rotated, the image selection unit 124 also corrects information on feature points in the target face image. That is, the image selection unit 124 also rotates the coordinates of the feature points using the rotation parameter used to rotate the target face image, and updates the information on the position of the feature points.

このような変形例によれば、対象人物の顔のロール角と異なるロール角の顔を含む素材画像も、合成可能な素材画像の候補になりえる。すなわち、選択可能な素材画像の数が、ロール角の近さに制限されない。したがって、より多くの素材画像が、対象人物の姿勢に近い姿勢を持つ素材画像として使用可能となる。 According to such a modification, a material image including a face with a roll angle different from the roll angle of the target person's face can also be a candidate for a material image that can be synthesized. That is, the number of selectable material images is not limited by the proximity of roll angles. Therefore, more material images can be used as material images having a posture close to that of the target person.

＜変形例３＞
上記実施形態では、画像合成部１１６は、顔領域抽出部１１２により抽出された顔領域以外の部分が、画像変形部１１５により生成された合成用素材画像に置き換えられた画像を生成する。変形例として、画像合成部１１６は、素材画像における、顔領域の部分が、対象顔画像の顔領域に置き換えられた画像を生成してもよい。すなわち、素材画像全体の範囲が合成に使用されてもよい。そのような形態を、以下変形例３として説明する。 <Modification 3>
In the above-described embodiment, the image synthesizing unit 116 generates an image in which the portion other than the facial region extracted by the facial region extracting unit 112 is replaced with the synthetic material image generated by the image transforming unit 115 . As a modification, the image synthesizing unit 116 may generate an image in which the face region portion of the material image is replaced with the face region of the target face image. That is, the range of the entire material image may be used for synthesis. Such a form will be described as Modified Example 3 below.

変形例３では、ステップＳ３５における、精密変形部１１５３の処理と、ステップＳ３６における、画像合成部１１６の処理とが、既に説明した処理と異なる。具体的には次の通りである。 In Modified Example 3, the processing of the precise transformation unit 1153 in step S35 and the processing of the image synthesizing unit 116 in step S36 are different from the processing already described. Specifically, it is as follows.

精密変形部１１５３は、対象顔画像の顔領域を、パラメータ推定部１１５１が推定した幾何変形パラメータに基づいて素材画像に投影した場合の、それぞれの位置を特定する。ただし、パラメータ推定部１１５１が推定した幾何変形パラメータは、アフィン変換のように、歪まない（すなわち、線形な幾何変形のみによってもとに戻すことが可能な）、パラメータであるとする。そして、精密変形部１１５３は、素材画像における、対象顔画像の顔領域を規定する特徴点に対応付けられる特徴点を、上記位置が特定された顔領域に合うように、変形させる。すなわち、精密変形部１１５３は、素材画像の一部または全体を、対象顔画像の顔領域が特徴を維持しながら自然に合成されることが可能となるように、変形する。 The precise transformation unit 1153 identifies each position when the face region of the target facial image is projected onto the material image based on the geometric transformation parameters estimated by the parameter estimation unit 1151 . However, it is assumed that the geometric transformation parameters estimated by the parameter estimation unit 1151 are parameters that are not distorted (that is, can be restored only by linear geometric transformation) like affine transformation. Then, the precise transformation unit 1153 transforms the feature points in the material image that are associated with the feature points that define the face region of the target face image so as to match the face region whose position is specified. That is, the precision transformation unit 1153 transforms a part or the whole of the material image so that the face region of the target face image can be naturally synthesized while maintaining the features.

たとえば、精密変形部１１５３は、素材画像の顔領域の形状が、対象顔画像の顔領域がパラメータ推定部１１５１により推定された幾何変形パラメータによって素材画像に投影されることにより形成される領域の形状になるように、素材画像を変形する（図１２）。図１２の例では、図１２の（ａ）の画像が変形前の素材画像であり、図１２の（ｃ）の画像が変形後の素材画像である。変形前の素材画像の顔領域の形状は、対象顔画像（図１２の（ｂ））の顔領域を非線形に変形しなければ合成することができない形状である。一方で、変形後の素材画像の顔領域の形状は、対象顔画像の顔領域を線形な幾何変形のみを経て合成することができる形状である。 For example, the precise deformation unit 1153 converts the shape of the face region of the material image into the shape of the region formed by projecting the face region of the target face image onto the material image using the geometric deformation parameters estimated by the parameter estimation unit 1151. The material image is transformed so as to become (FIG. 12). In the example of FIG. 12, the image in (a) of FIG. 12 is the material image before deformation, and the image in (c) of FIG. 12 is the material image after deformation. The shape of the face region of the material image before deformation is a shape that cannot be synthesized unless the face region of the target face image ((b) in FIG. 12) is non-linearly deformed. On the other hand, the shape of the face region of the material image after deformation is a shape that can be combined with the face region of the target face image through only linear geometric deformation.

精密変形部１１５３による変形の対象となる領域は、素材画像の全体でもよいし、少なくとも顔の特徴点をすべて含む領域であってもよい。たとえば、精密変形部１１５３による変形の対象となる領域は、素材画像における投影外周点に囲まれる領域、すなわち、対象顔画像全体の領域に相当する領域でもよい。 The region to be deformed by the precision deformation unit 1153 may be the entire material image, or may be a region including at least all the feature points of the face. For example, the area to be transformed by the precise transformation unit 1153 may be an area surrounded by the projection outer peripheral points in the material image, that is, an area corresponding to the entire target face image.

このような変形を経て生成した画像を、合成用素材画像とする。これにより、対象顔画像の顔領域は、幾何変形パラメータによる変形のみを経て、合成用素材画像に自然にはめ込まれることが可能となる。すなわち、対象顔画像の顔領域は、合成用素材画像に非線形な変形なしに当てはめられることが可能である。 An image generated through such deformation is used as a composite material image. As a result, the face area of the target face image can be naturally fitted into the synthesis material image only through deformation using the geometric deformation parameters. That is, the face area of the target face image can be applied to the material image for synthesis without non-linear deformation.

ステップＳ３６において、画像合成部１１６は、合成用素材画像の、顔領域以外の部分（図１３の（ａ））と、対象顔画像の顔領域（図１３の（ｂ））とを合成する。このとき、画像合成部１１６は、対象顔画像を、パラメータ推定部１１５１が推定した幾何変形パラメータに基づいて幾何変形して、合成用素材画像と合成させる。合成用素材画像の顔領域の部分は、幾何変形された対象顔画像の顔領域が当てはまるように変形されているため、合成は容易に行える。この合成により、素材画像に対象人物の顔が自然に合成された合成画像が生成される（図１３の（ｃ））。 In step S36, the image synthesizing unit 116 synthesizes the portion other than the face region of the synthesis material image ((a) in FIG. 13) and the face region of the target face image ((b) in FIG. 13). At this time, the image synthesizing unit 116 geometrically transforms the target face image based on the geometric transformation parameters estimated by the parameter estimating unit 1151, and synthesizes it with the synthesis material image. Since the facial area portion of the material image for composition is deformed so that the facial area of the geometrically deformed target facial image is applied, the composition can be easily performed. As a result of this synthesis, a synthesized image in which the face of the target person is naturally synthesized with the material image is generated ((c) in FIG. 13).

以上説明した処理によれば、選択された素材画像の全体を使用した、合成画像を作成することができる。この合成画像において、対象人物の顔は、アフィン変換程度の幾何変形しか行われていないため、個人の特徴は、非線形な変形を行う場合に比べて、損なわれにくい。特に、対象人物の顔に対する幾何変形が、アスペクト比が変化しない拡大または縮小、および回転の組み合わせによる変形であれば、個人の特徴は損なわれない。 According to the processing described above, it is possible to create a composite image using the entire selected material image. In this composite image, the target person's face is only geometrically transformed to the degree of affine transformation, so individual features are less likely to be damaged than in the case of non-linear transformation. In particular, if the geometric deformation of the target person's face is a combination of enlargement or reduction without changing the aspect ratio, and rotation, the characteristics of the individual are not impaired.

上記変形例３の更なる変形例として、ステップＳ３５において、精密変形部１１５３は、素材画像全体を、パラメータ推定部１１５１が推定した幾何変形パラメータに基づいて幾何変形してもよい。幾何変形が行われた素材画像を基に精密変形部が素材画像を変形し、合成用素材画像を生成すれば、対象顔画像の顔領域を幾何変形しなくても、画像合成部１１６は合成用素材画像と顔領域とを合成することができる。すなわち、この場合、対象人物の顔は一切変形されることなく、素材画像の全体を使用した合成画像が生成できる。 As a further modification of modification 3, in step S<b>35 , the precise transformation unit 1153 may geometrically transform the entire material image based on the geometric transformation parameters estimated by the parameter estimation unit 1151 . If the precise transformation unit transforms the material image based on the geometrically transformed material image and generates a material image for composition, the image composition unit 116 can perform composition without geometrically transforming the face region of the target face image. A material image for use and a face area can be synthesized. That is, in this case, a composite image using the entire material image can be generated without deforming the face of the target person.

＜＜第２の実施形態＞＞
第２の実施形態に係る画像生成装置１２について説明する。図１４は、画像生成装置１２の構成を示すブロック図である。画像生成装置１２は、画像生成装置１１に比べ、画像選択部１１４の機能が拡張された機能を有する画像選択部１２４と、画像反転部１２８を備えている点で異なる。画像選択部１２４および画像反転部１２８以外の部の機能および動作については、画像生成装置１１における部の機能および動作と同様であるため、以下では詳しい説明を省略する。 <<Second Embodiment>>
An image generation device 12 according to the second embodiment will be described. FIG. 14 is a block diagram showing the configuration of the image generating device 12. As shown in FIG. The image generation device 12 differs from the image generation device 11 in that it includes an image selection unit 124 having an expanded function of the image selection unit 114 and an image inverting unit 128 . The functions and operations of the units other than the image selection unit 124 and the image reversing unit 128 are the same as the functions and operations of the units in the image generation device 11, and therefore detailed descriptions thereof are omitted below.

［構成の説明］
画像選択部１２４は、複数の素材画像の中から、合成に適切な素材画像を選択する。選択に際し、画像選択部１２４は、対象人物の姿勢情報と素材画像に関連付けられた姿勢情報との間の近さを算出する。このとき、画像選択部１２４は、上記２つの姿勢情報の一方を左右反転した場合の姿勢情報も、近さを算出する際に用いる姿勢情報として使用してもよい。 [Description of configuration]
The image selection unit 124 selects material images suitable for synthesis from among a plurality of material images. At the time of selection, the image selection unit 124 calculates the closeness between the posture information of the target person and the posture information associated with the material image. At this time, the image selection unit 124 may also use orientation information obtained by horizontally reversing one of the two pieces of orientation information as orientation information used when calculating the closeness.

たとえば、画像選択部１２４は、対象人物の姿勢情報を、素材画像に関連付けられた姿勢情報を左右反転させた場合の姿勢情報とも比較する。たとえば、画像選択部１２４は、対象人物の姿勢情報を、素材画像に関連付けられた姿勢情報のヨー角の値の正負を逆にした場合の姿勢情報とも比較する。あるいは、逆に、画像選択部１２４は、対象人物の姿勢情報のヨー角の値の正負を逆にし、その場合の姿勢情報を、それぞれの素材画像に関連付けられた姿勢情報と比較してもよい。このようにして、画像選択部１２４は、対象顔画像または素材画像を左右反転させた場合の姿勢情報にも基づいて、素材画像を選択してもよい。 For example, the image selection unit 124 also compares the posture information of the target person with posture information obtained when the posture information associated with the material image is horizontally reversed. For example, the image selection unit 124 also compares the posture information of the target person with posture information obtained by reversing the sign of the yaw angle value of the posture information associated with the material image. Alternatively, conversely, the image selection unit 124 may reverse the sign of the yaw angle value of the posture information of the target person, and compare the posture information in that case with the posture information associated with each material image. . In this manner, the image selection unit 124 may select a material image based also on posture information when the target face image or the material image is left-right reversed.

画像選択部１２４は、選択した素材画像に関連づけられる顔の姿勢が、左右反転すれば対象人物の顔の姿勢に近くなる場合は、「素材画像を左右反転させよ」という反転指示を画像反転部１２８に送信する。すなわち、画像選択部１２４は、左右反転した場合に対象顔画像に含まれる顔の姿勢に近くなる姿勢情報が関連付けられた、素材画像を選択した場合、反転指示を画像反転部１２８に送信する。 If the face posture associated with the selected material image becomes closer to the face posture of the target person if left-right reversed, the image selection unit 124 issues a reversal instruction to the image reversing unit 124 to "horizontally reverse the material image." 128. That is, when the image selection unit 124 selects a material image associated with posture information that makes the posture of the face included in the target face image closer to the posture when horizontally reversed, the image selection unit 124 transmits a reverse instruction to the image reverse unit 128 .

画像反転部１２８は、反転指示を受け取った場合、その反転指示が示す素材画像を左右反転する。この反転加工によって生成した画像が、ステップＳ３５以降の処理で使用される素材画像となる。 When receiving a reversal instruction, the image reversing unit 128 horizontally reverses the material image indicated by the reversal instruction. The image generated by this reversal processing becomes the material image used in the processing after step S35.

［動作の説明］
図１５は、画像生成装置１２の処理の流れを示すフローチャートである。 [Explanation of operation]
FIG. 15 is a flow chart showing the flow of processing of the image generation device 12 .

図１５に示されるフローチャートは、図３に示されるフローチャートと比べて、ステップＳ３４の代わりに、ステップＳ３４－２およびステップＳ３４－３が含まれている点で異なる。 The flowchart shown in FIG. 15 differs from the flowchart shown in FIG. 3 in that steps S34-2 and S34-3 are included instead of step S34.

ステップＳ３４－２では、画像選択部１２４が、複数の素材画像の中から、合成に適切な素材画像を選択する。選択に際し、画像選択部１２４は、対象人物の姿勢情報と素材画像に関連付けられた姿勢情報との間の近さを算出する。このとき、画像選択部１２４は、対象人物の姿勢情報を、素材画像に関連付けられた姿勢情報を左右反転（ヨー角の値の正負を逆に）させた場合の姿勢情報とも比較する。すなわち、画像選択部１２４は、姿勢情報の左右を反転させた場合の姿勢情報に基づいて、素材画像を選択してもよい。 In step S34-2, the image selection unit 124 selects material images suitable for synthesis from among the plurality of material images. At the time of selection, the image selection unit 124 calculates the closeness between the posture information of the target person and the posture information associated with the material image. At this time, the image selection unit 124 also compares the posture information of the target person with the posture information when the posture information associated with the material image is horizontally reversed (the positive and negative yaw angle values are reversed). That is, the image selection unit 124 may select the material image based on the orientation information when the orientation information is horizontally reversed.

画像選択部１２４は、左右を反転させた場合の姿勢情報が対象顔画像の姿勢情報に近いことを理由に素材画像を選択した場合は、反転指示を生成し、選択した素材画像と反転指示とを画像反転部１２８に送信する。 The image selection unit 124 generates a reversal instruction when the material image is selected because the orientation information when the left and right are reversed is close to the orientation information of the target face image, and the selected material image and the reversal instruction are combined. is sent to the image inverting unit 128 .

ステップＳ３４－３では、画像反転部１２８が、反転指示を受け取った場合、素材画像を左右反転させる。画像反転部１２８は、たとえば、素材画像を二等分する、素材画像の縦方向に平行な直線を軸として、鏡像の関係にある画素のペアの、座標に関連付けられる種々の値を入れ替えればよい。 In step S34-3, the image reversing unit 128 horizontally reverses the material image when receiving the reversing instruction. The image reversing unit 128 may, for example, exchange various values associated with the coordinates of pairs of pixels having a mirror image relationship about a straight line parallel to the vertical direction of the material image that bisects the material image. .

画像反転部１２８は、素材画像における特徴点の情報も修正する。すなわち、画像反転部１２８は、特徴点の座標を左右反転させる。また、画像反転部１２８は、特徴点と、対象顔画像における特徴点との対応関係を修正する。たとえば、画像反転部１２８は、もともと左目の特徴点として抽出されていた特徴点を、右目の特徴点であるとして情報を書き換える。 The image reversing unit 128 also corrects information on feature points in the material image. That is, the image inverting unit 128 horizontally inverts the coordinates of the feature points. Also, the image inverting unit 128 corrects the correspondence relationship between the feature points and the feature points in the target face image. For example, the image reversing unit 128 rewrites the information of the feature point originally extracted as the feature point of the left eye as the feature point of the right eye.

なお、画像反転部１２８が行う反転は、上記の方法に限られない。たとえば、画像反転部１２８が行う反転は、左右の反転でなくともよい。たとえば、画像反転部１２８は、素材画像の形や顔の姿勢に応じて、垂直でない線を軸として画像を反転させてもよい。このとき軸となる線は、必ずしも素材画像を二等分する線でなくともよい。 Note that the inversion performed by the image inverting unit 128 is not limited to the above method. For example, the inversion performed by the image inverting unit 128 may not be left-to-right inversion. For example, the image inverting unit 128 may invert the image around a non-perpendicular line according to the shape of the material image and the pose of the face. At this time, the line serving as the axis does not necessarily have to be a line that bisects the material image.

また、画像反転部１２８は、反転させた画像に対して、回転させるなどの調整を行ってもよい。 Further, the image reversing unit 128 may perform adjustments such as rotating the reversed image.

以上のようにして反転された素材画像と、修正された特徴点の情報とが、ステップＳ３５以降のステップにおいて用いられる。 The material image reversed as described above and the corrected feature point information are used in the steps after step S35.

［効果］
第２の実施形態に係る画像生成装置１２によれば、第１の実施形態に係る画像生成装置１１に比べ、合成画像のバリエーションを増やすことができる。別の言い方をすれば、画像生成装置１２は、素材画像の姿勢のバリエーションが少ない場合でも高品質な合成画像を生成することができる。 [effect]
According to the image generation device 12 according to the second embodiment, it is possible to increase the variations of the synthesized image compared to the image generation device 11 according to the first embodiment. In other words, the image generation device 12 can generate a high-quality composite image even when there are few pose variations of the material images.

たとえば、第１の実施形態の場合、対象人物の顔の向きが右向きであると、合成に使用可能な素材画像は、右向きの顔を含む素材画像に限られる。一方、第２の実施形態の場合、画像反転部１２８が素材画像を反転させるため、左向きの顔を含む素材画像も、合成可能な素材画像の候補になりえる。すなわち、選択可能な素材画像の数が、右向きの顔を含む素材画像に制限されない。 For example, in the case of the first embodiment, if the target person's face is facing right, the material images that can be used for composition are limited to material images that include a right-facing face. On the other hand, in the case of the second embodiment, the image reversing unit 128 reverses the material image, so a material image including a face facing left can also be a candidate for a material image that can be synthesized. That is, the number of selectable material images is not limited to material images including faces facing right.

＜＜主要構成＞＞
一実施形態の主要構成について説明する。図１６は、一実施形態に係る画像生成装置１０の構成を示すブロック図である。画像生成装置１０は、画像選択部１０４と、画像変形部１０５と、画像生成部１０６とを備える。 <<main configuration>>
A main configuration of one embodiment will be described. FIG. 16 is a block diagram showing the configuration of the image generation device 10 according to one embodiment. The image generation device 10 includes an image selection unit 104 , an image transformation unit 105 and an image generation unit 106 .

画像生成装置１０の各部の機能および処理の流れを、図１７のフローチャートに沿って説明する。 The functions and processing flow of each unit of the image generating apparatus 10 will be described with reference to the flowchart of FIG. 17 .

画像選択部１０４は、予め記憶されている複数の顔画像に含まれる顔の姿勢および、入力される第１の顔画像に含まれる顔の姿勢に基づいて、上記複数の顔画像から第２の顔画像を選択する（ステップＳ１２１）。 The image selection unit 104 selects a second face image from the plurality of face images based on the face posture included in the plurality of face images stored in advance and the face posture included in the input first face image. A face image is selected (step S121).

画像選択部１０４の一例は、上記各実施形態における画像選択部１１４である。なお、第１の顔画像は、上記各実施形態における対象顔画像に相当する。予め記憶されている複数の顔画像の一例は、上記各実施形態における素材画像である。
画像変形部１０５は、第１の顔画像に含まれる顔の特徴点および第２の顔画像に含まれる顔の特徴点に基づいて、第１の顔画像の顔領域に合うように第２の顔画像を変形する（ステップＳ１２２）。「第１の顔画像の顔領域」とは、たとえば、第１の顔画像において複数の特徴点により規定される領域である。「第１の顔画像の顔領域に合うように」とは、たとえば、第１の顔画像の顔領域が特徴を維持しながら自然に合成されることが可能となるように、という意味である。たとえば、画像変形部１０５は、第１の顔画像の顔領域を非線形な変形なしに当てはめることが可能な形状になるように、第２の顔画像の一部または全体を変形する。その具体例として、たとえば、画像変形部１０５は、変形後の第２の顔画像における、上記第１の顔画像の顔領域を規定する複数の特徴点に対応付けられる特徴点の位置が、上記第１の顔画像の顔領域を規定する複数の特徴点の位置に一致するように、第２の顔画像の一部または全体を変形する。なお、上記変形によって、第２の顔画像の一部分はトリミングされてもよい。
画像変形部１０５の一例は、上記各実施形態における画像変形部１１５である。
画像生成部１０６は、第１の顔画像の顔領域と、画像変形部１０５により変形された第２の顔画像の顔領域以外の領域とを合成する（ステップＳ１２３）。この合成により生成する顔画像は、第３の顔画像である。 An example of the image selection unit 104 is the image selection unit 114 in each of the above embodiments. Note that the first face image corresponds to the target face image in each of the above embodiments. An example of a plurality of face images stored in advance is the material image in each of the above embodiments.
Based on the facial feature points included in the first facial image and the facial feature points included in the second facial image, the image transforming unit 105 transforms the second facial image to match the facial region of the first facial image. The face image is transformed (step S122). A “face region of the first face image” is, for example, a region defined by a plurality of feature points in the first face image. "Matching the facial area of the first facial image" means, for example, that the facial area of the first facial image can be synthesized naturally while maintaining the features. . For example, the image transformation unit 105 transforms part or all of the second facial image so that the facial region of the first facial image can be fitted without non-linear transformation. As a specific example thereof, for example, the image transforming unit 105 determines that the positions of the feature points in the second face image after deformation, which are associated with the plurality of feature points defining the face region of the first face image, are the above-mentioned Part or all of the second facial image is deformed so as to match the positions of the plurality of feature points defining the facial region of the first facial image. Note that a portion of the second face image may be trimmed by the deformation described above.
An example of the image transformation unit 105 is the image transformation unit 115 in each of the above embodiments.
The image generation unit 106 synthesizes the face area of the first face image and the area other than the face area of the second face image transformed by the image transformation unit 105 (step S123). The facial image generated by this synthesis is the third facial image.

画像生成部１０６の一例は、上記各実施形態における画像合成部１１６である。 An example of the image generating unit 106 is the image synthesizing unit 116 in each of the above embodiments.

画像生成装置１０によれば、入力された顔画像に対し、その顔画像に含まれる顔の特徴を損なわずに、顔以外の部分が自然に合成された画像を生成することができる。その理由は、画像選択部１１４により顔の姿勢に基づいて選択された第２の顔画像が、画像変形部１０５により、入力された顔画像の顔領域に合うように変形され、画像生成部１０６により、入力された顔画像の顔領域と合成されるからである。 According to the image generation device 10, an image in which portions other than the face are naturally synthesized can be generated from an input face image without impairing the features of the face included in the face image. The reason is that the second face image selected by the image selection unit 114 based on the facial posture is deformed by the image deformation unit 105 so as to match the face area of the input face image, and the image generation unit 106 This is because the face area of the input face image is combined with the face region of the input face image.

＜＜第２の主要構成＞＞
図１８は、一実施形態に係る顔照合装置２０の構成を示すブロック図である。顔照合装置２０は、入力部２０１と、照合部２０２とを有する。 <<Second main configuration>>
FIG. 18 is a block diagram showing the configuration of the face matching device 20 according to one embodiment. The face matching device 20 has an input section 201 and a matching section 202 .

入力部２０１は、顔画像を入力として受け付ける。 The input unit 201 receives a face image as an input.

照合部２０２は、前述の第３の顔画像と、入力された顔画像とを照合する。 Collation unit 202 collates the above-described third facial image with the input facial image.

第３の顔画像は、必ずしも画像生成装置１０により生成されなくともよい。すなわち、第３の顔画像は、以下の工程を経て生成される画像であればよい。
・第１の顔画像に含まれる顔の姿勢に基づいて、複数の顔画像から第２の顔画像が選択される工程
・第１の顔画像に含まれる顔の特徴点および第２の顔画像に含まれる顔の特徴点に基づいて、第２の顔画像の顔領域が第１の顔画像の顔領域に合うように、第２の顔画像が変形される工程
・第１の顔画像の顔領域と、変形された第２の顔画像の顔領域以外の領域とが合成され、その結果として第３の画像が生成する工程
本顔照合装置２０は、より精度の高い顔照合を行うことができる。その理由は、第１の顔画像の特徴を残しつつ第２の顔画像が合成された第３の顔画像を使用して顔照合を行うからである。 The third facial image does not necessarily have to be generated by the image generating device 10 . That is, the third face image may be an image generated through the following steps.
- A step of selecting a second facial image from a plurality of facial images based on the posture of the face included in the first facial image - Feature points of the face and the second facial image included in the first facial image deforming the second face image so that the face region of the second face image matches the face region of the first face image based on the facial feature points included in the first face image; A step of synthesizing a face region and a region other than the face region of the modified second face image to generate a third image as a result. can be done. The reason is that face matching is performed using the third face image synthesized with the second face image while retaining the features of the first face image.

（実施形態の各部を実現するハードウェアの構成）
以上、説明した本発明の各実施形態において、各装置の各構成要素は、機能単位のブロックを示している。 (Hardware configuration for realizing each part of the embodiment)
In each of the embodiments of the present invention described above, each component of each device represents a functional unit block.

各構成要素の処理は、たとえば、コンピュータシステムが、コンピュータ読み取り可能な記憶媒体により記憶された、その処理をコンピュータシステムに実行させるプログラムを、読み込み、実行することによって、実現されてもよい。「コンピュータ読み取り可能な記憶媒体」は、たとえば、光ディスク、磁気ディスク、光磁気ディスク、および不揮発性半導体メモリ等の可搬媒体、ならびに、コンピュータシステムに内蔵されるＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）およびハードディスク等の記憶装置である。「コンピュータ読み取り可能な記憶媒体」は、インターネット等のネットワークや電話回線等の通信回線を介してプログラムを送信する場合の通信線のように、短時間の間、動的にプログラムを保持するもの、その場合のサーバやクライアントにあたるコンピュータシステム内部の揮発性メモリのように、プログラムを一時的に保持しているものも含む。また上記プログラムは、前述した機能の一部を実現するためのものであってもよく、更に前述した機能をコンピュータシステムにすでに記憶されているプログラムとの組み合わせで実現できるものであってもよい。 The processing of each component may be realized, for example, by the computer system reading and executing a program stored in a computer-readable storage medium that causes the computer system to perform the processing. "Computer-readable storage medium" includes, for example, optical discs, magnetic discs, magneto-optical discs, portable media such as non-volatile semiconductor memories, and ROM (Read Only Memory) and hard disks built into computer systems. It is a storage device. A "computer-readable storage medium" is a medium that dynamically retains a program for a short period of time, such as a communication line for transmitting a program via a network such as the Internet or a communication line such as a telephone line. It also includes those that temporarily hold the program, such as the volatile memory inside the computer system that corresponds to the server or client in that case. Further, the program may be for realizing part of the functions described above, or may be capable of realizing the functions described above in combination with a program already stored in the computer system.

「コンピュータシステム」とは、一例として、以下のような構成を含むコンピュータ９００を含むシステムである。
・ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）９０１
・ＲＯＭ９０２
・ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）９０３
・ＲＡＭ９０３へロードされるプログラム９０４Ａおよび記憶情報９０４Ｂ
・プログラム９０４Ａおよび記憶情報９０４Ｂを格納する記憶装置９０５
・記憶媒体９０６の読み書きを行うドライブ装置９０７
・通信ネットワーク９０９と接続する通信インタフェース９０８
・データの入出力を行う入出力インタフェース９１０
・各構成要素を接続するバス９１１
たとえば、各実施形態における各装置の各構成要素は、その構成要素の機能を実現するプログラム９０４ＡをＣＰＵ９０１がＲＡＭ９０３にロードして実行することで実現される。各装置の各構成要素の機能を実現するプログラム９０４Ａは、例えば、予め、記憶装置９０５やＲＯＭ９０２に格納される。そして、必要に応じてＣＰＵ９０１がプログラム９０４Ａを読み出す。記憶装置９０５は、たとえば、ハードディスクである。プログラム９０４Ａは、通信ネットワーク９０９を介してＣＰＵ９０１に供給されてもよいし、予め記憶媒体９０６に格納されており、ドライブ装置９０７に読み出され、ＣＰＵ９０１に供給されてもよい。なお、記憶媒体９０６は、たとえば、光ディスク、磁気ディスク、光磁気ディスク、および不揮発性半導体メモリ等の、可搬媒体である。 A "computer system" is, for example, a system including a computer 900 having the following configuration.
- CPU (Central Processing Unit) 901
・ROM902
・RAM (Random Access Memory) 903
- Program 904A and stored information 904B loaded into RAM 903
- Storage device 905 for storing program 904A and storage information 904B
・Drive device 907 that reads and writes storage medium 906
- A communication interface 908 that connects to the communication network 909
- An input/output interface 910 for inputting/outputting data
A bus 911 connecting each component
For example, each component of each device in each embodiment is implemented by the CPU 901 loading a program 904A that implements the function of the component into the RAM 903 and executing it. A program 904A that realizes the function of each component of each device is stored in advance in the storage device 905 or the ROM 902, for example. Then, the CPU 901 reads the program 904A as necessary. Storage device 905 is, for example, a hard disk. The program 904A may be supplied to the CPU 901 via the communication network 909, or may be stored in the storage medium 906 in advance, read by the drive device 907, and supplied to the CPU 901. FIG. Note that the storage medium 906 is, for example, a portable medium such as an optical disk, a magnetic disk, a magneto-optical disk, or a nonvolatile semiconductor memory.

各装置の実現方法には、様々な変形例がある。例えば、各装置は、構成要素毎にそれぞれ別個のコンピュータ９００とプログラムとの可能な組み合わせにより実現されてもよい。また、各装置が備える複数の構成要素が、一つのコンピュータ９００とプログラムとの可能な組み合わせにより実現されてもよい。 There are various modifications in the implementation method of each device. For example, each device may be realized by a possible combination of a separate computer 900 and a program for each component. Also, a plurality of components included in each device may be realized by a possible combination of one computer 900 and a program.

また、各装置の各構成要素の一部または全部は、その他の汎用または専用の回路、コンピュータ等やこれらの組み合わせによって実現されてもよい。これらは、単一のチップによって構成されてもよいし、バスを介して接続される複数のチップによって構成されてもよい。 Also, part or all of each component of each device may be implemented by other general-purpose or dedicated circuits, computers, etc., or combinations thereof. These may be composed of a single chip, or may be composed of multiple chips connected via a bus.

各装置の各構成要素の一部または全部が複数のコンピュータや回路等により実現される場合には、複数のコンピュータや回路等は、集中配置されてもよいし、分散配置されてもよい。例えば、コンピュータや回路等は、クライアントアンドサーバシステム、クラウドコンピューティングシステム等、各々が通信ネットワークを介して接続される形態として実現されてもよい。 When a part or all of each component of each device is realized by a plurality of computers, circuits, etc., the plurality of computers, circuits, etc. may be arranged centrally or distributedly. For example, the computers, circuits, and the like may be implemented as a client-and-server system, a cloud computing system, or the like, each of which is connected via a communication network.

本願発明は以上に説明した実施形態に限定されるものではない。本願発明の構成や詳細には、本願発明のスコープ内で当業者が理解し得る様々な変更をすることができる。 The present invention is not limited to the embodiments described above. Various changes that can be understood by those skilled in the art can be made to the configuration and details of the present invention within the scope of the present invention.

上記実施形態の一部または全部は以下の付記のようにも記載され得るが、以下には限られない。
［付記１］
予め記憶されている複数の顔画像に含まれる顔の姿勢および、入力される第１の顔画像に含まれる顔の姿勢に基づいて、前記複数の顔画像から第２の顔画像を選択する画像選択手段と、
前記第１の顔画像に含まれる顔の特徴点および前記第２の顔画像に含まれる顔の特徴点に基づいて、前記第２の顔画像の顔領域が前記第１の顔画像の顔領域に合うように、前記第２の顔画像を変形する画像変形手段と、
前記第１の顔画像の顔領域と、前記変形された第２の顔画像の、顔領域以外の領域とが合成された第３の顔画像を生成する画像生成手段と、
を有する画像生成装置。
［付記２］
前記画像変形手段は、少なくとも前記第２の顔画像の顔領域の部分が、前記第１の顔画像の顔領域を非線形な変形なしに当てはめることが可能な形状になるように、前記第２の顔画像を変形する、
付記１に記載の画像生成装置。
［付記３］
前記画像変形手段は、
前記第１の顔画像の特徴点と前記第２の顔画像の特徴点との対応関係に基づき、前記第１の顔画像の点を前記第２の顔画像の点へ投影する幾何変形パラメータを推定するパラメータ推定手段と、
前記パラメータ推定手段により推定された幾何変形パラメータを用いて、前記第１の顔画像の外周上にある外周点を前記第２の顔画像上へ投影する投影外周点生成手段と、
前記第２の顔画像の顔領域の外周線と、前記投影外周点生成手段により前記第２の顔画像上に投影された外周点に基づいて形成される線と、で囲まれる領域の形状が、前記第１の顔画像の顔領域の外周線と、前記第１の顔画像の外周点に基づいて形成される線と、で囲まれる領域の形状になるように、前記第２の顔画像を変形する精密変形手段と、
を備える、付記２に記載の画像生成装置。
［付記４］
前記画像選択手段は、前記複数の顔画像のうち、顔の姿勢を表すパラメータの値の、前記第１の顔画像に含まれる顔の姿勢を表すパラメータの値に対する近さが所定の基準以内である顔画像から、前記第２の顔画像を選択する、
付記１から３のいずれか一つに記載の画像生成装置。
［付記５］
前記画像選択手段は、前記第１の顔画像の顔領域内の輝度の分布と、前記複数の顔画像それぞれの顔領域内の輝度の分布と、にも基づいて、前記複数の顔画像から前記第２の顔画像を選択する、
付記１から４のいずれか一つに記載の画像生成装置。
［付記６］
前記画像選択手段は、前記第２の顔画像の選択において、前記複数の顔画像に含まれる顔の姿勢だけでなく、前記複数の顔画像のそれぞれを左右反転させた場合の顔の姿勢を、前記入力される第１の顔画像に含まれる顔の姿勢と比較し、比較結果に基づいて、前記複数の顔画像および左右反転された前記複数の顔画像のうち少なくとも１つを、前記第２の顔画像として選択する、
付記１から５のいずれか一つに記載の画像生成装置。
［付記７］
前記顔の姿勢は、正面を向く顔の方向を基準とし、左右の軸周りの回転角と、上下の軸まわりの回転角と、および前後の軸まわりの回転角との、３つの回転角によって定義され、
前記画像選択手段は、前記３つの回転角のうち、前記左右の軸周りの回転角および前記上下の軸まわりの回転角を、前記複数の顔画像に含まれる顔と前記第１の顔画像に含まれる顔との間で比較することによって、前記複数の顔画像から前記第２の顔画像を選択する、
付記１から６のいずれか一つに記載の画像生成装置。
［付記８］
前記顔領域は、目、鼻、および口よりも顔の外側に位置する特徴点を結ぶことで形成される、目、鼻、および口を包含する閉領域である、
付記１から７のいずれか一つに記載の画像生成装置。
［付記９］
予め記憶されている複数の顔画像に含まれる顔の姿勢および、入力される第１の顔画像に含まれる顔の姿勢に基づいて、前記複数の顔画像から第２の顔画像を選択し、
前記第１の顔画像に含まれる顔の特徴点および前記第２の顔画像に含まれる顔の特徴点に基づいて、前記第２の顔画像の顔領域が前記第１の顔画像の顔領域に合うように、前記第２の顔画像を変形し、
前記第１の顔画像の顔領域と、前記変形された第２の顔画像の、顔領域以外の領域とが合成された第３の顔画像を生成する、
画像生成方法。
［付記１０］
前記第２の顔画像を変形において、少なくとも前記第２の顔画像の顔領域の部分が、前記第１の顔画像の顔領域を非線形な変形なしに当てはめることが可能な形状になるように、前記第２の顔画像を変形する、
付記９に記載の画像生成方法。
［付記１１］
前記第２の顔画像を変形において、
前記第１の顔画像の特徴点と前記第２の顔画像の特徴点との対応関係に基づき、前記第１の顔画像の点を前記第２の顔画像の点へ投影する幾何変形パラメータを推定し、
前記推定された幾何変形パラメータを用いて、前記第１の顔画像の外周上にある外周点を前記第２の顔画像上へ投影し、
前記第２の顔画像の顔領域の外周線と、前記第２の顔画像上に投影された外周点に基づいて形成される線と、で囲まれる領域の形状が、前記第１の顔画像の顔領域の外周線と、前記第１の顔画像の外周点に基づいて形成される線と、で囲まれる領域の形状になるように、前記第２の顔画像を変形する、
付記１０に記載の画像生成方法。
［付記１２］
前記第２の顔画像の選択において、前記複数の顔画像のうち、顔の姿勢を表すパラメータの値の、前記第１の顔画像に含まれる顔の姿勢を表すパラメータの値に対する近さが所定の基準以内である顔画像から、前記第２の顔画像を選択する、
付記９から１１のいずれか一つに記載の画像生成方法。
［付記１３］
前記第２の顔画像の選択において、前記第１の顔画像の顔領域内の輝度の分布と、前記複数の顔画像それぞれの顔領域内の輝度の分布と、にも基づいて、前記複数の顔画像から前記第２の顔画像を選択する、
付記９から１２のいずれか一つに記載の画像生成方法。
［付記１４］
前記第２の顔画像の選択において、前記複数の顔画像に含まれる顔の姿勢だけでなく、前記複数の顔画像のそれぞれを左右反転させた場合の顔の姿勢を、前記入力される第１の顔画像に含まれる顔の姿勢と比較し、比較結果に基づいて、前記複数の顔画像および左右反転された前記複数の顔画像のうち少なくとも１つを、前記第２の顔画像として選択する、
付記９から１２のいずれか一つに記載の画像生成方法。
［付記１５］
前記顔の姿勢は、正面を向く顔の方向を基準とし、左右の軸周りの回転角と、上下の軸まわりの回転角と、および前後の軸まわりの回転角との、３つの回転角によって定義され、
前記第２の顔画像の選択において、前記３つの回転角のうち、前記左右の軸周りの回転角および前記上下の軸まわりの回転角を、前記複数の顔画像に含まれる顔と前記第１の顔画像に含まれる顔との間で比較することによって、前記複数の顔画像から前記第２の顔画像を選択する、
付記９から１３のいずれか一つに記載の画像生成方法。
［付記１６］
前記顔領域は、目、鼻、および口よりも顔の外側に位置する特徴点を結ぶことで形成される、目、鼻、および口を包含する閉領域である、
付記９から１４のいずれか一つに記載の画像生成方法。
［付記１７］
コンピュータに、
予め記憶されている複数の顔画像に含まれる顔の姿勢および、入力される第１の顔画像に含まれる顔の姿勢に基づいて、前記複数の顔画像から第２の顔画像を選択する画像選択処理と、
前記第１の顔画像に含まれる顔の特徴点および前記第２の顔画像に含まれる顔の特徴点に基づいて、前記第２の顔画像の顔領域が前記第１の顔画像の顔領域に合うように、前記第２の顔画像を変形する画像変形処理と、
前記第１の顔画像の顔領域と、前記変形された第２の顔画像の、顔領域以外の領域とが合成された第３の顔画像を生成する画像生成処理と、
を実行させるプログラム。
［付記１８］
前記画像変形処理は、少なくとも前記第２の顔画像の顔領域の部分が、前記第１の顔画像の顔領域を非線形な変形なしに当てはめることが可能な形状になるように、前記第２の顔画像を変形する、
付記１７に記載のプログラム。
［付記１９］
前記画像変形処理は、
前記第１の顔画像の特徴点と前記第２の顔画像の特徴点との対応関係に基づき、前記第１の顔画像の点を前記第２の顔画像の点へ投影する幾何変形パラメータを推定するパラメータ推定処理と、
前記パラメータ推定処理により推定された幾何変形パラメータを用いて、前記第１の顔画像の外周上にある外周点を前記第２の顔画像上へ投影する投影外周点生成処理と、
前記第２の顔画像の顔領域の外周線と、前記投影外周点生成処理により前記第２の顔画像上に投影された外周点に基づいて形成される線と、で囲まれる領域の形状が、前記第１の顔画像の顔領域の外周線と、前記第１の顔画像の外周点に基づいて形成される線と、で囲まれる領域の形状になるように、前記第２の顔画像を変形する精密変形処理と、
を含む、付記１８に記載のプログラム。
［付記２０］
前記画像選択処理は、前記複数の顔画像のうち、顔の姿勢を表すパラメータの値の、前記第１の顔画像に含まれる顔の姿勢を表すパラメータの値に対する近さが所定の基準以内である顔画像から、前記第２の顔画像を選択する、
付記１７から１９のいずれか一つに記載のプログラム。
［付記２１］
前記画像選択処理は、前記第１の顔画像の顔領域内の輝度の分布と、前記複数の顔画像それぞれの顔領域内の輝度の分布と、にも基づいて、前記複数の顔画像から前記第２の顔画像を選択する、
付記１７から２０のいずれか一つに記載のプログラム。
［付記２２］
前記画像選択処理は、前記第２の顔画像の選択において、前記複数の顔画像に含まれる顔の姿勢だけでなく、前記複数の顔画像のそれぞれを左右反転させた場合の顔の姿勢を、前記入力される第１の顔画像に含まれる顔の姿勢と比較し、比較結果に基づいて、前記複数の顔画像および左右反転された前記複数の顔画像のうち少なくとも１つを、前記第２の顔画像として選択する、
付記１７から２１のいずれか一つに記載のプログラム。
［付記２３］
前記顔の姿勢は、正面を向く顔の方向を基準とし、左右の軸周りの回転角と、上下の軸まわりの回転角と、および前後の軸まわりの回転角との、３つの回転角によって定義され、
前記画像選択処理は、前記３つの回転角のうち、前記左右の軸周りの回転角および前記上下の軸まわりの回転角を、前記複数の顔画像に含まれる顔と前記第１の顔画像に含まれる顔との間で比較することによって、前記複数の顔画像から前記第２の顔画像を選択する、
付記１７から２２のいずれか一つに記載のプログラム。
［付記２４］
前記顔領域は、目、鼻、および口よりも顔の外側に位置する特徴点を結ぶことで形成される、目、鼻、および口を包含する閉領域である、
付記１７から２３のいずれか一つに記載のプログラム。
［付記２５］
付記１７から２４のいずれか一つに記載のプログラムを記憶した、コンピュータ読み取り可能な記憶媒体。
［付記２６］
顔画像を入力として受け付ける入力手段と、
予め記憶されている複数の顔画像に含まれる顔の姿勢および、第１の顔画像に含まれる顔の姿勢に基づいて、前記複数の顔画像から選択された第２の顔画像が、前記第１の顔画像に含まれる顔の特徴点および当該第２の顔画像に含まれる顔の特徴点に基づいて、前記第１の顔画像の顔領域に合うように変形され、前記第１の顔画像の顔領域と、前記変形された第２の顔画像の顔領域以外の領域とが合成されることにより生成する、第３の顔画像と、前記入力として受け付けられた顔画像と、を照合する照合手段と、
を有する顔照合装置。 Some or all of the above embodiments may also be described in the following appendices, but are not limited to the following.
[Appendix 1]
An image for selecting a second face image from the plurality of face images based on the face posture included in the plurality of pre-stored face images and the face posture included in the input first face image. a selection means;
The facial area of the second facial image is determined based on the facial feature points included in the first facial image and the facial feature points included in the second facial image. an image transforming means for transforming the second facial image so as to match the
image generating means for generating a third facial image in which the facial area of the first facial image and the area other than the facial area of the deformed second facial image are synthesized;
An image generation device having
[Appendix 2]
The image transformation means transforms the second facial image so that at least the face area portion of the second facial image has a shape that allows fitting the facial area of the first facial image without non-linear transformation. transform face images,
1. The image generation device according to appendix 1.
[Appendix 3]
The image transformation means is
A geometric deformation parameter for projecting the points of the first facial image onto the points of the second facial image based on the correspondence relationship between the feature points of the first facial image and the feature points of the second facial image. parameter estimation means for estimating;
projection outer peripheral point generating means for projecting outer peripheral points on the outer periphery of the first facial image onto the second facial image using the geometric deformation parameters estimated by the parameter estimating means;
a shape of an area surrounded by a peripheral line of the face area of the second face image and a line formed based on the peripheral points projected onto the second facial image by the projection peripheral point generating means; , a peripheral line of the facial region of the first facial image, and a line formed based on the peripheral points of the first facial image. a precision deformation means for deforming the
3. The image generating device of appendix 2, comprising:
[Appendix 4]
The image selection means determines whether the closeness of the parameter value representing the face posture among the plurality of face images to the value of the parameter representing the face posture included in the first face image is within a predetermined standard. selecting the second facial image from a facial image;
4. The image generation device according to any one of appendices 1 to 3.
[Appendix 5]
The image selection means selects the face image from the plurality of face images based also on the distribution of luminance within the face region of the first face image and the distribution of luminance within the face region of each of the plurality of face images. selecting a second facial image;
5. The image generation device according to any one of appendices 1 to 4.
[Appendix 6]
In selecting the second face image, the image selection means selects not only the posture of the face included in the plurality of face images, but also the posture of the face when each of the plurality of face images is horizontally reversed. comparing with the pose of the face included in the input first face image, and based on the comparison result, at least one of the plurality of face images and the plurality of left-right reversed face images is transferred to the second face image; to select as the face image of the
6. The image generation device according to any one of appendices 1 to 5.
[Appendix 7]
The posture of the face is based on the direction of the face facing the front, and is determined by three angles of rotation: the angle of rotation about the left and right axis, the angle of rotation about the vertical axis, and the angle of rotation about the front and back axis. defined,
The image selection means selects, of the three rotation angles, the rotation angle about the left and right axis and the rotation angle about the up and down axis for the face included in the plurality of face images and the first face image. selecting the second facial image from the plurality of facial images by comparing between included faces;
7. The image generation device according to any one of appendices 1 to 6.
[Appendix 8]
The facial region is a closed region including the eyes, nose, and mouth formed by connecting feature points located outside the eyes, nose, and mouth.
8. The image generation device according to any one of appendices 1 to 7.
[Appendix 9]
selecting a second face image from the plurality of face images based on the face posture included in the plurality of pre-stored face images and the face posture included in the input first face image;
The facial area of the second facial image is determined based on the facial feature points included in the first facial image and the facial feature points included in the second facial image. deforming the second face image to fit
generating a third facial image in which the facial region of the first facial image and the region other than the facial region of the deformed second facial image are synthesized;
Image generation method.
[Appendix 10]
deforming the second face image so that at least a portion of the face region of the second face image has a shape that allows the face region of the first face image to be fitted without non-linear deformation; deforming the second facial image;
The image generation method according to appendix 9.
[Appendix 11]
In transforming the second face image,
A geometric deformation parameter for projecting the points of the first facial image onto the points of the second facial image based on the correspondence relationship between the feature points of the first facial image and the feature points of the second facial image. presume,
Using the estimated geometric deformation parameter, projecting a perimeter point on the perimeter of the first facial image onto the second facial image;
The shape of the area surrounded by the peripheral line of the facial area of the second facial image and the line formed based on the peripheral points projected onto the second facial image is the shape of the facial area of the first facial image. deforming the second facial image so as to have the shape of an area surrounded by a peripheral line of the facial area and a line formed based on the peripheral points of the first facial image;
11. The image generation method according to appendix 10.
[Appendix 12]
In selecting the second face image, closeness of a value of a parameter representing a face posture among the plurality of face images to a value of a parameter representing the face posture included in the first face image is predetermined. selecting the second facial image from facial images that are within the criteria of
12. The image generation method according to any one of Appendices 9 to 11.
[Appendix 13]
In selecting the second face image, the plurality of face images are selected based also on the distribution of luminance within the face region of the first face image and the distribution of luminance within the face region of each of the plurality of face images. selecting the second facial image from the facial images;
13. The image generation method according to any one of Appendices 9 to 12.
[Appendix 14]
In the selection of the second face image, not only the pose of the face included in the plurality of face images but also the pose of the face when each of the plurality of face images is horizontally reversed is selected from the input first face image. and selecting at least one of the plurality of face images and the plurality of horizontally reversed face images as the second face image based on the comparison result. ,
13. The image generation method according to any one of Appendices 9 to 12.
[Appendix 15]
The posture of the face is based on the direction of the face facing the front, and is determined by three angles of rotation: the angle of rotation about the left and right axis, the angle of rotation about the vertical axis, and the angle of rotation about the front and back axis. defined,
In selecting the second face image, among the three rotation angles, the rotation angle about the left and right axis and the rotation angle about the up and down axis are combined with the face included in the plurality of face images and the first face image. selecting the second facial image from the plurality of facial images by comparing between the faces included in the facial images of
14. The image generation method according to any one of Appendices 9 to 13.
[Appendix 16]
The facial region is a closed region including the eyes, nose, and mouth formed by connecting feature points located outside the eyes, nose, and mouth.
15. The image generation method according to any one of Appendices 9 to 14.
[Appendix 17]
to the computer,
An image for selecting a second face image from the plurality of face images based on the face posture included in the plurality of pre-stored face images and the face posture included in the input first face image. a selection process;
The facial area of the second facial image is determined based on the facial feature points included in the first facial image and the facial feature points included in the second facial image. an image deformation process for deforming the second face image so as to match
image generation processing for generating a third face image in which the face region of the first face image and the region other than the face region of the deformed second face image are synthesized;
program to run.
[Appendix 18]
The image deformation processing is such that at least a portion of the face region of the second face image has a shape that can fit the face region of the first face image without nonlinear deformation. transform face images,
17. The program according to Appendix 17.
[Appendix 19]
The image deformation processing includes:
A geometric deformation parameter for projecting the points of the first facial image onto the points of the second facial image based on the correspondence relationship between the feature points of the first facial image and the feature points of the second facial image. a parameter estimation process to estimate;
Projection outer peripheral point generation processing for projecting outer peripheral points on the outer periphery of the first facial image onto the second facial image using the geometric deformation parameters estimated by the parameter estimation processing;
A shape of an area surrounded by an outer peripheral line of the face region of the second facial image and a line formed based on the outer peripheral points projected onto the second facial image by the projection outer peripheral point generation process. , a peripheral line of the facial region of the first facial image, and a line formed based on the peripheral points of the first facial image. precision deformation processing to deform the
19. The program of clause 18, comprising:
[Appendix 20]
In the image selection processing, the closeness of the value of the parameter representing the face posture among the plurality of face images to the value of the parameter representing the face posture included in the first face image is within a predetermined standard. selecting the second facial image from a facial image;
20. A program according to any one of appendices 17-19.
[Appendix 21]
The image selection process selects the plurality of face images from the plurality of face images based also on the distribution of luminance within the face region of the first face image and the distribution of luminance within the face region of each of the plurality of face images. selecting a second facial image;
21. The program according to any one of appendices 17-20.
[Appendix 22]
In the image selection process, in selecting the second face image, not only the posture of the face included in the plurality of face images, but also the posture of the face when each of the plurality of face images is horizontally reversed, comparing with the pose of the face included in the input first face image, and based on the comparison result, at least one of the plurality of face images and the plurality of left-right reversed face images is transferred to the second face image; to select as the face image of the
22. The program according to any one of appendices 17-21.
[Appendix 23]
The posture of the face is based on the direction of the face facing the front, and is determined by three angles of rotation: the angle of rotation about the left and right axis, the angle of rotation about the vertical axis, and the angle of rotation about the front and back axis. defined,
In the image selection processing, among the three rotation angles, the rotation angle about the left and right axis and the rotation angle about the up and down axis are assigned to the face included in the plurality of face images and the first face image. selecting the second facial image from the plurality of facial images by comparing between included faces;
23. The program according to any one of appendices 17-22.
[Appendix 24]
The facial region is a closed region including the eyes, nose, and mouth formed by connecting feature points located outside the eyes, nose, and mouth.
24. The program according to any one of appendices 17-23.
[Appendix 25]
A computer-readable storage medium storing the program according to any one of appendices 17 to 24.
[Appendix 26]
input means for accepting a face image as an input;
The second face image selected from the plurality of face images based on the face posture included in the plurality of pre-stored face images and the face posture included in the first face image is the first face image. Based on the facial feature points included in the first facial image and the facial feature points included in the second facial image, the first face is deformed to match the facial region of the first facial image. collating a third face image generated by synthesizing a face region of the image and a region other than the face region of the modified second face image with the face image accepted as the input; a matching means for
face matching device.

１０～１２画像生成装置
２０顔照合装置
１０４画像選択部
１０５画像変形部
１０６画像生成部
１１０入力部
１１１特徴点抽出部
１１２顔領域抽出部
１１３姿勢推定部
１１４画像選択部
１１５画像変形部
１１５１パラメータ推定部
１１５２投影外周点生成部
１１５３精密変形部
１１６画像合成部
１１７記憶部
１２４画像選択部
１２８画像反転部
２０１入力部
２０２照合部
９００コンピュータ
９０１ＣＰＵ
９０２ＲＯＭ
９０３ＲＡＭ
９０４Ａプログラム
９０４Ｂ記憶情報
９０５記憶装置
９０６記憶媒体
９０７ドライブ装置
９０８通信インタフェース
９０９通信ネットワーク
９１０入出力インタフェース
９１１バス 10 to 12 image generation device 20 face matching device 104 image selection unit 105 image transformation unit 106 image generation unit 110 input unit 111 feature point extraction unit 112 face area extraction unit 113 posture estimation unit 114 image selection unit 115 image transformation unit 1151 parameter estimation Section 1152 Projection Circumference Point Generation Section 1153 Precise Transformation Section 116 Image Synthesis Section 117 Storage Section 124 Image Selection Section 128 Image Inversion Section 201 Input Section 202 Verification Section 900 Computer 901 CPU
902 ROMs
903 RAM
904A program 904B storage information 905 storage device 906 storage medium 907 drive device 908 communication interface 909 communication network 910 input/output interface 911 bus

Claims

Using outer peripheral points of a face image generated based on the face feature points included in the input first face image and the face feature points included in the pre-stored second face image, image deformation means for cutting out the face area of the second face image so as to match the face area of the first face image, and deforming the area other than the face area of the second face image;
an image generating means for generating a third face image in which a region other than the face region of the first face image is replaced with a region other than the face region of the deformed second face image;
has
The image transformation means is
A geometric deformation parameter for projecting the points of the first facial image onto the points of the second facial image based on the correspondence relationship between the feature points of the first facial image and the feature points of the second facial image. parameter estimation means for estimating;
projection outer peripheral point generating means for projecting outer peripheral points on the outer periphery of the first facial image onto the second facial image using the geometric deformation parameters estimated by the parameter estimating means;
a shape of an area surrounded by a peripheral line of the face area of the second face image and a line formed based on the peripheral points projected onto the second facial image by the projection peripheral point generating means; , a peripheral line of the facial region of the first facial image, and a line formed based on the peripheral points of the first facial image. a precision deformation means for deforming the
An image generation device comprising:

The image transformation means transforms the second facial image so that at least the face area portion of the second facial image has a shape that allows fitting the facial area of the first facial image without non-linear transformation. transform face images,
2. The image generation device of claim 1 .

Image selection for selecting the second face image from the plurality of face images based on the face posture included in the plurality of pre-stored face images and the face posture included in the first face image. 3. An image generation device according to claim 1 or 2 , comprising means.

The image selection means determines whether the closeness of the parameter value representing the face posture among the plurality of face images to the value of the parameter representing the face posture included in the first face image is within a predetermined standard. selecting the second facial image from a facial image;
4. The image generation device according to claim 3 .

The image selection means selects the face image from the plurality of face images based also on the distribution of luminance within the face region of the first face image and the distribution of luminance within the face region of each of the plurality of face images. selecting a second facial image;
5. The image generation device according to claim 3 or 4 .

In selecting the second face image, the image selection means selects not only the posture of the face included in the plurality of face images, but also the posture of the face when each of the plurality of face images is horizontally reversed. comparing with the pose of the face included in the first face image, and transferring at least one of the plurality of face images and the plurality of horizontally inverted face images to the second face image based on the comparison result; to select as
6. The image generation device according to any one of claims 3 to 5 .

The posture of the face is based on the direction of the face facing the front, and is determined by three angles of rotation: the angle of rotation about the left and right axis, the angle of rotation about the vertical axis, and the angle of rotation about the front and back axis. defined,
The image selection means selects, of the three rotation angles, the rotation angle about the left and right axis and the rotation angle about the up and down axis for the face included in the plurality of face images and the first face image. selecting the second facial image from the plurality of facial images by comparing between included faces;
An image generation device according to any one of claims 3 to 6 .

The facial region is a closed region including the eyes, nose, and mouth formed by connecting feature points located outside the eyes, nose, and mouth.
7. An image generation device according to any one of claims 1-6 .

the computer
Using outer peripheral points of a face image generated based on the face feature points included in the input first face image and the face feature points included in the pre-stored second face image, cutting out the face region of the face image of 2 so as to match the face region of the first face image, deforming the region other than the face region of the second face image;
generating a third face image in which a region other than the face region of the first face image is replaced with a region other than the face region of the deformed second face image;
In said variant,
A geometric deformation parameter for projecting the points of the first facial image onto the points of the second facial image based on the correspondence relationship between the feature points of the first facial image and the feature points of the second facial image. presume,
Using the estimated geometric deformation parameter, projecting a perimeter point on the perimeter of the first facial image onto the second facial image;
The shape of the area surrounded by the peripheral line of the facial area of the second facial image and the line formed based on the peripheral points projected onto the second facial image is the shape of the facial area of the first facial image. deforming the second facial image so as to have the shape of an area surrounded by a peripheral line of the facial area and a line formed based on the peripheral points of the first facial image;
Image generation method.

to the computer,
Using outer peripheral points of a face image generated based on the face feature points included in the input first face image and the face feature points included in the pre-stored second face image, an image deformation process for cutting out the face area of the second face image so as to match the face area of the first face image, and deforming the area other than the face area of the second face image;
image generation processing for generating a third face image in which a region other than the face region of the first face image is replaced with a region other than the face region of the deformed second face image;
and
The image deformation processing includes:
A geometric deformation parameter for projecting the points of the first facial image onto the points of the second facial image based on the correspondence relationship between the feature points of the first facial image and the feature points of the second facial image. a parameter estimation process to estimate;
Projection outer peripheral point generation processing for projecting outer peripheral points on the outer periphery of the first facial image onto the second facial image using the estimated geometric deformation parameters;
The shape of the area surrounded by the peripheral line of the facial area of the second facial image and the line formed based on the peripheral points projected onto the second facial image is the shape of the facial area of the first facial image. a precise deformation process for deforming the second facial image so as to form a shape of an area surrounded by an outer peripheral line of the facial area and a line formed based on the outer peripheral points of the first facial image; ,
let the computer run
program.