KR102288001B1

KR102288001B1 - Device for generating job image having face to which age transformation is applied and photo booth including the same

Info

Publication number: KR102288001B1
Application number: KR1020200017927A
Authority: KR
Inventors: 김익재; 최성은; 홍유진
Original assignee: 한국과학기술연구원
Priority date: 2020-02-13
Filing date: 2020-02-13
Publication date: 2021-08-11

Abstract

Embodiments of the present invention relate to a job image generating device with an age-converted face and a photo booth including the same, which have a process including the steps of: acquiring an original image including a subject's face at an original age; obtaining original age information of a subject; determining a background image in which a job is expressed for use for generating a job image; converting the face of the subject at the original age into the face of the subject at a target age; and generating the job image of the target by synthesizing the age-converted face with a determined face region of the background image. Accordingly, an age-converted image having an adult-age face may be generated based on a generated texture and shape.

Description

Occupational image generating device with age-converted face and photo booth including the same

본 발명의 실시예들은 영상 처리 기술에 관한 것으로서, 보다 상세하게는, 특정 나이에서의 대상의 얼굴 영상에서 상기 대상이 목표 나이가 될 경우 가질 것으로 예상되는 얼굴을 갖는 나이변환 영상을 생성하고, 목표 나이에서의 대상의 얼굴을 이용하여 상기 대상이 목표 나이에서 직업을 가질 경우를 나타내는 직업영상 생성 장치 및 이를 포함한 시스템(예컨대, 포토 부스)에 관한 것이다.Embodiments of the present invention relate to image processing technology, and more particularly, to generate an age-converted image having a face expected to have when the subject becomes a target age from a face image of a subject at a specific age, The present invention relates to a job image generating apparatus and a system (eg, a photo booth) including the same for indicating a case in which the target has a job at a target age by using the target's face at the age.

직업은 개인의 생애에서 경제적인 의미는 물론 직업생활을 통한 자아실현이라는 의미에 이르기까지 다양하고도 중요한 의미를 가진 사회활동이다. Occupation is a social activity with various and important meanings ranging from economic meaning in an individual's life to the meaning of self-realization through professional life.

최근에는 직업 전분야에 있어 고도화, 세분화가 진행되어, 직업 선택을 위한 준비 기간이 빠를수록 유리한 측면이 있다. 직업선택은 기술/사회 트렌드에 대한 풍부한 지식과 정보에 기초하여 신중하게 수행되어야 한다. 일반적으로 유아, 청소년기의 자녀들은 이러한 기술/사회 트렌드를 분석하는 능력이 상대적으로 낮기 때문에 부모의 조언이 직업선택에 큰 영향을 미친다. In recent years, advancement and segmentation have been progressing in all fields of occupation, and the earlier the preparation period for career selection, the more advantageous it is. Career selection should be made carefully, based on a wealth of knowledge and information on technological/social trends. In general, children of infants and adolescence have relatively low ability to analyze these technological/social trends, so parental advice has a great influence on career choices.

그러나, 직업선택의 주체는 자녀 자신이며, 따라서 자녀가 능동적으로 선택할 직업을 결정하는 것이 성장 과정에서 경험하는 해당 직업을 갖기 위한 노력을 가능하게 할 확률이 높다. However, the subject of career choice is the child himself/herself, and therefore, it is highly probable that the child's active determination of the occupation will enable the child to make an effort to acquire the corresponding occupation experienced in the process of growth.

결국, 어린 자녀에게 능동적인 직업선택이 가능하도록 다양하고 구체적인 직업정보를 가능하게 하는 것이 부모에 있어 최선의 직업 교육이 될 것이다. In the end, it will be the best vocational education for parents to enable diverse and specific job information to enable active career choices for young children.

자녀 자신이 해당 직업을 가졌을 때의 미래 모습을 상상해 보는 것이 직업선택에 대한 가장 확실한 동기부여가 된다. 그러나, 통상적으로, 부모가 제공하는 직업정보는 해당 직업 자체에 대한 정보 또는 해당 직업을 가진 타인에 대한 정보만을 제공하여, 자녀가 부모의 직업정보를 받아들이고 공감하는데 한계가 있다. Imagining the future of your child when he or she has the job is the most sure motivator for choosing a job. However, in general, the job information provided by the parent only provides information about the job itself or information about others who have the job, so there is a limit to the child's ability to accept and empathize with the parent's job information.

특허공개공보 제10-1998-0065049호Patent Publication No. 10-1998-0065049 특허공개공보 제10-2002-0007744호Patent Laid-Open Publication No. 10-2002-0007744

Goodfellow, Ian J.; Pouget-Abadie, Jean; Mirza, Mehdi; Xu, Bing; Warde-Farley, David; Ozair, Sherjil; Courville, Aaron; Bengio, Yoshua (2014). "Generative Adversarial Networks" Goodfellow, Ian J.; Pouget-Abadie, Jean; Mirza, Mehdi; Xu, Bing; Warde-Farley, David; Ozair, Sherjil; Courville, Aaron; Bengio, Yoshua (2014). "Generative Adversarial Networks" Mehdi Mirza; Simon Osindero, (2014) “Conditional Generative Adversarial Nets” Mehdi Mirza; Simon Osindero, (2014) “Conditional Generative Adversarial Nets”

본 발명의 일 측면에 따르면, 원본영상에 포함된 대상의 얼굴을 목표 나이로 변환 처리하여 나이변환 영상을 생성하고, 상기 나이변환 영상에 포함된 목표 나이에서의 얼굴을 직업을 표현한 배경영상에 합성하여 대상이 목표 나이가 되어 해당 직업을 가졌을 경우를 나타내는, 나이변환된 얼굴을 갖는 직업영상을 생성하는 장치를 제공한다. According to an aspect of the present invention, an age conversion image is generated by converting the face of a target included in the original image into a target age, and the face at the target age included in the age conversion image is synthesized into a background image expressing a job To provide a device for generating a job image having an age-converted face, indicating a case in which the subject has reached the target age and has a corresponding job.

이 외에도, 포토부스, 키오스크와 같은, 상기 장치가 적용된 시스템을 제공할 수 있다.In addition to this, it is possible to provide a system to which the device is applied, such as a photo booth and a kiosk.

본 발명의 일 측면에 따른 프로세서를 포함한, 나이변환 얼굴을 갖는 직업영상을 생성하는 장치는: 원본 나이에서의 대상의 얼굴을 포함한 원본영상을 획득하는 단계; 상기 대상의 원본 나이 정보를 획득하는 단계; 직업영상을 생성하는데 사용하기 위한, 직업이 표현된 배경영상을 결정하는 단계; 원본 나이에서의 대상의 얼굴을 상기 목표 나이에서의 대상의 얼굴로 변환하는 단계; 및 상기 결정된 배경영상의 얼굴 영역에 상기 나이변환 얼굴을 합성하여 상기 대상의 직업영상을 생성하는 단계를 수행하도록 구성될 수 있다. An apparatus for generating a job image having an age-converted face, including a processor according to an aspect of the present invention, includes: obtaining an original image including a face of a subject at an original age; obtaining original age information of the subject; determining a background image in which a job is expressed for use in generating a job image; converting the face of the subject at the original age into the face of the subject at the target age; and synthesizing the age-converted face with the determined face region of the background image to generate a job image of the target.

일 실시예에서, 상기 장치는: 상기 원본영상, 배경영상, 나이변환 영상 및 직업영상 중 적어도 직업영상을 포함한 진로 안내 서비스 영상을 사용자에게 제공하는 단계를 수행하도록 더 구성될 수 있다. In one embodiment, the device may be further configured to perform the step of: providing a career guidance service image including at least a vocational image among the original image, the background image, the age conversion image and the vocational image to the user.

일 실시예에서, 상기 직업영상을 위한 배경영상을 결정하는 단계는, 미리 저장된 후보 배경영상 중 적어도 일부를 표시하여 배경영상을 선택하는 사용자의 입력을 유도하는 인터페이스 화면을 제공하는 단계; 및 배경영상을 선택하는 입력을 수신한 경우, 상기 입력에 대응하는 후보 배경영상을 대상의 직업영상을 위한 배경영상으로 결정하는 단계를 포함할 수 있다. 여기서, 상기 후보 배경영상은 상기 직업의 특성을 표현하도록 구성되며, 상기 대상과 상이한, 해당 직업을 갖는 사람의 얼굴의 적어도 일부를 포함한다. In an embodiment, the determining of the background image for the job image may include: providing an interface screen for inducing a user's input for selecting a background image by displaying at least a portion of a previously stored candidate background image; and when receiving an input for selecting a background image, determining a candidate background image corresponding to the input as a background image for a job image of the target. Here, the candidate background image is configured to express the characteristics of the job, and includes at least a part of a face of a person having a corresponding job, which is different from the target.

일 실시예에서, 상기 직업영상을 위한 배경영상을 결정하는 단계는, 상기 배경영상을 선택하는 사용자의 입력을 유도하는 상기 인터페이스 화면을 제공하기 이전에, 직업 관련 목록을 표시하여 원하는 직업을 선택하게 하는 선택 화면을 제공하는 단계를 더 포함할 수 있다. 여기서, 상기 직업 관련 목록은 하나 이상의 직업 항목(listing) 및 하나 이상의 직업군 항목 중 적어도 하나를 포함한다. In one embodiment, the step of determining the background image for the job image, before providing the interface screen inducing the user's input to select the background image, to display a job-related list to select a desired job It may further include the step of providing a selection screen. Here, the occupation-related list includes at least one of one or more occupational items (listing) and one or more occupational group items.

일 실시예에서, 상기 인터페이스 화면은, 대상의 성별 정보가 획득된 경우, 상기 대상의 성별에 연관된 후보 배경영상 중 적어도 일부를 포함한다. In an embodiment, the interface screen includes at least a portion of a candidate background image related to the gender of the subject when the subject's gender information is obtained.

일 실시예에서, 상기 목표 나이에서의 대상의 얼굴로 변환하는 단계는: 상기 원본영상의 대상의 얼굴로부터 랜드마크를 추출하는 단계; 상기 랜드마크가 추출된 원본영상의 대상의 얼굴로부터 상기 목표 나이에서의 대상의 얼굴 텍스쳐를 생성하는 단계; 상기 랜드마크가 추출된 원본영상의 대상의 얼굴로부터 상기 목표 나이에서의 대상의 얼굴 모양을 생성하는 단계; 및 상기 목표 나이에서의 대상의 얼굴 텍스쳐 및 얼굴 모양에 기초하여 상기 대상의 나이변환 얼굴을 생성하는 단계를 포함할 수 있다. In an embodiment, the converting into the target's face at the target age includes: extracting a landmark from the target's face of the original image; generating a facial texture of the target at the target age from the target's face of the original image from which the landmark is extracted; generating a face shape of the subject at the target age from the face of the subject of the original image from which the landmark is extracted; and generating an age-converted face of the target based on a facial texture and a face shape of the target at the target age.

일 실시예에서, 상기 목표 나이에서의 얼굴 텍스쳐(texture)를 생성하는 단계는, 상기 랜드마크가 추출된 원본영상의 대상의 얼굴로부터 무모양 얼굴 텍스쳐를 생성하는 단계; 및 상기 무모양 얼굴 텍스쳐를 미리 학습된 텍스쳐 변환 모델에 적용하여 상기 목표 나이에서의 대상의 무모양 얼굴 텍스쳐를 생성하는 단계를 포함할 수 있다. In an embodiment, the generating of the facial texture at the target age includes: generating a shapeless facial texture from the face of the target of the original image from which the landmark is extracted; and applying the shapeless facial texture to a pre-learned texture transformation model to generate a shapeless facial texture of the target at the target age.

일 실시예에서, 상기 텍스쳐 변환 모델은, 제1 도메인의 입력 데이터에 노이즈를 적용하여 제2 도메인의 변환 데이터를 출력하는 제1 생성기; 및 상기 제1 도메인의 입력 데이터에 노이즈를 적용하여 제3 도메인의 변환 데이터를 출력하는 제2 생성기를 포함할 수 있다. 여기서, 각 생성기는 변환 데이터를 제1 도메인으로 데이터로 재-변환 시 상기 제1 도메인의 입력 데이터로 변환되도록 구성된다. In an embodiment, the texture transformation model may include: a first generator configured to output transformed data of a second domain by applying noise to input data of a first domain; and a second generator configured to output converted data of a third domain by applying noise to input data of the first domain. Here, each generator is configured to be converted into input data of the first domain upon re-conversion of the converted data into data into the first domain.

일 실시예에서, 상기 텍스쳐 변환 모델은, 제1 도메인의 입력 데이터에 노이즈 및 조건 정보를 적용하여 제2 도메인 및 제3 도메인을 포함한 복수의 다른 도메인의 변환 데이터를 출력하는 생성기를 포함할 수 있다. In an embodiment, the texture transformation model may include a generator that applies noise and condition information to input data of a first domain to output transformation data of a plurality of different domains including a second domain and a third domain. .

일 실시예에서, 상기 텍스쳐 변환 모델은, 상기 목표 나이에서의 얼굴 텍스쳐에 해당하는 데이터를 출력하도록 복수의 훈련 샘플을 이용하여 미리 학습된 모델로서, 각 훈련 샘플은 상기 목표 나이에서의 훈련 대상의 얼굴 텍스쳐, 및 훈련 대상의 성별을 포함한 레이블 데이터를 포함할 수 있다. In an embodiment, the texture transformation model is a pre-trained model using a plurality of training samples to output data corresponding to a facial texture at the target age, and each training sample is a model of a training target at the target age. It may include label data including a face texture and the gender of the training target.

일 실시예에서, 상기 훈련 대상의 얼굴 텍스쳐는 상기 훈련 대상의 얼굴로부터 획득된 무모양 얼굴 텍스쳐일 수 있다. In an embodiment, the facial texture of the training target may be a shapeless facial texture obtained from the training target's face.

일 실시예에서, 상기 목표 나이에서의 대상의 얼굴 모양을 생성하는 단계는: 상기 원본영상의 얼굴의 랜드마크에 기초하여 상기 원본영상의 대상의 얼굴 모양 특징을 추출하는 단계; 상기 원본영상의 대상의 얼굴 모양 특징을 미리 학습된 모양 변환 모델에 적용하여 상기 목표 나이에서의 대상의 얼굴 모양 특징을 생성하는 단계; 및 상기 목표 나이에서의 대상의 얼굴 모양 특징에 기초하여 상기 목표 나이에서의 대상의 얼굴 모양을 복원하는 단계를 포함할 수 있다. In an embodiment, the generating of the face shape of the subject at the target age includes: extracting facial shape features of the subject of the original image based on the landmark of the face of the original image; generating facial features of the target at the target age by applying the facial shape features of the original image to a pre-learned shape transformation model; and reconstructing the facial shape of the target at the target age based on the facial shape features of the target at the target age.

일 실시예에서, 상기 모양 변환 모델은, 나이와 해당 나이에서의 얼굴 모양 특징 간의 관계를 모델링하여 생성된 것으로서, 상기 목표 나이에서의 나이 함수 값과 상기 원본나이에서의 나이 함수 값 간의 차이 및 상기 원본영상의 대상의 얼굴 모양 특징에 기초하여 상기 목표 나이에서의 대상의 얼굴 모양 특징을 산출하도록 모델링된 것일 수 있다. In an embodiment, the shape transformation model is generated by modeling a relationship between age and a facial shape feature at the corresponding age, the difference between the age function value at the target age and the age function value at the original age, and the It may be modeled to calculate the facial shape characteristics of the target at the target age based on the facial shape characteristics of the target of the original image.

일 실시예에서, 상기 모양 변환 모델은, 상기 목표 나이에서의 얼굴 모양 특징을 출력하도록 복수의 훈련 샘플 및 상기 목표 나이를 나타내는 레이블 데이터를 이용하여 미리 학습된 모델로서, 각 세트 내 훈련 샘플은 해당 나이에서의 훈련 대상의 얼굴 모양 특징을 포함할 수 있다. In one embodiment, the shape transformation model is a model pre-trained using a plurality of training samples and label data indicating the target age to output the facial shape features at the target age, wherein the training samples in each set are corresponding It may include the facial shape features of the training target at age.

일 실시예에서, 상기 모양 변환 모델이 복수의 나이 중 어느 하나의 나이에서의 얼굴 모양 특징을 출력하도록 구성된 경우, 상기 모양 변환 모델은 복수의 훈련 샘플 세트를 이용하여 미리 학습된 모델로서, 각 세트는 상기 복수의 나이 중 특정 나이에서의 복수의 훈련 샘플 및 상기 특정 나이를 나타내는 레이블 데이터를 포함하며, 각 세트 내 훈련 샘플은 해당 나이에서의 훈련 대상의 얼굴 모양 특징을 포함할 수 있다. In an embodiment, when the shape transformation model is configured to output facial shape features at any one of a plurality of ages, the shape transformation model is a pre-trained model using a plurality of sets of training samples, each set may include a plurality of training samples at a specific age among the plurality of ages and label data indicating the specific age, and each training sample in each set may include a facial shape feature of a training target at the corresponding age.

일 실시예에서, 상기 모양 변환 모델은, 상기 얼굴 모양 특징의 차원이 N차원인 경우(여기서, N은 1이상의 정수), 각 얼굴 모양 특징에 대한 나이 함수에 기초하여 모델링된 것일 수 있다. In an embodiment, the shape transformation model may be modeled based on an age function for each facial shape feature when the dimension of the facial shape feature is N (here, N is an integer greater than or equal to 1).

일 실시예에서, 상기 대상의 나이변환 얼굴을 생성하는 단계는, 상기 목표 나이에서의 대상의 얼굴 모양에 상기 목표 나이에서의 얼굴 텍스쳐를 와핑(warping)하여, 상기 목표 나이에서의 얼굴을 나이변환 얼굴로서 생성하는 단계를 포함할 수 있다. In an embodiment, the generating of the age-converted face of the target includes warping the face texture at the target age with the face shape of the target at the target age, thereby converting the age of the face at the target age. It may include generating as a face.

일 실시예에서, 상기 대상의 직업영상을 생성하는 단계는, 상기 배경영상의 얼굴 및 나이변환 얼굴의 랜드마크를 각각 추출하는 단계; 추출된 각각의 랜드마크에 기초하여 나이변환 얼굴의 랜드마크를 상기 배경영상의 얼굴 영역에 매핑하는 단계; 상기 나이변환 영상의 얼굴 텍스쳐를 상기 배경영상에 매핑된 랜드마크의 위치를 기반으로 와핑하는 단계; 상기 배경영상의 얼굴 텍스쳐를 상기 배경영상에 매핑된 랜드마크의 위치를 기반으로 와핑하는 단계; 상기 배경영상의 얼굴의 이동한 랜드마크에 기초하여 내부 영역을 필터링하는 합성영역 마스크를 생성하는 단계; 및 상기 합성영역 마스크를 사용하여 와핑된 나이변환 영상의 대상의 얼굴 영역을 필터링하고, 상기 필터링된 대상의 얼굴 영역을 상기 배경영상의 얼굴 영역에 이식하는 단계를 포함할 수 있다. In one embodiment, the generating of the job image of the target may include: extracting each of the landmarks of the face of the background image and the age-converted face; mapping the landmark of the age-converted face to the face region of the background image based on each extracted landmark; warping the face texture of the age conversion image based on the location of the landmark mapped to the background image; warping the face texture of the background image based on the location of the landmark mapped to the background image; generating a composite area mask for filtering an inner area based on the moved landmark of the face of the background image; and filtering the face region of the target of the warped age transformation image using the composite region mask, and transplanting the filtered face region to the face region of the background image.

일 실시예에서, 상기 매핑하는 단계는, 각각의 랜드마크가 의미하는 해부학적 얼굴 특징에 기초하여 수행될 수 있다. In an embodiment, the mapping may be performed based on anatomical facial features that each landmark means.

일 실시예에서, 상기 직업영상을 생성하는 단계는, 상기 선택된 배경영상이 복수의 프레임으로 이루어진 동영상인 경우, 상기 나이변환 얼굴을 상기 복수의 프레임 중 적어도 하나와 합성하는 단계를 포함할 수 있다. In an embodiment, the generating of the job image may include synthesizing the age-converted face with at least one of the plurality of frames when the selected background image is a moving picture composed of a plurality of frames.

일 실시예에서, 상기 직업영상을 생성하는 단계는, 나이변환 얼굴을 갖도록 합성된 프레임으로 이루어진 직업 동영상을 생성하는 단계를 더 포함할 수 있다. In an embodiment, the generating of the job image may further include generating a job video composed of frames synthesized to have an age-converted face.

일 실시예에서, 상기 장치는, 상기 진로 안내 서비스 영상을 제공하기 위해, 상기 진로 안내 서비스 영상을 표시하는 것, 상기 진로 안내 서비스 영상을 인쇄하는 것, 상기 진로 안내 서비스 영상을 전송하는 것 중 하나 이상을 수행하도록 구성된다. In an embodiment, the device is one of displaying the career guidance service image, printing the career guidance service image, and transmitting the career guidance service image to provide the career guidance service image It is configured to do more than

본 발명의 다른 일 측면에 따른 포토부스는 상술한 실시예들에 따른 장치; 및 벽으로 적어도 일부가 감싸져 상기 대상이 위치 가능한 공간을 형성하는 부스를 포함할 수 있다. A photobooth according to another aspect of the present invention includes an apparatus according to the above-described embodiments; and a booth at least partially surrounded by a wall to form a space in which the object can be located.

본 발명의 일 측면에 따른 직업영상을 생성하는 장치는 미리 학습된 텍스쳐 변환 모델을 이용하여 아이의 얼굴 영상에서 성인 나이에서의 텍스쳐를 생성하고, 미리 학습된 나이 함수를 이용하여 아이의 얼굴 영상에서 성인 나이에서의 모양을 생성한 뒤, 상기 생성된 텍스쳐 및 모양에 기초하여 성인 나이에서의 얼굴을 갖는 나이변환 영상을 생성할 수 있다. An apparatus for generating a job image according to an aspect of the present invention generates a texture at an adult age from a face image of a child using a pre-learned texture transformation model, and uses a pre-learned age function from the face image of a child After generating the shape at the adult age, an age conversion image having the face at the adult age may be generated based on the generated texture and shape.

또한, 나이변환 영상을 직업영상과 합성함에 있어서, 나이변환 영상의 얼굴 영역을 직업영상의 얼굴 영역에 자연스럽게 합성하여 보다 사실적이고 사용자 맞춤형 영상을 제공할 수 있다. In addition, in synthesizing the age conversion image with the occupational image, a more realistic and user-customized image may be provided by naturally synthesizing the face region of the age transformation image with the face region of the occupational image.

그 결과, 아이가 희망 진로를 가졌을 경우를 상상하는 것을 지원하고, 또한 진로 결정을 지원하는 구체적이고 사실적인 영상을 제공할 수 있다. As a result, it is possible to provide concrete and realistic images that support imagining a child's desired career path, and also support career decision-making.

본 발명의 효과들은 이상에서 언급한 효과들로 제한되지 않으며, 언급되지 않은 또 다른 효과들은 청구범위의 기재로부터 통상의 기술자에게 명확하게 이해될 수 있을 것이다.Effects of the present invention are not limited to the effects mentioned above, and other effects not mentioned will be clearly understood by those skilled in the art from the description of the claims.

본 발명 또는 종래 기술의 실시예의 기술적 해결책을 보다 명확하게 설명하기 위해, 실시예에 대한 설명에서 필요한 도면이 아래에서 간단히 소개된다. 아래의 도면들은 본 명세서의 실시예를 설명하기 위한 목적일 뿐 한정의 목적이 아니라는 것으로 이해되어야 한다. 또한, 설명의 명료성을 위해 아래의 도면들에서 과장, 생략 등 다양한 변형이 적용된 일부 요소들이 도시될 수 있다.
도 1a 및 도 1b는, 본 발명의 일 실시예에 따른, 직업영상 생성 장치의 동작을 개념적으로 도시한 도면이다.
도 2는, 본 발명의 일 실시예에 따른, 직업영상 생성 장치의 개념적인 구성도이다.
도 3은, 일 실시예에 따른, 배경영상을 선택하게 하는 인터페이스 화면을 도시한 도면이다.
도 4는, 본 발명의 일 실시예에 따른, 나이 변환부의 동작을 설명하기 위한 도면이다.
도 5는, 본 발명의 일 실시예에 따른, 얼굴 모양 특징에 대한 나이함수를 결정하는 과정을 설명하기 위한 도면이다.
도 6은, 도 5의 과정에 의해 훈련된 제2 모양 특징에 대한 나이 함수를 도시한 도면이다.
도 7은, 본 발명의 일 실시예에 따른, 포토부스 시스템의 개념도이다.
도 8은, 본 발명의 일 실시예에 따른, 직업영상 생성 방법의 흐름도이다.
도 9는, 본 발명의 일 실시예에 따른, 나이변환 동작에 따른 결과를 예시적으로 도시한 도면이다.
도 10은, 본 발명의 일 실시예에 따른, 영상 합성 과정의 흐름도이다.
도 11은, 본 발명의 일 실시예에 따른, 나이변환 영상 및 배경 영상의 랜드마크 추출결과를 도시한 도면이다.
도 12는, 본 발명의 일 실시예에 따른, 배경영상에 매핑된, 나이변환 영상의 랜드마크를 도시한 도면이다.
도13은, 본 발명의 일 실시예에 따른, 배경영상의 얼굴의 랜드마크의 위치 이동 결과를 도시한 도면이다.
도14는, 본 발명의 일 실시예에 따른, 나이변환 영상의 얼굴의 랜드마크의 위치 이동 결과를 도시한 도면이다.
도 15는, 본 발명의 일 실시예에 따른, 합성 영역 마스크를 도시한 도면이다.
도 16은, 본 발명의 일 실시예에 따른, 목표 직업이 경찰인 경우 합성 영상을 도시한 도면이다.
도 17은, 본 발명의 일 실시예에 따른, 목표 직업이 의사인 경우 합성 영상을 도시한 도면이다.
도 18a 및 도 18b는, 본 발명의 실시예들에 따른, 진로 안내 서비스 영상을 도시한 도면이다.In order to more clearly explain the technical solutions of the embodiments of the present invention or the prior art, drawings necessary for the description of the embodiments are briefly introduced below. It should be understood that the drawings below are for the purpose of explaining the embodiments of the present specification and not for the purpose of limitation. In addition, some elements to which various modifications such as exaggeration and omission have been applied may be shown in the drawings below for clarity of description.
1A and 1B are diagrams conceptually illustrating an operation of an apparatus for generating an occupational image according to an embodiment of the present invention.
2 is a conceptual configuration diagram of an apparatus for generating an occupational image according to an embodiment of the present invention.
3 is a diagram illustrating an interface screen for selecting a background image, according to an embodiment.
4 is a view for explaining the operation of the age conversion unit, according to an embodiment of the present invention.
5 is a diagram for explaining a process of determining an age function for a facial shape feature, according to an embodiment of the present invention.
FIG. 6 is a diagram illustrating an age function for a second shape feature trained by the process of FIG. 5 .
7 is a conceptual diagram of a photobooth system according to an embodiment of the present invention.
8 is a flowchart of a method for generating a job image, according to an embodiment of the present invention.
9 is a diagram exemplarily showing a result according to an age conversion operation, according to an embodiment of the present invention.
10 is a flowchart of an image synthesis process according to an embodiment of the present invention.
11 is a diagram illustrating a landmark extraction result of an age conversion image and a background image according to an embodiment of the present invention.
12 is a diagram illustrating a landmark of an age conversion image mapped to a background image, according to an embodiment of the present invention.
13 is a diagram illustrating a result of moving a landmark of a face of a background image according to an embodiment of the present invention.
14 is a diagram illustrating a result of positional movement of a landmark of a face of an age conversion image according to an embodiment of the present invention.
15 is a diagram illustrating a composite area mask according to an embodiment of the present invention.
16 is a diagram illustrating a composite image when a target job is a police officer according to an embodiment of the present invention.
17 is a diagram illustrating a composite image when a target job is a doctor, according to an embodiment of the present invention.
18A and 18B are diagrams illustrating a career guidance service image according to embodiments of the present invention.

여기서 사용되는 전문 용어는 단지 특정 실시예를 언급하기 위한 것이며, 본 발명을 한정하는 것을 의도하지 않는다. 여기서 사용되는 단수 형태들은 문구들이 이와 명백히 반대의 의미를 나타내지 않는 한 복수 형태들도 포함한다. 명세서에서 사용되는 "포함하는"의 의미는 특정 특성, 영역, 정수, 단계, 동작, 요소, 성분 및/또는 부품을 구체화하려는 것이며, 다른 특성, 영역, 정수, 단계, 동작, 요소, 성분 및/또는 부품의 존재나 부가를 제외시키는 것은 아니다.The terminology used herein is for the purpose of referring to specific embodiments only, and is not intended to limit the invention. As used herein, the singular forms also include the plural forms unless the phrases clearly indicate the opposite. The meaning of "comprising," as used herein, is intended to specify a particular characteristic, region, integer, step, operation, element, component, and/or part, and is intended to specify another characteristic, region, integer, step, operation, element, component and/or component. or the presence or addition of parts.

다르게 정의하지는 않았지만, 여기에 사용되는 기술용어 및 과학용어를 포함하는 모든 용어들은 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자가 일반적으로 이해하는 의미와 동일한 의미를 가진다. 보통 사용되는 사전에 정의된 용어들은 관련기술문헌과 현재 개시된 내용에 부합하는 의미를 가지는 것으로 추가 해석되고, 정의되지 않는 한 이상적이거나 매우 공식적인 의미로 해석되지 않는다.Although not defined otherwise, all terms including technical and scientific terms used herein have the same meaning as commonly understood by those of ordinary skill in the art to which the present invention belongs. Commonly used terms defined in the dictionary are additionally interpreted as having a meaning consistent with the related technical literature and the presently disclosed content, and unless defined, they are not interpreted in an ideal or very formal meaning.

이하에서, 도면을 참조하여 본 발명의 실시예들에 대하여 상세히 살펴본다.Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

도 1a 및 도 1b는, 본 발명의 일 실시예에 따른, 직업영상 생성 장치의 동작을 개념적으로 도시한 도면이다.1A and 1B are diagrams conceptually illustrating an operation of an apparatus for generating an occupational image according to an embodiment of the present invention.

본 발명의 일 실시예에 따른, 직업영상 생성 장치는 원본영상(original image)에 포함된 대상의 얼굴을 나이변환 처리하도록 구성된다. 상기 나이변환 처리는 대상의 나이(예컨대, 영상 촬영 당시 대상의 나이)에서의 대상의 얼굴을, 사용자가 변환을 원하는 나이가 될 경우 상기 대상이 가질 것으로 예상되는 얼굴로 변환하는 이미지 처리 동작을 지칭한다. 상기 직업영상 생성 장치는 나이변환 처리를 수행하여 목표 나이에서의 대상의 얼굴을 갖는 나이변환 영상을 생성한다. 그러면, 상기 직업영상 생성 장치는 상기 나이변환 영상에 포함된, 목표 나이에서의 대상의 얼굴을 직업이 표현된 배경영상에 합성하여 대상이 상기 목표 나이에서 해당 직업을 가질 경우를 표현한 직업영상을 생성한다. According to an embodiment of the present invention, the occupational image generating apparatus is configured to age-convert the face of the target included in the original image. The age conversion process refers to an image processing operation of converting the face of the subject at the age of the subject (eg, the age of the subject at the time of image shooting) into a face expected to be possessed by the subject when the user reaches an age desired to be converted. do. The occupational image generating apparatus generates an age-converted image having a face of a target at a target age by performing an age conversion process. Then, the job image generating device synthesizes the face of the target at the target age included in the age conversion image with the background image in which the job is expressed to generate a job image expressing the case where the target has a corresponding job at the target age do.

여기서, 대상은 얼굴이 나이변환 처리될 대상으로서, 일 실시예에서, 대상은 도 1에 도시된 바와 같이 아이를 지칭한다. 본 명세서에서 아이는 성년이 되지 못한 나이를 갖는 사람을 지칭한다. 아이는 영유아, 및 청소년을 포함한다. Here, the target is a target whose face is age-converted, and in one embodiment, the target refers to a child as shown in FIG. 1 . As used herein, a child refers to a person who has not reached the age of majority. Children include infants, toddlers, and adolescents.

본 명세서에서, 사용자는 상기 장치(1) 및 이를 포함한 시스템을 사용하는 사람으로서, 대상 또는 상기 대상과 관련된 제3자이다. 상기 제3자는 예를 들어, 대상을 보호하는 보호자일 수 있다. In this specification, a user is a person who uses the device 1 and a system including the same, and is an object or a third party related to the object. The third party may be, for example, a guardian protecting the object.

원본영상(original image)은 아직 나이 변환 처리되지 않은, 대상의 얼굴이 나타난 영상이다. 상기 원본영상의 얼굴은 영상 획득 당시의 대상의 원본 얼굴이다. 상기 원본영상의 얼굴에 해당하는 대상의 나이는 변환 처리 이전의 나이로서 원본 나이(original age)로 지칭된다. 예를 들어, 상기 원본영상에 포함된 대상의 나이는 대상의 촬영 당시 나이이다. The original image is an image of the subject's face that has not yet been processed for age conversion. The face of the original image is the original face of the object at the time of image acquisition. The age of the object corresponding to the face of the original image is the age before the conversion process and is referred to as the original age. For example, the age of the subject included in the original image is the age at which the subject was photographed.

본 명세서에서, 목표 나이는 사용자가 원하는 나이변환 얼굴을 생성하게 하는 나이를 나타낸다. 상기 목표 나이는 직업영상을 위해 (예컨대, 아이와 같은) 대상 또는 (예컨대, 부모와 같은) 사용자가 원하는 나이이거나, 또는 상기 장치(1)에 미리 설정된 나이일 수 있다.In this specification, the target age indicates an age at which a user's desired age-converted face is generated. The target age may be an age desired by a subject (eg, a child) or a user (eg, a parent) for the job image, or an age preset in the device 1 .

일 실시예에서, 목표 나이는 도 1에 도시된 바와 같이 아이가 성장하여 성인이 되었을 때 해당하는 나이와 같은, 상기 대상의 촬영 당시 나이와 상이한 나이일 수 있다. In an embodiment, the target age may be an age different from the age at which the subject was photographed, such as an age corresponding to when a child grows up and becomes an adult as shown in FIG. 1 .

본 명세서에서 성인은 일반적으로 성년이 된 사람을 지칭한다. 성인에 대응하는 나이는 성년 이후의 특정 나이로 제한되지 않으며, 하나 이상의 나이를 포함한 범위로서 지칭될 수 있다. 또한, 특정 실시예들에서, 성인에 대응하는 나이는 직업별로 상이할 수 있다. As used herein, an adult generally refers to a person who has reached the age of majority. The age corresponding to an adult is not limited to a specific age after the age of majority, and may be referred to as a range including one or more ages. Also, in certain embodiments, the age corresponding to an adult may be different for each occupation.

도 2는, 본 발명의 일 실시예에 따른, 직업영상 생성 장치의 개념적인 구성도이다. 2 is a conceptual configuration diagram of an apparatus for generating an occupational image according to an embodiment of the present invention.

도 2를 참조하면, 상기 직업영상 생성 장치(1)는 제어부(50); 나이 변환부(60); 영상 합성부(70)를 포함한다. 일부 실시예에서, 상기 장치(1)는 촬영기기(9); 입력장치(10); 송수신장치(20); 메모리(30); 및 표시장치(40); 및 영상 인쇄기기(80) 중 적어도 일부를 더 포함할 수 있다. Referring to FIG. 2 , the apparatus for generating a job image 1 includes a control unit 50; Age conversion unit 60; and an image synthesizing unit 70 . In some embodiments, the device 1 includes an imaging device 9; input device 10; Transceiver 20; memory 30; and a display device 40; And it may further include at least a portion of the image printing device (80).

실시예들에 따른 장치(1)는 전적으로 하드웨어이거나, 전적으로 소프트웨어이거나, 또는 부분적으로 하드웨어이고 부분적으로 소프트웨어인 측면을 가질 수 있다. 예컨대, 시스템은 데이터 처리 능력이 구비된 하드웨어 및 이를 구동시키기 위한 운용 소프트웨어를 통칭할 수 있다. 본 명세서에서 "부(unit)", “모듈(module)”“장치”, 또는 "시스템" 등의 용어는 하드웨어 및 해당 하드웨어에 의해 구동되는 소프트웨어의 조합을 지칭하는 것으로 의도된다. 예를 들어, 하드웨어는 CPU(Central Processing Unit), GPU(Graphic Processing Unit) 또는 다른 프로세서(processor)를 포함하는 데이터 처리 가능한 컴퓨팅 장치일 수 있다. 또한, 소프트웨어는 실행중인 프로세스, 객체(object), 실행파일(executable), 실행 스레드(thread of execution), 프로그램(program) 등을 지칭할 수 있다.The device 1 according to embodiments may have aspects that are entirely hardware, entirely software, or partly hardware and partly software. For example, the system may collectively refer to hardware equipped with data processing capability and operating software for driving the same. As used herein, terms such as “unit”, “module,” “device,” or “system” are intended to refer to a combination of hardware and software run by the hardware. For example, the hardware may be a computing device capable of processing data including a central processing unit (CPU), a graphic processing unit (GPU), or another processor. In addition, software may refer to a running process, an object, an executable file, a thread of execution, a program, and the like.

촬영기기(9)는 피사체를 촬영하여 피사체의 이미지를 획득하도록 구성된다. 상기 촬영기기(9)는, 예를 들어, 이미지 센서, 사진기 또는 비디오 카메라와 같은, 다양한 촬영 장치를 포함한다. 상기 사진기는 폴라로이드 사진기와 같은, 즉석 사진기를 포함할 수 있다. The photographing device 9 is configured to photograph a subject to obtain an image of the subject. The photographing device 9 includes, for example, various photographing devices, such as an image sensor, a camera, or a video camera. The camera may include an instant camera, such as a Polaroid camera.

입력장치(10)는 직업영상을 생성하는 동작과 관련된 데이터, 정보 명령 등을 수신하도록 구성된다. 상기 입력장치(10)는, 예를 들어 마우스, 키보드, 마이크, 터치 센서, 제스쳐 센서 등을 포함하나, 이에 제한되지 않으며, 사용자의 입력을 수신할 수 있는 다양한 장치로 구현될 수 있다. The input device 10 is configured to receive data, information commands, etc. related to the operation of generating the occupational image. The input device 10 includes, for example, a mouse, a keyboard, a microphone, a touch sensor, a gesture sensor, etc., but is not limited thereto, and may be implemented as various devices capable of receiving a user's input.

송수신장치(20)는 사용자의 모바일 기기와 유/무선 전기적 연결을 통해 데이터를 송/수신하도록 구성된다. 여기서, 상기 모바일 기기는 사용자가 휴대할 수 있는 전자 기기로서, 예를 들어, 스마트 폰, 셀룰러 폰, 스마트 글래스, 스마트 워치, 웨어러블 장치, 디지털 카메라, 태블릿 등을 포함한다. 상기 유/무선 전기적 연결은, 예를 들어 3G, 4G, 5G, LTE, WiFi, Bluetooth 등과 같은, 다양한 방식을 포함한다. The transceiver 20 is configured to transmit/receive data through a wired/wireless electrical connection with a user's mobile device. Here, the mobile device is an electronic device that a user can carry, and includes, for example, a smart phone, a cellular phone, smart glasses, a smart watch, a wearable device, a digital camera, a tablet, and the like. The wired/wireless electrical connection includes, for example, various methods, such as 3G, 4G, 5G, LTE, WiFi, Bluetooth, and the like.

메모리(30)는 나이 변환 또는 직업영상 생성을 위해 이용되는 데이터를 저장할 수 있다. 상기 나이 변환을 위해 이용되는 데이터는, 미리 획득한 상기 원본영상과 같은, 나이 변환의 기초 영상, 나이 변환을 위한 모델 등을 포함한다. 상기 직업영상 생성을 위해 이용되는 데이터는 직업별 배경영상, 직업영상 생성을 위한 모델 등을 포함한다. The memory 30 may store data used for age conversion or job image generation. The data used for age transformation includes a basic image of age transformation, a model for age transformation, and the like, such as the previously obtained original image. The data used for generating the job image includes a background image for each job, a model for generating the job image, and the like.

표시장치(40)는 상기 제공 장치(1)에서 저장 및/또는 처리된 정보를 출력한다. 일 실시예에서, 표시장치(40)는 상기 장치(1)에 저장된 이미지 또는 상기 장치(1)에 의해 처리된 이미지를 디스플레이할 수 있다. 또한, 직업영상 생성을 위한 사용자 입력을 수신하는 사용자 인터페이스를 더 디스플레이할 수 있다. The display device 40 outputs information stored and/or processed by the providing device 1 . In one embodiment, the display device 40 may display an image stored in the device 1 or an image processed by the device 1 . In addition, a user interface for receiving a user input for generating a job image may be further displayed.

표시장치(40)는 LCD, OLED, 플렉서블 스크린 등과 같은 다양한 표시 장치를 포함할 수 있다. The display device 40 may include various display devices such as an LCD, an OLED, a flexible screen, and the like.

도 2에서는 입력부(10)와 표시장치(40)가 분리되어 있지만, 일 실시예에서, 표시장치(40)와 입력부(10)는 입력 수신 및 정보 출력을 수행하도록 하나의 구성요소로 구현될 수 있다. 예를 들어, 입력부(10)와 표시장치(40)는 터치 패널이 스크린과 레이어 구조를 이루는 터치 스크린으로 구현될 수 있다. Although the input unit 10 and the display device 40 are separated in FIG. 2 , in an embodiment, the display device 40 and the input unit 10 may be implemented as one component to receive input and output information. there is. For example, the input unit 10 and the display device 40 may be implemented as a touch screen in which the touch panel forms a layer structure with the screen.

제어부(50)는 상기 장치(1)의 전반적인 동작을 제어한다. 예를 들어, 제어부(50)는 입력장치(10)와 사용자의 상호작용 동작에 의해 획득된 정보 및/또는 외부 장치(예컨대, 사용자의 모바일 기기)와 데이터를 송/수신한 정보를 나이 변환부(60)에 제공하여 상기 장치(1)가 나이 변환 영상을 생성하거나, 또는 영상 합성부(70)에 제공하여 직업영상을 생성하게 한다. The control unit 50 controls the overall operation of the device 1 . For example, the control unit 50 converts the information obtained by the interaction between the input device 10 and the user and/or the information transmitted/received from the external device (eg, the user's mobile device) to the age conversion unit. Provided to ( 60 ), the device ( 1 ) generates an age-converted image, or provides it to the image synthesis unit ( 70 ) to generate an occupational image.

나이 변환부(60)는 상기 직업영상 생성 장치(1)는 상기 원본영상, 원본 나이 정보, 및 목표 나이 정보를 수신하면, 상기 원본영상을 나이변환 처리하여 상기 원본영상에 포함된 대상의 얼굴을 상기 대상이 목표 나이가 될 경우 가질 것으로 예상되는 얼굴을 갖는 나이변환 영상을 생성하도록 구성된다. When the age conversion unit 60 receives the original image, the original age information, and the target age information, the age conversion unit 60 converts the age of the original image to the face of the target included in the original image. and to generate an age-converted image having a face expected to have when the subject becomes a target age.

또한, 영상 합성부(70)는 (예컨대, 나이 변환부70에 의해 생성된) 나이 변환 영상과 배경영상에 기초하여 목표 나이에서의 얼굴을 가지면서 해당 직업을 표현하는 직업영상을 생성하도록 구성된다. In addition, the image synthesizing unit 70 is configured to generate a job image expressing the job while having a face at the target age based on the age conversion image and the background image (eg, generated by the age converting unit 70). .

나이 변환부(60) 및 영상 합성부(70)의 동작에 대해서는 아래의 도 4 등을 참조하여 보다 상세하게 서술한다. The operation of the age converting unit 60 and the image synthesizing unit 70 will be described in more detail with reference to FIG. 4 below.

영상 인쇄기기(80)는 원본영상, 나이변환 영상, 배경영상, 직업영상 등을 포함한, 상기 장치(1)에서 획득된 영상을 인쇄를 통해 출력하도록 구성된다. The image printing device 80 is configured to output the image obtained by the device 1 through printing, including an original image, an age conversion image, a background image, a job image, and the like.

상기 인쇄는 종이를 통한 인쇄, 및 사진을 통한 인쇄(즉, 인화), 플라스틱을 통한 인쇄를 포함한다. 상기 영상 인쇄기기(80)는, 예를 들어 영상 데이터를 수신하여 이를 종이로 인쇄(print)하는 흑백 프린터, 컬러 프린터, 플라스틱으로 인쇄하는 3D 프린터, 및/또는 (예컨대, 폴라로이드 즉석 사진기 등의) 사진 인화기기를 포함하나, 상기 영상 인쇄기기(80)는, 전술한 기기들에 제한되진 않는다. The printing includes printing through paper, printing through photos (ie, prints), and printing through plastics. The image printing device 80 may include, for example, a black-and-white printer that receives image data and prints it on paper, a color printer, a 3D printer that prints on plastic, and/or (eg, a Polaroid instant camera, etc.) Including a photo printing device, the image printing device 80 is not limited to the above-described devices.

상기 장치(1)는 영상 획득 장치(예컨대, 촬영기기(9) 또는 송수신장치(20))에 의해 나이변환 처리될 원본영상(original image)을 획득한다. 예를 들어, 상기 장치(1)는 촬영기기(9)에 의해 대상의 얼굴을 촬영하여 원본영상(original image)을 획득하거나, 또는 송수신장치(20)에 의해 외부 기기(예컨대, 사용자의 모바일 기기 또는 외부 컴퓨터 등)로부터 전기적 통신을 통해 상기 원본영상을 수신한다. The device 1 acquires an original image to be subjected to age conversion processing by an image acquisition device (eg, the photographing device 9 or the transceiver 20). For example, the device 1 acquires an original image by photographing the face of the target by the photographing device 9 or an external device (eg, the user's mobile device) by the transceiver 20 . or an external computer) through electrical communication to receive the original image.

상기 장치(1)는 대상의 원본나이를 결정하도록 구성된다. 또한, 상기 장치(1)는 대상의 성별을 더 결정할 수 있다. The device 1 is configured to determine the original age of the subject. Also, the device 1 may further determine the gender of the subject.

일 실시예에서, 상기 장치(1)는 (예컨대, 입력장치(10)에 의해 획득된) 원본나이 정보 및/또는 성별 정보를 포함한 사용자 입력에 기초하여 대상의 원본나이 및/또는 성별을 결정한다. 상기 사용자 입력은 대상의 세부 사항에 대한 입력일 수 있다. 일부 실시예에서, 상기 대상의 세부 사항에 대한 사용자 입력은 대상의 성명, 식별번호 등을 더 포함한다. In one embodiment, the device 1 determines the original age and/or gender of the subject based on a user input including original age information and/or gender information (eg, obtained by the input device 10 ). . The user input may be an input for details of an object. In some embodiments, the user input for details of the subject further includes the subject's name, identification number, and the like.

다른 일 실시예에서, 상기 장치(1)는 원본영상만을 수신한 뒤, 상기 원본영상의 대상의 얼굴을 분석하여 대상의 나이 값을 산출하고, 산출된 대상의 나이 값을 기준으로 직업영상을 생성할 수도 있다. 또한, 원본영상의 대상의 얼굴을 분석하여 대상의 성별을 더 결정할 수 있다. 상기 대상의 나이 값의 산출 및 성별 결정은 나이변환을 위한 분석의 적어도 일부에 기초하여 수행될 수 있다. In another embodiment, the device 1 receives only the original image, analyzes the subject's face of the original image, calculates the age value of the subject, and generates a job image based on the calculated age value of the subject You may. In addition, the gender of the subject may be further determined by analyzing the subject's face in the original image. The calculation of the age value and the determination of the gender of the subject may be performed based on at least a part of the analysis for age transformation.

또 다른 일 실시예에서, 상기 장치(1)는 전술한 사용자 입력 및 원본영상 분석에 기초하여 대상의 원본나이와 성별을 결정할 수 있다. 예를 들어, 원본나이는 사용자 입력에 의해 획득하고, 성별은 원본영상 분석에 기초하여 획득된다. In another embodiment, the device 1 may determine the original age and gender of the subject based on the above-described user input and original image analysis. For example, the original age is obtained by a user input, and the gender is obtained based on the analysis of the original image.

예를 들어, 상기 장치(1)는 입력 영상(즉, 원본영상)을 CNN 기반 딥러닝 네트워크에 적용하여 얼굴 영역에서 나이 및/또는 성별에 대한 정보를 산출하도록 구성된다. 상기 CNN 기반 딥러닝 네트워크는 다수의 컨볼루션 레이어; 완전 연결 레이어를 포함한 구조로서, 추출된 나이 특징과 분류자(Classifier) 또는 회귀자(Regressor)를 이용하여 나이 및/또는 성별을 인식한다. For example, the device 1 is configured to apply an input image (ie, an original image) to a CNN-based deep learning network to calculate age and/or gender information in a face region. The CNN-based deep learning network includes a plurality of convolutional layers; As a structure including a fully connected layer, age and/or gender are recognized using the extracted age feature and a classifier or regressor.

상기 CNN 기반 딥러닝 네트워크는 얼굴 영상과 하나 이상의 레이블 데이터를 포함한 훈련 샘플의 세트에 의해 학습된다. 상기 훈련 샘플의 세트 내 각 훈련 샘플은 나이에 대한 레이블 데이터, 및/또는 성별에 대한 레이블 데이터를 각각 포함한다. The CNN-based deep learning network is trained on a set of training samples including face images and one or more label data. Each training sample in the set of training samples includes label data for age, and/or label data for gender, respectively.

일부 실시예에서, 상기 장치(1)는 입력 영상(즉, 원본영상)에서 얼굴 영역을 검출하는 동작 및/또는 정규화 동작을 더 수행할 수 있다. 이러한 전처리 과정에 의해 얼굴 영역만 상기 CNN 기반 딥러닝 네트워크에 적용될 수 있다. In some embodiments, the apparatus 1 may further perform an operation of detecting a face region in an input image (ie, an original image) and/or a normalization operation. Only the face region can be applied to the CNN-based deep learning network by this preprocessing process.

또한, 상기 장치(1)는 직업이 표현된 배경영상을 선택하는 입력을 수신하도록 구성될 수 있다. 배경영상은 상기 직업의 특성을 표현하도록 구성되며, 상기 대상과 상이한, 해당 직업을 갖는 사람의 얼굴의 적어도 일부를 포함할 수 있다. 예를 들어, 배경영상은 다른 사람이 해당 직업의 유니폼(예컨대, 경찰 유니폼, 또는 의사 가운)을 입은 사진일 수 있다. Also, the device 1 may be configured to receive an input for selecting a background image in which a job is expressed. The background image is configured to express the characteristics of the job, and may include at least a part of a face of a person having a corresponding job, which is different from the target. For example, the background image may be a picture of another person wearing a uniform (eg, a police uniform or a doctor's gown) for a corresponding job.

일 실시예에서, 상기 장치(1)는 배경영상을 선택하는 입력에 기초하여 상기 대상의 직업영상을 위한 배경영상을 결정할 수 있다. 이를 위해, 상기 장치(1)는 직업이 표현된 하나 이상의 후보 배경영상 중에서 배경영상을 선택하게 하는 인터페이스 화면을 제공하도록 구성된다. In an embodiment, the device 1 may determine a background image for the job image of the target based on an input for selecting the background image. To this end, the device 1 is configured to provide an interface screen for selecting a background image from among one or more candidate background images in which a job is expressed.

도 3은, 일 실시예에 따른, 배경영상을 선택하게 하는 인터페이스 화면을 도시한 도면이다. 3 is a diagram illustrating an interface screen for selecting a background image, according to an embodiment.

도 3의 인터페이스 화면은 하나 이상의 후보 배경영상을 표시하여 배경영상 결정을 위한 사용자의 입력을 유도한다. 사용자 입력이 수신되면, 상기 장치(1)는 입력에 대응하는 후보 배경영상을 직업영상을 위한 배경영상으로 결정한다. The interface screen of FIG. 3 induces a user's input for determining the background image by displaying one or more candidate background images. When a user input is received, the device 1 determines a candidate background image corresponding to the input as a background image for a job image.

상기 인터페이스 화면은 표시장치(40)의 해상도, 표시 크기 등에 따라 일부가 사용자에게 표시될 수 있다. 사용자에게 표시되지 않은 다른 부분은 화면 이동 명령에 의해 사용자에게 더 제공된다. A part of the interface screen may be displayed to the user according to the resolution and display size of the display device 40 . Other parts that are not displayed to the user are further provided to the user by the screen moving command.

상기 후보 배경영상은 직업영상을 생성하기 위해 사용될 배경영상으로 선택될 수 있는 후보 영상이다. 상기 배경영상 후보는 특정 직업군(또는 직업)을 대표하는 적어도 하나의 인물 사진일 수 있다. 예를 들어, 도 3에 도시된 바와 같이, 배경영상 후보는 유명인, 세계 위인의 사진을 포함한다. 상기 장치(1)에서 상기 후보 배경영상은 특정 직업에 미리 연관된다. 또한, 상기 후보 배경영상은 특정 직업이 속한 직업군에 미리 더 연관될 수 있다. 또한, 상기 후보 배경영상은 연관된 직업의 주된 성별이 더 연관될 수 있다. 즉, 후보 배경영상은 직업, 직업군, 및 성별 중 적어도 직업에 연관되며, 연관된 요소에 기초하여 분류 또는 검색될 수 있다. The candidate background image is a candidate image that can be selected as a background image to be used to generate a job image. The background image candidate may be at least one portrait of a person representing a specific occupation group (or occupation). For example, as shown in FIG. 3 , the background image candidates include photos of celebrities and world greats. In the device 1, the candidate background image is previously associated with a specific job. Also, the candidate background image may be further related in advance to a job group to which a specific job belongs. In addition, the candidate background image may be further related to the main gender of the related job. That is, the candidate background image is related to at least a job among a job, a job group, and a gender, and may be classified or searched based on the related element.

상기 도 3과 같은, 배경영상을 위한 인터페이스 화면은 다양한 구조로 구성된다. As shown in FIG. 3, the interface screen for the background image has various structures.

일 예에서, 상기 배경영상을 위한 인터페이스 화면은 미리 저장된 후보 배경영상 중 적어도 일부를 포함하도록 구성된다. In one example, the interface screen for the background image is configured to include at least a portion of the candidate background images stored in advance.

다른 일 예에서, 상기 배경영상을 위한 인터페이스 화면은, 사용자의 직업(또는 직업군) 선택 이후, 선택된 해당 직업(또는 직업군)에 관련된 후보 배경영상 중 적어도 일부를 포함하도록 구성된다. In another example, the interface screen for the background image is configured to include at least some of the candidate background images related to the selected job (or job group) after the user's job (or job group) is selected.

그러면, 상기 장치(1)는 배경영상 결정을 위한 인터페이스 화면을 제공하기 이전에, 직업 관련 목록을 표시하여 원하는 직업을 선택하게 하는 선택 화면을 제공하도록 더 구성된다. 상기 직업 관련 목록은 하나 이상의 직업 항목(listing) 및/또는 하나 이상의 직업군 항목을 포함한다. 상기 장치(1)는 특정 직업 항목에 대한 선택 입력을 수신한 경우, 상기 특정 직업 항목에 관련된 하나 이상의 후보 배경영상을 표시한 상기 인터페이스 화면을 제공한다. 상기 장치(1)는 특정 직업군 항목에 대한 선택 입력을 수신한 경우, 해당 직업군에 포함된 직업 항목을 포함한 서브 목록을 제공하거나, 또는 해당 직업군에 포함된 직업에 관련된 하나 이상의 후보 배경영상을 표시한 상기 인터페이스 화면을 제공한다. Then, the device 1 is further configured to provide a selection screen for selecting a desired job by displaying a job-related list before providing the interface screen for determining the background image. The job-related listing includes one or more occupational listings and/or one or more occupational group entries. When receiving a selection input for a specific job item, the device 1 provides the interface screen displaying one or more candidate background images related to the specific job item. When the device 1 receives a selection input for a specific occupational group item, it provides a sub-list including the occupational item included in the corresponding occupational group, or one or more candidate background images related to the occupation included in the corresponding occupational group. and provides the interface screen displaying

또는, 상기 장치(1)는 상기 직업 관련 목록에서 사용자가 선택한 직업(또는 직업군)에 대하여 미리 결정된 대표 영상을 자동으로 배경영상으로 결정하도록 더 구성될 수 있다. Alternatively, the device 1 may be further configured to automatically determine a predetermined representative image for the occupation (or occupation group) selected by the user from the occupation-related list as the background image.

또 다른 일 예에서, 상기 배경영상을 위한 인터페이스 화면은 성별에 기초하여 구성된다. 배경영상의 헤어스타일, 의상 등은 성별에 따라 상이할 수 있어, 대상의 성별과 동일한 성별이 나타난 배경영상을 선택하는 것이 자연스러운 합성 영상을 생성하는데 요구된다.In another example, the interface screen for the background image is configured based on gender. Hairstyles, clothes, etc. of the background image may be different depending on gender, so selecting a background image showing the same gender as the subject's gender is required to generate a natural synthetic image.

상기 장치(1)는 배경영상 결정을 위한 인터페이스 화면을 제공하기 이전에 상기 대상의 성별 정보가 획득한 경우, 해당 성별에 연관된 것으로 미리 설정된 후보 배경영상을 선별한다. 상기 장치(1)는 해당 성별에 연관된 후보 배경영상으로 상기 인터페이스 화면을 구성하여 사용자에게 제공한다. 즉, 사용자의 결정된 성별에 미리 관련된 후보 배경영상을 필터링하고, 필터링된 후보 배경영상 중 적어도 일부를 제공하도록 더 구성된다. 예를 들어, 대상의 성별이 남성으로 결정된 경우, 상기 장치(1)는 남성과 관련된 것으로 미리 결정된 직업에 관련된 후보 배경영상을 포함한 인터페이스 화면을 제공한다. When the subject's gender information is obtained before providing the interface screen for determining the background image, the device 1 selects a candidate background image preset to be related to the corresponding gender. The device 1 configures the interface screen with a candidate background image related to the corresponding gender and provides it to the user. That is, it is further configured to filter the candidate background image related in advance to the determined gender of the user, and to provide at least a part of the filtered candidate background image. For example, when the gender of the subject is determined to be male, the device 1 provides an interface screen including a background image of a candidate related to a job determined in advance to be related to a male.

전술한 바와 같이, 대상의 성별은 사용자 입력 및/또는 원본영상 분석에 기초하여 획득될 수 있다. 상기 장치(1)가 대상의 성별 정보를 미리 획득한 경우, 배경영상의 성별을 선택하는 입력이 별도로 요구되지 않는다. As described above, the gender of the subject may be obtained based on a user input and/or analysis of an original image. When the device 1 has previously obtained the subject's gender information, an input for selecting the gender of the background image is not separately required.

일부 실시예에서, 배경영상 선택 이전에 대상의 성별이 미리 획득되지 않은 경우, 상기 장치(1)에서 배경영상을 선택하기 위한 사용자 입력은 성별을 선택하는 입력을 포함할 수 있다. 예를 들어, 상기 장치(1)는 배경영상의 성별을 선택하게 하는 인터페이스를 우선 제공한 이후에, 선택된 성별을 갖는 직업영상을 선택하게 하는 인터페이스를 제공할 수 있다. In some embodiments, when the gender of the subject is not previously obtained prior to selecting the background image, the user input for selecting the background image in the device 1 may include an input for selecting the gender. For example, the device 1 may provide an interface for selecting a job image having the selected gender after first providing an interface for selecting the gender of the background image.

또한, 상기 장치(1)는 전술한 예들의 조합에 따른 인터페이스 화면을 제공하도록 구성될 수 있다. 예를 들어, 상기 장치(1)는 직업(또는 직업군) 선택 및/또는 성별에 기초하여 후보 배경영상을 우선 필터링 한 이후, 필터링된 후보 배경영상 중에서 직업영상을 위한 배경영상을 선택하게 하는 인터페이스 화면을 제공하도록 구성될 수 있다. Also, the device 1 may be configured to provide an interface screen according to a combination of the above-described examples. For example, the device 1 first filters a candidate background image based on occupation (or occupation group) selection and/or gender, and then interfaces to select a background image for the occupation image from among the filtered candidate background images. It may be configured to provide a screen.

다른 일 실시예에서, 상기 장치(1)는 원본영상 및/또는 아래의 나이변환 영상에 기초하여 자동으로 배경영상을 결정할 수 있다. In another embodiment, the device 1 may automatically determine the background image based on the original image and/or the age-converted image below.

배경영상은 원본영상 및/또는 나이변환 영상에 기초하여 자동으로 선택될 수 있다. 예를 들어, 배경영상은 원본영상의 영상 특성 및 나이변환 영상에서 얼굴 특성 중 하나 이상에 기초하여 선택될 수 있다. The background image may be automatically selected based on the original image and/or the age-converted image. For example, the background image may be selected based on at least one of image characteristics of the original image and facial characteristics in the age-converted image.

상기 원본영상의 영상 특성은 조명, 해상도, 선명도 등을 포함한다. 상기 장치(1)는 원본영상의 영상 특성을 산출하고, 유사한 특성을 갖는 배경영상을 직업영상을 위한 배경영상으로 선택한다. 예를 들어, 상기 장치(1)는 영상의 밝기 분석 기법, 영상 품질 평가(image quality assessment) 기법 등을 통해 원본영상의 특성을 산출할 수 있다. The image characteristics of the original image include lighting, resolution, sharpness, and the like. The device 1 calculates the image characteristics of the original image, and selects a background image having similar characteristics as a background image for the job image. For example, the device 1 may calculate the characteristics of the original image through an image brightness analysis technique, an image quality assessment technique, or the like.

상기 나이변환 영상의 얼굴 특성은, 얼굴 형태(예컨대, 둥근형, 타원형 등), 헤어 특성(직모, 곱슬, 가르마 방향), 귀 모양, 이마 모양, 얼굴 포즈 등을 포함하나, 이에 제한되진 않는다. The facial characteristics of the age conversion image include, but are not limited to, a face shape (eg, a round shape, an oval shape, etc.), a hair characteristic (straight hair, curly hair, a parted direction), an ear shape, a forehead shape, a face pose, and the like.

일 실시예에서, 상기 장치(1)는 아래에서 서술될 나이변환 영상의 얼굴 특성을 산출하고, 유사한 특성을 갖는 배경영상을 직업영상을 위한 배경영상으로 선택하도록 구성될 수 있다. 상기 장치(1)는, 예를 들어, 나이 변환 시 사용된 얼굴의 특징점 정보를 기반으로 얼굴 형태 및 얼굴 포즈를 산출할 수 있다. 또한, 상기 장치(1)는 예를 들어 영상 분할(image segmentation) 기법을 통해 귀 모양, 이마 모양, 헤어 특성 등을 산출할 수 있다. In one embodiment, the device 1 may be configured to calculate the facial characteristics of an age conversion image to be described below, and to select a background image having similar characteristics as a background image for a job image. The apparatus 1 may calculate a face shape and a face pose based on, for example, information on feature points of a face used in age conversion. Also, the device 1 may calculate an ear shape, a forehead shape, a hair characteristic, etc. through, for example, an image segmentation technique.

상기 직업영상 생성 장치(1)는 상기 원본영상, 원본 나이 정보, 및 목표 나이 정보를 수신하면, 상기 원본영상을 나이변환 처리하여 상기 원본영상에 포함된 대상의 얼굴을 상기 대상이 목표 나이가 될 경우 가질 것으로 예상되는 얼굴을 갖는 나이변환 영상을 생성할 수 있다. When the occupational image generating device 1 receives the original image, the original age information, and the target age information, the age conversion process is performed on the original image so that the face of the target included in the original image becomes the target age. In this case, it is possible to generate an age-converted image having a face expected to have.

나이 변환부(60)는 입력영상에 연관된 나이(예컨대, 원본 나이) 및 목표 나이에 기초하여 입력영상을 나이변환 처리하는 동작을 수행하도록 구성된다. The age conversion unit 60 is configured to perform age conversion processing on the input image based on an age (eg, an original age) associated with the input image and a target age.

전술한 바와 같이, 상기 장치(1)에서 원본 나이는 (예컨대, 입력장치(10) 또는 송수신장치(20)를 통해) 획득된 대상의 나이 정보에 기초하여 결정되거나, 또는 원본 영상으로부터 자동으로 측정될 수 있다. As described above, the original age in the device 1 is determined based on the acquired age information of the subject (eg, through the input device 10 or the transceiver 20), or automatically measured from the original image. can be

일 실시예에서, 상기 장치(1)에서 목표 나이는 (예컨대, 입력장치(10) 또는 송수신장치(20)를 통해) 획득된 목표 나이 정보에 기초하여 결정될 수 있다. In an embodiment, the target age in the device 1 may be determined based on the acquired target age information (eg, through the input device 10 or the transceiver 20 ).

다른 일 실시예에서, 상기 장치(1)에서 목표 나이는 배경영상 별로 미리 설정될 수 있다. 예를 들어, 의사의 경우 변환될 나이가 40세, 경찰의 경우 25세와 같이, 직업별로 미리 설정될 수 있다. 직업군, 업무, 직종, 근속년수, 연봉, 복지 등 직업 관련 특성에 따라 해당 직업을 종사하는 사람의 나이는 상이하기 때문이다. 이 경우, 상기 배경영상을 선택하기 위한 입력이 수신되면, 상기 배경영상에 대하여 미리 설정된 변환 나이가 목표 나이로 결정된다. 이 경우, 목표 나이 정보에 대한 입력은 불필요하므로, 별도의 목표 나이에 대한 입력은 생략될 수 있다. In another embodiment, the target age in the device 1 may be preset for each background image. For example, the age to be converted may be preset for each occupation, such as 40 years old for a doctor and 25 years old for a police officer. This is because the age of a person engaged in the relevant occupation differs according to occupation-related characteristics such as occupation group, job, occupation, years of service, annual salary, and welfare. In this case, when an input for selecting the background image is received, a preset conversion age for the background image is determined as the target age. In this case, since input of target age information is unnecessary, a separate input of target age may be omitted.

상기 목표 나이는 각 직업의 평균 연령으로 설정될 수 있다. 상기 장치(1)는 직업 관련 특성을 내부 저장 장치에 저장하고 있거나, 외부로부터 상기 목표 나이를 수신함으로써, 목표 나이로의 얼굴 변환을 수행할 수 있다. The target age may be set as an average age of each job. The device 1 may perform face transformation into a target age by storing job-related characteristics in an internal storage device or by receiving the target age from the outside.

추가적으로, 상기 목표 나이는 직업별로 복수일 수 있다. 직업선택에 큰 영향을 미치는 요소로서 해당 직업을 장기적으로 종사할 수 있는지 여부이다. 이를 위해, 각 직업에 대하여, 예를 들어 제1 목표 나이, 제2 목표 나이, 제3 목표 나이가 설정될 수 있다. 여기서 제1 목표 나이는 해당 직업 종사자에서 직급, 경력이 낮은 사람들의 나이(예컨대, 신입사원 평균 나이, 또는 경력 3년 이하의 평균 나이)를 나타내고, 제3 목표 나이는 직급, 경력이 매우 높은 사람들의 나이(예컨대, 임원급의 평균 나이, 경력 20년 이상의 평균 나이)를 나타내고, 제2 목표 나이는 제1 목표 나이와 제3 목표 나이 사이에 속하는 종사자들에 대응하는 나이를 나타낸다. Additionally, the target age may be plural for each job. It is a factor that greatly influences the choice of a job, and whether the job can be engaged in the long term. To this end, for each job, for example, a first target age, a second target age, and a third target age may be set. Here, the first target age represents the age of people with low rank and experience (eg, the average age of new employees or the average age of 3 years or less) in the relevant occupational worker, and the third target age is those with very high rank and experience. represents the age (eg, an average age of an executive level, an average age of 20 years or more of experience), and the second target age represents an age corresponding to workers belonging to between the first target age and the third target age.

또한, 상기 장치(1)는 목표 나이가 직업별로 복수인 경우, 각각의 목표 나이에 연관된 배경 영상을 직업영상을 생성하기 위해 사용할 수 있다. 동일 직업 내에서도 경력에 따라 하는 일, 복장, 환경 등이 변화하기 때문이다. 예를 들어, 경찰이 직업으로 선택된 경우, 상기 제1 목표 나이에 연관된 배경은 상기 제1 목표 나이 대의 직급을 나타내는 복장, 상기 직급을 가질 때 임무를 주로 수행하는 장소 등을 표현하도록 구성된다. 이러한 주로 수행하는 장소, 직급 등을 포함한, 각 직업에 대한 직업 특성 데이터는 다양한 직업 정보에 기초하여 통계적으로 분석된 결과를 이용한다. Also, when there are a plurality of target ages for each job, the device 1 may use a background image associated with each target age to generate a job image. This is because, even within the same job, the job, clothes, environment, etc., change depending on the career. For example, when a police officer is selected as a job, the background related to the first target age is configured to represent an attire representing a rank in the first target age band, a place where a task is mainly performed when the rank is held, and the like. Occupational characteristic data for each job, including the location, position, and the like, performed mainly, uses statistically analyzed results based on various job information.

상기 원본 나이 또는 목표 나이는 다양하게 표현될 수 있다. 일 예에서, 두 나이 모두 출생을 기준으로 하는 나이(예컨대, 7세 또는 25세)로 표현될 수 있다. 다른 일 예에서, 두 나이 중 하나를 기준 나이로 (예컨대, 7세 및 상기 7세에서 18세 더 늙음) 표현될 수 있다. The original age or the target age may be expressed in various ways. In one example, both ages may be expressed as an age based on birth (eg, 7 years old or 25 years old). In another example, one of the two ages may be expressed as a reference age (eg, 7 years old and 7 to 18 years older).

이와 같이 목표 나이는 1년 단위로 구체적으로 표현된 값일 수 있으나, 본 발명의 목표 나이는 이에 제한되지 않는다. 다른 실시예에서, 상기 목표 나이는 서로 다른 복수의 나이로 이루어진 군집(예컨대, 20대, 30대, 40대 등과 같은 연령대)에 대응하는 나이로 설정될 수 있다. 상기 군집에 대응하는 나이는 군집 내 최대 값, 최소 값, 중간 값, 및 최빈 값, 사용자에 의해 정의된 대표 값 등일 수 있다. As described above, the target age may be a value specifically expressed in units of one year, but the target age of the present invention is not limited thereto. In another embodiment, the target age may be set as an age corresponding to a group (eg, age group such as 20's, 30's, 40's, etc.) consisting of a plurality of different age groups. The age corresponding to the cluster may be a maximum value, a minimum value, a median value, and a mode value within the cluster, a representative value defined by a user, and the like.

일 실시예에서, 상기 장치(1)는 나이 변환부(60)에 의한 얼굴 변환을 수행하기 이전에 또는 (예컨대, 입력장치(10) 또는 송수신장치(20)를 통해) 직업 선택 명령을 수신한 이후에, 해당 직업에 대하여 단일의 목표 나이 또는 복수의 목표 나이를 결정할 수 있다. In one embodiment, the device 1 receives a job selection command before performing face transformation by the age conversion unit 60 or (eg, via the input device 10 or the transceiver 20 ). Thereafter, a single target age or a plurality of target ages may be determined for the corresponding job.

일 실시예에서, 나이 변환부(60)는 얼굴 텍스쳐(texture) 및 모양(shape)에 기초하여 나이변환 동작을 수행한다. In an embodiment, the age conversion unit 60 performs an age conversion operation based on a facial texture and shape.

도 4는, 본 발명의 일 실시예에 따른, 나이 변환부의 동작을 설명하기 위한 도면이다. 4 is a view for explaining the operation of the age conversion unit, according to an embodiment of the present invention.

도 4를 참조하면, 나이 변환부(60)는: 원본얼굴에서 복수의 랜드마크를 추출하는 단계(S410), 미리 학습된 텍스쳐 변환 모델을 이용하여 원본 나이에서의 텍스쳐로부터 목표 나이에서의 텍스쳐를 생성하는 단계(S420); 미리 학습된 나이 함수를 이용하여 목표 나이에서의 형상을 생성하는 단계(S440); 및 상기 목표 나이에서의 텍스쳐 및 형상에 기초하여 상기 목표 나이에서의 얼굴을 생성하는 단계(S440)를 수행하도록 구성된다. Referring to FIG. 4 , the age conversion unit 60: extracting a plurality of landmarks from the original face (S410), using a pre-learned texture conversion model, from the texture at the original age to the texture at the target age generating (S420); generating a shape at the target age using the pre-learned age function (S440); and generating a face at the target age based on the texture and shape at the target age ( S440 ).

나이 변환부(60)는 나이변환 얼굴 영상을 생성하기 위해 원본영상에 포함된 원본얼굴로부터 랜드마크를 추출한다(S410). The age conversion unit 60 extracts a landmark from the original face included in the original image to generate an age-converted face image (S410).

단계(S410)에서, 얼굴의 랜드마크의 추출은 원본영상 내 원본얼굴에 해당하는 얼굴 영역을 결정한 이후에 수행될 수 있다. 이 경우, Haar, NN(Neural Network), SVM(Support Vector Machine), Gabor, SIFT 등과 같은, 영상으로부터 특정 영역을 검출하는 다양한 영역 검출 기법을 이용할 수 있으나, 이에 제한되진 않는다. In step S410, the extraction of the landmark of the face may be performed after determining the face region corresponding to the original face in the original image. In this case, various region detection techniques for detecting a specific region from an image, such as Haar, Neural Network (NN), Support Vector Machine (SVM), Gabor, and SIFT, may be used, but the present invention is not limited thereto.

상기 랜드마크는 눈, 코, 입, 귀 등과 같은 얼굴 해부학적 특징에 연관된 정보이다. 상기 랜드마크는 얼굴 내에서 일정한 상대 위치를 가지며, 얼굴 포즈에 따른 기하학적 관계의 변함이 적다. 여기서, 포즈는 얼굴의 표정, 또는 얼굴의 회전 방향, 기울임 각도 등을 나타낸다. The landmark is information related to facial anatomical features such as eyes, nose, mouth, and ears. The landmark has a constant relative position within the face, and the geometric relationship according to the face pose is less changed. Here, the pose indicates a facial expression, a rotation direction of the face, an inclination angle, or the like.

일 실시예에서, 상기 랜드마크는 눈의 중심, 코의 중심, 양 입 끝점, 얼굴 윤곽 중 간격이 가장 넓은 위치의 점, 턱 윤곽의 중심 등과 같이, 얼굴을 구별하기 위한 특성을 나타내는 점으로 추출될 수 있다. 예를 들어, 도 4에 도시된 바와 같이, 68개의 랜드마크 점을 포함한 랜드마크 세트가 추출될 수 있다. In one embodiment, the landmark is extracted as a point indicating characteristics for distinguishing the face, such as the center of the eyes, the center of the nose, the end points of both mouths, the point at the position with the widest interval among the face contours, the center of the chin contour, etc. can be For example, as shown in FIG. 4 , a landmark set including 68 landmark points may be extracted.

단계(S410)에서, 상기 랜드마크는 얼굴 영역에서 얼굴의 랜드마크를 검출할 수 있는 다양한 랜드마크 추출 알고리즘에 의해 추출될 수 있다. 상기 랜드마크 추출 알고리즘은, 예를 들어, ACM(Active Contour Model), ASM(Active Shape Model), AAM(Active Appearance model), SDM(Supervised Descent Method) 또는 뉴럴 네트워크 등을 포함하나, 이에 제한되지 않는다. 나이 변환부(60)는 얼굴 영역에서 얼굴의 랜드마크들을 검출할 수 있으나, 이에 제한되진 않는다. In step S410, the landmark may be extracted by various landmark extraction algorithms capable of detecting the landmark of the face in the face region. The landmark extraction algorithm includes, for example, Active Contour Model (ACM), Active Shape Model (ASM), Active Appearance model (AAM), Supervised Descent Method (SDM), or a neural network, but is not limited thereto. . The age converter 60 may detect facial landmarks in the face region, but is not limited thereto.

직업영상 생성 장치(1)는 나이 변환부(60)를 통해 원본얼굴의 랜드마크의 위치 정보 및 식별 정보를 더 획득할 수 있다. 여기서 식별 정보는 각 랜드마크가 정의하는 해부학적 얼굴 특징의 정의를 포함한다. 예를 들어, 제1 랜드마크의 식별정보는 눈의 중심을 의미하는 정보를 포함하고, 제2 랜드마크의 식별정보는 입의 왼쪽 끝 점을 의미하는 정보를 포함할 수 있다. The occupational image generating apparatus 1 may further acquire location information and identification information of the landmark of the original face through the age conversion unit 60 . Here, the identification information includes the definition of anatomical facial features defined by each landmark. For example, the identification information of the first landmark may include information indicating the center of the eye, and identification information of the second landmark may include information indicating the left end point of the mouth.

추출된 랜드마크로부터 2차 정보가 더 획득될 수 있다. 예를 들어, 나이 변환부(60)는 델로니 삼각형(Delaunay triangles)을 이용하여 랜드마크 사이의 거리 정보를 더 획득할 수 있다. Secondary information may be further obtained from the extracted landmark. For example, the age converter 60 may further acquire distance information between landmarks using Delaunay triangles.

나이 변환부(60)는 상기 랜드마크가 추출된 원본얼굴(즉, 단계(S410)의 얼굴)로부터 상기 목표 나이에서의 얼굴 텍스쳐(texture)를 생성한다(S420). The age conversion unit 60 generates a facial texture at the target age from the original face from which the landmark is extracted (that is, the face in step S410) (S420).

상기 목표 나이에서의 텍스쳐는 상기 대상이 목표 나이가 되었을 경우 예상되는 얼굴 텍스쳐이다. 얼굴 텍스쳐는 관심 영역의 질감을 나타내는 정보로서, 텍스쳐는 얼굴 구성요소 별로 고유하기 때문에, 나이 변환에 활용하기 적합한 요소이다. 일 실시예에서, 텍스쳐는 패턴 데이터로 표현될 수 있다. The texture at the target age is a facial texture expected when the subject reaches the target age. The facial texture is information representing the texture of the region of interest, and since the texture is unique for each facial component, it is a suitable factor to be used for age transformation. In an embodiment, the texture may be expressed as pattern data.

나이 변환부(60)는 상기 랜드마크가 추출된 원본얼굴의 텍스쳐(즉, 원본 나이에서의 텍스쳐)를 추출하고(S421), 상기 원본나이에서의 텍스쳐를 미리 학습된 텍스쳐 변환 모델에 적용하여 목표 나이에서의 얼굴 텍스쳐를 생성한다(S423).The age conversion unit 60 extracts the texture of the original face from which the landmark is extracted (that is, the texture at the original age) (S421), and applies the texture at the original age to the pre-learned texture transformation model. A face texture is generated at the age (S423).

일 실시예에서, 원본나이에서의 텍스쳐는 일반 텍스쳐(normal texture)일 수 있다. 일반 텍스쳐는 별도의 전처리가 적용되지 않은 영상으로부터 추출된 텍스쳐이다. In an embodiment, the texture at the original age may be a normal texture. A normal texture is a texture extracted from an image to which a separate preprocessing is not applied.

다른 일 실시예에서, 원본나이에서의 텍스쳐는 무모양 얼굴 텍스쳐일 수 있다. 나이 변환부(60)는 원본영상의 얼굴 모양을 추출하고, 상기 원본영상의 얼굴의 평균 모양을 산출하며, 상기 평균 모양을 갖는 얼굴의 텍스쳐를 무모양 얼굴 텍스쳐로 생성한다(S421). 모양에 대한 나이변환 처리는 아래의 단계(S440)에서 수행되기 때문이다. In another embodiment, the texture at the original age may be a shapeless face texture. The age converter 60 extracts the face shape of the original image, calculates the average shape of the face of the original image, and generates a face texture having the average shape as a shapeless face texture (S421). This is because the age conversion process for the shape is performed in the following step (S440).

일 실시예에서, 상기 텍스쳐 변환 모델은 각각의 특정 나이에서의 얼굴 텍스쳐를 출력하도록 모델링된, 복수의 서브 모델을 포함한다. 예를 들어, 상기 장치(1)가 나이변환 요청에 응답하여 미리 설정된 하나의 목표 나이로 (예컨대, 40세로) 나이변환 처리하도록 구성된 경우, 상기 텍스쳐 변환 모델은 40세에서의 얼굴 텍스쳐를 출력하도록 구성될 수 있다. 한편, 상기 장치(1)가 나이변환 요청에 응답하여, 상기 요청에 포함된 목표 나이(예컨대, 25세, 또는 60세 등)로 나이변환 처리하도록 구성된 경우, 상기 텍스쳐 변환 모델은 해당 목표 나이에서의 얼굴 텍스쳐를 출력하도록 구성된 서브 모델을 포함한다. 이 경우, 상기 장치(1)는 나이변환 요청을 수신하기 이전에, 변환 가능한 나이 범위에 대한 정보를 사용자에게 제공하도록 더 구성된다. In one embodiment, the texture transformation model includes a plurality of sub-models, each modeled to output a facial texture at a specific age. For example, when the device 1 is configured to process an age transformation to a preset target age (eg, to 40 years old) in response to an age transformation request, the texture transformation model is configured to output a face texture at the age of 40 can be configured. On the other hand, when the device 1 is configured to process the age transformation into a target age (eg, 25 years old, 60 years old, etc.) included in the request in response to the age transformation request, the texture transformation model is It contains a sub-model configured to output the face texture of In this case, the device 1 is further configured to provide the user with information about the age range that can be converted before receiving the age conversion request.

상기 텍스쳐 변환 모델(또는 서브 모델)은 영상을 생성하면서 하나의 클래스에 대하여 학습 가능한 기계 학습 모델이다. 상기 텍스쳐 변환 모델은, 예를 들어 도 4에 도시된 바와 같이 GAN(Generative Adversarial Network) 기반 모델일 수 있으나, 이에 제한되진 않는다. The texture transformation model (or sub-model) is a machine learning model capable of learning for one class while generating an image. The texture transformation model may be, for example, a Generative Adversarial Network (GAN)-based model as shown in FIG. 4 , but is not limited thereto.

이하 설명의 명료성을 위해, GAN 구조를 갖는, 특정 나이로 텍스쳐를 변환하도록 구성된 서브 모델을 이용하여 텍스쳐 변환 과정을 설명한다. For clarity of description, a texture transformation process will be described using a sub-model configured to transform a texture to a specific age having a GAN structure.

상기 GAN 구조를 갖는 서브 모델은 생성기(generator) 및 판별기(discriminator)를 포함한다. 상기 생성기는 입력 데이터에 노이즈를 적용하여 새로운 데이터를 출력하도록 구성된다. 상기 생성기는 실제 데이터와 유사한 데이터를 생성함으로써 판별기를 속여 그 유사한 데이터를 실제 데이터로 판별하게 하는 것을 목표로 가진다. 판별기는 상기 실제 데이터와 생성기의 출력 데이터를 식별하는 것을 목표로 가진다. The sub-model having the GAN structure includes a generator and a discriminator. The generator is configured to output new data by applying noise to the input data. The generator aims to trick the discriminator into discriminating the similar data as real data by generating data similar to real data. The discriminator aims to identify the actual data and the output data of the generator.

학습이 진행되면, 생성기와 판별기는 각각의 목표를 달성하기 위해 모델 내 파라미터를 갱신한다. 상기 판별기는 실수할 확률을 낮추기 위해 학습하고, 생성기는 임의의 노이즈로부터 출력한 데이터에 대해서 판별기가 실수할 확률을 높이기 위해 학습한다. 즉, 생성기와 판별기는 전술한 minimax problem을 풀기 위해 학습된다. As training progresses, the generator and discriminator update the parameters in the model to achieve their respective goals. The discriminator learns to reduce the probability of making a mistake, and the generator learns to increase the probability that the discriminator makes a mistake with respect to data output from random noise. That is, the generator and discriminator are trained to solve the aforementioned minimax problem.

이러한 학습 과정에서 판별기는 입력 값의 정답(즉, 훈련 데이터)로부터 피드백을 받고, 생성기는 판별기로부터 피드백을 받는다. 이러한 GAN 구조의 모델을 학습하는 과정은 비특허문헌 1(Goodfellow, Ian J.; Pouget-Abadie, Jean; Mirza, Mehdi; Xu, Bing; Warde-Farley, David; Ozair, Sherjil; Courville, Aaron; Bengio, Yoshua (2014). "Conditional Generative Adversarial Networks")에 개시되어 있는 바, 자세한 설명은 생략한다. In this learning process, the discriminator receives feedback from the correct answer of the input value (ie, training data), and the generator receives feedback from the discriminator. The process of learning such a GAN structure model is described in Non-Patent Document 1 (Goodfellow, Ian J.; Pouget-Abadie, Jean; Mirza, Mehdi; Xu, Bing; Warde-Farley, David; Ozair, Sherjil; Courville, Aaron; Bengio). , Yoshua (2014). "Conditional Generative Adversarial Networks"), a detailed description thereof will be omitted.

상기 텍스쳐 변환 모델은 복수의 텍스쳐 훈련 샘플을 이용하여 기계 학습된다. 각 텍스쳐 훈련 샘플은 특정 나이에서의 훈련 대상의 얼굴 텍스쳐를 각각 포함한다. 또한, 각 텍스쳐 훈련 샘플은 상기 훈련 대상의 나이를 나타내는 제1 레이블 데이터 및 훈련 대상의 성별을 나타내는 제2 레이블 데이터를 더 포함할 수 있다.The texture transformation model is machine-learned using a plurality of texture training samples. Each texture training sample includes a facial texture of a training target at a specific age, respectively. In addition, each texture training sample may further include first label data indicating the age of the training target and second label data indicating the gender of the training target.

일 실시예에서, 상기 텍스쳐 훈련 샘플은 상기 특정 나이를 갖는 훈련 대상의 얼굴로부터 획득된 무모양 얼굴 텍스쳐를 적어도 일부 포함한다. In an embodiment, the texture training sample includes at least a part of a shapeless facial texture obtained from a face of a training target having the specific age.

일 실시예에서, 상기 텍스쳐 변환 모델이 복수의 서브 모델을 포함할 경우, 상기 텍스쳐 변환 모델은 복수의 텍스쳐 훈련 샘플 세트를 이용하여 기계 학습된다. 각각의 서브 모델별 텍스쳐 훈련 샘플 세트는 서브 모델에 해당하는 특정 나이에서의 훈련 대상의 얼굴 텍스쳐를 포함한다. 일부 실시예에서, 각 텍스쳐 훈련 샘플 세트는 해당 나이를 갖는 훈련 대상의 얼굴로부터 획득한 무모양 얼굴 텍스쳐를 적어도 일부 포함한다.In an embodiment, when the texture transformation model includes a plurality of sub-models, the texture transformation model is machine-learned using a plurality of texture training sample sets. The texture training sample set for each sub-model includes a facial texture of a training target at a specific age corresponding to the sub-model. In some embodiments, each set of texture training samples includes at least a portion of shapeless facial textures obtained from the faces of a training subject having a corresponding age.

또한, 상기 일부 실시예에서, 각 텍스쳐 훈련 샘플은 특정 서브 모델에 해당하는 특정 나이를 나타내는 제1 레이블 데이터 및 훈련 대상의 성별을 나타내는 제2 레이블 데이터를 더 포함할 수 있다. In addition, in some embodiments, each texture training sample may further include first label data indicating a specific age corresponding to a specific sub-model and second label data indicating the gender of the training target.

이와 같이 각 서브 모델별로 사용되는 각각의 텍스쳐 훈련 샘플 세트는 동일한 특정 나이를 갖는 훈련 대상의 정보를 포함하므로, 각 서브 모델은 동일한 특정 나이를 갖는 훈련 샘플을 통해 학습된다. 따라서, 각 서브 모델은 해당 특정 나이에 연관되며, 결국 상기 장치(1)는 목표 나이에 연관된 서브 모델을 검색할 수 있다. As described above, since each texture training sample set used for each sub-model includes information on a training target having the same specific age, each sub-model is learned through a training sample having the same specific age. Accordingly, each sub-model is associated with a corresponding specific age, and eventually the device 1 may search for a sub-model associated with the target age.

한편, 각 서브 모델은 소정 범위를 갖는 연령대의 나이에 연관되도록 학습될 수 있다. 이 경우, 각 서브 모델에 사용되는 각각의 훈련 샘플 세트는 해당 연령대에 속하는 훈련 대상의 정보를 사용하여 학습된다. Meanwhile, each sub-model may be trained to be related to the age of an age group having a predetermined range. In this case, each training sample set used for each sub-model is learned using information of a training target belonging to the corresponding age group.

상기 GAN 구조를 갖는 서브 모델이 특정 나이에서의 복수의 텍스쳐 훈련 샘플을 포함한 텍스쳐 훈련 샘플 세트로 학습되는 경우, 복수의 텍스쳐 훈련 샘플 중 적어도 일부가 실제 데이터로 사용될 수 있다. When the sub-model having the GAN structure is trained with a texture training sample set including a plurality of texture training samples at a specific age, at least some of the plurality of texture training samples may be used as real data.

충분히 학습되어 학습 완료된 생성기는 입력 데이터와 최대한 가까운 데이터를 출력하도록 구성되므로, 상기 미리 학습된 서브 모델은 목표 나이에서의 텍스쳐를 출력할 수 있다. 여기서, 가까운 출력 데이터는 목표 나이를 실제로 갖는 사람의 얼굴 텍스쳐와 (예컨대, 특징 벡터 등의) 벡터 간격이 최소가 되는 데이터이다. 상기 생성기는 입력 데이터를 특정 나이에서의 데이터의 분포를 갖도록 변환시키고, 변환된 데이터를 출력한다. Since the sufficiently learned and learned generator is configured to output data as close as possible to the input data, the pre-trained sub-model may output a texture at a target age. Here, the close output data is data in which a vector interval (eg, a feature vector) from a facial texture of a person who actually has a target age is minimized. The generator transforms input data to have a distribution of data at a specific age, and outputs the transformed data.

따라서, 목표 나이가 특정 나이인 경우, 생성기에서 출력되는 특정 나이에서의 변환 텍스쳐를 목표 나이에서의 텍스쳐로 사용할 수 있다.Accordingly, when the target age is a specific age, the transformed texture at the specific age output from the generator may be used as the texture at the target age.

일부 실시예에서, 무모양(shape free) 얼굴 텍스쳐를 텍스쳐 훈련 샘플로 사용한 경우, 상기 생성기는 목표 나이에서의 무모양 얼굴 텍스쳐에 가까운, 변환된 무모양 얼굴 텍스쳐를 출력한다. In some embodiments, when a shape free face texture is used as a texture training sample, the generator outputs a transformed shapeless face texture close to the shape free face texture at a target age.

다른 일 실시예에서, 상기 텍스쳐 변환 모델(또는 서브 모델)은 cycleGAN 기반 모델일 수 있다. In another embodiment, the texture transformation model (or sub-model) may be a cycleGAN-based model.

cycleGAN은 쌍을 이루지 않는 데이터로 학습되는 모델이다. CycleGAN은 두 생성기를 포함하며, 각각의 생성기는 서로 다른 도메인으로 변환한 데이터를 출력한다. 이를 위해, 각 생성기의 노이즈는 서로 상이할 수 있다. cycleGAN의 판별기는 각 생성기가 출력한 상이한 도메인의 데이터를 각각 식별하는 것을 목적으로 한다. cycleGAN is a model that is trained on unpaired data. CycleGAN includes two generators, and each generator outputs data converted to different domains. To this end, the noise of each generator may be different from each other. The discriminator of cycleGAN aims to identify the data of different domains output by each generator, respectively.

cycleGAN 구조를 갖는 서브 모델은 두 개의 생성기를 가지므로, 순환 일관성(cycle consistency)에 더 기초하여 학습된다. 상기 순환 일관성은 제1 도메인에서 제2 도메인으로 변환된 데이터를 다시 제1 도메인으로 변환하면, 이전에 제1 도메인에서 생성된 영상으로 변환되어야 한다는 것이다.Since the submodel with the cycleGAN structure has two generators, it is further trained based on cycle consistency. The cyclic coherence is that when data converted from the first domain to the second domain is converted back to the first domain, it must be converted into an image previously generated in the first domain.

이와 같이, cycleGAN 기반 텍스쳐 변환 모델을 이용할 경우, 쌍을 이루는 데이터 세트를 준비할 필요가 없게 되어, 얼굴의 나이 변환 모델을 보다 쉽게 모델링할 수 있다. In this way, when using the cycleGAN-based texture transformation model, there is no need to prepare a paired data set, so it is easier to model the age transformation model of the face.

상기 cycleGAN 기반 텍스쳐 변환 모델(또는 서브 모델)의 학습 과정 및 학습에 사용되는 훈련 샘플은 상기 GAN 기반 텍스쳐 변환 모델(또는 서브 모델)과 유사하므로 자세한 설명은 생략한다. Since the training process of the cycleGAN-based texture transformation model (or sub-model) and the training sample used for learning are similar to the GAN-based texture transformation model (or sub-model), a detailed description will be omitted.

단계(S421) 이후, 나이 변환부(60)는 상기 목표 나이에 대하여 미리 학습된 텍스쳐 변환 모델에 상기 무모양 얼굴 텍스쳐를 입력 데이터로 적용하여 변환된 목표 나이에서의 무모양 얼굴 텍스쳐를 생성한다(S423). After step S421, the age conversion unit 60 applies the shapeless face texture as input data to the texture transformation model learned in advance for the target age to generate a shapeless face texture at the converted target age ( S423).

일 실시예에서, 나이 변환부(60)는 특정 목표 나이를 수신한 뒤, 미리 학습된 텍스쳐 변환 모델에 포함된, 다수의 특정 나이에 대하여 각각 학습된 서브 모델 중에서 상기 목표 나이에 대하여 미리 학습된 서브 모델을 검색한다. In an embodiment, the age conversion unit 60 receives a specific target age, and then performs pre-trained for the target age from among a plurality of sub-models each learned for a specific age included in the pre-trained texture transformation model. Search for submodels.

그러면, 나이 변환부(60)는 검색된 서브 모델에 랜드마크가 추출된 원본영상의 무표정 얼굴 텍스쳐를 입력 데이터로 적용하여, 목표 나이에서의 무모양 얼굴 텍스쳐를 생성한다(S423).Then, the age conversion unit 60 applies the expressionless face texture of the original image from which the landmark is extracted to the searched sub-model as input data to generate a shapeless face texture at the target age (S423).

한편, 위에서는 복수의 서브 모델을 포함한 텍스쳐 변환 모델을 사용하는 실시예로 목표 나이에서의 얼굴 텍스쳐를 생성하는 과정을 설명하였으나, 본 발명의 목표 나이에서의 얼굴 텍스쳐를 생성하는 과정은 이에 제한되지 않는다. Meanwhile, in the above, the process of generating a facial texture at a target age has been described as an embodiment using a texture transformation model including a plurality of sub-models, but the process of generating a facial texture at a target age of the present invention is not limited thereto. does not

이와 같이 본 발명의 일부 실시예에서 텍스쳐 변환 모델은 두 도메인으로 변환이 가능한 GAN 기반 또는 cycleGAN 기반 서브 모델을 복수 개 포함하도록 구성될 수 있다. As such, in some embodiments of the present invention, the texture transformation model may be configured to include a plurality of GAN-based or cycleGAN-based sub-models that can be transformed into two domains.

한편, 다른 일 실시예에서 텍스쳐 변환 모델은 단일 모델 구조로서, 복수의 나이에서의 얼굴 텍스쳐를 출력하도록 구성될 수 있다. 일부 실시예에서, 상기 복수의 나이는 복수의 군집에 대응하는 나이일 수 있다. 상기 복수의 군집은 연령대(예컨대, 20대, 30대, 40대 등)일 수 있다. 단일 모델 구조의 텍스쳐 변환 모델은 입력 데이터를 복수의 연령대 나이에서의 얼굴 텍스쳐로 변환할 수 있다. 이 경우, 단일 모델 구조의 텍스쳐 변환 모델은, 20대에 대응하는 나이(예컨대, 25세)에서의 얼굴 텍스쳐, 30대에 대응하는 나이(예컨대, 35세)에서의 얼굴 텍스쳐, 40대에 대응하는 나이(예컨대, 45세)에서의 얼굴 텍스쳐 등을 생성할 수 있다. Meanwhile, in another embodiment, the texture transformation model may be configured to output facial textures at a plurality of ages as a single model structure. In some embodiments, the plurality of ages may be ages corresponding to the plurality of clusters. The plurality of clusters may be age groups (eg, 20's, 30's, 40's, etc.). The texture conversion model having a single model structure may convert input data into facial textures for a plurality of age groups. In this case, the texture transformation model of the single model structure corresponds to a face texture at an age corresponding to the twenties (eg, 25 years old), a facial texture at an age corresponding to the thirties (eg 35 years old), and a 40's. It is possible to generate a facial texture and the like at the age (eg, 45 years of age).

상기 단일 모델 구조의 텍스쳐 변환 모델은 복수의 나이 각각에 대응하는 복수의 훈련 샘플 세트를 통해 학습된다. 각 훈련 샘플 세트는 특정 나이를 갖는 훈련 대상의 얼굴 영상, 및 상기 훈련 대상의 나이를 나타내는 제1 레이블 데이터 및 훈련 대상의 성별을 나타내는 제2 레이블 데이터를 포함할 수 있다. 상기 복수의 나이가 복수의 군집에 대응하는 나이일 경우, 각 훈련 샘플 세트는 해당 군집에 속하는 훈련 대상의 얼굴 영상을 포함한다. The texture transformation model of the single model structure is learned through a plurality of training sample sets corresponding to each of a plurality of ages. Each training sample set may include a face image of a training target having a specific age, and first label data indicating the age of the training target and second label data indicating the gender of the training target. When the plurality of ages correspond to the plurality of clusters, each training sample set includes a face image of a training target belonging to the corresponding cluster.

일 실시예에서, 상기 단일 모델 구조의 텍스쳐 변환 모델은 입력 데이터를 복수의 도메인으로 변환하는 기계 학습 모델로 구성되고, 이를 위한 알고리즘을 통해 학습된다. 예를 들어, 상기 단일 모델 구조의 텍스쳐 변환 모델은 conditional GAN(cGAN) 기반 모델일 수 있다. 상기 cGAN 기반 모델은 GAN 기반 모델과 유사하나, 생성기와 판별기에 특정 조건(condition)을 나타내는 정보(y)를 가해지는 점이 특징이다. GAN 구조에서 생성기는 노이즈를 적용하여 변환한다. cGAN 구조에서 생성기는 노이즈 및 정보(y)를 적용하여 변환한다. 여기서 상기 정보(y)는 복수의 나이 각각을 나타내는 클래스 라벨(class label)일 수 있다. 예를 들어, 연령대를 각각 나타내는 클래스 라벨일 수 있다. In an embodiment, the texture transformation model of the single model structure is configured as a machine learning model that transforms input data into a plurality of domains, and is learned through an algorithm for this. For example, the texture transformation model of the single model structure may be a conditional GAN (cGAN)-based model. The cGAN-based model is similar to the GAN-based model, but is characterized in that information (y) indicating a specific condition is added to the generator and the discriminator. In the GAN architecture, the generator applies noise to transform it. In the cGAN architecture, the generator applies noise and information (y) to transform it. Here, the information y may be a class label indicating each of a plurality of ages. For example, it may be a class label representing each age group.

상기 cGAN 기반 모델을 학습하는 과정은 비특허문헌 2(Mehdi Mirza, Simon Osindero, (2014) “Generative Adversarial Nets”)에 의해 개시되어 있으므로, 본 명세서에서 자세한 설명은 생략한다. Since the process of learning the cGAN-based model is disclosed by Non-Patent Document 2 (Mehdi Mirza, Simon Osindero, (2014) “Generative Adversarial Nets”), a detailed description thereof will be omitted herein.

일 실시예에서, 나이 변환부(60)는 단계(S421)에서 랜드마크가 추출된 원본영상의 무표정 얼굴 텍스쳐를 입력 데이터로 단일 모델 구조의 텍스쳐 변환 모델에 적용한다. 상기 단일 모델 구조의 텍스쳐 변환 모델은 복수의 나이에서의 얼굴 텍스쳐를 생성한다. 나이 변환부(60)는 복수의 나이에서의 얼굴 텍스쳐 중 목표 나이에서의 얼굴 텍스쳐를 선택한다(S423). 이를 위해, 상기 장치(1)는 목표 나이를 미리 수신한다. 나이 변환부(60)는 상기 목표 나이에 매칭하는 나이에서의 얼굴 텍스쳐를 나이변환 영상을 생성하는데 사용한다. In one embodiment, the age conversion unit 60 applies the expressionless face texture of the original image from which the landmark is extracted in step S421 as input data to the texture conversion model of a single model structure. The texture transformation model of the single model structure generates facial textures at a plurality of ages. The age conversion unit 60 selects a facial texture at a target age from among facial textures at a plurality of ages (S423). To this end, the device 1 receives the target age in advance. The age conversion unit 60 uses a facial texture at an age matching the target age to generate an age conversion image.

또한, 나이 변환부(60)는 상기 랜드마크가 추출된 원본얼굴로부터 상기 목표 나이에서의 얼굴 모양(shape)을 생성한다(S440). Also, the age conversion unit 60 generates a face shape at the target age from the original face from which the landmark is extracted (S440).

이를 위해, 나이 변환부(60)는 상기 원본영상의 얼굴(즉, 원본얼굴)의 랜드마크에 기초하여 상기 원본영상의 얼굴 모양 특징을 추출한다(S431). To this end, the age converter 60 extracts the facial shape features of the original image based on the landmarks of the face (ie, the original face) of the original image (S431).

상기 얼굴 모양 특징은 얼굴 모양과 관련된 특징으로서, 영상으로부터 모양과 관련된 특징을 추출하기 위한 다양한 특징 추출 알고리즘을 이용하여 추출된다. 상기 특징 추출 알고리즘은, 예를 들어, PCA를 포함하나, 이에 제한되진 않는다. The face shape feature is a feature related to a face shape and is extracted using various feature extraction algorithms for extracting shape related features from an image. The feature extraction algorithm includes, but is not limited to, for example, PCA.

추출된 얼굴 모양 특징은 다양한 유형의 값으로 표현될 수 있다. 일 실시예에서, 상기 얼굴 모양 특징은 특징 벡터로 추출될 수 있다. N개의 얼굴 모양 특징이 추출되는 경우, 상기 특징 벡터는 N차원으로 구성된다. The extracted face shape feature may be expressed as various types of values. In an embodiment, the facial shape feature may be extracted as a feature vector. When N facial shape features are extracted, the feature vector is configured in N dimensions.

나이 변환부(60)는 상기 원본영상의 얼굴 모양 특징 및 미리 학습된 모양 변환 모델을 이용하여, 요청나이에서의 얼굴 모양 특징을 생성한다. 상기 모양 변환 모델은 원본나이에서의 얼굴 모양 특징을 적용하면 요청나이에서의 얼굴 모양 특징을 출력하도록 미리 학습된다.The age conversion unit 60 uses the facial shape features of the original image and a pre-learned shape transformation model to generate facial shape features at the requested age. The shape transformation model is pre-trained to output the facial shape features at the requested age when the facial shape features at the original age are applied.

모양 변환 모델은 나이와 해당 나이에서의 얼굴 모양 특징 간의 관계를 모델링하여 생성되었다. The shape transformation model was created by modeling the relationship between age and facial shape features at that age.

일 실시예에서, 상기 모양 변환 모델은 상기 목표 나이에서의 나이 함수 값과 상기 원본나이에서의 나이 함수 값 간의 차이 및 상기 원본영상의 얼굴 모양 특징(예컨대, 단계(S431)의 얼굴 모양 특징)에 기초하여 상기 목표 나이에서의 대상의 얼굴 모양 특징을 산출하도록 모델링된다. In one embodiment, the shape transformation model is based on the difference between the age function value at the target age and the age function value at the original age and the facial shape feature of the original image (eg, the facial shape feature of step S431). modeled to calculate the facial shape features of the subject at the target age based on the

예를 들어, 상기 모양 변환 모델은, N개의 차원의 얼굴 모양 특징이 추출되는 경우, 아래의 수학식으로 표현될 수 있다.For example, the shape transformation model may be expressed by the following equation when N-dimensional facial shape features are extracted.

여기서, i는 N이하의 자연수로서, a_i ^new는 목표 나이에서의 제i 얼굴 모양 특징(예컨대, 특징 벡터 값)을 나타내고, a_i ^org는 상기 원본 나이에서의 제i 얼굴 모양 특징(예컨대, 특징 벡터 값)을 나타내며, age_new는 목표 나이, age_org는 원본 나이, f_i ^ap 는 제i 얼굴 모양 특징(i-th facial shape feature)에 대한 나이 함수를 나타낸다. Here, i is a natural number less than or equal to N, where a _i ^new represents the i-th facial shape feature (eg, a feature vector value) at the target age, and a _i ^org is the i-th facial shape feature at the original age (eg, feature vector value), age _new is the target age, age _org is the original age, and f _i ^ap is the age function for the i-th facial shape feature.

도 5는, 본 발명의 일 실시예에 따른, 얼굴 모양 특징에 대한 나이함수를 결정하는 과정을 설명하기 위한 도면이다. 5 is a diagram for explaining a process of determining an age function for a facial shape feature, according to an embodiment of the present invention.

도 5를 참조하면, N차원의 얼굴 모양 특징이 추출된 경우, 각각의 얼굴 모양 특징에 대한 나이 함수는 해당 얼굴 모양 특징을 갖는 훈련 대상(training subject)의 나이와의 관계를 모델링하여 결정된다. Referring to FIG. 5 , when N-dimensional facial shape features are extracted, an age function for each facial shape feature is determined by modeling a relationship with the age of a training subject having the corresponding facial shape feature.

즉, 복수의 얼굴 모양 특징에 대한 나이 함수는 복수의 모양 훈련 샘플 세트를 이용하여 각각 학습된다. 각각의 모양 훈련 샘플 세트는 훈련 대상의 얼굴 모양 특징, 및 상기 훈련 대상의 나이를 레이블 데이터로 포함한 해당 나이에서의 복수의 모양 훈련 샘플을 포함한다. That is, the age functions for a plurality of face shape features are respectively learned using a plurality of shape training sample sets. Each shape training sample set includes a facial shape feature of the training target and a plurality of shape training samples at the corresponding age including the training target's age as label data.

여기서, 복수의 모양 훈련 샘플 세트에 연관된 나이는 도 4에 도시된 바와 같이 1세 내지 80세 사이의 나이를 갖는 사람과 같이, 다양한 나이를 포함한다. Here, the age associated with the plurality of shape training sample sets includes various ages, such as a person having an age between 1 and 80 years as shown in FIG. 4 .

상기 훈련 대상의 얼굴 모양 특징은 해당 나이에서의 훈련 대상의 얼굴 영상으로부터 획득한 얼굴 모양 특징이다. 일 실시예에서, 상기 얼굴 모양 특징은 모양과 관련된 특징 벡터로서, N차원(여기서, N은 1 이상의 정수)의 특징 벡터를 포함할 수 있다. The facial shape feature of the training target is a facial shape feature obtained from a facial image of the training target at a corresponding age. In an embodiment, the facial shape feature is a feature vector related to a shape, and may include an N-dimensional feature vector (where N is an integer greater than or equal to 1).

도 5에 도시된 제1 모양 훈련 샘플은 제1 훈련 대상의 나이 정보(도 5의 3세), 및 제1 훈련 대상의 얼굴 모양 특징({a₁, a₂, a₃, …a_M})을 포함한다. 제2 모양 훈련 샘플은 제2 훈련 대상의 나이 정보(도 4의 3세) 및 제2 훈련 대상의 얼굴 모양 특징({a₁, a₂, a₃, …a_M})을 포함한다. The first shape training sample shown in FIG. 5 includes age information (3 years old in FIG. 5) of the first training target, and facial shape features ({a ₁ , a ₂ , a ₃ , ...a _M } of the first training target. ) is included. The second shape training sample includes age information (3 years old in FIG. 4 ) of the second training target and facial shape features ({a ₁ , a ₂ , a ₃ , ...a _M }) of the second training target.

각 얼굴 모양 특징에 대한 나이 함수는 각각의 모양 특징에 있어서 훈련 대상의 나이 정보의 분포에 기초하여 결정된다. 따라서, 모양 변환 모델에 포함된 나이 함수는 모양 특징의 차원에 의존한다. The age function for each face shape feature is determined based on the distribution of age information of the training target for each shape feature. Therefore, the age function included in the shape transformation model depends on the dimension of the shape feature.

각 모양 특징 벡터에 대한 나이 함수는 규칙성에 대한 정보가 없는 분포된 정보에서 규칙성을 결정하는 다양한 적합 알고리즘(fitting algorithm)을 이용하여 결정된다. The age function for each shape feature vector is determined using various fitting algorithms that determine regularity from distributed information without information on regularity.

예를 들어, i번째 모양 특징에 대한 나이 함수(즉, 제i 나이 함수)는 3D 다항 적합(polynomial fitting) 알고리즘에 의해 근사화되나, 이에 제한되진 않는다. 근사화에 의해 결정된 각각의 나이 함수는 도 5에 도시된 바와 같이, 특징 구성요소-나이 그래프 도면(plot)에서 연속선으로 표현된다. For example, the age function for the i-th shape feature (ie, the i-th age function) is approximated by a 3D polynomial fitting algorithm, but is not limited thereto. Each age function determined by the approximation is represented by a continuous line in a feature component-age graph plot, as shown in FIG. 5 .

그 결과, 제1 모양 특징(a₁)에 대한 복수의 훈련 대상의 나이 분포로부터 도 5에 도시된, 제1 모양 특징(a₁)에 대한 나이 함수가 결정된다. 또한, 제3 모양 특징(a₃)에 대한 복수의 훈련 대상의 나이 분포로부터 도 5에 도시된, 제3 모양 특징(a₃)에 대한 나이 함수가 결정된다.As a result, the old function of the first shaped features (a _1), the first shaped features (a ₁₎ shown in Figure 5 from the age distribution of the plurality of training target to be determined. In addition, the old function of the three shape features (a _3), the third shape characteristics (a ₃₎ shown in Figure 5 from the age distribution of the plurality of training target to be determined.

이와 같이, 전체 m개의 나이 함수가 목표 나이에서의 m개의 특징 구성요소를 위해 훈련된다.As such, a total of m age functions are trained for the m feature components at the target age.

도 6은, 도 5의 과정에 의해 훈련된 제2 모양 특징에 대한 나이 함수를 도시한 도면이다. FIG. 6 is a diagram illustrating an age function for a second shape feature trained by the process of FIG. 5 .

도 5의 나이 함수 생성 과정에 의해, 제2 모양 특징(a₂)에 대한 나이 함수가 결정될 수 있다. By the process of generating the age function of FIG. 5 , an age function for the second shape feature a ₂ may be determined.

도 6 및 수학식 1을 참조하면, 상기 원본영상에 연관된 나이가 10세이고, 요청나이가 80세인 경우, 80세에서의 제2 모양 특징에 대한 나이 함수 값은 f₂ ^ap(80)이고, 10세에서의 제2 모양 특징에 대한 나이 함수 값은 f₂ ^ap(10)이다. 6 and Equation 1, when the age associated with the original image is 10 years old and the requested age is 80 years old, the age function value for the second shape feature at 80 years old is f ₂ ^ap (80), 10 The age function value for the second shape feature at age is f ₂ ^ap (10).

그러면, 미리 학습된 모양 변환 모델은 도 6의 제2 모양 특징에 대한 나이 함수(f₂ ^ap)와 상기 원본영상의 얼굴로부터 추출한 제2 모양 특징(a₂ ¹⁰)에 기초하여 상기 80세에서의 제2 모양 특징에 대한 출력 값(a₂ ⁸⁰)을 산출할 수 있다. Then, the pre-trained shape transformation model is based on the age function (f ₂ ^ap _{) for the second shape feature of FIG. 6 and the second shape feature (a 2} ¹⁰ ) extracted from the face of the original image at the age of 80. An output value (a ₂ ⁸⁰ ) of the second shape feature may be calculated.

이와 같이, 나이 변환부(60)는 미리 학습된 모양 변환 모델을 이용하여 상기 목표 나이에서의 얼굴 모양 특징(예컨대, 제1 내지 제N 얼굴 모양 특징 세트)를 산출할 수 있다(S433). As such, the age conversion unit 60 may calculate the facial shape features (eg, first to Nth facial shape feature sets) at the target age by using the pre-learned shape transformation model ( S433 ).

나이 변환부(60)는 모양 변환 모델에 의해 출력된 목표 나이에서의 얼굴 모양 특징을 복원하여 상기 목표 나이에서의 얼굴 모양을 생성할 수 있다. The age converter 60 may restore the facial shape features at the target age output by the shape transformation model to generate the face shape at the target age.

예를 들어, 목표 나이에서의 얼굴 모양 특징 벡터 세트가 출력된 경우, 각각의 얼굴 모양 특징 벡터 값을 복원함으로써 상기 목표 나이에서의 얼굴 모양을 생성한다(S435). For example, when the facial shape feature vector set at the target age is output, the facial shape at the target age is generated by restoring each facial shape feature vector value ( S435 ).

나이 변환부(60)는 상기 목표 나이에서의 얼굴 모양을 생성하기 위해, 다양한 복원 알고리즘을 이용할 수 있다. 상기 복원 알고리즘은, 예를 들어 PCA(Principal Component Analysis) 기반 복원 알고리즘을 포함할 수 있으나, 이에 제한되진 않는다.The age converter 60 may use various restoration algorithms to generate the face shape at the target age. The restoration algorithm may include, for example, a principal component analysis (PCA)-based restoration algorithm, but is not limited thereto.

나이 변환부(60)는 목표 나이에서의 얼굴 모양에 목표 나이에서의 얼굴 텍스쳐를 와핑(warping)하여, 상기 목표 나이에서의 얼굴을 나이변환 얼굴로서 생성한다(S440). The age conversion unit 60 warps the face texture at the target age to the shape of the face at the target age to generate the face at the target age as an age conversion face (S440).

전술한 실시예들에서 상기 모양 변환 모델 및 텍스쳐 변환 모델은 대상의 나이 보다 목표 나이가 더 많은 방향의 나이 변환에 대해서만 서술되었으나, 이에 제한되지 않는다. In the above-described embodiments, the shape transformation model and the texture transformation model have been described only for the age transformation in a direction in which the target age is greater than the age of the object, but is not limited thereto.

모양 변환 모델의 나이 함수는 연속적이고 양방향의 나이 변환이 가능하도록 구성되기 때문에, 대상의 나이가 목표 나이 보다 어린 방향으로 나이변환된 얼굴 모양을 획득할 수 있다. 또한, 대상의 나이 및 목표 나이에 대한 서브 모델이 미리 학습된 경우, 어린 방향으로 나이변환된 얼굴 텍스쳐도 획득할 수 있다. Since the age function of the shape transformation model is configured to enable continuous and bi-directional age transformation, it is possible to obtain an age-transformed face shape in a direction in which the age of the subject is younger than the target age. In addition, when the sub-model for the target age and the target age is previously learned, the age-converted face texture in the young direction may also be obtained.

전술한 바와 같이, 배경영상은 사용자의 입력에 기초하여 선택되거나, 또는 원본영상 및/또는 아래의 나이변환 영상에 기초하여 자동으로 결정된다.As described above, the background image is selected based on the user's input, or is automatically determined based on the original image and/or the age-converted image below.

영상 합성부(70)는 상기 배경영상의 얼굴영역에 상기 목표 나이에서의 대상의 얼굴을 합성하여, 상기 대상이 목표 나이에 해당 직업을 가질 경우를 나타내는 직업영상을 생성한다. 즉, 영상 합성부(70)는 나이변환 영상에 포함된, 목표 나이에서의 대상의 얼굴영상 및 직업이 표현된 배경영상에 기초하여 상기 대상의 직업영상을 생성할 수 있다. The image synthesizing unit 70 synthesizes the face of the target at the target age with the face region of the background image, and generates a job image indicating a case in which the target has a corresponding job at the target age. That is, the image synthesizing unit 70 may generate a job image of the target based on a background image in which a face image and a job of the target at the target age are expressed, which are included in the age conversion image.

선택된 배경영상의 얼굴과 목표 나이에서의 얼굴은 크기, 골격과 같은 신체 구조 특성의 차이, 또는 얼굴 각도, 방향 등 포즈의 차이를 가질 수 있다. 이 경우, 배경영상의 얼굴에 목표 나이에서의 얼굴을 그대로 합성할 경우 사용자가 보기에 부자연스러운 직업영상이 생성된다. 영상 합성부(70)는 배경영상의 얼굴과 목표 나이에서의 얼굴을 매칭시킴으로써 자연스러운 직업영상을 생성하도록 구성된다. The face of the selected background image and the face at the target age may have differences in body structural characteristics such as size and skeleton, or differences in poses such as face angle and direction. In this case, when the face of the target age is synthesized as it is with the face of the background image, an unnatural job image is generated for the user. The image synthesizing unit 70 is configured to generate a natural job image by matching the face of the background image with the face at the target age.

일 실시예에서, 영상 합성부(70)는 배경영상의 얼굴 및/또는 목표 나이에서의 얼굴에서 랜드마크를 추출할 수 있다. 직업영상 생성 장치(1)는 영상 합성부(70)를 통해 배경영상의 얼굴의 랜드마크의 위치 정보 및 식별 정보, 목표 나이에서의 얼굴의 랜드마크의 위치 정보 및 식별 정보를 더 획득할 수 있다. 여기서 식별 정보는 각 랜드마크가 의미하는 해부학적 얼굴 특징을 포함한다. In an embodiment, the image synthesizing unit 70 may extract a landmark from the face of the background image and/or the face at the target age. The occupational image generating apparatus 1 may further acquire location information and identification information of the landmark of the face of the background image, and location information and identification information of the landmark of the face at the target age through the image synthesizing unit 70 . . Here, the identification information includes anatomical facial features that each landmark means.

영상 합성부(70)의 랜드마크 추출 동작은 나이 변환부(60)의 랜드마크 추출 동작과 유사하게 수행되므로, 자세한 설명은 생략한다. Since the landmark extraction operation of the image synthesizing unit 70 is performed similarly to the landmark extraction operation of the age conversion unit 60 , a detailed description thereof will be omitted.

또한, 영상 합성부(70)는 영상 합성을 위해, 랜드마크 간의 매핑, 얼굴 모양의 와핑, 합성 영역의 이식 등 다양한 동작을 수행하도록 구성된다In addition, the image synthesizing unit 70 is configured to perform various operations, such as mapping between landmarks, warping of a face shape, and transplantation of a synthesizing region, for image synthesizing.

영상 합성부(70)는 나이변환 영상의 얼굴의 랜드마크를 배경영상의 얼굴 영역에 매핑하도록 구성된다. The image synthesizing unit 70 is configured to map the landmark of the face of the age conversion image to the face region of the background image.

일 실시예에서, 영상 합성부(70)는 나이변환 영상의 얼굴의 랜드마크(shp_sim)가 의미하는 해부학적 얼굴 특징에 기초하여 배경영상의 얼굴 영역에 매핑을 수행할 수 있다. 영상 합성부(70)는 동일한 해부학적 얼굴 특징을 의미하는 배경영상의 랜드마크와 목표 나이에서의 얼굴의 랜드마크를 각각 매핑한다. 예를 들어, 나이변환 영상의 얼굴(즉, 목표 나이에서의 대상의 얼굴)에서 추출된 입술의 왼쪽 끝에 해당하는 랜드마크는 배경영상의 얼굴에서 추출된 입술의 왼쪽 끝에 해당하는 랜드마크에 매핑된다.In an embodiment, the image synthesizing unit 70 may perform mapping on the face region of the background image based on anatomical facial features that the facial landmark shp_sim of the age conversion image means. The image synthesizing unit 70 maps the landmark of the background image indicating the same anatomical facial feature and the landmark of the face at the target age, respectively. For example, the landmark corresponding to the left end of the lips extracted from the face of the age conversion image (that is, the face of the target at the target age) is mapped to the landmark corresponding to the left end of the lips extracted from the face of the background image. .

일 실시예에서, 영상 합성부(70)는 각 랜드마크 간의 위치를 적어도 일부 최소화함으로써, 상기 나이변환 영상의 얼굴의 랜드마크를 배경영상의 얼굴 영역에 매핑할 수 있다. 매핑 이후 배경영상의 얼굴 영역은 기존의 랜드마크(shp_bg) 세트 및 매핑된 랜드마크(shp_sim_t) 세트를 포함한다. In an embodiment, the image synthesizing unit 70 may map the landmark of the face of the age conversion image to the face region of the background image by at least partially minimizing the positions between the landmarks. The face region of the background image after mapping includes an existing landmark (shp_bg) set and a mapped landmark (shp_sim_t) set.

이러한 매핑 동작으로 인해, 배경영상의 얼굴 구성요소의 위치와 나이변환 영상의 얼굴 구성요소의 위치가 매칭되어, 포즈 등의 차이로 인해 발생하는 얼굴 영역 간의 합성의 부자연스러움이 최소화된다.Due to this mapping operation, the positions of the facial components of the background image and the positions of the facial components of the age conversion image are matched, thereby minimizing the unnaturalness of synthesis between facial regions caused by differences in poses and the like.

또한, 영상 합성부(70)는 나이변환 영상의 얼굴을 배경영상의 얼굴에 매칭시키도록 구성된다. In addition, the image synthesis unit 70 is configured to match the face of the age conversion image to the face of the background image.

일 실시예에서, 영상 합성부(70)는 나이변환 영상의 얼굴 텍스쳐를 매핑된 랜드마크(즉, 매핑된 랜드마크(shp_sim_t)의 위치)를 기반으로 와핑하도록 구성된다. 예를 들어, 영상 합성부(70)는 나이변환 영상의 얼굴 텍스쳐를 매핑된 랜드마크에 기반하는 얼굴 모양으로 와핑할 수 있다. In an embodiment, the image synthesizing unit 70 is configured to warp the facial texture of the age conversion image based on the mapped landmark (ie, the location of the mapped landmark shp_sim_t). For example, the image synthesizing unit 70 may warp the face texture of the age conversion image into a face shape based on the mapped landmark.

도 4를 참조하여 전술한 바와 같이, 얼굴 텍스쳐 특징이 랜드마크에 기초하여 추출되므로, 얼굴 텍스쳐는 랜드마크에 연관되어 표현되어 있다.As described above with reference to FIG. 4 , since the facial texture feature is extracted based on the landmark, the facial texture is expressed in relation to the landmark.

영상 합성부(70)는 나이변환 영상의 얼굴의 랜드마크(shp_sim)를 기반으로 구성된 나이변환 영상의 얼굴 텍스쳐를매핑된 랜드마크(shp_sim_t)로 와핑한다. 이러한 와핑은 배경영상 내 매핑된 랜드마크의 상대적 위치 및 나이변환 영상 내 대상의 얼굴의 랜드마크의 상대적 위치에 기초하여 수행된다. The image synthesizing unit 70 warps the facial texture of the age transformation image constructed based on the facial landmark (shp_sim) of the age transformation image to the mapped landmark (shp_sim_t). Such warping is performed based on the relative position of the landmark mapped in the background image and the relative position of the landmark of the target's face in the age conversion image.

상기 와핑에 의해, 매핑된 랜드마크(ship_sim_t)에 기초하여 배경영상의 얼굴 텍스쳐 및/또는 얼굴 모양과 나이변환 영상의 얼굴 텍스쳐 및/또는 얼굴 모양이 매칭된다. By the warping, the face texture and/or face shape of the background image and the face texture and/or face shape of the age conversion image are matched based on the mapped landmark ship_sim_t.

영상 합성부(70)는 영상을 벤딩(bending), 와핑할 수 있는 다양한 영상 편집 기법을 통해 배경영상의 얼굴 모양과 나이변환 영상의 얼굴 모양을 매칭시킬 수 있다. 상기 영상 편집 기법은 passion image editing 등을 포함하나, 이에 제한되지 않는다. The image synthesizing unit 70 may match the face shape of the background image with the face shape of the age conversion image through various image editing techniques capable of bending and warping the image. The image editing technique includes, but is not limited to, passion image editing.

이러한 얼굴 모양의 매칭으로 인해, 대상의 얼굴이 갖는 고유한 텍스쳐 특성 및/또는 모양 특성이 배경영상의 사람의 얼굴이 갖는 고유한 텍스쳐 특성 및/또는 모양 특성에 매칭되어, 배경영상의 얼굴 이외 영역(즉, 배경영역)과 나이변환 영상의 얼굴 영역 간의 합성의 부자연스러움이 최소화된다. Due to this face shape matching, the unique texture characteristics and/or shape characteristics of the subject's face are matched with the unique texture characteristics and/or shape characteristics of the human face of the background image, so that areas other than the face of the background image The unnaturalness of the synthesis between (that is, the background region) and the face region of the age conversion image is minimized.

또한, 영상 합성부(70)는 배경영상의 얼굴 텍스쳐를 배경영상에 매핑된 나이변환 영상의 얼굴의 랜드마크(즉, 매핑된 랜드마크(shp_sim_t))의 위치로 와핑하도록 더 구성된다. 예를 들어, 영상 합성부(70)는 배경영상의 얼굴 텍스쳐를 매핑된 랜드마크를 기반하는 얼굴 모양으로 와핑한다. In addition, the image synthesizing unit 70 is further configured to warp the face texture of the background image to the location of the landmark of the face of the age conversion image mapped to the background image (ie, the mapped landmark (shp_sim_t)). For example, the image synthesizing unit 70 warps the face texture of the background image into a face shape based on the mapped landmark.

영상 합성부(70)는 와핑된 배경영상의 얼굴의 랜드마크(shp_bg_t) 세트에 기초하여 합성 영역 마스크를 생성한다. The image synthesizing unit 70 generates a synthesizing area mask based on a set of facial landmarks shp_bg_t of the warped background image.

합성 영역 마스크는 마스크 내부 영역의 데이터를 필터링하도록 구성된다. 영상 합성부(70)는 상기 합성영역 마스크를 얼굴 모양이 매칭되게 와핑된 나이변환 영상에 적용하여 이미 모양이 매칭된 목표 나이에서의 대상의 얼굴 영역을 필터링하고, 상기 필터링된 대상의 얼굴 영역을 상기 배경영상의 얼굴 영역에 이식할 수 있다. 상기 이식은 배경영상 내 마스크 영역에 해당하는 영상 위에 겹쳐지거나, 또는 배경영상 내 마스크 영역에 해당하는 영상을 대체하는 것과 같은, 배경영상이 필터링된 영상을 포함하도록 편집되는 모든 동작을 지칭한다. The composite area mask is configured to filter data in the area inside the mask. The image synthesizing unit 70 applies the synthesized region mask to the warped age transformation image so that the face shape is matched to filter the face region of the target at the target age in which the shape is already matched, and the filtered face region of the target It can be implanted in the face region of the background image. The transplantation refers to any operation in which the background image is edited to include the filtered image, such as overlapping the image corresponding to the mask region in the background image, or replacing the image corresponding to the mask region in the background image.

상기 랜드마크(shp_bg_t) 세트와 상기 랜드마크(shp_sim_t) 세트는 동일하므로, 상기 합성 영역 마스크는 배경영상 및 나이변환 영상 내에서 동일한 영역을 합성영역으로 필터링할 수 있다.Since the landmark (shp_bg_t) set and the landmark (shp_sim_t) set are the same, the composite area mask can filter the same area in the background image and the age conversion image as the composite area.

대안적인 실시예에서, 상기 랜드마크(shp_bg_t) 세트와 상기 랜드마크(shp_sim_t) 세트는 동일하므로, 합성 영역 마스크는 랜드마크(ship_sim_t) 세트에 기초하여 생성될 수 있다. In an alternative embodiment, since the landmark (shp_bg_t) set and the landmark (shp_sim_t) set are the same, a composite area mask may be generated based on the landmark (ship_sim_t) set.

상기 이식 결과, 영상 합성부(70)는 직업영상을 생성한다. As a result of the transplantation, the image synthesizing unit 70 generates an occupational image.

이와 같이, 영상 합성부(70)는 와핑된 나이변환 영상 및 합성영역 마스크를 이용하여 배경영상에 나이변환된 얼굴 영역을 합성하여 직업영상을 생성할 수 있다. In this way, the image synthesizing unit 70 may generate a job image by synthesizing the age-converted face region with the background image using the warped age-converted image and the synthesized region mask.

전술한 직업영상 생성 장치(1)는 사용자에게 직업영상 공급 서비스를 제공하도록 구성된 다양한 시스템에 적용될 수 있다. 예를 들어, 직업영상 생성 장치(1)는 포토부스 시스템, 또는 키오스크 시스템에 적용된다. The above-described job image generating apparatus 1 may be applied to various systems configured to provide a job image supply service to a user. For example, the occupational image generating apparatus 1 is applied to a photobooth system or a kiosk system.

도 7은, 본 발명의 일 실시예에 따른, 포토부스 시스템의 개념도이다. 7 is a conceptual diagram of a photobooth system according to an embodiment of the present invention.

도 7을 참조하면, 포토부스 시스템(1000)은 상기 직업영상 장치1 및 배경 역할을 하는 벽으로 적어도 일부가 감싸져 대상이 위치 가능한 일정 공간을 형성하는 부스(100); 를 포함한다. 일부 실시예에서, 포토부스 시스템(1000)은 전면 조명(200); 의자(300); 및 가림막(400) 중 하나 이상을 더 포함할 수 있다. Referring to FIG. 7 , the photo booth system 1000 includes: a booth 100 at least partially surrounded by the job imaging apparatus 1 and a wall serving as a background to form a predetermined space in which an object can be located; includes In some embodiments, the photobooth system 1000 includes a front light 200 ; chair 300; And it may further include one or more of the shield 400.

상기 시스템(1000)에서 상기 직업영상 장치(1)는 포토부스 형태의 사진 촬영 및 직업영상 출력 장치로서, 상기 장치(1)는 얼굴 촬영용 카메라9; 영상 처리를 위한 컴퓨터 장비(예컨대, 나이 변환부(60) 및/또는 영상 합성부(70)); 전면 터치 스크린(즉, 입력장치(10) 및 표시장치(40))을 제공한다. In the system 1000, the occupational imaging apparatus 1 is a photo booth type photo-taking and occupational image output apparatus, and the apparatus 1 includes: a face photographing camera 9; computer equipment for image processing (eg, the age conversion unit 60 and/or the image synthesis unit 70); A front touch screen (ie, the input device 10 and the display device 40) is provided.

사용자는 부스(100) 내에서 상기 장치(1)에 의해 얼굴의 적어도 일부가 촬영될 장소에 위치한다. 상기 부스(100)는 하나 이상의 사람이 서 있거나, 또는 앉을 수 있는 의자가 위치하게 하는 공간으로 이루어진다. The user is positioned in the booth 100 where at least a part of the face is to be photographed by the device 1 . The booth 100 consists of a space in which one or more people can stand or sit on a chair.

상기 부스(100)는 배경 역할을 할 수 있는 벽을 제공한다. 상기 부스(100)는 도 7에 도시된 바와 같이, 일부가 개방된 공간일 수 있으나, 이에 제한되지 않는다. 다른 실시예에서, 상기 부스(100)는 밀폐형일 수 있으며, 이 경우 부스(100)는 사용자의 입/퇴장을 위한 문(미도시)을 더 포함할 수 있다. 전면 조명(200)은 사진 촬영 중 부스(100)에서 피사체 대상으로서 사용자를 적절히 조명하기 위한 장치이다. 의자(300)는 하나 이상의 사람이 앉을 수 있도록 구성된다. 예를 들어, 의자(300)에는 대상으로서 아이, 및 아이를 보호하는 보호자가 앉을 수 있다. The booth 100 provides a wall that can serve as a background. As shown in FIG. 7 , the booth 100 may be a partially open space, but is not limited thereto. In another embodiment, the booth 100 may be of a sealed type, in this case, the booth 100 may further include a door (not shown) for entering/exiting the user. The front light 200 is a device for properly illuminating a user as a subject in the booth 100 during photo taking. The chair 300 is configured so that one or more people can sit on it. For example, a child as an object and a guardian protecting the child may sit on the chair 300 .

가림막(400)은 부스(100)가 개방형인 경우, 부스(100) 외부로부터의 광이 원본영상 획득에 미치는 영향을 적어도 일부 감소시키기 위해, 부스(100)의 개방된 부분을 적어도 일부 차단하도록 구성된다. 가림막(400)은 도 7에 도시된 바와 같이, 커튼 방식으로 구성될 수 있으나, 이에 제한되진 않는다. When the booth 100 is of an open type, the shield 400 is configured to at least partially block the open portion of the booth 100 in order to at least partially reduce the effect of light from outside the booth 100 on acquisition of the original image. do. As shown in FIG. 7 , the shield 400 may be configured in a curtain type, but is not limited thereto.

상기 직업영상 생성 장치(1) 및 이를 포함한 시스템이 본 명세서에 서술되지 않은 다른 구성요소를 포함할 수도 있다는 것이 본 출원의 기술분야에 속하는 통상의 기술자에게 명백할 것이다. 예를 들어, 데이터 또는 정보를 저장하는 기억장치를 포함하는, 본 명세서에 서술된 동작에 필요한 다른 하드웨어 요소를 포함할 수도 있다. 또한, 상기 장치(1) 또는 상기 장치(1)를 포함한 시스템(예컨대, 포토부스)은 네트워크, 네트워크 인터페이스 및 프로토콜 등을 더 포함할 수 있다. It will be apparent to those skilled in the art that the occupational image generating apparatus 1 and a system including the same may include other components not described herein. It may also include other hardware elements necessary for the operations described herein, including, for example, storage for storing data or information. In addition, the apparatus 1 or a system (eg, a photobooth) including the apparatus 1 may further include a network, a network interface and a protocol, and the like.

본 발명의 다른 일 측면에 따른 나이 변환된 얼굴을 갖는 직업영상 생성 방법은 프로세서를 포함한 컴퓨팅 장치(예컨대, 상기 직업영상 생성 장치(1)) 및 이를 포함한 시스템(예컨대, 포토부스(1000))에 의해 수행될 수 있다. A method of generating a job image having an age-converted face according to another aspect of the present invention is provided to a computing device including a processor (eg, the job image generating device 1) and a system including the same (eg, photo booth 1000). can be performed by

이하, 설명의 명료성을 위해서, 7세의 아이를 대상으로 가정하고, 목표 나이는 25세로 가정하여 직업영상 생성 방법을 보다 상세하게 서술한다. 그러나, 본 발명이 7세의 아이를 25세로 나이변환하는 것으로 제한되어 이해되지 않는 것이 통상의 기술자에게 명백할 것이다. Hereinafter, for clarity of explanation, a 7-year-old child is assumed and the target age is 25 years old, and the occupational image generation method will be described in more detail. However, it will be apparent to those skilled in the art that the present invention is not to be understood as being limited to the age conversion of a 7 year old child to 25 years old.

도 8은, 본 발명의 일 실시예에 따른, 직업영상 생성 방법의 흐름도이다. 8 is a flowchart of a method for generating a job image, according to an embodiment of the present invention.

도 8을 참조하면, 나이변환된 얼굴을 갖는 직업영상을 생성하는 방법은: 특정 나이에서의 대상의 얼굴을 포함한 원본영상을 획득하는 단계(S10); 상기 대상의 원본 나이 정보를 획득하는 단계(S15); 직업영상 생성을 위해 사용될, 직업이 표현된 배경영상을 결정하는 단계(S20); 변환될 목표 나이 정보를 획득하는 단계(S25); 원본 나이에서의 대상의 얼굴을 상기 목표 나이에서의 대상의 얼굴로 변환하는 단계(S30); 상기 배경영상의 얼굴 영역에 상기 나이변환 얼굴을 합성하여 상기 대상의 직업영상을 생성하는 단계(S50)를 포함한다. 일부 실시예에서, 상기 방법은: 단계(S10)의 원본영상, 단계(S20)의 배경영상, 단계(S30)의 나이변환 영상, 및 단계(S50)의 직업영상 중 적어도 직업영상을 포함한 진로 안내 서비스 영상을 사용자에게 제공하는 단계(S60)를 더 포함할 수 있다. 또한, 상기 방법은: 직업영상을 생성한 이후(S50), 상기 직업영상에 대한 사용자의 만족도 확인을 요청하는 단계(S51); 및 불만족을 나타내는 입력을 수신한 경우, 이전 단계(예컨대, S10, S15, S20, S25, S30 또는 S50)로 되돌아가 다시 해당 단계 및 그 이후 단계를 진행하는 단계(S55)를 더 포함할 수 있다. Referring to FIG. 8 , the method of generating a job image having an age-converted face includes: obtaining an original image including a face of a subject at a specific age (S10); obtaining original age information of the subject (S15); determining a background image in which a job is expressed, to be used for generating a job image (S20); obtaining target age information to be converted (S25); converting the face of the subject at the original age into the face of the subject at the target age (S30); and generating a job image of the target by synthesizing the age-converted face with the face region of the background image (S50). In some embodiments, the method includes: Career guidance including at least a job image among the original image of step S10, the background image of step S20, the age conversion image of step S30, and the job image of step S50 The method may further include providing a service image to the user (S60). In addition, the method includes: after generating the job image (S50), requesting confirmation of the user's satisfaction with the job image (S51); and when an input indicating dissatisfaction is received, returning to the previous step (eg, S10, S15, S20, S25, S30 or S50) and performing the corresponding step and subsequent steps again (S55). .

단계(S10, S15, S20, S25)에서 직업영상을 생성하기 위한 나이변환 영상을 생성하는데 사용되는 영상 데이터 및 세부 정보가 획득된다. 예를 들어, 단계(S10)에서 상기 7세 아이의 얼굴을 포함한 원본영상 데이터가 획득되고, 단계(S15)에서 상기 대상의 나이에 해당하는 7세에 대한 정보가 획득되며, 단계(S20)에서 7세 아이가 희망하는 직업이 표현된 배경영상이 획득되고, 단계(S25)에서 상기 목표 나이에 해당하는 25세에 대한 정보가 획득된다. In steps S10, S15, S20, and S25, image data and detailed information used to generate an age-converted image for generating an occupational image are obtained. For example, original image data including the face of the 7-year-old child is acquired in step S10, information about the age of 7 corresponding to the age of the subject is acquired in step S15, and in step S20 A background image expressing a job desired by a 7-year-old child is acquired, and information about a 25-year-old corresponding to the target age is acquired in step S25.

단계(S10)에서 원본영상 데이터는 상기 장치(1)의 촬영기기(9)에 의해 또는 사용자의 모바일 기기로부터 송수신장치(20)에 의해 획득된다. In step S10, the original image data is acquired by the transceiver 20 by the photographing device 9 of the device 1 or from the user's mobile device.

일 실시예에서, 단계(S15)에서 원본나이에 대한 정보는 사용자 입력(예컨대, 상기 장치(1)의 입력장치(10) 또는 송수신장치(20)에 의한 입력)으로 획득된다. In one embodiment, the information on the original age in step S15 is obtained as a user input (eg, input by the input device 10 or the transceiver 20 of the device 1).

다른 일 실시예에서, 단계(S15)에서 원본나이에 대한 정보는 원본영상을 자동으로 분석하여 획득된다. In another embodiment, the information on the original age in step (S15) is obtained by automatically analyzing the original image.

또한, 단계(S15)에서 원본영상의 대상에 대한 성별 정보가 더 획득될 수 있다. 원본영상의 대상에 대한 성별 정보는 전술한 사용자 입력 또는 원본영상으로부터의 분석에 의해 획득된다. In addition, in step S15, gender information on the subject of the original image may be further obtained. Gender information on the subject of the original image is obtained by the above-described user input or analysis from the original image.

단계(S20)에서 배경영상은 사용자 입력에 의해 결정되거나, 또는 단계(S10)의 원본영상 및/또는 단계(S30)의 나이변환 영상에 기초하여 자동으로 결정될 수 있다. 일부 실시예에서, 배경영상이 나이변환 영상에 기초하여 결정되는 경우, 단계(S20)는 단계(S30) 이후에 수행된다. The background image in step S20 may be determined by a user input, or may be automatically determined based on the original image of step S10 and/or the age-converted image of step S30. In some embodiments, when the background image is determined based on the age conversion image, step S20 is performed after step S30.

일 예에서, 직업이 표현된 배경영상은 사용자 입력에 기초하여 선택된다. 이를 위해, 상기 장치(1) 또는 상기 방법을 수행하기 위한 컴퓨팅 장치는 직업 선택 화면 및 입력을 수신받기 위한 인터페이스를 제공하도록 구성될 수 있다. 예를 들어, 상기 장치(1)는 과학자, 경찰, 소방관, 의사, 가수 등 다양한 직업 명칭을 포함한 선택 메뉴를 사용자에게 제공할 수 있다. In one example, the background image representing the job is selected based on a user input. To this end, the device 1 or the computing device for performing the method may be configured to provide a job selection screen and an interface for receiving an input. For example, the device 1 may provide the user with a selection menu including various job titles such as scientist, police officer, firefighter, doctor, and singer.

상기 장치(1)는 상기 선택 메뉴에 대한 직업 선택 명령을 입력장치를 통해 수신하고, 이에 응답하여 선택된 직업에 연관된 배경 영상을 검색한다(retrieve)(S20). 상기 배경영상의 선택을 위한 입력은 단계(S20)의 원본영상과 관련된 세부 정보와 함께 수신되거나, 또는 배경영상의 선택을 위한 인터페이스를 통해 수신될 수 있다. 그러면, 매칭된 배경영상이 직업영상을 생성하기 위해 사용된다(S50).The device 1 receives a job selection command for the selection menu through an input device, and in response to it, retrieves a background image related to the selected job (S20). The input for selecting the background image may be received together with detailed information related to the original image in step S20 or may be received through an interface for selecting the background image. Then, the matched background image is used to generate a job image (S50).

단계(S25)에서 목표 나이는 사용자 입력에 의해 결정되거나, 또는 단계(S20)의 배경영상의 결정에 따라 자동으로 획득될 수 있다. 예를 들어, 단계(S25)에서, 단계(S20)에서 결정된 배경영상에 연관된 것으로 미리 설정된 값(예컨대, 배경영상의 직업의 평균 연령)이 목표 나이로 획득될 수 있다. The target age in step S25 may be determined by a user input or may be automatically obtained according to the determination of the background image in step S20. For example, in step S25 , a value preset to be related to the background image determined in step S20 (eg, the average age of the occupation of the background image) may be obtained as the target age.

단계(S30)에서, 상기 원본영상의 대상의 얼굴로부터 상기 목표 나이에서의 대상의 얼굴로 변환하여 대상의 나이변환 얼굴을 생성한다. 예를 들어, 상기 7세에서의 아이의 얼굴을 25세에서의 성인의 얼굴로 변환하여 상기 아이가 25세가 되었을 경우 가질 것으로 예상되는 나이변환 얼굴을 생성한다. In step S30, the face of the target of the original image is converted into the face of the target at the target age to generate the age-converted face of the target. For example, the face of a 7-year-old child is transformed into an adult's face at 25 years old to generate an age-converted face expected to have when the child turns 25 years old.

일 실시예에서, 상기 나이변환 얼굴(즉, 목표 나이에서의 대상의 얼굴)로 변환하는 단계는: 상기 원본영상의 대상의 얼굴로부터 랜드마크를 추출하는 단계; 상기 랜드마크가 추출된 원본영상의 대상의 얼굴로부터 상기 목표 나이에서의 대상의 얼굴 텍스쳐를 생성하는 단계; 상기 랜드마크가 추출된 원본영상의 대상의 얼굴로부터 상기 목표 나이에서의 대상의 얼굴 모양을 생성하는 단계; 및 상기 목표 나이에서의 대상의 얼굴 텍스쳐 및 얼굴 모양에 기초하여 상기 대상의 나이변환 얼굴을 생성하는 단계를 포함한다. In an embodiment, the converting into the age-converted face (ie, the target's face at the target age) includes: extracting a landmark from the target's face of the original image; generating a facial texture of the target at the target age from the target's face of the original image from which the landmark is extracted; generating a face shape of the subject at the target age from the face of the subject of the original image from which the landmark is extracted; and generating an age-converted face of the subject based on the face texture and face shape of the subject at the target age.

일 실시예에서, 상기 원본영상의 대상의 얼굴로부터 랜드마크를 추출하는 단계는 미리 설정된 랜드마크 추출 알고리즘을 통해 수행될 수 있다. 일부 실시예에서, 상기 랜드마크를 추출하기 이전에, 상기 원본영상에서 대상의 얼굴영역을 검출하는 단계가 먼저 수행될 수 있다. In an embodiment, the step of extracting the landmark from the face of the target of the original image may be performed through a preset landmark extraction algorithm. In some embodiments, before extracting the landmark, the step of detecting the face region of the target from the original image may be performed first.

일 실시예에서, 상기 목표 나이에서의 대상의 얼굴 텍스쳐를 생성하는 단계는: 상기 랜드마크가 추출된 원본영상의 대상의 얼굴(예컨대, 얼굴 영상)로부터 무모양 얼굴 텍스쳐를 생성하는 단계; 및 상기 무모양 얼굴 텍스쳐를 미리 학습된 텍스쳐 변환 모델에 적용하여 상기 목표 나이에서의 대상의 무모양 얼굴 텍스쳐를 생성하는 단계를 포함한다. In an embodiment, the generating of the facial texture of the target at the target age includes: generating a shapeless facial texture from the target's face (eg, face image) of the original image from which the landmark is extracted; and applying the shapeless facial texture to a pre-learned texture transformation model to generate a shapeless facial texture of the target at the target age.

일 실시예에서, 상기 무모양 얼굴 텍스쳐는 상기 원본영상의 대상의 얼굴 모양을 평균 모양으로 변환한 데이터일 수 있다. In an embodiment, the shapeless face texture may be data obtained by converting the face shape of the target of the original image into an average shape.

일 실시예에서, 텍스쳐 변환 모델은 목표 나이에서의 얼굴 텍스쳐를 출력하도록 미리 학습된다. 예를 들어, 텍스쳐 변환 모델은 25세에서의 얼굴 텍스쳐를 출력하도록 미리 학습된다.In one embodiment, the texture transformation model is pre-trained to output a facial texture at a target age. For example, the texture transformation model is pre-trained to output the face texture at the age of 25.

일 실시예에서, 상기 텍스쳐 변환 모델은 GAN 기반 모델일 수 있다. 여기서, 상기 텍스쳐 변환 모델은 입력 데이터에 노이즈를 적용하여 상기 목표 나이에서의 얼굴 텍스쳐에 대응하는 변환 텍스쳐를 출력하도록 미리 학습된 생성기를 포함한다. In an embodiment, the texture transformation model may be a GAN-based model. Here, the texture transformation model includes a generator trained in advance to output a transformation texture corresponding to the facial texture at the target age by applying noise to the input data.

상기 생성기는 변환 텍스쳐 및 상기 변환 텍스쳐와 유사한 참조 데이터(즉, 실제 데이터)를 식별하는 판별기를 이용하여 미리 학습되었다.The generator was pre-trained using a transform texture and a discriminator that identifies reference data (ie real data) similar to the transform texture.

상기 생성기는 실제 목표 나이에서의 얼굴 텍스쳐와 매우 유사하도록 변환된 텍스쳐를 출력하므로, 상기 생성기에서 출력된 변환 텍스쳐를 상기 목표 나이에서의 대상의 얼굴 텍스쳐로 사용할 수 있다. 즉, 상기 생성기는 목표 나이에서의 얼굴 텍스쳐에 해당하는 데이터를 출력하도록 구성된다. Since the generator outputs a texture converted to be very similar to the facial texture at the actual target age, the transformed texture output from the generator may be used as a facial texture of the target at the target age. That is, the generator is configured to output data corresponding to the facial texture at the target age.

상기 텍스쳐 변환 모델(예컨대, 생성기)은 복수의 훈련 샘플을 이용하여 생성되며, 각 훈련 샘플은 25세에서의 훈련 대상의 얼굴 텍스쳐를 포함한다. The texture transformation model (eg, generator) is generated using a plurality of training samples, each training sample including a facial texture of a training target at the age of 25.

일부 실시예에서, 텍스쳐 변환 모델은 상기 목표 나이에서의 얼굴 텍스쳐를 출력하도록 미리 학습된 상기 목표 나이에 대한 서브 모델을 포함할 수 있다. 예를 들어, 텍스쳐 변환 모델은 25세에서의 얼굴 텍스쳐를 출력하도록 미리 학습된 제1 서브 모델을 포함한다. 이 경우, 상기 서브 모델은 복수의 훈련 샘플로 이루어진 훈련 샘플 세트를 이용하여 생성되며, 각 세트는 해당 나이에서의 훈련 대상의 얼굴 텍스쳐, 해당 나이를 나타내는 제1 레이블 데이터 및 훈련 대상의 성별을 나타내는 제2 레이블 데이터를 포함한다. 텍스쳐 변환 모델은 서브 모델의 수에 의존하는 복수의 훈련 샘플 세트를 이용하여 미리 학습되었다. In some embodiments, the texture transformation model may include a pre-trained sub-model for the target age to output a facial texture at the target age. For example, the texture transformation model includes a pre-trained first sub-model to output a face texture at the age of 25. In this case, the sub-model is generated using a training sample set consisting of a plurality of training samples, and each set represents the face texture of the training target at the corresponding age, first label data indicating the corresponding age, and the gender of the training target. and second label data. The texture transformation model was pre-trained using a plurality of training sample sets depending on the number of sub-models.

다른 일 실시예에서, 상기 텍스쳐 변환 모델은 cycleGAN 기반 모델일 수 있다. 여기서, 상기 텍스쳐 변환 모델은 상기 목표 나이에서의 대상의 얼굴 텍스쳐를 생성하는 복수의 생성기(예컨대, 두 개)를 포함한다. 상기 복수의 생성기는 서로 다른 도메인으로 변환한 데이터를 출력하도록 구성된다. 상기 복수의 생성기는 각 생성기가 출력한 상이한 도메인의 데이터를 식별하도록 구성된 판별기를 이용하여 미리 학습되었다. In another embodiment, the texture transformation model may be a cycleGAN-based model. Here, the texture transformation model includes a plurality of generators (eg, two) for generating a facial texture of the target at the target age. The plurality of generators are configured to output data converted into different domains. The plurality of generators were previously trained using a discriminator configured to identify data of different domains output by each generator.

예를 들어, 두 개의 생성기는: 제1 도메인의 입력 데이터에 노이즈를 적용하여 제2 도메인의 변환 데이터를 출력하는 제1 생성기; 및 상기 제1 도메인의 입력 데이터에 노이즈를 적용하여 제3 도메인의 변환 데이터를 출력하는 제2 생성기로서, 순환 일관성을 충족하기 위해 각 생성기는 변환 데이터를 제1 도메인으로 데이터로 재-변환 시 상기 제1 도메인의 입력 데이터로 변환되도록 구성된다. For example, the two generators may include: a first generator that applies noise to input data of a first domain to output converted data of a second domain; and a second generator for outputting transformed data of a third domain by applying noise to the input data of the first domain, wherein each generator is configured to re-convert the transformed data into data in the first domain to satisfy cyclic consistency. and is configured to be converted into input data of the first domain.

또 다른 일 실시예에서, 상기 미리 학습된 텍스쳐 변환 모델은 단일 입력 데이터로부터 복수의 나이에서의 얼굴 텍스쳐를 생성하는 단일 모델 구조의 텍스쳐 변환 모델일 수 있다. 이 경우, 목표 나이에서의 얼굴 텍스쳐는 복수의 나이에서의 얼굴 텍스쳐로부터 선택된다. In another embodiment, the pre-trained texture transformation model may be a texture transformation model having a single model structure that generates facial textures at a plurality of ages from single input data. In this case, the facial texture at the target age is selected from facial textures at a plurality of ages.

일부 실시예에서, 복수의 나이는 복수의 군집에 대응하는 나이일 수 있다. 예를 들어, 복수의 군집은 연령대(20대, 30대, 40대 등)일 수 있다. In some embodiments, the plurality of ages may be ages corresponding to the plurality of clusters. For example, the plurality of clusters may be age groups (20's, 30's, 40's, etc.).

상기 단일 모델 구조의 텍스쳐 변환 모델은 입력 데이터에 노이즈 및 조건 정보(y)를 적용하여 복수의 나이에서의 얼굴 텍스쳐를 생성한다. 여기서, 조건 정보(y)는 복수의 나이 각각을 나타내는 클래스 라벨(예컨대, 연령대별 나이를 나타내는 클래스 라벨)을 포함한다. The texture conversion model of the single model structure generates face textures at a plurality of ages by applying noise and condition information (y) to input data. Here, the condition information y includes a class label indicating each of a plurality of ages (eg, a class label indicating an age for each age group).

상기 단일 모델 구조의 텍스쳐 변환 모델은, 예를 들어 conditional GAN 기반 모델일 수 있으나, 이에 제한되진 않는다. The texture transformation model of the single model structure may be, for example, a conditional GAN-based model, but is not limited thereto.

상기 모양 변환 모델은: 나이와 해당 나이에서의 훈련 대상의 얼굴 모양 특징 간의 관계를 모델링하여 생성된 것으로서, 상기 목표 나이에서의 나이 함수 값과 상기 원본나이에서의 나이 함수 값 간의 차이 및 상기 원본영상의 얼굴 모양 특징(예컨대, 단계(S431)의 얼굴 모양 특징)에 기초하여 모델링된다. 상기 모양 변환 모델은, 예를 들어, 상기 수학식 1로 표현될 수 있다. The shape transformation model is generated by modeling the relationship between age and the facial shape features of the training target at the corresponding age, and the difference between the age function value at the target age and the age function value at the original age and the original image It is modeled based on the facial shape features (eg, the facial shape features of step S431). The shape transformation model may be expressed, for example, by Equation 1 above.

상기 모양 변환 모델은, 상기 얼굴 모양 특징의 차원이 N차원인 경우(여기서, N은 1이상의 정수), 각 얼굴 모양 특징에 대한 나이 함수에 기초하여 모델링될 수 있다. The shape transformation model may be modeled based on an age function for each facial shape feature when the dimension of the facial shape feature is N-dimensional (where N is an integer greater than or equal to 1).

상기 모양 변환 모델은, 상기 목표 나이에서의 얼굴 모양 특징을 출력하도록 복수의 훈련 샘플 및 상기 목표 나이를 나타내는 레이블 데이터를 이용하여 미리 학습된 모델로서, 각 세트 내 훈련 샘플은 해당 나이에서의 훈련 대상의 얼굴 모양 특징을 포함할 수 있다. 예를 들어, 상기 모앙 변환 모델은, 25세의 복수의 훈련 대상의 얼굴 모양을 이용하여, 25세에서의 얼굴 모양 특징을 출력하도록 미리 학습된다. The shape transformation model is a model pre-trained using a plurality of training samples and label data indicating the target age to output facial shape features at the target age, and the training sample in each set is a training target at the corresponding age. of facial features. For example, the Moang transformation model is trained in advance so as to output the facial shape features at the age of 25 using the facial shapes of a plurality of 25-year-old training targets.

일부 실시예에서, 상기 모양 변환 모델이 복수의 나이 중 어느 하나의 나이에서의 얼굴 모양 특징을 출력하도록 구성된 경우, 상기 모양 변환 모델은 상기 복수의 나이 각각의 얼굴 모양 특징을 출력하도록 복수의 훈련 샘플 세트를 이용하여 미리 학습된 모델로서, 각 세트는 상기 복수의 나이 중 특정 나이에서의 복수의 훈련 샘플 및 상기 특정 나이를 나타내는 레이블 데이터를 포함하며, 각 세트 내 훈련 샘플은 해당 나이에서의 훈련 대상의 얼굴 모양 특징을 포함할 수 있다. In some embodiments, when the shape transformation model is configured to output a facial shape feature at any one of a plurality of ages, the shape transformation model includes a plurality of training samples to output a facial shape feature for each of the plurality of ages. A model pre-trained using a set, wherein each set includes a plurality of training samples at a specific one of the plurality of ages and label data representing the specific age, and a training sample in each set is a training target at the corresponding age. of facial features.

일 실시예에서, 상기 대상의 나이변환 얼굴을 생성하는 단계는: 상기 목표 나이에서의 대상의 얼굴 모양에 상기 목표 나이에서의 얼굴 텍스쳐를 와핑(warping)하여, 상기 목표 나이에서의 얼굴을 나이변환 얼굴로서 생성하는 단계를 포함할 수 있다. In one embodiment, the generating of the age-converted face of the subject includes: warping the face texture at the target age with the face shape of the subject at the target age, and transforming the face at the target age into age It may include generating as a face.

도 9는, 본 발명의 일 실시예에 따른, 나이변환 동작에 따른 결과를 예시적으로 도시한 도면이다. 9 is a diagram exemplarily showing a result according to an age conversion operation, according to an embodiment of the present invention.

도 9를 참조하면, 7세에서의 아이로서 대상의 얼굴을 포함한 원본영상이 25세에서의 성인으로서 대상의 얼굴로 변환된 나이변환 영상을 생성할 수 있다. Referring to FIG. 9 , an age-converted image in which an original image including the subject's face as a child at the age of 7 is converted into the face of the subject as an adult at the age of 25 may be generated.

단계(S30) 이후, 단계(S20)에서 결정된 배경영상의 얼굴 영역에 상기 나이변환 얼굴을 합성하여 상기 대상의 직업영상을 생성한다(S50). After step S30, the age-converted face is synthesized in the face region of the background image determined in step S20 to generate a job image of the target (S50).

도 10은, 본 발명의 일 실시예에 따른, 영상 합성 과정의 흐름도이다. 10 is a flowchart of an image synthesis process according to an embodiment of the present invention.

도 10을 참조하면, 영상 합성 과정은: 단계(S30)의 대상의 나이변환 얼굴의 랜드마크를 추출하고, 단계(S20)의 배경영상의 얼굴의 랜드마크를 추출한다(S510). 단계(S510)의 랜드마크 추출은 단계(S410)의 랜드마크 추출과 동일 또는 유사한 과정을 통해 수행된다. Referring to FIG. 10 , the image synthesis process is: extracting the landmark of the age-converted face of the target in step S30, and extracting the landmark of the face of the background image in step S20 (S510). The landmark extraction of step S510 is performed through the same or similar process as the landmark extraction of step S410.

도 11은, 본 발명의 일 실시예에 따른, 나이변환 영상 및 배경 영상의 랜드마크 추출결과를 도시한 도면이다. 11 is a diagram illustrating a landmark extraction result of an age conversion image and a background image according to an embodiment of the present invention.

도 11을 참조하면, 사용자가 직업영상을 생성하기 위해 선택된 직업은 경찰로서, 배경영상은 직업으로서 경찰을 표현하는 영상이다. 단계(S510)에서, 대상의 나이변환 얼굴 및 배경영상의 얼굴에서 복수의 랜드마크를 포함한 랜드마크 세트가 각각 추출된다. Referring to FIG. 11 , a job selected by the user to generate a job image is a police officer, and a background image is an image representing a police officer as a job. In step S510, a landmark set including a plurality of landmarks is extracted from the age-converted face of the subject and the face of the background image, respectively.

일부 실시예에서, 배경영상의 얼굴의 랜드마크는 배경영상과 함께 미리 저장되어 있을 수 있다. 이 경우, 배경영상을 검색 시 검색된 배경영상에 연관된, 미리 저장된 랜드마크를 단계(S510)에서 사용할 수 있다. In some embodiments, the landmark of the face of the background image may be pre-stored together with the background image. In this case, a pre-stored landmark related to the searched background image may be used in step S510 when the background image is searched.

추출된 각각의 랜드마크에 기초하여 나이변환 얼굴의 랜드마크를 상기 배경영상의 얼굴 영역에 매핑한다(S520). Based on each extracted landmark, the landmark of the age-converted face is mapped to the face region of the background image (S520).

도 12는, 본 발명의 일 실시예에 따른, 배경영상에 매핑된, 나이변환 영상의 랜드마크를 도시한 도면이다. 12 is a diagram illustrating a landmark of an age conversion image mapped to a background image, according to an embodiment of the present invention.

도 12를 참조하면, 도 11에 도시된, 나이변환 영상의 랜드마크가 배경영상에 매핑된다. Referring to FIG. 12 , the landmarks of the age conversion image shown in FIG. 11 are mapped to the background image.

일 실시예에서, 단계(S520)의 매핑은 각각의 랜드마크가 의미하는 해부학적 얼굴 특징에 기초하여 수행된다. In one embodiment, the mapping of step S520 is performed based on anatomical facial features that each landmark means.

매핑 이후, 나이변환 영상의 얼굴의 랜드마크의 위치를 상기 배경영상에 매핑된 랜드마크의 위치로 이동시킨다(S530). 그 결과, 상기 나이변환 영상의 얼굴 모양을 배경영상의 얼굴 모양과 매칭되도록 와핑할 수 있다. After mapping, the position of the landmark of the face of the age conversion image is moved to the position of the landmark mapped to the background image (S530). As a result, the face shape of the age conversion image may be warped to match the face shape of the background image.

도13은, 본 발명의 일 실시예에 따른, 배경영상의 얼굴의 랜드마크의 위치 이동 결과를 도시한 도면이다. 13 is a diagram illustrating a result of moving a landmark of a face of a background image according to an embodiment of the present invention.

도 13을 참조하면, 도 11에 도시된, 나이변환 영상의 랜드마크가 도 12의 매핑 지점으로 이동한다. 단계(S530)에서 나이변환 영상의 이동한 랜드마크 위치에 기초하여 나이변환 영상의 얼굴 모양을 와핑할 수 있다. Referring to FIG. 13 , the landmark of the age conversion image shown in FIG. 11 moves to the mapping point of FIG. 12 . In step S530, the face shape of the age conversion image may be warped based on the moved landmark position of the age conversion image.

또한, 매핑 이후, 상기 배경영상의 얼굴의 랜드마크의 위치를 상기 배경영상에 매핑된 랜드마크의 위치로 이동시킨다(S540). Also, after mapping, the position of the landmark of the face of the background image is moved to the position of the landmark mapped to the background image (S540).

도14는, 본 발명의 일 실시예에 따른, 나이변환 영상의 얼굴의 랜드마크의 위치 이동 결과를 도시한 도면이다. 14 is a diagram illustrating a result of positional movement of a landmark of a face of an age conversion image according to an embodiment of the present invention.

도 14를 참조하면, 도 11에 도시된, 배경영상의 랜드마크가 도 12의 매핑 지점으로 이동한다(S540). 단계(S540)에서 배경영상의 이동한 랜드마크 위치에 기초하여 배경영상의 얼굴 모양을 와핑할 수 있다. Referring to FIG. 14 , the landmark of the background image shown in FIG. 11 moves to the mapping point of FIG. 12 ( S540 ). In step S540, the face shape of the background image may be warped based on the moved landmark position of the background image.

단계(S530 및 S540)의 수행으로 인해, 나이변환 영상의 얼굴 모양이 배경영상의 얼굴 모양과 매칭된다.Due to the execution of steps S530 and S540, the face shape of the age conversion image is matched with the face shape of the background image.

단계(S530 또는 S540)의 위치 이동은, 단계(S520)의 매핑 결과(즉, 매핑 지점)에 기초하여 수행된다. The position movement of step S530 or S540 is performed based on the mapping result (ie, mapping point) of step S520 .

그러면, 상기 배경영상의 얼굴의 랜드마크에 기초하여 내부 영역을 필터링하는 합성영역 마스크를 생성할 수 있다(S550). Then, based on the landmark of the face of the background image, a composite area mask for filtering the inner area may be generated (S550).

일 실시예에서, 상기 합성영역 마스크는 상기 배경영상의 이동한 랜드마크에서 가장 바깥에 위치하는 랜드마크를 이용하여 생성된다. In an embodiment, the composite area mask is generated by using a landmark located at the outermost part of the moved landmark of the background image.

도 15는, 본 발명의 일 실시예에 따른, 합성 영역 마스크를 도시한 도면이다. 15 is a diagram illustrating a composite area mask according to an embodiment of the present invention.

도 13의 배경영상의 얼굴의 이동한 랜드마크를 이용하면, 도 15에 도시된 합성영역 마스크를 획득할 수 있다. By using the moved landmark of the face of the background image of FIG. 13 , the composite area mask shown in FIG. 15 can be obtained.

단계(S550) 이후, 상기 합성영역 마스크를 사용하여 와핑된 나이변환 영상의 대상의 얼굴 영역을 필터링한다(S560). 이어서, 상기 필터링된 대상의 얼굴 영역을 상기 배경영상의 얼굴 영역에 이식하여, 직업영상을 생성한다(S560). After the step (S550), the face region of the target of the warped age transformation image is filtered using the synthesized region mask (S560). Then, a job image is generated by transplanting the filtered target face region to the face region of the background image (S560).

단계(S530 및 S540)의 랜드마크 위치 이동에 따른 와핑으로 인해, 나이변환 영상의 대상의 얼굴 모양은 배경영상과 얼굴 모양이 매칭된다. 상기 합성영역 마스크를 통해 배경영상과 얼굴 모양이 매칭된 상기 목표 나이에서의 대상의 얼굴 영역이 필터링된다. Due to the warping according to the movement of the landmark position in steps S530 and S540, the face shape of the target of the age conversion image matches the face shape with the background image. The face region of the target at the target age in which the face shape is matched with the background image is filtered through the composite region mask.

도 16은, 본 발명의 일 실시예에 따른, 목표 직업이 경찰인 경우 합성 영상을 도시한 도면이다. 16 is a diagram illustrating a composite image when a target job is a police officer according to an embodiment of the present invention.

도 16을 참조하면, 나이 변환 처리에 의해 도 9의 나이변환 영상의 얼굴 모양이 와핑된 것을 확인할 수 있다. 단계(S560)에서 와핑된 나이변환 영상의 대상의 얼굴 영역이 경찰을 표현하는 배경영상에 이식된다. Referring to FIG. 16 , it can be confirmed that the face shape of the age conversion image of FIG. 9 is warped by the age conversion process. In step S560, the face region of the subject of the warped age conversion image is transplanted into the background image representing the police.

그 결과, 7세의 대상이 25세에 경찰이 되었을 경우를 나타내는 직업영상을 생성할 수 있다(S560). As a result, a job image representing a case in which a 7-year-old object becomes a police officer at the age of 25 may be generated (S560).

도 17은, 본 발명의 일 실시예에 따른, 목표 직업이 의사인 경우 합성 영상을 도시한 도면이다. 17 is a diagram illustrating a composite image when a target job is a doctor, according to an embodiment of the present invention.

도 17를 참조하면, 도 9의 나이변환 영상의 얼굴 모양이, 도 16과는 다소 상이하게 와핑된 것을 확인할 수 있다. 의사를 나타내는 배경영상를 사용한 매핑 결과(S520)와, 경찰을 나타내는 배경영상를 사용한 매핑 결과(S520)가 상이하기 때문이다. Referring to FIG. 17 , it can be seen that the face shape of the age conversion image of FIG. 9 is warped somewhat differently from FIG. 16 . This is because the mapping result ( S520 ) using the background image representing the doctor is different from the mapping result ( S520 ) using the background image representing the police.

단계(S560)에서 와핑된 나이변환 영상의 대상의 얼굴 영역이 의사를 표현하는 배경영상에 이식된다. 그 결과, 7세의 대상이 25세에 경찰이 되었을 경우를 나타내는 직업영상을 생성할 수 있다(S560). In step S560, the face region of the target of the warped age conversion image is transplanted into the background image expressing the intention. As a result, a job image representing a case in which a 7-year-old object becomes a police officer at the age of 25 may be generated (S560).

전술한 직업영상 생성 과정은 정지 영상을 사용하는 것에 제한되지 않는다. 일 실시예에서, 배경영상을 위한 직업영상은 복수의 프레임으로 이루어진 동영상일 수 있다. 이 경우, 복수의 프레임 중 적어도 하나는 직업을 표현한 영상(예컨대, 정지 배경영상)을 포함한다. The above-described job image generation process is not limited to using a still image. In one embodiment, the job image for the background image may be a moving picture consisting of a plurality of frames. In this case, at least one of the plurality of frames includes an image (eg, a still background image) representing a job.

이 경우, 정지 나이변환 영상 및 배경 동영상의 적어도 하나의 프레임을 정지 배경영상으로 사용하여 프레임에 대한 직업영상을 생성한다. In this case, at least one frame of the still age conversion image and the background video is used as a still background image to generate a job image for the frame.

상기 나이변환 영상과 배경영상의 합성 결과는 복수의 프레임으로 이루어진 동영상(이하, “직업 동영상”)으로 생성될 수 있다. 일 실시예에서, 상기 직업 동영상은 정지 나이변환 영상 및 배경영상이 나타난 동영상(이하, “배경 동영상”)에 기초하여 생성된다. The result of synthesizing the age conversion image and the background image may be generated as a moving image (hereinafter, “job moving image”) consisting of a plurality of frames. In an embodiment, the job video is generated based on a video in which a still age conversion image and a background image appear (hereinafter, “background video”).

예를 들어, 정지 나이변환 영상 및 배경 동영상의 적어도 하나의 프레임을 정지 배경영상으로 사용하여 프레임에 대한 직업영상을 생성하고, 상기 직업영상을 갖는 프레임으로 이루어진 직업 동영상을 생성한다. For example, by using at least one frame of the still age conversion image and the background video as a still background image, a job image for the frame is generated, and a job video composed of a frame having the job image is generated.

각 프레임별 직업영상을 생성하는 과정은 정지 나이변환 영상 및 정지 배경영상에 기초하여 직업영상을 생성하는 과정과 유사하므로, 자세한 설명은 생략한다. Since the process of generating the job image for each frame is similar to the process of generating the job image based on the still age conversion image and the still background image, a detailed description will be omitted.

단계(S50)에서 직업영상의 생성 결과는 사용자에게 제공된다(S60). The result of generating the job image in step S50 is provided to the user (S60).

단계(S60)에서 입력사진(즉, 원본영상), 배경사진(즉, 배경영상), 및 합성사진(즉, 직업영상) 중 적어도 합성사진을 포함한 조합을 포함한 최종 결과물이 진로 안내 서비스 영상으로 생성되고, 사용자에게 상기 진로 안내 서비스 영상을 표시한 화면이 제공된다. In step S60, the final result including a combination including at least a composite photo among an input photo (ie, an original image), a background photo (ie, a background image), and a composite photo (ie, a job image) is generated as a career guidance service image and a screen displaying the career guidance service image is provided to the user.

상기 진로 안내 서비스 영상에는 사용자가 커스텀할 수 있는 액자, 스티커, 문구, 그림 등이 삽입될 수 있다. A frame, sticker, text, picture, etc. that can be customized by the user may be inserted into the career guidance service image.

도 18a 및 도 18b는, 본 발명의 실시예들에 따른, 진로 안내 서비스 영상을 도시한 도면이다. 18A and 18B are diagrams illustrating a career guidance service image according to embodiments of the present invention.

도 18a를 참조하면, 사용자에게 제공되는 진로 안내 서비스 영상은 입력사진, 합성사진, 배경테두리, 및 상기 장치(1)를 제작 또는 진로 안내 서비스를 제공하는 회사의 로고(KIST)를 포함할 수 있다. Referring to FIG. 18A , the career guidance service image provided to the user may include an input photo, a composite photo, a background border, and a logo (KIST) of a company that manufactures the device 1 or provides a career guidance service. .

도 18b를 참조하면, 진로 안내 서비스 영상은 배경사진, 합성사진, 배경테두리 및 회사 로고(KIST)를 포함할 수 있다. Referring to FIG. 18B , the career guidance service image may include a background picture, a composite picture, a background border, and a company logo (KIST).

이러한 진로 안내 서비스 영상은 사진 인쇄, 또는 이메일, 핸드폰 전송 등의 통신, 또는 표시 장치에 의한 디스플레이를 통해 사용자에게 제공될 수 있다. 상기 장치(1) 및 이를 포함한 시스템(예컨대, 포토부스)은 진로 안내 서비스 영상을 제공하여 대상이 자신의 진로를 결정하는 것을 보조할 수 있다. Such a career guidance service image may be provided to the user through photo printing, communication such as e-mail or mobile phone transmission, or display by a display device. The device 1 and a system including the same (eg, a photo booth) may provide a career guidance service image to assist a subject in determining their own career path.

대안적인 실시예에서, 직업영상을 생성한 이후 곧바로 진로 안내 서비스 영상을 제공하기 이전에, 직업영상을 생성한 이후(S50), 상기 직업영상에 대한 사용자의 만족도를 확인한다(S51). In an alternative embodiment, immediately after generating the job image, before providing the career guidance service image, after generating the job image (S50), the user's satisfaction with the job image is checked (S51).

단계(S51)에서 사진 인쇄, 또는 이메일, 핸드폰 전송 등의 통신, 또는 표시 장치에 의한 디스플레이를 통해 사용자에게 현재 생성된 직업영상이 제공되면, 상기 장치(1)는 제공된 현재의 직업영상에 대한 만족도를 확인하도록 더 구성된다. 예를 들어, “해당 영상을 만족하십니까? - 예/아니오” 또는 “해당 영상을 인쇄(또는 인화)하시겠습니까? - 예/아니오”와 같은, 만족/불만족에 대한 판단 입력을 요청하는 화면이 전체 또는 팝업 형태로 제공될 수 있다. In step S51, if the currently generated job image is provided to the user through photo printing, communication such as e-mail, mobile phone transmission, or display by a display device, the device 1 is satisfied with the provided current job image. is further configured to confirm For example, “Are you satisfied with the video? - Yes/No” or “Do you want to print (or print) the image? - A screen that requests input of judgment on satisfaction/dissatisfaction, such as “yes/no”, may be provided in the form of a whole or a pop-up.

단계(S51)에서 만약 만족에 대응하는 입력이 수신되면, 상기 장치(1)는 단계(S60)를 통해 단계(S20)에서 결정된 배경영상에 기초한 최종적인 직업영상을 제공한다. If an input corresponding to satisfaction is received in step S51, the device 1 provides a final job image based on the background image determined in step S20 through step S60.

단계(S51)에서 만약 불만족에 대응하는 입력이 수신되면, 상기 장치(1)는 이전 단계(S10, S15, S20, S25, S30 및 S50) 중 적어도 하나의 단계로부터 직업영상 생성(S50)까지의 과정들을 다시 진행한다(S55). If an input corresponding to dissatisfaction is received in step S51, the device 1 performs the steps from at least one of the previous steps (S10, S15, S20, S25, S30 and S50) to the job image generation (S50). The processes are performed again (S55).

일 실시예에서, 상기 장치(1)는 불만족에 대응하는 입력을 수신한 이후, 재-진행의 시작 단계를 결정하게 하는 입력을 유도하도록 더 구성된다. 상기 장치(1)는 각 단계(S10, S15, S20, S25, S30, S50)에 대한 불만족 항목을 표시한 불만족 목록을 사용자에게 제공하고, 사용자가 적어도 하나의 불만족 항목을 입력하면, 해당 입력에 대응하는 항목에 연관된 단계로부터 재-진행을 수행할 수 있다. 예를 들어, 상기 불만족 목록은 “원본영상 불만족”, “변환 시작 나이 불만족”, “배경영상 불만족”, “목표 나이 불만족”, “나이변환 영상 불만족”, “직업영상 불만족”을 포함할 수 있으며, “배경영상 불만족”에 대한 입력이 수신된 경우, 상기 장치(1)는 단계(S20)부터 단계(S50)까지의 동작을 다시 수행한다. In an embodiment, the device 1 is further configured to induce, after receiving an input corresponding to a dissatisfaction, an input for determining a starting stage of re-progression. The device 1 provides the user with a list of dissatisfaction indicating the items of dissatisfaction for each step (S10, S15, S20, S25, S30, S50), and when the user inputs at least one item of dissatisfaction, A re-run may be performed from the step associated with the corresponding item. For example, the dissatisfaction list may include “dissatisfied with the original image”, “dissatisfied with the conversion start age”, “dissatisfied with the background image”, “dissatisfied with the target age”, “dissatisfied with the age conversion image”, and “dissatisfied with the job image” , when an input for “dissatisfied with the background image” is received, the device 1 performs the operations from step S20 to step S50 again.

이후 재-진행에 따른 직업영상이 생성되면, 다시 불만족 여부가 재-확인되고(S51), 만족에 대응하는 입력이 수신되면 재-생성된 직업영상에 기초한 진로 안내 서비스 영상이 사용자에게 제공된다(S60). After that, if a job image is generated according to the re-progress, dissatisfaction is re-confirmed again (S51), and when an input corresponding to satisfaction is received, a career guidance service image based on the re-generated job image is provided to the user ( S60).

이상에서 설명한 실시예들에 따른 직업영상 생성 장치(1), 이를 포함한 시스템, 그리고 직업영상 생성 방법에 의한 동작은 적어도 부분적으로 컴퓨터 프로그램으로 구현되어, 컴퓨터로 읽을 수 있는 기록매체에 기록될 수 있다. 예를 들어, 프로그램 코드를 포함하는 컴퓨터-판독가능 매체로 구성되는 프로그램 제품과 함께 구현되고, 이는 기술된 임의의 또는 모든 단계, 동작, 또는 과정을 수행하기 위한 프로세서에 의해 실행될 수 있다. The apparatus for generating an occupational image 1 according to the embodiments described above, a system including the same, and an operation by the method for generating an occupational image may be at least partially implemented as a computer program and recorded in a computer-readable recording medium. . For example, embodied with a program product consisting of a computer-readable medium containing program code, which may be executed by a processor for performing any or all steps, operations, or processes described.

상기 컴퓨터는 데스크탑 컴퓨터, 랩탑 컴퓨터, 노트북, 스마트 폰, 또는 이와 유사한 것과 같은 컴퓨팅 장치일 수도 있고 통합될 수도 있는 임의의 장치일 수 있다. 컴퓨터는 하나 이상의 대체적이고 특별한 목적의 프로세서, 메모리, 저장공간, 및 네트워킹 구성요소(무선 또는 유선 중 어느 하나)를 가지는 장치다. 상기 컴퓨터는 예를 들어, 마이크로소프트의 윈도우와 호환되는 운영 체제, 애플 OS X 또는 iOS, 리눅스 배포판(Linux distribution), 또는 구글의 안드로이드 OS와 같은 운영체제(operating system)를 실행할 수 있다.The computer may be any device that may be incorporated into or may be a computing device such as a desktop computer, laptop computer, notebook, smart phone, or the like. A computer is a device having one or more alternative and special purpose processors, memory, storage, and networking components (either wireless or wired). The computer may run, for example, an operating system compatible with Microsoft's Windows, Apple OS X or iOS, a Linux distribution, or an operating system such as Google's Android OS.

상기 컴퓨터가 읽을 수 있는 기록매체는 컴퓨터에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록신원확인 장치를 포함한다. 컴퓨터가 읽을 수 있는 기록매체의 예로는 ROM, RAM, CD-ROM, 자기 테이프, 플로피디스크, 광 데이터 저장신원확인 장치 등을 포함한다. 또한 컴퓨터가 읽을 수 있는 기록매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어, 분산 방식으로 컴퓨터가 읽을 수 있는 코드가 저장되고 실행될 수도 있다. 또한, 본 실시예를 구현하기 위한 기능적인 프로그램, 코드 및 코드 세그먼트(segment)들은 본 실시예가 속하는 기술 분야의 통상의 기술자에 의해 용이하게 이해될 수 있을 것이다. The computer-readable recording medium includes all types of recording identification devices in which computer-readable data is stored. Examples of the computer-readable recording medium include ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical data storage identification device, and the like. In addition, the computer-readable recording medium may be distributed in a network-connected computer system, and the computer-readable code may be stored and executed in a distributed manner. In addition, functional programs, codes, and code segments for implementing the present embodiment may be easily understood by those skilled in the art to which the present embodiment belongs.

이상에서 살펴본 본 발명은 도면에 도시된 실시예들을 참고로 하여 설명하였으나 이는 예시적인 것에 불과하며 당해 분야에서 통상의 지식을 가진 자라면 이로부터 다양한 변형 및 실시예의 변형이 가능하다는 점을 이해할 것이다. 그러나, 이와 같은 변형은 본 발명의 기술적 보호범위 내에 있다고 보아야 한다. 따라서, 본 발명의 진정한 기술적 보호범위는 첨부된 특허청구범위의 기술적 사상에 의해서 정해져야 할 것이다.Although the present invention as described above has been described with reference to the embodiments shown in the drawings, it will be understood that these are merely exemplary and that various modifications and variations of the embodiments are possible therefrom by those of ordinary skill in the art. However, such modifications should be considered to be within the technical protection scope of the present invention. Accordingly, the true technical protection scope of the present invention should be defined by the technical spirit of the appended claims.

본 발명의 실시예들에 의해 생성된 직업영상은 어린이 직업 체험 전문 시설(예컨대, 키자니아^TM , 잡월드^TM )에서 직업 체험 후 영상 서비스를 제공하는 방식으로 활용될 수 있다. 또한, 다양한 직업을 소개하는 교육용 동영상 등의 컨텐츠에 나이 변환된 아이의 성인 얼굴을 합성하여 제공하는 방식으로 활용될 수 있다. 또한, 어린이집, 유치원 등 유아 교육 시설에서 조사된 아이의 장래희망을 기반으로 생일, 졸업, 어린이날 등에 기념품으로 제공하는 방식으로 활용될 수 있다. Occupational images generated by the embodiments of the present invention may be utilized in a way of providing an image service after job experience at a ^{child job experience specialized facility (eg, KidZania TM} , Job World ^{TM ).} In addition, it can be utilized as a method of providing content such as educational videos introducing various occupations by synthesizing the adult face of the age-converted child. In addition, it can be used as a souvenir for birthdays, graduations, children's day, etc. based on the future hopes of children surveyed in early childhood education facilities such as daycare centers and kindergartens.

특히, 4차 산업 분야의 하나인 기계 학습에 기초한 모델을 사용하여 목표 나이에서의 나이 변환 영상을 생성함으로써, 유아의 미래 모습이 나타난 자연스러운 직업영상을 제공할 수 있다. In particular, by generating an age-converted image at a target age using a model based on machine learning, which is one of the quaternary industrial fields, it is possible to provide a natural job image showing the future appearance of infants.

따라서, 저출산 시대에서 아이의 직업에 대한 관심이 높아지고 있는 추세이므로, 산업상 이용가능성이 매우 높을 것으로 예상된다. Therefore, since interest in children's jobs is increasing in the age of low fertility, industrial application is expected to be very high.

Claims

In the photo booth,
The photo booth includes a device for generating a job image having an age-converted face, including a processor; and a booth at least partially surrounded by a wall to form a space where the object can be located,
The device is:
obtaining an original image including the subject's face at the original age;
obtaining original age information of the subject;
determining a background image in which a job is expressed for use in generating a job image;
converting the subject's face at the original age into the subject's face at the target age; and
and synthesizing the age-converted face with the determined face region of the background image to generate a job image of the target,
The step of generating the job image of the target is,
extracting each landmark of the face and age-converted face of the background image;
mapping the landmark of the age-converted face to the face region of the background image based on each extracted landmark;
In order to warp the face texture of the age transformation image having the age transformation face, the landmark of the age transformation face having a position on the age transformation image having the age transformation face is mapped to the background image. moving it to a position on the job image of
moving the landmark of the face of the background image having the position on the background image to the position on the job image of the landmark of the age-converted face mapped to the background image in order to warp the face texture of the background image;
generating a composite area mask for filtering an inner area based on the moved landmark of the face of the background image; and
and filtering the face region of the warped age transformation image using the composite region mask, and transplanting the filtered face region to the face region of the background image.

According to claim 1,
Photobooth, characterized in that it is further configured to perform the step of providing a career guidance service image including at least a job image among the original image, the background image, the age conversion image, and the job image to the user.

According to claim 1, wherein the step of determining the background image for the job image,
providing an interface screen for inducing a user's input for selecting a background image by displaying at least some of the previously stored candidate background images; and
When an input for selecting a background image is received, determining a candidate background image corresponding to the input as a background image for the job image of the target;
The candidate background image is configured to express the characteristics of the job and is different from the target photo booth, characterized in that it includes at least a part of a face of a person having a corresponding job.

According to claim 3, The step of determining the background image for the job image,
Prior to providing the interface screen for inducing a user's input to select the background image, further comprising the step of providing a selection screen for selecting a desired job by displaying a job-related list,
The job-related list includes at least one of one or more job listings and one or more job group entries.

The method of claim 3, wherein the interface screen,
When the subject's gender information is obtained, the photo booth comprising at least a part of the candidate background image related to the subject's gender.

The method of claim 1 , wherein the converting into the face of the subject at the target age comprises:
extracting a landmark from the face of the target of the original image;
generating a facial texture of the target at the target age from the target's face of the original image from which the landmark is extracted;
generating a face shape of the subject at the target age from the face of the subject of the original image from which the landmark is extracted; and
Photo booth comprising the step of generating an age-converted face of the target based on the facial texture and face shape of the target at the target age.

The method of claim 6, wherein the generating of a facial texture at the target age comprises:
generating a shapeless face texture from the face of the target of the original image from which the landmark is extracted; and
and applying the shapeless facial texture to a pre-learned texture transformation model to generate a shapeless facial texture of the target at the target age.

The method of claim 7, wherein the texture transformation model,
a first generator for outputting transformed data of a second domain by applying noise to input data of a first domain; and
A second generator for outputting converted data of a third domain by applying noise to the input data of the first domain,
and each generator is configured to be converted into input data of the first domain upon re-conversion of the converted data into data into the first domain.

The method of claim 8, wherein the texture transformation model,
A photobooth comprising: a generator that applies noise and condition information to input data of a first domain to output converted data of a plurality of different domains including a second domain and a third domain.

The method of claim 8, wherein the texture transformation model,
A model trained in advance using a plurality of training samples to output data corresponding to a facial texture at the target age, wherein each training sample is a label including a facial texture of a training target at the target age, and a gender of the training target A photobooth comprising data.

8. The method of claim 7, wherein generating the face shape of the subject at the target age comprises:
extracting facial features of the target of the original image based on the landmark of the face of the original image;
generating facial features of the target at the target age by applying the facial shape features of the original image to a pre-learned shape transformation model; and
and restoring a facial shape of the target at the target age based on the facial shape features of the target at the target age.

The method of claim 11, wherein the shape transformation model,
generated by modeling the relationship between age and facial features at that age,
It is characterized in that the model is modeled to calculate the facial shape features of the target at the target age based on the difference between the age function value at the target age and the age function value at the original age and the facial shape features of the target in the original image. photo booth.

The method of claim 12, wherein the shape transformation model,
When the dimension of the facial feature is N-dimensional (where N is an integer greater than or equal to 1), the photobooth is characterized in that it is modeled based on an age function for each facial feature.

The method of claim 11, wherein the shape transformation model,
A model pre-trained using a plurality of training samples and label data indicating the target age to output facial shape features at the target age, wherein the training samples in each set include facial shape features of a training target at the corresponding age Photo booth, characterized in that.

The method of claim 7, wherein the generating of the age-converted face of the target comprises:
and warping a face texture at the target age with the face shape of the target at the target age, and generating the face at the target age as an age-converted face.

delete

The method of claim 1, wherein the mapping comprises:
Photobooth, characterized in that it is performed based on the anatomical facial features that each landmark means.

The method of claim 2, wherein to provide the career guidance service image,
The photo booth, characterized in that at least one of displaying the career guidance service image, printing the career guidance service image, and transmitting the career guidance service image.

delete

According to claim 1,
lighting providing light to the object; a chair on which the subject sits; A photo booth further comprising at least one of a screen to block the outside and the inside of the booth.