KR20200097201A

KR20200097201A - Method and apparatus of generating 3d data based on deep learning

Info

Publication number: KR20200097201A
Application number: KR1020200009896A
Authority: KR
Inventors: 신승식
Original assignee: 주식회사 블루프린트랩
Priority date: 2019-02-07
Filing date: 2020-01-28
Publication date: 2020-08-18
Also published as: KR102160955B1

Abstract

Disclosed are a deep learning based-3D data generation method, and an apparatus thereof. According to an embodiment of the present invention, the deep learning based-3D data generation method comprises the steps of: acquiring an image including a face; extracting a region of interest including at least a part of the face from the image; generating an input corresponding to an input layer of a convolutional neural network based on the region of interest; acquiring an output generated by the convolutional neural network by applying the input to the convolutional neural network; generating 3D data of the face including positions corresponding to respective predefined portions in the face based on the output; expressing the predefined portions in the face with different constants; including a part or all of the constants in any one or more of predefined groups; and acquiring 3D data of a specific area of the face from the 3D data of the face based on constants of one or more of the predefined groups. According to the present invention, the 3D data of the face can be used in various fields.

Description

Deep learning based 3D data generation method and device {METHOD AND APPARATUS OF GENERATING 3D DATA BASED ON DEEP LEARNING}

아래 실시예들은 딥 러닝을 기반으로 3D 데이터를 생성하는 기술에 관한 것이다.The following embodiments relate to a technology for generating 3D data based on deep learning.

신경망(neural network) 기반의 딥 러닝(deep learning) 기술은 다양한 분야에서 활용되고 있다. 예를 들어, 얼굴, 홍채, 지문 등을 인식하는 딥 러닝 기반 생체 인식/인증 어플리케이션은 단말 및 장치 등에 채용된다. 특히, 컨볼루션 신경망(convolutional neural network; CNN)은 컨볼루션(convolution) 연산을 활용하는 다 계층 신경망으로서, 딥 러닝 기반 영상 및 이미지의 인식, 분류 및 추론 분야에서 좋은 성능을 보여준다.Deep learning technology based on neural networks is being used in various fields. For example, deep learning-based biometric recognition/authentication applications that recognize faces, irises, fingerprints, etc. are employed in terminals and devices. In particular, a convolutional neural network (CNN) is a multi-layered neural network that utilizes convolutional operations and shows good performance in the fields of deep learning-based image and image recognition, classification, and inference.

한편, 이와 별도로, 최근 스마트 폰 등에 도입된 트루 뎁스 카메라(true depth camera) 기술은 카메라를 통한 얼굴 인식 기능인 페이스 ID(Face ID) 같은 보안적인 기능은 물론, 사용자의 얼굴 표정, 근육, 골격 등을 분석해 동물 이모티콘을 만들어주는 애니모지(Animoji) 기능까지 수행하는 등, 얼굴의 3D 데이터를 다양한 방향으로 활용할 수 있도록 돕고 있다.On the other hand, separate from this, the true depth camera technology recently introduced in smart phones, etc., not only security functions such as Face ID, a facial recognition function through the camera, but also the user's facial expressions, muscles, and skeletons. It is helping to use 3D data of the face in various directions, such as performing an Animoji function that analyzes and creates animal emoticons.

그러나 현재 트루뎁스 카메라는 아이폰 XS, 아이폰 XR 등 소수의 고급 기종 스마트 폰에만 적용된 기술이며, 다수의 일반 사용자가 이용하는 스마프폰, 테블릿 컴퓨터 등은 트루뎁스 카메라 기술이 적용되지 않은 카메라를 포함하고 있다. 따라서 소수의 고급 기종을 제외한 일반적으로 사용되는 스마트 폰 등에 포함된 카메라는 해상도의 발전은 있었을지언정, 여전히 2D로 이루어진 이미지만 촬영할 수 있을 뿐이며, 스마트 폰의 카메라로부터 3D 데이터를 얻을 수는 없다. 요컨대, 최고급 핸드폰을 사용하지 않고 있는 대다수의 일반 사용자는 아직 사람의 얼굴의 3D 데이터를 활용한 향상된 기능을 누리지 못하고 있는 실정이다.However, the current True Depth camera is a technology applied only to a few high-end smartphones such as the iPhone XS and iPhone XR, and the smartphones and tablet computers used by many general users include cameras to which the True Depth camera technology is not applied. . Therefore, cameras included in commonly used smart phones, excluding a few high-end models, can only shoot 2D images, although the resolution may have improved, and 3D data cannot be obtained from the camera of the smart phone. In short, the majority of general users who do not use high-end mobile phones are not yet able to enjoy improved functions using 3D data of human faces.

이에 따라, 딥 러닝 기술 중에서도 영상 및 이미지 인식 분야에서 좋은 성능을 보여주는 컨볼루션 신경망을 기반으로, 일반적인 카메라로 촬영한 사진만으로도 사람의 얼굴의 3D 데이터를 생성함으로써, 다수의 사용자들이 얼굴의 3D 데이터를 사용할 수 있도록 돕는 방법 및 장치가 요청되고 있다.Accordingly, among deep learning technologies, 3D data of a human face is generated based on a convolutional neural network that shows good performance in the field of image and image recognition. Methods and devices are being requested to help make use of them.

상기 과제를 해결하기 위한 일실시예에 따른 딥 러닝 기반 3D 데이터 생성 방법은, 얼굴을 포함하는 이미지를 획득하는 단계; 상기 이미지에서 상기 얼굴의 적어도 일부를 포함하는 관심 영역(region of interest)을 추출하는 단계; 상기 관심 영역에 기초하여, 컨볼루션 신경망(convolutional neural network)의 인풋 레이어(input layer)에 대응하는 입력을 생성하는 단계; 상기 입력을 상기 컨볼루션 신경망에 적용하여, 상기 컨볼루션 신경망에 의해 생성된 출력을 획득하는 단계; 및 상기 출력에 기초하여, 상기 얼굴 내 미리 정의된 부위들에 각각 대응하는 위치들을 포함하는 얼굴의 3D 데이터를 생성하는 단계를 포함할 수 있다.A deep learning-based 3D data generation method according to an embodiment for solving the above problem includes: obtaining an image including a face; Extracting a region of interest including at least a portion of the face from the image; Generating an input corresponding to an input layer of a convolutional neural network based on the region of interest; Applying the input to the convolutional neural network to obtain an output generated by the convolutional neural network; And generating 3D data of a face including positions respectively corresponding to predefined parts of the face based on the output.

일실시예에 따르면, 상기 얼굴의 3D 데이터를 생성하는 단계는 상기 컨볼루션 신경망에서, 상기 얼굴 내 제1 부위에 대응하는 제1 출력 노드들로부터 각각 획득된 제1 출력 값들을 조합하여, 상기 제1 부위에 대응하는 제1 값들을 생성하는 단계; 상기 상수들로부터 상기 제1 부위에 대응하는 제1 상수를 획득하는 단계; 상기 얼굴을 구성하는 그룹들-상기 그룹들에서, 상기 상수들의 적어도 일부는 중복이 허용되어 서로 다른 그룹들에 포함됨-마다 상기 제1 상수가 포함되어 있는지 여부를 판단하여, 상기 부위들을 서로 중첩시켜 상기 얼굴을 구성하는 그룹들 중 상기 제1 상수가 포함되는 제1 그룹 및 제2 그룹을 식별하는 단계; 상기 제1 그룹 내 상수들의 시퀀스 내에서, 상기 제1 상수의 순서를 식별하여 상기 제1 값들을 상기 제1 그룹 내 상기 식별된 순서에 포함시키는 단계; 상기 제2 그룹 내 상수들의 시퀀스 내에서, 상기 제1 상수의 순서를 식별하여 상기 제1 값들을 상기 제2 그룹 내 상기 식별된 순서에 포함시키는 단계; 및 상기 제1 그룹의 상수들 및 상기 제2 그룹의 상수들의 합집합의 시퀀스에서, 상기 제1 그룹의 상수들과 상기 제2 그룹의 상수들의 교집합에 포함되는 상기 제1 상수와 대응하는 순서를 참조하여, 상기 제1 그룹의 상수들과 대응하는 얼굴 내 미리 정의된 부위들 및 상기 제2 그룹의 상수들과 대응하는 얼굴 내 미리 정의된 부위들의 교집합에 포함되는 상기 제1 부위에 대응하는 3D 데이터를 생성하는 단계를 포함할 수 있다.According to an embodiment, the generating of the 3D data of the face comprises in the convolutional neural network combining first output values each obtained from first output nodes corresponding to a first part of the face, Generating first values corresponding to one portion; Obtaining a first constant corresponding to the first portion from the constants; It is determined whether the first constant is included in each of the groups constituting the face-in the groups, at least some of the constants are allowed to be duplicated and are included in different groups-and overlap the parts with each other. Identifying a first group and a second group including the first constant among groups constituting the face; In the sequence of constants in the first group, identifying an order of the first constants and including the first values in the identified order in the first group; In the sequence of constants in the second group, identifying an order of the first constants to include the first values in the identified order in the second group; And a sequence corresponding to the first constant included in the intersection of the first group constants and the second group constants in the sequence of the union of the first group constants and the second group constants. Thus, 3D data corresponding to the first portion included in the intersection of predefined portions of the face corresponding to the constants of the first group and the predefined portions of the face corresponding to the constants of the second group It may include the step of generating.

일실시예에 따르면, 상기 얼굴 내 미리 정의된 부위들은 각각 서로 다른 상수(constant)들로 표현할 수 있다.According to an embodiment, each of the predefined parts in the face may be expressed by different constants.

일실시예에 따른 딥 러닝 기반 3D 데이터 생성 방법은, 상기 상수들 중 일부 또는 전부를 미리 설정된 그룹들 중 어느 하나 이상에 포함시키는 단계; 및 상기 그룹들 중 하나 이상의 그룹의 상수들에 기초하여, 상기 얼굴의 3D 데이터로부터 얼굴의 특정 영역의 3D 데이터를 획득하는 단계를 포함할 수 있다.A deep learning-based 3D data generation method according to an embodiment includes the steps of including some or all of the constants in any one or more of preset groups; And obtaining 3D data of a specific area of the face from the 3D data of the face based on constants of one or more of the groups.

일실시예에 따르면, 상기 출력을 획득하는 단계는, 컨볼루션 레이어(convolution layer)를 통해 상기 입력에 컨볼루션 연산을 수행하여 특징맵(feature map)을 생성하는 단계; 상기 특징맵에 활성화함수(activation function)를 적용하는 단계; 상기 활성화함수가 적용된 특징맵을 풀링(pooling)하여 풀링된 특징맵(pooled feature map)을 생성하는 단계; 및 풀리 커넥티드 레이어(fully connected layer)를 통해 상기 풀링된 특징맵으로부터 상기 출력을 획득하는 단계를 포함할 수 있다.According to an embodiment, the obtaining of the output may include generating a feature map by performing a convolution operation on the input through a convolution layer; Applying an activation function to the feature map; Generating a pooled feature map by pooling the feature map to which the activation function is applied; And obtaining the output from the pulled feature map through a fully connected layer.

일실시예에 따르면, 상기 활성화함수는 ReLU(rectified linear unit)일 수 있다.According to an embodiment, the activation function may be a rectified linear unit (ReLU).

일실시예에 따르면, 입력을 생성하는 단계는 상기 관심 영역의 가로 길이와 세로 길이가 동일하도록 처리하는 것을 포함할 수 있다.According to an embodiment, generating the input may include processing so that the horizontal length and the vertical length of the ROI are the same.

일실시예에 따르면, 입력을 생성하는 단계는 상기 관심 영역을 흑백 이미지로 처리하는 것을 포함할 수 있다.According to an embodiment, generating the input may include processing the region of interest into a black and white image.

일실시예에 따른 딥 러닝 기반 3D 데이터 생성 방법은, 상기 얼굴의 특정 영역의 3D 데이터에 기초하여, 상기 얼굴의 특정 영역의 모션 트래킹(motion tracking) 데이터를 생성하는 단계를 포함할 수 있다.The deep learning-based 3D data generation method according to an embodiment may include generating motion tracking data of a specific area of the face based on 3D data of the specific area of the face.

일실시예에 따른 딥 러닝 기반 3D 데이터 생성 방법은, 상기 얼굴의 3D 데이터 및 상기 얼굴의 특정 영역의 3D 데이터에 기초하여, 상기 얼굴의 특정 영역의 3D 데이터에 대응하는 상기 그룹의 상수들 중 일부 또는 전부에 대응하는 상기 위치들을 변화시켜 상기 얼굴의 3D 데이터로부터 수정된 3D 데이터를 생성하는 단계; 및 상기 수정된 3D 데이터에 기초하여, STL 파일 (stereo lithography file)을 생성하는 단계를 포함할 수 있다.A deep learning-based 3D data generation method according to an embodiment includes, based on 3D data of the face and 3D data of a specific area of the face, some of the constants of the group corresponding to 3D data of the specific area of the face Or generating modified 3D data from 3D data of the face by changing the positions corresponding to all of them; And generating an STL file (stereo lithography file) based on the modified 3D data.

일실시예에 따른 딥 러닝 기반 3D 데이터 생성 방법은, 상기 얼굴의 3D 데이터 또는 상기 얼굴의 특정 영역의 3D 데이터에 기초하여, 사용자의 얼굴 전부 또는 일부의 3D 인증 데이터를 생성하는 단계를 포함할 수 있다.The deep learning-based 3D data generation method according to an embodiment may include generating 3D authentication data of all or part of a user's face based on 3D data of the face or 3D data of a specific area of the face. have.

일실시예에 따르면, 상기 컨볼루션 신경망의 학습은, 얼굴을 포함하는 이미지를 트레이닝 데이터(training data)로 획득하는 단계; 상기 트레이닝 데이터에서 상기 얼굴의 적어도 일부를 포함하는 트레이닝 데이터의 관심 영역을 추출하는 단계; 상기 트레이닝 데이터의 관심 영역에 기초하여, 컨볼루션 신경망의 인풋 레이어에 대응하는 트레이닝 데이터의 입력을 생성하는 단계; 상기 트레이닝 데이터의 입력을 상기 컨볼루션 신경망에 적용하여, 상기 컨볼루션 신경망에 의해 생성된 트레이닝 데이터의 출력을 획득하는 단계; 및 상기 트레이닝 데이터의 출력을 레이블드 데이터(labeled data)와 비교하여 상기 컨볼루션 신경망을 최적화하는 단계를 포함하여 이루어질 수 있다.According to an embodiment, the learning of the convolutional neural network includes: acquiring an image including a face as training data; Extracting a region of interest of training data including at least a portion of the face from the training data; Generating an input of training data corresponding to an input layer of a convolutional neural network based on an ROI of the training data; Applying the input of the training data to the convolutional neural network to obtain an output of the training data generated by the convolutional neural network; And optimizing the convolutional neural network by comparing the output of the training data with labeled data.

일실시예에 따르면, 상기 레이블드 데이터는 상기 트레이닝 데이터에 포함된 얼굴의 3D 데이터를 컨볼루션 신경망의 출력 형식으로 가공한 데이터일 수 있다.According to an embodiment, the labeled data may be data obtained by processing 3D data of a face included in the training data into an output format of a convolutional neural network.

일실시예에 따른 딥 러닝 기반 3D 데이터 생성 방법은, 매체에 저장된 컴퓨터 프로그램으로 실행할 수 있다.The deep learning-based 3D data generation method according to an embodiment may be executed with a computer program stored in a medium.

일실시예에 따른 딥 러닝 기반 3D 데이터 생성 장치는, 얼굴을 포함하는 이미지를 획득하고, 상기 이미지에서 상기 얼굴의 적어도 일부를 포함하는 관심 영역(region of interest)을 추출하고, 상기 관심 영역에 기초하여, 컨볼루션 신경망(convolutional neural network)의 인풋 레이어(input layer)에 대응하는 입력을 생성하고, 상기 입력을 상기 컨볼루션 신경망에 적용하여, 상기 컨볼루션 신경망에 의해 생성된 출력을 획득하고, 상기 출력에 기초하여, 상기 얼굴 내 미리 정의된 부위들에 각각 대응하는 위치들을 포함하는 얼굴의 3D 데이터를 생성하고, 상기 얼굴 내 미리 정의된 부위들을 각각 서로 다른 상수(constant)들로 표현하고, 상기 상수들 중 일부 또는 전부를 미리 설정된 그룹들 중 어느 하나 이상에 포함시키고, 상기 그룹들 중 하나 이상의 그룹의 상수들에 기초하여, 상기 얼굴의 3D 데이터로부터 얼굴의 특정 영역의 3D 데이터를 획득하는 프로세서를 포함할 수 있다.Deep learning-based 3D data generation apparatus according to an embodiment acquires an image including a face, extracts a region of interest including at least a part of the face from the image, and based on the region of interest Thus, an input corresponding to an input layer of a convolutional neural network is generated, and the input is applied to the convolutional neural network to obtain an output generated by the convolutional neural network, and the Based on the output, 3D data of the face including positions respectively corresponding to the predefined parts in the face are generated, and the predefined parts in the face are respectively expressed with different constants, and the A processor that includes some or all of the constants in any one or more of the preset groups, and obtains 3D data of a specific area of the face from the 3D data of the face based on the constants of one or more of the groups It may include.

일실시예에 따른 딥 러닝 기반 3D 데이터 생성 방법에 의하면, 딥 러닝 기반 3D 데이터 생성 장치가 생성한 얼굴의 3D 데이터는 얼굴 내 미리 정의된 부위들에 각각 대응하는 위치들을 포함하며, 얼굴 내 미리 정의된 부위들은 각각 서로 다른 상수들로 표현되고, 상수들 중 일부 또는 전부는 미리 설정된 그룹들 중 어느 하나 이상에 포함될 수 있고, 미리 설정된 그룹들 중 하나 이상의 그룹의 상수들에 기초하여, 얼굴의 3D 데이터로부터 얼굴의 특정 영역의 3D 데이터를 획득할 수 있다. 이런 특징 때문에, 얼굴의 3D 데이터는 다양한 분야에서 사용될 수 있다.According to the deep learning-based 3D data generation method according to an embodiment, the 3D data of the face generated by the deep learning-based 3D data generation device includes positions respectively corresponding to predefined parts in the face, and is predefined within the face. Each of the regions is represented by different constants, some or all of the constants may be included in any one or more of the preset groups, and based on the constants of one or more groups of the preset groups, the 3D face From the data, 3D data of a specific area of the face can be obtained. Because of this feature, 3D data of the face can be used in various fields.

예를 들어, 딥 러닝 기반 3D 데이터 생성 장치는 얼굴의 특정 영역에 대응하는 그룹의 상수들을 참조하여 얼굴의 3D 데이터로부터 얼굴의 특정 영역의 3D 데이터를 구성하는 위치들을 용이하게 획득할 수 있기 때문에, 얼굴의 특정 부분의 모션 트래킹(motion tracking) 데이터를 용이하게 생성할 수 있다.For example, since the deep learning-based 3D data generating apparatus can easily obtain positions constituting 3D data of a specific area of the face from 3D data of the face by referring to constants of a group corresponding to a specific area of the face, It is possible to easily generate motion tracking data of a specific part of the face.

또한, 딥 러닝 기반 3D 데이터 생성 장치는 얼굴의 특정 영역에 대응하는 그룹의 상수들을 참조하여 얼굴의 3D 데이터로부터 얼굴의 특정 영역의 3D 데이터를 구성하는 위치들을 용이하게 획득하여 해당 위치들을 쉽게 변화시킬 수 있기 때문에, 얼굴의 특정 영역에 변화를 준 STL(stereo lithography file) 파일을 용이하게 생성할 수 있다.In addition, the deep learning-based 3D data generating device can easily acquire positions constituting 3D data of a specific area of the face from 3D data of the face by referring to constants of a group corresponding to a specific area of the face, and thus can easily change the corresponding positions. As a result, it is possible to easily generate a stereo lithography file (STL) file that has changed a specific area of the face.

나아가, 딥 러닝 기반 3D 데이터 생성 장치는 얼굴의 특정 영역에 대응하는 그룹의 상수들을 참조하여 얼굴의 3D 데이터로부터 얼굴의 특정 영역의 3D 데이터를 구성하는 위치들을 용이하게 획득할 수 있기 때문에, 얼굴의 일부분으로 구성된 3D 인증 데이터를 용이하게 생성할 수 있다.Furthermore, since the deep learning-based 3D data generating apparatus can easily obtain positions constituting 3D data of a specific area of the face by referring to the constants of the group corresponding to the specific area of the face, It is possible to easily generate 3D authentication data composed of parts.

한편, 실시예들에 따른 효과는 이상에서 언급한 것으로 제한되지 않으며, 언급되지 않은 또 다른 효과들은 아래의 기재로부터 해당 기술 분야의 통상의 지식을 가진 자에게 명확히 이해될 수 있을 것이다.Meanwhile, the effects according to the embodiments are not limited to those mentioned above, and other effects that are not mentioned may be clearly understood by those of ordinary skill in the art from the following description.

도 1은 일실시예에 따른 딥 러닝 기반 3D 데이터 생성 방법을 설명하기 위한 순서도이다.
도 2는 일실시예에 따른 획득한 이미지로부터 관심 영역을 추출하는 단계 및 추출한 관심 영역에 기초하여 컨볼루션 신경망의 인풋레이어에 대응하는 입력을 생성하는 단계를 설명하기 위한 도면이다.
도 3은 일실시예에 따른 컨볼루션 신경망을 설명하기 위한 도면이다.
도 4는 일실시예에 따른 얼굴의 3D 데이터를 설명하기 위한 도면이다.
도 5는 일실시예에 따른 얼굴의 3D 데이터의 응용예를 설명하기 위한 도면이다.
도 6는 일실시예에 따른 컨볼루션 신경망이 학습되는 과정을 설명하기 위한 순서도이다.
도 7은 일실시예에 따른 컨볼루션 신경망이 학습되는 과정을 설명하기 위한 블럭도이다.
도 8은 일실시예에 따른 딥 러닝 기반 3D 데이터 생성 장치의 구성의 예시도이다.1 is a flow chart illustrating a deep learning-based 3D data generation method according to an embodiment.
FIG. 2 is a diagram illustrating a step of extracting a region of interest from an acquired image and generating an input corresponding to an input layer of a convolutional neural network based on the extracted region of interest, according to an exemplary embodiment.
3 is a diagram illustrating a convolutional neural network according to an embodiment.
4 is a diagram for describing 3D data of a face according to an exemplary embodiment.
5 is a diagram for explaining an application example of 3D data of a face according to an embodiment.
6 is a flowchart illustrating a process of learning a convolutional neural network according to an embodiment.
7 is a block diagram illustrating a process of learning a convolutional neural network according to an embodiment.
8 is an exemplary diagram of a configuration of an apparatus for generating 3D data based on deep learning according to an embodiment.

실시예들의 특정한 구조적 또는 기능적 설명들은 단지 예시를 위한 목적으로 개시된 것으로서, 다양한 형태로 변경되어 실시될 수 있다. 따라서, 실시예들은 특정한 개시형태로 한정되는 것이 아니며, 본 명세서의 범위는 기술적 사상에 포함되는 변경, 균등물, 또는 대체물을 포함한다.Specific structural or functional descriptions of the embodiments are disclosed for illustrative purposes only, and may be changed in various forms and implemented. Accordingly, the embodiments are not limited to a specific disclosure form, and the scope of the present specification includes changes, equivalents, or substitutes included in the technical idea.

제1 또는 제2 등의 용어를 다양한 구성요소들을 설명하는데 사용될 수 있지만, 이런 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 해석되어야 한다. 예를 들어, 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소는 제1 구성요소로도 명명될 수 있다.Terms such as first or second may be used to describe various components, but these terms should be interpreted only for the purpose of distinguishing one component from other components. For example, a first component may be referred to as a second component, and similarly, a second component may be referred to as a first component.

어떤 구성요소가 다른 구성요소에 "연결되어" 있다고 언급된 때에는, 그 다른 구성요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있지만, 중간에 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다.When a component is referred to as being "connected" to another component, it is to be understood that it may be directly connected or connected to the other component, but other components may exist in the middle.

단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 명세서에서, "포함하다" 또는 "가지다" 등의 용어는 설명된 특징, 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것이 존재함으로 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.Singular expressions include plural expressions unless the context clearly indicates otherwise. In the present specification, terms such as "comprise" or "have" are intended to designate that the described feature, number, step, action, component, part, or combination thereof is present, but one or more other features or numbers, It is to be understood that the possibility of addition or presence of steps, actions, components, parts, or combinations thereof is not preliminarily excluded.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 해당 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가진다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미를 갖는 것으로 해석되어야 하며, 본 명세서에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Unless otherwise defined, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the relevant technical field. Terms as defined in a commonly used dictionary should be interpreted as having a meaning consistent with the meaning in the context of the related technology, and should not be interpreted as an ideal or excessively formal meaning unless explicitly defined in this specification. Does not.

이하, 실시예들을 첨부된 도면을 참조하여 상세하게 설명한다.Hereinafter, embodiments will be described in detail with reference to the accompanying drawings.

도 1은 일실시예에 따른 딥 러닝 기반 3D 데이터 생성 방법을 설명하기 위한 순서도이다. 도 1을 참조하면, 딥 러닝 기반 3D 데이터 생성 장치는 사람의 얼굴을 포함하는 이미지를 획득할 수 있다(110). 딥 러닝 기반 3D 데이터 생성 장치는 일반적인 연산기능, 저장기능, 통신기능, 입출력기능 등의 수행에 더하여, 컨볼루션 신경망 등의 인공신경망을 처리하여 사람의 얼굴 등의 3D 데이터를 생성하는 장치로서, 소프트웨어 모듈, 하드웨어 모듈 또는 이들의 조합으로 구현될 수 있다. 예를 들어, 딥 러닝 기반 3D 데이터 생성 장치는 컨볼루션 신경망과 관련된 동작, 연산 및 명령 등을 전달 또는 처리하여, 목적하는 3D 데이터를 생성할 수 있다. 딥 러닝 기반 3D 데이터 생성 장치는 스마트 폰, 테블릿 컴퓨터, 랩톱 컴퓨터, 데스크톱 컴퓨터, 텔레비전, 스마트워치, 웨어러블 장치, 보안 시스템, 스마트 홈 시스템, 전용서버, 범용서버 등 다양한 컴퓨팅 장치 및/또는 시스템에 탑재될 수 있다.1 is a flow chart illustrating a deep learning-based 3D data generation method according to an embodiment. Referring to FIG. 1, an apparatus for generating 3D data based on deep learning may acquire an image including a human face (110). Deep learning-based 3D data generation device is a device that generates 3D data such as human faces by processing artificial neural networks such as convolutional neural networks in addition to performing general calculation functions, storage functions, communication functions, and input/output functions. It may be implemented as a module, a hardware module, or a combination thereof. For example, an apparatus for generating 3D data based on deep learning may transmit or process an operation, operation, and command related to a convolutional neural network to generate desired 3D data. Deep learning-based 3D data generation devices are applied to various computing devices and/or systems such as smart phones, tablet computers, laptop computers, desktop computers, televisions, smart watches, wearable devices, security systems, smart home systems, dedicated servers, and general-purpose servers. Can be mounted.

한편, 사람의 얼굴을 포함하는 이미지는 딥 러닝 기반 3D 데이터 생성 장치에 의해 획득될 수만 있다면, 그 구체적인 획득 방법에는 제약이 없다. 가령, 스마트 폰이나 테플릿 컴퓨터처럼 하드웨어 장치가 카메라를 포함하는 구성이라면, 딥 러닝 기반 3D 데이터 생성 장치는 카메라 또는 사진첩과 연동된 애플리케이션과 연결되어, 촬영 및 저장된 이미지를 획득할 수 있다. 또는, 딥 러닝 기반 3D 데이터 생성 장치는 하드웨어 및 소프트웨어적 구성을 통해, USB, SSD, 하드드라이브, DVD 등의 비휘발성 메모리 또는 DRAM, SRAM 등의 휘발성 메모리에 저장된 이미지 파일을 읽어드림으로써 사람의 얼굴을 포함하는 이미지를 획득할 수 있다. 또는, 딥 러닝 기반 3D 데이터 생성 장치는 유무선 인터넷, 전용회선, 인트라넷 등으로부터 서버나 다른 기기로부터 이미지 파일을 수신함으로써 사람의 얼굴을 포함하는 이미지를 획득할 수 있다.On the other hand, as long as an image including a human face can be acquired by a deep learning-based 3D data generating device, there is no restriction on the specific acquisition method. For example, if the hardware device includes a camera, such as a smart phone or a tablet computer, the deep learning-based 3D data generation device may be connected to an application linked to a camera or a photo album to obtain captured and stored images. Alternatively, the deep learning-based 3D data generation device reads image files stored in nonvolatile memory such as USB, SSD, hard drive, DVD, etc. or volatile memory such as DRAM and SRAM through hardware and software configuration. It is possible to obtain an image including. Alternatively, the deep learning-based 3D data generating apparatus may acquire an image including a human face by receiving an image file from a server or other device from a wired/wireless Internet, a leased line, or an intranet.

이어서, 딥 러닝 기반 3D 데이터 생성 장치는 획득한 이미지로부터 얼굴의 적어도 일부를 포함하는 관심 영역(region of interest; RoI)을 추출할 수 있다(120). 딥 러닝 기반 3D 데이터 생성 장치가 획득한 이미지로부터 얼굴의 적어도 일부를 포함하는 관심 영역을 추출하는 모습은 도2와 같을 수 있다. 도 2는 일실시예에 따른 획득한 이미지로부터 관심 영역을 추출하는 단계 및 추출한 관심 영역에 기초하여 컨볼루션 신경망의 인풋레이어에 대응하는 입력을 생성하는 단계를 설명하기 위한 도면이다.Subsequently, the deep learning-based 3D data generating apparatus may extract a region of interest (RoI) including at least a part of the face from the acquired image (120). A state in which the apparatus for generating 3D data based on deep learning extracts an ROI including at least a part of a face from an image acquired may be as shown in FIG. 2. FIG. 2 is a diagram for describing a step of extracting a region of interest from an acquired image and generating an input corresponding to an input layer of a convolutional neural network based on the extracted region of interest, according to an exemplary embodiment.

도 2를 참조하면, 딥 러닝 기반 3D 데이터 생성 장치는 획득한 이미지(200)에서 눈, 코, 입, 이마, 광대, 볼, 턱 등이 일부 또는 전부 포함되도록 관심 영역(210)을 추출할 수 있다. 획득한 이미지(200)로부터 관심 영역(210)을 추출하는 방법으로는, 범용 얼굴인식 API를 그대로 이용하거나, 범용 얼굴인식 API와 더불어 추가의 연산을 통해 얼굴을 포함하는 이미지로부터 관심 영역을 추출할 수 있다. Referring to FIG. 2, the deep learning-based 3D data generating apparatus may extract a region of interest 210 so that some or all of the eyes, nose, mouth, forehead, cheekbones, cheeks, chin, etc. are included in the acquired image 200. have. As a method of extracting the region of interest 210 from the acquired image 200, a general-purpose face recognition API can be used as it is, or the region of interest can be extracted from an image including a face through an additional operation in addition to the general-purpose face recognition API. I can.

예를 들어, 범용 얼굴인식 API로는 Microsoft Azure에서 제공되는 Face API나, Google Cloud에서 제공되는 Cloud Vision API의 face detection 기능 등이 있을 수 있다. 한편, 범용 얼굴인식 API의 구현을 통해 목표하는 관심 영역을 추출할 수 없을 경우, 범용 얼굴인식 API와 더불어 추가의 연산을 통해 얼굴을 포함하는 이미지로부터 관심 영역을 추출할 수 있다. For example, as a general-purpose face recognition API, there may be a Face API provided by Microsoft Azure or a face detection function of Cloud Vision API provided by Google Cloud. On the other hand, if the target region of interest cannot be extracted through the implementation of the general-purpose face recognition API, the region of interest may be extracted from the image including the face through an additional operation along with the general-purpose face recognition API.

가령, 범용 얼굴인식 API를 구현했을 때 획득한 이미지(200)로부터 얼굴 주변의 배경을 많이 추출한다면, 딥 러닝 기반 3D 데이터 생성 장치에서 추가의 연산과 제어를 통해 배경 부분을 제외한 관심 영역(210)을 추출할 수 있다. 또는, 범용 얼굴인식 API를 구현했을 때 획득한 이미지(200)로부터 턱, 볼, 광대 등의 얼굴 주변부가 충분히 추출되지 않는다면, 딥 러닝 기반 3D 데이터 생성 장치에서 추가의 연산과 제어를 통해 얼굴 주변부를 포함한 관심 영역(210)을 추출할 수 있다.For example, if a lot of backgrounds around the face are extracted from the image 200 obtained when implementing a general-purpose face recognition API, the region of interest 210 excluding the background portion through additional calculation and control in the deep learning-based 3D data generation device Can be extracted. Alternatively, if the facial peripherals such as chin, cheeks, and cheekbones are not sufficiently extracted from the image 200 obtained when the general-purpose face recognition API is implemented, the deep learning-based 3D data generation device performs additional calculations and controls to The included region of interest 210 may be extracted.

이어서, 도 1을 다시 살펴보면, 딥 러닝 기반 3D 데이터 생성 장치는 관심 영역에 기초하여, 컨볼루션 신경망의 인풋레이어에 대응하는 입력을 생성할 수 있다(130). 딥 러닝 기반 3D 데이터 생성 장치가 관심 영역(210)에 기초하여 입력(220)을 생성하는 모습은 도2와 같을 수 있다.Next, referring to FIG. 1 again, the apparatus for generating 3D data based on deep learning may generate an input corresponding to an input layer of a convolutional neural network based on an ROI (130). A state in which the deep learning-based 3D data generating apparatus generates the input 220 based on the ROI 210 may be as shown in FIG. 2.

관심 영역(210)으로부터 입력(220)을 생성함에 있어, 관심 영역(210)의 가로 길이와 세로 길이가 동일하도록 처리하는 과정이 있을 수 있다. 길이 단위는 cm, mm, inch 등 물리적인 길이 단위일 수 있으며, dpi, ppi 등의 화상 단위일 수도 있다. 이를 통해 관심 영역(210)으로부터 생성된 입력(220)은 정사각형의 이미지 파일이 될 수 있다. 일반적으로 컨볼루션 연산이 정사각형 단위의 커널(kernel)을 통해 이루어지며, 풀링(pooling)의 기본 단위도 정사각형 단위로 이루어지므로, 입력(220)을 정사각형의 이미지 파일로 만들어줌으로써 컨볼루션 연산이 관심 영역(210)으로 추출된 이미지의 모든 부분에 대해 적절하게 수행되도록 할 수 있다.In generating the input 220 from the ROI 210, there may be a process of processing so that the horizontal length and the vertical length of the ROI 210 are the same. The length unit may be a physical length unit such as cm, mm, and inch, or may be an image unit such as dpi or ppi. Through this, the input 220 generated from the ROI 210 may be a square image file. In general, the convolution operation is performed through a square unit kernel, and the basic unit of pooling is also performed in a square unit, so the input 220 is made into a square image file, so that the convolution operation is an area of interest. All parts of the image extracted with (210) can be properly performed.

또한, 관심 영역(210)으로부터 입력(220)을 생성함에 있어, 관심 영역(210)을 흑백 이미지로 처리하는 과정이 있을 수 있다. 이를 통해 관심 영역(210)으로부터 생성된 입력(220)은 색상 정보가 없는 단일 채널(channel)의 이미지 파일이 될 수 있다. 단일 채널의 이미지 파일이 컨볼루션 신경망으로 입력될 경우, 컨볼루션 신경망은 높이(height)와 너비(width)만 가지며 깊이(depth)는 가지지 않는 특징맵(feature map)을 연산하면 되므로, 보다 적은 컴퓨팅 파워를 가지고도 보다 빠른 속도로 목표하는 출력을 얻을 수 있다. 이를 통해, 딥 러닝 기반 3D 데이터 생성 장치가 전용서버나 최고급 스마트 폰이나 등이 아닌 아닌 일반적인 스마트 폰에 탑재되더라도 컨볼루션 신경망의 연산이 용이하게 수행되도록 할 수 있다.In addition, in generating the input 220 from the region of interest 210, there may be a process of processing the region of interest 210 as a black and white image. Through this, the input 220 generated from the ROI 210 may be an image file of a single channel without color information. When an image file of a single channel is input to a convolutional neural network, the convolutional neural network has only height and width and only needs to calculate a feature map that does not have a depth, so there is less computation. Even with power, the target output can be obtained at a faster rate. Through this, even if the deep learning-based 3D data generating device is mounted on a general smartphone rather than a dedicated server or a high-end smartphone, it is possible to easily perform the computation of the convolutional neural network.

이어서, 다시 도 1을 살펴보면, 딥 러닝 기반 3D 데이터 생성 장치는 입력을 컨볼루션 신경망에 적용하여, 컨볼루션 신경망에 의해 생성된 출력을 획득할 수 있다(140). 딥 러닝 기반 3D 데이터 생성 장치에 포함된 컨볼루션 신경망의 구성 및 작동 원리는 도 3과 같을 수 있다. 도 3은 일실시예에 따른 컨볼루션 신경망을 설명하기 위한 도면이다.Next, referring again to FIG. 1, the apparatus for generating 3D data based on deep learning may apply an input to a convolutional neural network to obtain an output generated by the convolutional neural network (140). The configuration and operation principle of the convolutional neural network included in the deep learning-based 3D data generation apparatus may be as shown in FIG. 3. 3 is a diagram illustrating a convolutional neural network according to an embodiment.

컨볼루션 신경망(300)은 미리 구축된 데이터베이스로부터 커널 또는 입력(220)을 로딩할 수 있고, 데이터베이스는 딥 러닝 기반 3D 데이터 생성 장치에 포함된 메모리로 구현되거나 딥 러닝 기반 3D 데이터 생성 장치와 유선, 무선, 또는 네트워크 등으로 연결 가능한 서버 등의 외부 장치로 구현될 수 있다.The convolutional neural network 300 may load a kernel or input 220 from a pre-built database, and the database may be implemented as a memory included in a deep learning-based 3D data generating device, or a deep learning-based 3D data generating device and wired, It may be implemented as an external device such as a server that can be connected wirelessly or through a network.

기계 학습(machine learning)에 있어서, 신경망(neural network)의 일종인 컨볼루션 신경망(300)은 컨볼루션 연산을 수행하도록 설계된 컨볼루션 레이어(convolution layer, 310, 320)들을 포함한다. 컨볼루션 신경망(300)을 구성하는 컨볼루션 레이어(310, 320)는 적어도 하나의 커널을 이용하여 입력과 연관된 컨볼루션 연산을 수행할 수 있다. 도 3과 같이, 컨볼루션 신경망(300)이 복수의 컨볼루션 레이어(310, 320)들을 포함하면, 딥 러닝 기반 3D 데이터 생성 장치는 각 컨볼루션 레이어(310, 320)에 대응하는 각 컨볼루션 연산을 수행할 수 있으므로, 복수의 컨볼루션 연산들을 수행할 수 있다. 각 컨볼루션 레이어(310, 320)의 입력, 커널 및 출력의 크기는 해당 컨볼루션 레이어가 설계된 양상에 따라 정의될 수 있다.In machine learning, the convolutional neural network 300, which is a kind of neural network, includes convolution layers 310 and 320 designed to perform convolutional operations. The convolutional layers 310 and 320 constituting the convolutional neural network 300 may perform a convolution operation related to an input using at least one kernel. As shown in FIG. 3, when the convolutional neural network 300 includes a plurality of convolutional layers 310 and 320, the deep learning-based 3D data generating apparatus performs each convolution operation corresponding to each convolutional layer 310 and 320 Since can be performed, a plurality of convolution operations can be performed. The sizes of the inputs, kernels, and outputs of each convolutional layer 310 and 320 may be defined according to a design pattern of the corresponding convolutional layer.

구체적으로, 제 1 컨볼루션 레이어(310)는 입력(220)을 받아들이는 인풋 레이어(input layer)로서의 역할을 수행하며, 입력(220)은 커널과 컨볼루션 되어 특징맵(feature map, 311)을 생성할 수 있다. 커널은 입력(220)의 명도 특징을 감지하거나 에지 특징을 감지하는 커널 등으로 구성될 수 있으며, 기 알려진 AlexNet 모델, ConvNet 모델, LeNet-5모델, Inception Network, GoogLeNet 모델 등에서의 커널들이 이용될 수 있다. 그러나, 이에 제한되는 것은 아니며, 다양한 컨볼루션 신경망 모델들에서 이용되는 커널들이 이용될 수 있다.Specifically, the first convolution layer 310 serves as an input layer that receives the input 220, and the input 220 is convolved with the kernel to generate a feature map 311. Can be generated. The kernel may be composed of a kernel that detects the brightness feature of the input 220 or the edge feature, and kernels from known AlexNet model, ConvNet model, LeNet-5 model, Inception Network, GoogLeNet model, etc. can be used. have. However, the present invention is not limited thereto, and kernels used in various convolutional neural network models may be used.

다음으로, 획득된 특징맵(311)에 대해 활성화함수(activation function, 312)를 적용할 수 있다. 이를 통해, 비선형적인 딥 러닝이 이루어지도록 할 수 있다. 활성화함수는 기 알려진 시그모이드 함수(sigmoid function), 하이퍼볼릭 탄젠트 함수(hyperbolic tangent function), ReLU(rectified linear unit) 등이 이용될 수 있다. 그러나, 이에 제한되는 것은 아니며, 다양한 컨볼루션 신경망 모델들에서 이용되는 활성화함수들이 이용될 수 있다. Next, an activation function 312 may be applied to the obtained feature map 311. Through this, nonlinear deep learning can be performed. As the activation function, a known sigmoid function, a hyperbolic tangent function, a rectified linear unit (ReLU), or the like may be used. However, the present invention is not limited thereto, and activation functions used in various convolutional neural network models may be used.

구체적으로, 활성화함수(312)는 ReLU일 수 있다. 활성화함수(312)로 ReLU를 사용할 경우, 여러 개의 레이어로 구성된 딥 러닝 신경망에서 발생하는 그래디언트 소멸(gradient vanishing) 문제를 회피할 수 있다. 또한, ReLU는 시그모이드 함수 등과 비교하여 함수 자체가 간단하여, 보다 적은 컴퓨팅 파워를 가지고도 보다 빠른 속도로 목표하는 출력을 얻을 수 있다. 이를 통해, 딥 러닝 기반 3D 데이터 생성 장치가 전용서버나 최고급 스마트 폰이나 등이 아닌 아닌 일반적인 스마트 폰에 탑재되더라도 컨볼루션 신경망의 연산이 용이하게 수행될 수 있다.Specifically, the activation function 312 may be ReLU. When ReLU is used as the activation function 312, a gradient vanishing problem occurring in a deep learning neural network composed of several layers can be avoided. In addition, the ReLU has a simple function itself compared to a sigmoid function, and thus a target output can be obtained at a faster speed even with less computing power. Through this, even if the deep learning-based 3D data generating device is mounted on a general smartphone rather than a dedicated server or a high-end smartphone, the computation of the convolutional neural network can be easily performed.

다음으로, 활성화함수가 정용된 특징맵들을 풀링(pooling, 313)하여 제 1 풀링된 특징맵(pooled feature map, 314)을 생성할 수 있다. 풀링은 기 알려진 최대값 풀링(max pooling), 평균값 풀링(mean pooling) 등이 이용될 수 있다. 그러나, 이에 제한되는 것은 아니며, 다양한 컨볼루션 신경망 모델들에서 이용되는 풀링들이 이용될 수 있다. 풀링(313)을 통해 유의미한 정보는 남겨둔 채 컨볼루션 신경망의 전체 노드의 개수 및 연산량을 줄일 수 있으므로, 보다 적은 컴퓨팅 파워를 가지고도 보다 빠른 속도로 목표하는 출력을 얻을 수 있다. 이를 통해, 딥 러닝 기반 3D 데이터 생성 장치가 전용서버나 최고급 스마트 폰이나 등이 아닌 아닌 일반적인 스마트 폰에 탑재되더라도 컨볼루션 신경망의 연산이 용이하게 수행되도록 할 수 있다.Next, a first pooled feature map 314 may be generated by pooling the feature maps defined by the activation function 313. For pooling, known maximum pooling, mean pooling, or the like may be used. However, the present invention is not limited thereto, and poolings used in various convolutional neural network models may be used. Through the pooling 313, it is possible to reduce the total number of nodes and the amount of computation of the convolutional neural network while leaving meaningful information, so that a target output can be obtained at a faster speed even with less computing power. Through this, even if the deep learning-based 3D data generating device is mounted on a general smartphone rather than a dedicated server or a high-end smartphone, it is possible to easily perform computation of the convolutional neural network.

한편, 컨볼루션 신경망(300)에서, 컨볼루션 레이어를 통해 생성된 특징맵에 활성화함수를 적용하여 풀링하는 과정은 복수 회 이루어질 수 있다. 구체적으로, 컨볼루션 신경망(300)은 제 2 컨볼루션 레이어(320)는 제 1 풀링된 특징맵(314)을 입력으로서 획득하여, 컨볼루션 연산을 통해 제 2 특징맵(321)을 생성할 수 있다. 이어서, 제 2 특징맵(321)에 활성화함수(322)가 적용될 수 있으며, 활성화함수가 적용된 제 2 특징맵을 풀링(323)하여 제 2 풀링된 특징맵(324)을 생성할 수 있다.Meanwhile, in the convolutional neural network 300, a process of pooling by applying an activation function to a feature map generated through a convolutional layer may be performed multiple times. Specifically, the convolutional neural network 300 may obtain the first pooled feature map 314 as an input from the second convolution layer 320 and generate the second feature map 321 through a convolution operation. have. Subsequently, the activation function 322 may be applied to the second feature map 321, and a second pooled feature map 324 may be generated by pooling 323 the second feature map to which the activation function is applied.

또한, 도시되지 않았지만, 제 n 컨볼루션 레이어가 있을 수 있다. 제 n 컨볼루션 레이어는 제 n-1 풀링된 특징맵을 입력으로서 획득하여, 컨볼루션 연산을 통해 제 n 특징맵을 생성할 수 있다. 이어서, 제 n 특징맵에 활성화함수가 적용될 수 있으며, 활성화함수가 적용된 제 n 특징맵을 풀링하여 제 n 풀링된 특징맵을 생성할 수 있다. 이처럼 풀링된 특징맵을 복수 회 생성하는 과정을 통해, 유의미한 정보는 남겨둔 채 컨볼루션 신경망의 전체 노드의 개수 및 연산량을 줄일 수 있으므로, 보다 적은 컴퓨팅 파워를 가지고도 보다 빠른 속도로 목표하는 출력을 얻을 수 있다.Also, although not shown, there may be an nth convolutional layer. The nth convolutional layer may obtain the n-1th pulled feature map as an input and generate the nth feature map through a convolution operation. Subsequently, the activation function may be applied to the nth feature map, and the nth pooled feature map may be generated by pooling the nth feature map to which the activation function is applied. Through the process of generating the pooled feature map multiple times, it is possible to reduce the total number of nodes and the amount of computation of the convolutional neural network while leaving meaningful information. Therefore, the target output can be obtained at a faster speed with less computing power. I can.

이어서, 풀리 커넥티드 레이어(fully connected layer, 390)를 통해 마지막으로 풀링된 특징맵(324)을 모두 연결하여, 결과값으로써 출력(391)을 생성할 수 있다.Subsequently, all of the feature maps 324 finally pulled through a fully connected layer 390 may be connected, and an output 391 may be generated as a result value.

다시 도 1을 살펴보면, 딥 러닝 기반 3D 데이터 생성 장치는 출력에 기초하여, 얼굴 내 미리 정의된 부위들에 각각 대응하는 위치들을 포함하는 얼굴의 3D 데이터를 생성할 수 있다(150). 얼굴 내 미리 정의된 부위들에 각각 대응하는 위치들을 포함하는 얼굴의 3D 데이터는 도 4와 같을 수 있다. 도 4는 일실시예에 따른 얼굴의 3D 데이터를 설명하기 위한 도면이다.Referring back to FIG. 1, the apparatus for generating 3D data based on deep learning may generate 3D data of a face including positions respectively corresponding to predefined parts in the face based on the output (150 ). 3D data of a face including positions respectively corresponding to predefined parts in the face may be as shown in FIG. 4. 4 is a diagram for describing 3D data of a face according to an exemplary embodiment.

도 4를 참조하면, 딥 러닝 기반 3D 데이터 생성 장치는 얼굴 내 미리 정의된 부위들에 각각 대응하는 위치들을 3D 데이터로 생성할 수 있다. 가령, “코로부터 가장 멀리 위치한 오른쪽 눈 부위”라는 미리 정의된 제 1 부위(1)가 있을 수 있다. “코로부터 가장 멀리 위치한 오른쪽 눈 부위”는 모든 사람의 얼굴에 공통으로 존재하는 부위이지만, 일반적으로 사람의 얼굴마다 “코로부터 가장 멀리 위치한 오른쪽 눈 부위”가 자리하는 구체적인 위치는 상이할 것이다. 딥 러닝 기반 3D 데이터 생성 장치는 컨볼루션 신경망(300)의 출력(391)에 기초하여, 미리 정의된 “코로부터 가장 멀리 위치한 오른쪽 눈 부위”, 즉 획득한 이미지(200)의 얼굴에서 제 1 부위(1)의 구체적인 위치 데이터를 생성할 수 있다.Referring to FIG. 4, the apparatus for generating 3D data based on deep learning may generate 3D data at positions corresponding to predefined portions in a face. For example, there may be a first predefined area 1 called “the right eye area located farthest from the nose”. The “right eye area furthest from the nose” is a common area on everyone's face, but in general, the specific location of the “right eye area furthest from the nose” will be different for each face. The deep learning-based 3D data generation apparatus is based on the output 391 of the convolutional neural network 300, a predefined "right eye area located farthest from the nose", that is, the first part of the face of the acquired image 200 (1) The specific location data can be generated.

제 1 부위(1)의 위치는 미리 설정된 좌표(O)를 기준으로 세 개의 실수(x1, y1, z1)로 표현될 수 있다. 한편, 미리 정의된 다른 부위들에 각각 대응하는 위치들도 미리 설정된 좌표(O)를 변경하지 않은 상태에서 세 개의 실수(xn, yn, zn)로 표현될 수 있다.The location of the first part 1 may be expressed by three real numbers (x1, y1, z1) based on a preset coordinate (O). Meanwhile, positions corresponding to each of the other predefined parts may also be represented by three real numbers (xn, yn, zn) without changing the preset coordinates (O).

한편, 딥 러닝 기반 3D 데이터 생성 장치는 제 1 부위(1) 외에도, 미리 정의된 다른 부위들의 위치 데이터를 생성할 수 있다. 가령, “코로부터 가장 가까이 위치한 오른쪽 눈 부위”라는 미리 정의된 제 7 부위(7)가 있을 수 있다. “코로부터 가장 가까이 위치한 오른쪽 눈 부위”는 모든 사람의 얼굴에 공통으로 존재하는 부위이지만, 일반적으로 사람의 얼굴마다 “코로부터 가장 가까이 위치한 오른쪽 눈 부위”가 자리하는 구체적인 위치는 상이할 것이다. 딥 러닝 기반 3D 데이터 생성 장치는 컨볼루션 신경망(300)의 출력(391)에 기초하여, 미리 정의된 “코로부터 가장 가까이 위치한 오른쪽 눈 부위”, 즉 획득한 이미지(200)의 얼굴에서 제 7 부위(7)의 구체적인 위치 데이터를 생성할 수 있다.Meanwhile, the apparatus for generating 3D data based on deep learning may generate location data of other predefined parts in addition to the first part 1. For example, there may be a predefined seventh part 7 called "the right eye part located closest to the nose". The “right eye area closest to the nose” is a common area on everyone's faces, but in general, the specific location of the “right eye area closest to the nose” will be different for each person's face. The deep learning-based 3D data generation device is based on the output 391 of the convolutional neural network 300, a predefined "right eye area located closest to the nose", that is, the 7th part of the face of the acquired image 200. (7) You can create specific location data.

이와 같은 과정을 반복하여, 딥 러닝 기반 3D 데이터 생성 장치는 얼굴 내 미리 정의된 부위들의 위치 데이터들을 생성할 수 있으며, 위치 데이터들을 종합하여 얼굴 내 미리 정의된 부위들에 각각 대응하는 위치들을 포함하는 얼굴의 3D 데이터를 생성할 수 있다.By repeating this process, the deep learning-based 3D data generating apparatus can generate position data of predefined parts in the face, and by synthesizing the position data, the positions corresponding to each of the predefined parts in the face are included. 3D data of the face can be generated.

한편, 딥 러닝 기반 3D 데이터 생성 장치는 얼굴 내 미리 정의된 부위들을 각각 서로 다른 상수(constant)들로 표현할 수 있다. 예를 들어, 도 4를 참조하면, “코로부터 가장 멀리 위치한 오른쪽 눈 부위”인 제 1 부위(1)는 상수 1로, “오른쪽 눈에서 가장 높이 올라간 부위“인 제 2 부위(2)는 상수 2로, “코로부터 가장 가까이 위치한 오른쪽 눈 부위”인 제 7 부위(1)는 상수 7로 표현할 수 있다. 구체적으로, 얼굴의 3D 데이터가 리스트(list)나 배열(array) 형태의 자료형일 경우, 리스트나 배열의 인덱스(index)가 얼굴 내 미리 정의된 부위들과 각각 대응되는 상수들로 쓰일 수 있다. 그러나 상수들은 반드시 숫자로 표시될 필요는 없으며, 얼굴 내 미리 정의된 부위들을 구별해 줄 수만 있으면, 문자, 특수문자, 숫자, 또는 이들의 조합으로 이루어져 있을 수도 있다.Meanwhile, the deep learning-based 3D data generating apparatus may express predefined parts in the face with different constants, respectively. For example, referring to FIG. 4, the first part (1), which is the "right eye part located farthest from the nose", is a constant 1, and the second part (2), which is the "highest part from the right eye", is a constant. As 2, the seventh part (1), which is "the right eye part located closest to the nose", can be expressed as a constant 7. Specifically, when the 3D data of the face is a data type in the form of a list or an array, the index of the list or array may be used as constants corresponding to predefined parts in the face, respectively. However, constants do not necessarily have to be represented by numbers, and may consist of letters, special characters, numbers, or a combination of these as long as they can distinguish predefined parts of the face.

또한, 딥 러닝 기반 3D 데이터 생성 장치는 상수들의 일부 또는 전부를 미리 설정된 그룹들(400) 중 어느 하나 이상에 포함시킬 수 있다. 가령, 제 1 부위(1), 제 2 부위(2), 및 제 7 부위(7)에 대응하는 상수 1, 2, 및 7은 미리 설정된 “오른쪽 눈” 그룹(410)에 포함될 수 있다. 하나의 상수는 반드시 하나의 그룹에만 포함될 필요는 없으며, 복수의 그룹에 포함될 수 있다. 예를 들어 상수 7은 “미간” 그룹(430)에도 포함될 수 있다. 또한, 필요와 목적에 따라, 어떤 상수들은 미리 설정된 그룹 중 어느 그룹에도 설정되지 않을 수도 있다.In addition, the deep learning-based 3D data generating apparatus may include some or all of the constants in any one or more of the preset groups 400. For example, constants 1, 2, and 7 corresponding to the first portion 1, the second portion 2, and the seventh portion 7 may be included in the preset “right eye” group 410. One constant need not necessarily be included in only one group, but may be included in a plurality of groups. For example, the constant 7 may also be included in the “eyebrow” group 430. Also, depending on needs and purposes, certain constants may not be set in any of the preset groups.

이어서, 딥 러닝 기반 3D 데이터 생성 장치는 미리 설정된 그룹들(400) 중 하나 이상의 그룹의 상수들에 기초하여, 얼굴의 3D 데이터로부터 얼굴의 특정 영역의 3D 데이터를 획득할 수 있다. 예를 들어, “오른쪽 눈” 그룹(410)에 포함되는 상수 1, 2, 7 등을 참조하여, 해당하는 상수들에 각각 대응하는 위치들을 얼굴의 3D 데이터로부터 추출함으로써, “오른쪽 눈” 영역의 3D 데이터를 획득할 수 있다. 또한, “양쪽 눈 및 미간” 영역의 3D 데이터를 획득하고자 하는 경우, “오른쪽 눈”, “왼쪽 눈” 및 “미간” 그룹(410, 420, 430)에 포함되는 상수들을 참조하여, 해당하는 상수들에 각각 대응하는 위치들을 얼굴의 3D 데이터로부터 추출함으로써, “양쪽 눈 및 미간” 영역의 3D 데이터를 획득할 수 있다.Subsequently, the apparatus for generating 3D data based on deep learning may obtain 3D data of a specific area of the face from 3D data of the face based on constants of one or more groups among the preset groups 400. For example, by referring to constants 1, 2, and 7 included in the “right eye” group 410 and extracting positions corresponding to the corresponding constants from 3D data of the face, 3D data can be acquired. In addition, in the case of acquiring 3D data of the “Both Eyes and Glabellar” area, refer to the constants included in the “Right Eye”, “Left Eye” and “Glabellar” groups (410, 420, 430), By extracting positions corresponding to each of the fields from the 3D data of the face, 3D data of the “both eyes and brows” area can be obtained.

일실시예에 따르면, 딥 러닝 기반 3D 데이터 생성 장치는 얼굴 내 특정 부위에 대응하는 출력 노드들로부터 각각 획득된 출력 값들을 조합하여, 특정 부위에 대응하는 값들을 생성할 수 있다. 예를 들어, 딥 러닝 기반 3D 데이터 생성 장치는 컨볼루션 신경망의 출력 레이어의 출력 노드들의 출력 값들(391) 중에서 얼굴 내 부위(미리 정의된 "코로부터 가장 가까이 위치한 오른쪽 눈 부위")에 대응하는 출력 노드들의 출력 값들을 조합하여 얼굴 내 부위(미리 정의된 "코로부터 가장 가까이 위치한 오른쪽 눈 부위")에 대응하는 값들(미리 설정된 좌표(O)를 기준으로 얼굴 내 미리 정의된 부위들에 각각 대응하는 위치들의 x축 값, y축 값, 또는 z축 값 중 어느 하나의 값을 스칼라(scalar))을 생성할 수 있다.According to an embodiment, the apparatus for generating 3D data based on deep learning may generate values corresponding to a specific region by combining output values respectively obtained from output nodes corresponding to a specific region in the face. For example, the deep learning-based 3D data generation device outputs the output corresponding to the area inside the face (the predefined “right eye area closest to the nose”) among the output values 391 of the output nodes of the output layer of the convolutional neural network. By combining the output values of the nodes, values corresponding to the area within the face (the predefined "right eye area located closest to the nose") (based on the preset coordinate (O)), each corresponding to the predefined areas in the face. Any one of an x-axis value, a y-axis value, or a z-axis value of positions may be generated as a scalar.

일실시예에 따르면, 딥 러닝 기반 3D 데이터 생성 장치는 상수들로부터 특정 부위에 대응하는 특정 상수를 획득할 수 있다. 예를 들어, 딥 러닝 기반 3D 데이터 생성 장치는 상수들(1, 2, ...)로부터 얼굴 내 부위(미리 정의된 "코로부터 가장 가까이 위치한 오른쪽 눈 부위")에 대응하는 상수 7을 획득할 수 있다.According to an embodiment, the apparatus for generating 3D data based on deep learning may obtain a specific constant corresponding to a specific portion from the constants. For example, the deep learning-based 3D data generation device may obtain a constant 7 corresponding to a part within the face (a predefined "right eye part located closest to the nose") from the constants (1, 2, ...). I can.

일실시예에 따르면, 딥 러닝 기반 3D 데이터 생성 장치는 얼굴을 구성하는 그룹들-그룹들에서, 상수들의 적어도 일부는 중복이 허용되어 서로 다른 그룹들에 포함됨-마다 특정 상수가 포함되어 있는지 여부를 판단하여, 부위들을 서로 중첩시켜 얼굴을 구성하는 그룹들 중 특정 상수가 포함되는 제1 그룹 및 제2 그룹을 식별할 수 있다. 예를 들어, 딥 러닝 기반 3D 데이터 생성 장치는 얼굴을 구성하는 그룹들(410, 420, 430,, ...)에서 상수들(1,2, ...)의 적어도 일부(7)는 중복이 허용되어 서로 다른 그룹(410. 430)에 포함되고, 상수(7)가 그룹(410, 430)에 포함되어 있는지 여부를 판단할 수 있다. 딥러닝 기반 3D 데이터 생성 장치는 부위들("오른쪽 눈", "미간")을 중첩시켜 얼굴을 구성하는 그룹들(410, 420, 430, ...) 중 상수(7)가 포함되는 그룹(410,430)을 식별할 수 있다.According to an embodiment, the apparatus for generating 3D data based on deep learning determines whether a specific constant is included for each group constituting a face-in the groups, at least some of the constants are allowed to be duplicated and are included in different groups. As a result, a first group and a second group including a specific constant may be identified among groups constituting a face by overlapping portions with each other. For example, in the deep learning-based 3D data generation device, at least some (7) of the constants (1, 2, ...) in the groups (410, 420, 430, ...) constituting the face are overlapped. It is allowed to be included in the different groups 410 and 430, and it may be determined whether the constant 7 is included in the groups 410 and 430. The deep learning-based 3D data generation apparatus includes a group containing a constant 7 among the groups 410, 420, 430, ...) that form a face by overlapping parts ("right eye", "eyebrow"). 410,430) can be identified.

일실시예에 따르면, 딥 러닝 기반 3D 데이터 생성 장치는 제1 그룹 내 상수들의 시퀀스 내에서, 특정 상수의 순서를 식별하여 특정 값들을 특정 그룹 내 식별된 순서에 포함시킬 수 있다. 예를 들어, 딥 러닝 기반 3D 데이터 생성 장치는 오른쪽 눈 그룹(410) 내 상수들의 시퀀스(1,2,7, ...) 내에서, 상수(7)을 식별하여 얼굴 내 부위(미리 정의된 "코로부터 가장 가까이 위치한 오른쪽 눈 부위")에 대응하는 출력 노드들의 출력 값들을 조합하여 얼굴 내 부위(미리 정의된 "코로부터 가장 가까이 위치한 오른쪽 눈 부위")에 대응하는 값들(미리 설정된 좌표(O)를 기준으로 얼굴 내 미리 정의된 부위들에 각각 대응하는 위치들의 x축 값, y축 값, 또는 z축 값 중 어느 하나의 값을 스칼라(scalar))을 오른쪽 눈 그룹(410) 내 식별된 순서에 포함시킬 수 있다.According to an embodiment, the apparatus for generating 3D data based on deep learning may identify an order of a specific constant within a sequence of constants within a first group and include specific values in an order identified within a specific group. For example, the deep learning-based 3D data generation device identifies the constant 7 within the sequence of constants (1,2,7, ...) in the right eye group 410 and By combining the output values of the output nodes corresponding to "the right eye area closest to the nose"), values corresponding to the area within the face (the predefined "right eye area located closest to the nose") (preset coordinates (O ), the x-axis value, y-axis value, or z-axis value of each of the positions corresponding to the predefined parts in the face is a scalar) identified in the right eye group 410 Can be included in the order.

일실시예에 따르면, 딥 러닝 기반 3D 데이터 생성 장치는 제2 그룹 내 상수들의 시퀀스 내에서, 특정 상수의 순서를 식별하여 특정 값들을 제2 그룹 내 식별된 순서에 포함시킬 수 있다. 예를 들어, 딥 러닝 기반 3D 데이터 생성 장치는 미간 그룹(430) 내 상수들의 시퀀스(5,6,7,b,c,d, ...) 내에서, 상수(7)를 식별하여 얼굴 내 부위(미리 정의된 "코로부터 가장 가까이 위치한 오른쪽 눈 부위")에 대응하는 출력 노드들의 출력 값들을 조합하여 얼굴 내 부위(미리 정의된 "코로부터 가장 가까이 위치한 오른쪽 눈 부위")에 대응하는 값들(미리 설정된 좌표(O)를 기준으로 얼굴 내 미리 정의된 부위들에 각각 대응하는 위치들의 x축 값, y축 값, 또는 z축 값 중 어느 하나의 값을 스칼라(scalar))을 미간 그룹(430) 내 식별된 순서에 포함시킬 수 있다.According to an embodiment, the apparatus for generating 3D data based on deep learning may identify an order of a specific constant within a sequence of constants within the second group and include specific values in the identified order within the second group. For example, the deep learning-based 3D data generation apparatus identifies the constant 7 within the sequence of constants (5,6,7,b,c,d, ...) in the glabellar group 430 and Values corresponding to the area within the face (the predefined “right eye area closest to the nose”) by combining the output values of the output nodes corresponding to the area (the predefined “right eye area located closest to the nose”) ( Based on the preset coordinate (O), a scalar of any one of the x-axis value, y-axis value, or z-axis value of positions respectively corresponding to predefined areas in the face is used as a glabellar group 430 ) Can be included in the identified order.

일실시예에 따르면, 딥 러닝 기반 3D 데이터 생성 장치는 제1 그룹의 상수들 및 제2 그룹의 상수들의 합집합의 시퀀스에서, 제1 그룹의 상수들과 제2 그룹의 상수들의 교집합에 포함되는 제1 상수와 대응하는 순서를 참조하여, 제1 그룹의 상수들과 대응하는 얼굴 내 미리 정의된 부위들 및 제2 그룹의 상수들과 대응하는 얼굴 내 미리 정의된 부위들의 교집합에 포함되는 제1 부위에 대응하는 3D 데이터를 생성할 수 있다. 예를 들어, 딥 러닝 기반 3D 데이터 생성 장치는 오른쪽 눈 그룹(410)의 상수들(1,2,7,...) 및 미간 그룹(430)의 상수들(5,6,7,b,c,d,...)의 합집합의 시퀀스(1,2,7, ... , 5,6,7,b,c,d,...)에서, 오른쪽 눈 그룹(410)의 상수들(1,2,7,...)과 미간 그룹(430)의 상수들(5,6,7,b,c,d,...)의 교집합(7)에 포함되는 상수(7)와 대응하는 순서를 참조하여, 오른쪽 눈 그룹(410)의 상수들(1,2,7,...)과 대응하는 부위들 및 미간 미간 그룹(430)의 상수들(5,6,7,b,c,d,...)과 대응하는 부위들의 교집합(7)에 포함되는 "코로부터 가장 가까이 위치한 오른쪽 눈 부위"에 대응하는 3D 데이터를 생성할 수 있다.According to an embodiment, the apparatus for generating 3D data based on deep learning includes a first group of constants and a second group of constants in an intersection of the first group of constants and the second group of constants. 1 A first part included in the intersection of predefined parts of the face corresponding to the constants of the first group and the predefined parts of the face corresponding to the constants of the second group with reference to the order corresponding to the constant It is possible to generate 3D data corresponding to. For example, the deep learning-based 3D data generation apparatus includes constants (1,2,7,...) of the right eye group 410 and constants (5,6,7,b,) of the glabellar group 430 In the sequence of union of c,d,...) (1,2,7, ..., 5,6,7,b,c,d,...), the constants of the right eye group 410 The constant (7) included in the intersection (7) of (1,2,7,...) and the constants (5,6,7,b,c,d,...) of the glabellar group 430 and With reference to the corresponding order, regions corresponding to the constants (1,2,7,...) of the right eye group 410 and the constants (5,6,7,b) of the glabellar group 430 ,c,d,...) and 3D data corresponding to the "right eye area closest to the nose" included in the intersection 7 of the corresponding parts can be generated.

이처럼 딥 러닝 기반 3D 데이터 생성 장치가 생성한 얼굴의 3D 데이터는 얼굴 내 미리 정의된 부위들에 각각 대응하는 위치들을 포함하며, 얼굴 내 미리 정의된 부위들은 각각 서로 다른 상수들로 표현되고, 상수들 중 일부 또는 전부는 미리 설정된 그룹들(400) 중 어느 하나 이상에 포함될 수 있고, 미리 설정된 그룹들(400) 중 하나 이상의 그룹의 상수들에 기초하여, 얼굴의 3D 데이터로부터 얼굴의 특정 영역의 3D 데이터를 획득할 수 있다. 이런 특징 때문에, 얼굴의 3D 데이터는 다양한 분야에서 사용될 수 있으며, 이러한 점은 도 5를 통해 확인할 수 있다. 도 5는 일실시예에 따른 얼굴의 3D 데이터의 응용예를 설명하기 위한 도면이다.In this way, the 3D data of the face generated by the deep learning-based 3D data generating device includes positions corresponding to each of the predefined parts in the face, and the predefined parts in the face are each represented by different constants, and the constants Some or all of them may be included in any one or more of the preset groups 400, and based on the constants of one or more groups among the preset groups 400, 3D of a specific area of the face from 3D data of the face Data can be acquired. Because of this feature, 3D data of the face can be used in various fields, and this point can be confirmed through FIG. 5. 5 is a diagram for explaining an application example of 3D data of a face according to an embodiment.

도 5를 참조하면, 딥 러닝 기반 3D 데이터 생성 장치는 모션 트래킹(motion tracking) 데이터를 생성할 수 있다(510). 특히, 딥 러닝 기반 3D 데이터 생성 장치는 얼굴의 특정 영역의 모션 트래킹 데이터를 생성할 수 있다. 이때, 딥 러닝 기반 3D 데이터 생성 장치는 얼굴의 특정 영역에 대응하는 그룹의 상수들을 참조하여 얼굴의 3D 데이터로부터 얼굴의 특정 영역의 3D 데이터를 구성하는 위치들을 용이하게 획득할 수 있기 때문에, 얼굴의 특정 부분의 모션 트래킹 데이터를 용이하게 생성할 수 있다.Referring to FIG. 5, the apparatus for generating 3D data based on deep learning may generate motion tracking data (510 ). In particular, the deep learning-based 3D data generating apparatus may generate motion tracking data of a specific area of a face. At this time, since the deep learning-based 3D data generating apparatus can easily obtain positions constituting 3D data of a specific area of the face by referring to the constants of the group corresponding to the specific area of the face, It is possible to easily generate motion tracking data of a specific part.

예를 들어, 딥 러닝 기반 3D 데이터 생성 장치는 “오른쪽 눈” 그룹(410), “왼쪽 눈” 그룹(420), 및 “미간” 그룹(430)에 속하는 상수들을 참조하여, 스마트 폰의 전면카메라 등으로 자신의 모습을 바라보고 있는 사용자의 “양쪽 눈 및 미간” 영역의 3D 데이터를 용이하게 획득하여 모션 트래킹을 할 수 있다. 이를 바탕으로, 가령, 사용자의 화면에 표시되는 사용자 얼굴에 안경 이미지를 실시간으로 디스플레이 할 수 있다. 딥 러닝 기반 3D 데이터 생성 장치는 사용자의 “양쪽 눈 및 미간” 영역의 3D 데이터를 획득하여 제어하고 있기 때문에, 사용자가 고개를 돌리더라도, 지연 없는 실시간 처리로 마치 안경을 쓴 채로 고개를 돌리는 듯한 자연스러운 모션 트래킹이 이루어질 수 있다.For example, the device for generating 3D data based on deep learning refers to constants belonging to the “right eye” group 410, the “left eye” group 420, and the “eyebrow” group 430, and Motion tracking can be performed by easily acquiring 3D data of the “both eyes and eyebrows” area of the user who is looking at himself with the back. Based on this, for example, the glasses image can be displayed in real time on the user's face displayed on the user's screen. Since the deep learning-based 3D data generation device acquires and controls 3D data of the user's “both eyes and brows” area, even if the user turns his head, it is natural as if he is turning his head while wearing glasses through real-time processing without delay. Motion tracking can be done.

또한, 딥 러닝 기반 3D 데이터 생성 장치는 얼굴의 3D 데이터로부터 3D 프린팅이 가능한 STL 파일(stereo lithography file)을 생성할 수 있다(520). 나아가, 딥 러닝 기반 3D 데이터 생성 장치는 얼굴의 3D 데이터 및 얼굴의 특정 영역의 3D 데이터에 기초하여, 얼굴의 특정 영역의 3D 데이터에 대응하는 그룹의 상수들 중 일부 또는 전부에 대응하는 위치들을 변화시켜 수정된 얼굴의 3D 데이터를 생성할 수 있고, 수정된 3D 데이터에 기초하여 STL 파일을 생성할 수 있다. 이때, 딥 러닝 기반 3D 데이터 생성 장치는 얼굴의 특정 영역에 대응하는 그룹의 상수들을 참조하여 얼굴의 3D 데이터로부터 얼굴의 특정 영역의 3D 데이터를 구성하는 위치들을 용이하게 획득하여 해당 위치들을 쉽게 변화시킬 수 있기 때문에, 얼굴의 특정 영역에 변화를 준 STL 파일을 용이하게 생성할 수 있다.In addition, the deep learning-based 3D data generating apparatus may generate a stereo lithography file (STL) capable of 3D printing from 3D data of a face (520). Furthermore, the deep learning-based 3D data generation device changes positions corresponding to some or all of the group constants corresponding to the 3D data of a specific area of the face based on 3D data of the face and 3D data of a specific area of the face. As a result, 3D data of a corrected face may be generated, and an STL file may be generated based on the corrected 3D data. At this time, the deep learning-based 3D data generating device can easily obtain positions constituting 3D data of a specific area of the face from 3D data of the face by referring to the constants of the group corresponding to the specific area of the face, so that the corresponding positions can be easily changed. As a result, it is possible to easily create an STL file in which a specific area of the face is changed.

가령, 코 부분을 성형한 결과를 시뮬레이팅 하고 싶다면, 딥 러닝 기반 3D 데이터 생성 장치는 “코” 그룹의 상수들을 참조하여, 해당하는 상수들에 각각 대응하는 위치들을 얼굴의 3D 데이터로부터 추출하여 구체적인 위치들을 변화시킴으로써 “성형된 코”의 3D 데이터를 획득할 수 있고, 이를 코 영역을 제외한 얼굴의 3D 데이터와 재결합한 후 STL 파일을 생성하면, 성형된 코를 시뮬레이팅 할 수 있다.For example, if you want to simulate the result of shaping the nose, the deep learning-based 3D data generation device refers to the constants of the “nose” group and extracts the positions corresponding to the corresponding constants from the 3D data of the face. By changing the positions, 3D data of the “shaped nose” can be obtained, recombining it with 3D data of the face excluding the nose area, and then creating an STL file to simulate the shaped nose.

또한, 딥 러닝 기반 3D 데이터 생성 장치는 사용자의 얼굴 일부 또는 전부의 3D 인증 데이터를 생성할 수 있다(530). 가령, 보안시스템 등에서 사용자의 식별 및 인증을 위한 용도로 사용자의 얼굴 전부 또는 일부의 3D 데이터를 보유하고 있을 수 있다. 그런데 사용자의 스마트 폰 등이 최고급 기종이 아니기 때문에 사용자의 스마트 폰 등의 하드웨어적 구성만으로 보안시스템 등에서 보유하고 있는 3D 데이터와 비교할 용도의 사용자의 얼굴의 3D 데이터를 생성할 수 없을 수 있다. 이 경우, 딥 러닝 기반 3D 데이터 생성 장치는 사용자의 얼굴을 포함하는 2D 이미지만으로 얼굴의 3D 데이터를 생성할 수 있으며, 나아가 딥 러닝 기반 3D 데이터 생성 장치는 얼굴의 특정 영역에 대응하는 그룹의 상수들을 참조하여 얼굴의 3D 데이터로부터 얼굴의 특정 영역의 3D 데이터를 구성하는 위치들을 용이하게 획득할 수 있기 때문에, 얼굴의 일부분으로 구성된 3D 인증 데이터를 용이하게 생성할 수 있다.In addition, the deep learning-based 3D data generating apparatus may generate 3D authentication data of some or all of the user's face (530 ). For example, a security system may hold 3D data of all or part of a user's face for the purpose of user identification and authentication. However, since the user's smartphone is not the highest-end model, it may not be possible to create 3D data of the user's face for comparison with the 3D data held by a security system, etc., only with the hardware configuration of the user's smartphone. In this case, the deep learning-based 3D data generating device can generate 3D data of the face only with a 2D image including the user's face, and further, the deep learning-based 3D data generating device can generate group constants corresponding to a specific area of the face. Since the positions constituting the 3D data of a specific area of the face can be easily obtained from the 3D data of the face, 3D authentication data composed of a part of the face can be easily generated.

이처럼 딥 러닝 기반 3D 데이터 생성 장치는 얼굴의 3D 데이터를 통해 다양한 응용 데이터를 생성할 수 있다. 나아가, 응용 데이터들은 상술한 응용예들에 한정되지 않는다. 가령, 딥 러닝 기반 3D 데이터 생성 장치는 얼굴 내 미리 정의된 부위들의 3D 위치 데이터와 얼굴의 특정 영역들의 3D 위치 데이터에 기초하여, 얼굴 내 미리 정의된 부위와 부위 사이의 거리, 얼굴의 특정 영역의 길이, 얼굴의 특정 영역의 높이, 얼굴의 특정 영역의 넓이 등의 정확한 사이즈를 측정하여, 사용자의 얼굴 사이즈 데이터를 생성할 수도 있다.As such, the deep learning-based 3D data generating device can generate various application data through 3D data of a face. Furthermore, the application data are not limited to the above-described application examples. For example, the deep learning-based 3D data generating device is based on 3D position data of predefined parts of the face and 3D position data of specific areas of the face, and the distance between the predefined parts of the face and the specific areas of the face. The user's face size data may be generated by measuring the exact size, such as the length, the height of a specific area of the face, and the area of the specific area of the face.

이를 통해, 딥 러닝 기반 3D 데이터 생성 장치는 사용자의 얼굴 형상에 최적화된 맞춤형 마스크팩 등을 주문 제작하는데 활용될 수도 있다. 즉, 딥 러닝 기반 3D 데이터 생성 장치가 사용자의 얼굴 내 미리 정의된 부위들의 3D 위치 데이터 및 얼굴의 특정 영역들의 3D 데이터에 기초하여, 사용자의 오른쪽 및 왼쪽 눈의 길이, 양 눈 사이의 거리, 코의 길이 및 높이, 이마의 넓이, 광대들 사이의 폭 등을 측정하여 사용자의 얼굴 사이즈 데이터를 생성하면, 이를 바탕으로 사용자를 위한 맞춤형 마스크팩이 생산될 수 있을 것이다.Through this, the deep learning-based 3D data generation device may be used to custom-make a customized mask pack optimized for the user's face shape. In other words, the deep learning-based 3D data generating device is based on 3D location data of predefined areas in the user's face and 3D data of specific areas of the face, and the length of the user's right and left eyes, the distance between the eyes, and the nose. When the user's face size data is generated by measuring the length and height of the face, the width of the forehead, and the width between the cheeks, a customized mask pack for the user can be produced based on this.

한편, 딥 러닝 기반 3D 데이터 생성 장치에 포함되는 컨볼루션 신경망(300)은 얼굴을 포함하는 이미지에서 추출한 관심 영역으로부터 생성한 입력에 대해, 얼굴의 3D 데이터의 기초가 되는 출력을 생성하도록 학습될 수 있다. 이러한 과정은 도 6과 같을 수 있다. 도 6은 일실시예에 따른 컨볼루션 신경망이 학습되는 과정을 설명하기 위한 도면이다.On the other hand, the convolutional neural network 300 included in the deep learning-based 3D data generating device may be trained to generate an output that is the basis of 3D data of the face for an input generated from an ROI extracted from an image including a face. have. This process may be as shown in FIG. 6. 6 is a diagram illustrating a process of learning a convolutional neural network according to an embodiment.

일실시예에 따르면, 딥 러닝 기반 3D 데이터 생성을 위한 학습은 학습 장치에 의해 수행될 수 있다. 학습 장치는 3D 데이터를 생성하는 컨볼루션 신경망을 학습시키는 장치로서, 예를 들어 소프트웨어 모듈, 하드웨어 모듈 또는 이들의 조합으로 구현될 수 있다. According to an embodiment, learning for generating 3D data based on deep learning may be performed by a learning device. The learning device is a device for training a convolutional neural network that generates 3D data, and may be implemented as, for example, a software module, a hardware module, or a combination thereof.

학습 장치는 얼굴을 포함하는 이미지인 트레이닝 데이터로부터, 딥 러닝 기반 3D 데이터 생성 장치가 얼굴을 포함하는 이미지를 획득하고(610), 획득한 이미지에서 얼굴의 적어도 일부를 포함하는 관심 영역을 추출하고(620), 관심 영역에 기초하여 컨볼루션 신경망(300)의 인풋레이어에 대응하는 입력을 생성할 수 있다(630). The learning device acquires an image including a face by a deep learning-based 3D data generating device from training data that is an image including a face (610), and extracts a region of interest including at least a portion of the face from the acquired image ( 620), an input corresponding to the input layer of the convolutional neural network 300 may be generated based on the ROI (630).

학습 장치는 딥 러닝 기반 3D 데이터 생성 장치는 트레이닝 데이터의 입력을 컨볼루션 신경망(300)에 적용하여, 컨볼루션 신경망에 의해 생성된 트레이닝 데이터의 출력을 획득할 수 있고(640), 이어서, 트레이닝 데이터의 출력을 레이블드 데이터(labeled data)와 비교하여 컨볼루션 신경망(300)을 최적화할 수 있다(650). 이러한 과정은 도 7과 같을 수 있다. 도 7은 일실시예에 따른 컨볼루션 신경망이 학습되는 과정을 설명하기 위한 블럭도이다.The learning device can obtain the output of the training data generated by the convolutional neural network by applying the input of the training data to the convolutional neural network 300 (640), and then the training data. The output of is compared with labeled data to optimize the convolutional neural network 300 (650). This process may be the same as in FIG. 7. 7 is a block diagram illustrating a process of learning a convolutional neural network according to an embodiment.

도 7을 참조하면, 학습 장치는 사람의 얼굴을 포함하는 이미지인 트레이닝 데이터(700)로부터 생성한 트레이닝 데이터의 입력(710)을 컨볼루션 신경망(300)에 적용하여, 트레이닝 데이터의 출력(720)을 획득할 수 있다. 여기서 컨볼루션 신경망(300)은 컨볼루션 레이어를 통해 입력된 데이터와 커널을 컨볼루션 연산하여 특징맵을 생성하는 단계, 특징맵에 활성화함수를 적용하는 단계, 및 활성화함수가 적용된 특징맵을 풀링하여 풀링된 특징맵을 생성하는 단계를 1회 또는 복수 회 수행할 수 있다. 이후, 컨볼루션 신경망(300)은 풀리 커넥티드 레이어를 통해 마지막으로 풀링된 특징맵으로부터 트레이닝 데이터의 출력(720)을 생성할 수 있다.Referring to FIG. 7, the learning device applies an input 710 of training data generated from training data 700, which is an image including a human face, to the convolutional neural network 300, and outputs the training data 720. Can be obtained. Here, the convolutional neural network 300 generates a feature map by convolving the kernel and data input through the convolution layer, applying an activation function to the feature map, and pooling the feature map to which the activation function is applied. The step of generating the pooled feature map may be performed once or a plurality of times. Thereafter, the convolutional neural network 300 may generate an output 720 of the training data from the feature map finally pulled through the fully connected layer.

이어서, 학습 장치는 트레이닝 데이터의 출력(720)과 레이블드 데이터(730)를 비교하여 컨볼루션 신경망(300)을 최적화할 수 있다(740).Subsequently, the learning device may optimize the convolutional neural network 300 by comparing the output 720 of the training data and the labeled data 730 (740).

여기서, 레이블드 데이터(730)는 트레이닝 데이터(700)에 포함된 얼굴의 3D 데이터를 컨볼루션 신경망(300)의 출력 형식으로 가공한 데이터일 수 있다. 가령, 컨볼루션 신경망(300)의 출력 노드(node)들은 각각 미리 설정된 좌표(O)를 기준으로 얼굴 내 미리 정의된 부위들에 각각 대응하는 위치들의 x축 값, y축 값, 또는 z축 값 중 어느 하나의 값을 스칼라(scalar)로 가질 수 있으며, 레이블드 데이터(730)는 트레이닝 데이터(700)에 포함된 얼굴의 3D 데이터가 이러한 출력 노드들의 형식에 부합하도록 가공된 데이터일 수 있다.Here, the labeled data 730 may be data obtained by processing 3D data of a face included in the training data 700 into an output format of the convolutional neural network 300. For example, the output nodes of the convolutional neural network 300 are the x-axis values, y-axis values, or z-axis values of positions respectively corresponding to predefined parts in the face based on the preset coordinates (O). Any one of the values may be a scalar, and the labeled data 730 may be data processed so that 3D data of the face included in the training data 700 conforms to the format of these output nodes.

구체적으로, 트레이닝 데이터(700)는 아이폰 XR, 아이폰 XS 등의 스마트 폰의 전면 카메라로 촬영한 사람의 얼굴을 포함하는 이미지일 수 있으며, 레이블드 데이터(730)는 아이폰 XR, 아이폰 XS 등의 스마트 폰의 트루 뎁스 카메라 (true depth camera) 기술을 이용하여 생성한 동일한 사람의 얼굴의 3D 포인트 클라우드(3D point cloud)를 컨볼루션 신경망(300)의 출력 형식으로 가공한 데이터일 수 있다.Specifically, the training data 700 may be an image including a face of a person photographed with a front camera of a smart phone such as an iPhone XR and an iPhone XS, and the labeled data 730 is a smart device such as iPhone XR and iPhone XS. It may be data obtained by processing a 3D point cloud of the same person's face generated using a true depth camera technology of the phone into an output format of the convolutional neural network 300.

또는, 트레이닝 데이터(700)는 일반 카메라로 촬영한 사람의 얼굴을 포함하는 이미지일 수 있으며, 레이블드 데이터(730)는 3D 프린팅 스튜디오에서 3D 스캐너를 이용하여 생성한 이미지에 포함된 얼굴과 동일한 얼굴의 STL 파일을 컨볼루션 신경망(300)의 출력 형식으로 가공한 데이터일 수 있다.Alternatively, the training data 700 may be an image including a face of a person photographed with a general camera, and the labeled data 730 may be the same face as a face included in an image generated using a 3D scanner in a 3D printing studio. The STL file of may be processed data in an output format of the convolutional neural network 300.

그러나, 레이블드 데이터(730)는 상기 예들에 제한되는 것은 아니며, 트레이닝 데이터(700)에 포함된 사람의 얼굴과 동일한 얼굴의 3D 데이터를 생성하는 기초가 될 수 있는 출력이라면 레이블드 데이터(730)로 이용될 수 있다.However, the labeled data 730 is not limited to the above examples, and if it is an output that can serve as a basis for generating 3D data of the same face as the human face included in the training data 700, the labeled data 730 Can be used as.

다시 도 7을 참조하면, 학습 장치는 트레이닝 데이터의 출력(720)과 레이블드 데이터(730)를 비교하여 컨볼루션 신경망(300)을 최적화할 수 있다(740). 트레이닝 데이터의 출력(720)과 레이블드 데이터(730)의 비교는 손실함수(loss function)를 통해 이루어질 수 있다. 손실함수는 기 알려진 평균 제곱 오차(mean squared error, MSE), 교차 엔트로피 오차(cross entropy error, CEE) 등이 이용될 수 있다. 그러나, 이에 제한되는 것은 아니며, 트레이닝 데이터의 출력(720)과 레이블드 데이터(730)의 편차 내지는 오차를 측정할 수 있다면, 다양한 컨볼루션 신경망 모델들에서 이용되는 손실함수들이 이용될 수 있다.Referring back to FIG. 7, the training device may optimize the convolutional neural network 300 by comparing the output 720 of the training data and the labeled data 730 (740). The comparison of the output 720 of the training data and the labeled data 730 may be performed through a loss function. As the loss function, a known mean squared error (MSE), cross entropy error (CEE), and the like may be used. However, the present invention is not limited thereto, and loss functions used in various convolutional neural network models may be used as long as the deviation or error between the output 720 of the training data and the labeled data 730 can be measured.

한편, 손실함수의 최소값을 추정하여, 손실함수가 최소값 추정치가 되도록 컨볼루션 신경망의 웨이트(weight)를 재설정하는 과정을 반복함으로써 컨볼루션 신경망(300)을 최적화할 수 있다. 컨볼루션 신경망(300)의 최적화를 위해 기 알려진 역전파(backpropagation) 알고리즘, 확률론적 경사하강법(stochastic gradient descent) 등이 이용될 수 있다. 그러나, 이에 제한되는 것은 아니며, 다양한 컨볼루션 신경망 모델들에서 이용되는 웨이트의 최적화 알고리즘이 이용될 수 있다.Meanwhile, the convolutional neural network 300 may be optimized by repeating the process of resetting the weight of the convolutional neural network such that the loss function is estimated at the minimum value and the loss function becomes the minimum value. For optimization of the convolutional neural network 300, a known backpropagation algorithm, stochastic gradient descent, and the like may be used. However, the present invention is not limited thereto, and a weight optimization algorithm used in various convolutional neural network models may be used.

다음으로, 도 8은 일실시예에 따른 장치의 구성의 예시도이다.Next, FIG. 8 is an exemplary diagram of a configuration of an apparatus according to an embodiment.

도 8을 참조하면, 장치(800)는 프로세서(810) 및 메모리(820)를 포함한다. 프로세서(810)는 도 1 내지 도 7을 통하여 전술한 적어도 하나의 장치들을 포함하거나, 도 1 내지 도 7을 통하여 전술한 적어도 하나의 방법을 수행할 수 있다. 메모리(820)는 컨볼루션 레이어들의 입력과 관련된 정보, 커널과 관련된 정보들, 및 딥 러닝 기반 3D 데이터 생성 방법이 구현된 프로그램을 저장할 수 있다. 메모리(820)는 휘발성 메모리 또는 비휘발성 메모리일 수 있다.Referring to FIG. 8, the device 800 includes a processor 810 and a memory 820. The processor 810 may include at least one of the devices described above with reference to FIGS. 1 to 7 or may perform at least one of the methods described above with reference to FIGS. 1 to 7. The memory 820 may store information related to input of convolution layers, information related to kernel, and a program in which a deep learning-based 3D data generation method is implemented. The memory 820 may be a volatile memory or a nonvolatile memory.

프로세서(810)는 프로그램을 실행하고, 장치(800)를 제어할 수 있다. 프로세서(810)에 의하여 실행되는 프로그램의 코드는 메모리(820)에 저장될 수 있다. 장치(800)는 입출력 장치(도면 미 표시)를 통하여 외부 장치(예를 들어, 퍼스널 컴퓨터 또는 네트워크)에 연결되고, 데이터를 교환할 수 있다.The processor 810 may execute a program and control the device 800. The code of a program executed by the processor 810 may be stored in the memory 820. The device 800 is connected to an external device (eg, a personal computer or a network) through an input/output device (not shown) and may exchange data.

일실시예에 따르면 장치(800)는 컨볼루션 신경망과 관련된 연산을 고속으로 처리하는 CNN 가속기, NPU(Neural Processing Unit) 또는 VPU(Vision Processing Unit)에 채용되어 해당 전용 프로세서를 제어할 수 있다. 장치(800)는 설계 의도에 따라 다양한 하드웨어를 채용하거나 다양한 하드웨어에 채용될 수 있으며 도시된 구성요소들의 실시예에 한정되지 않는다. 컨볼루션 신경망 처리 시 상술한 실시예들을 적용하는 경우, 컨볼루션 신경망의 처리에서 요구되는 데이터 로드 횟수, 연산 횟수(예를 들어, MAC의 연산 횟수)를 줄여 메모리를 절감하고 처리 속도를 높일 수 있으므로, 상술한 실시예들은 제한된 리소스를 사용하는 환경이나 임베디드 단말에 적합할 수 있다.According to an embodiment, the device 800 may be employed in a CNN accelerator, a neural processing unit (NPU), or a vision processing unit (VPU) that processes operations related to a convolutional neural network at high speed to control a corresponding dedicated processor. The device 800 may employ various hardware according to design intent or may be employed in various hardware, and is not limited to embodiments of the illustrated components. When the above-described embodiments are applied when processing a convolutional neural network, the number of data loads and operations required for processing of the convolutional neural network (for example, the number of operations of MAC) can be reduced to save memory and increase processing speed. , The above-described embodiments may be suitable for an environment or an embedded terminal using limited resources.

이상에서 설명된 실시예들은 하드웨어 구성요소, 소프트웨어 구성요소, 및/또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치, 방법 및 구성요소는, 예를 들어, 프로세서, 콘트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPGA(field programmable gate array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 하나 이상의 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(OS) 및 상기 운영 체제 상에서 수행되는 하나 이상의 소프트웨어 애플리케이션을 수행할 수 있다. 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다. 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다. 예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 콘트롤러를 포함할 수 있다. 또한, 병렬 프로세서(parallel processor)와 같은, 다른 처리 구성(processing configuration)도 가능하다.The embodiments described above may be implemented as a hardware component, a software component, and/or a combination of a hardware component and a software component. For example, the devices, methods, and components described in the embodiments are, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable gate (FPGA). array), programmable logic unit (PLU), microprocessor, or any other device capable of executing and responding to instructions, such as one or more general purpose computers or special purpose computers. The processing device may execute an operating system (OS) and one or more software applications executed on the operating system. In addition, the processing device may access, store, manipulate, process, and generate data in response to the execution of software. For the convenience of understanding, although it is sometimes described that one processing device is used, one of ordinary skill in the art, the processing device is a plurality of processing elements and/or a plurality of types of processing elements. It can be seen that it may include. For example, the processing device may include a plurality of processors or one processor and one controller. In addition, other processing configurations are possible, such as a parallel processor.

소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다. 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 가상 장치(virtual equipment), 컴퓨터 저장 매체 또는 장치, 또는 전송되는 신호 파(signal wave)에 영구적으로, 또는 일시적으로 구체화(embody)될 수 있다. 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 하나 이상의 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.The software may include a computer program, code, instructions, or a combination of one or more of these, configuring the processing unit to behave as desired or processed independently or collectively. You can command the device. Software and/or data may be interpreted by a processing device or to provide instructions or data to a processing device, of any type of machine, component, physical device, virtual equipment, computer storage medium or device. , Or may be permanently or temporarily embodyed in a transmitted signal wave. The software may be distributed over networked computer systems and stored or executed in a distributed manner. Software and data may be stored on one or more computer-readable recording media.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 실시예를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 실시예의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like alone or in combination. The program instructions recorded on the medium may be specially designed and configured for the embodiment, or may be known and usable to those skilled in computer software. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tapes, optical media such as CD-ROMs and DVDs, and magnetic media such as floptical disks. -A hardware device specially configured to store and execute program instructions such as magneto-optical media, and ROM, RAM, flash memory, and the like. Examples of the program instructions include not only machine language codes such as those produced by a compiler but also high-level language codes that can be executed by a computer using an interpreter or the like. The hardware device described above may be configured to operate as one or more software modules to perform the operation of the embodiment, and vice versa.

이상과 같이 실시예들이 비록 한정된 도면에 의해 설명되었으나, 해당 기술 분야에서 통상의 지식을 가진 자라면 상기를 기초로 다양한 기술적 수정 및 변형을 적용할 수 있다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.As described above, although the embodiments have been described by the limited drawings, a person of ordinary skill in the relevant technical field may apply various technical modifications and variations based on the above. For example, the described techniques are performed in an order different from the described method, and/or components such as a system, structure, device, circuit, etc. described are combined or combined in a form different from the described method, or other components Alternatively, even if substituted or substituted by an equivalent, an appropriate result can be achieved.

그러므로, 다른 구현들, 다른 실시예들 및 특허청구범위와 균등한 것들도 후술하는 특허청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and claims and equivalents fall within the scope of the claims to be described later.

Claims

Obtaining an image including a face;
Extracting a region of interest including at least a portion of the face from the image;
Generating an input corresponding to an input layer of a convolutional neural network based on the region of interest;
Applying the input to the convolutional neural network to obtain an output generated by the convolutional neural network; And
Generating 3D data of a face including positions respectively corresponding to predefined portions in the face based on the output;
Including
Deep learning-based 3D data generation method.

The deep learning-based 3D data generation method according to claim 1, wherein the predefined parts in the face are each represented by different constants.

The method of claim 2,
Including some or all of the constants in any one or more of preset groups; And
Obtaining 3D data of a specific area of the face from the 3D data of the face based on constants of one or more of the groups;
Including
Deep learning-based 3D data generation method.

The method of claim 1, wherein based on 3D data of a specific area of the face,
Further comprising generating motion tracking data of the specific area of the face
Deep learning-based 3D data generation method.

The method of claim 4, wherein based on 3D data of the face and 3D data of a specific area of the face,
Generating modified 3D data from the 3D data of the face by changing the positions corresponding to some or all of the constants of the group corresponding to the 3D data of the specific area of the face; And
Based on the modified 3D data,
Generating an STL file (stereo lithography file);
Including
Deep learning-based 3D data generation method.

The method of claim 4, wherein based on 3D data of the face or 3D data of a specific area of the face,
Generating 3D authentication data of all or part of the user's face
Including
Deep learning-based 3D data generation method.

Acquiring an image including a face as training data;
Extracting a region of interest of training data including at least a portion of the face from the training data;
Generating an input of training data corresponding to an input layer of a convolutional neural network based on an ROI of the training data;
Applying the input of the training data to the convolutional neural network to obtain an output of the training data generated by the convolutional neural network; And
Optimizing the convolutional neural network by comparing the output of the training data with labeled data
Including,
The labeled data is
3D data of the face included in the training data is processed into an output format of a convolutional neural network,
Learning method for generating 3D data based on deep learning.

A computer program stored in a medium to execute the method of any one of claims 1 to 7.

Acquire an image including a face,
Extracting a region of interest including at least a part of the face from the image,
An input corresponding to an input layer of a convolutional neural network is generated based on the region of interest,
Applying the input to the convolutional neural network to obtain an output generated by the convolutional neural network,
Based on the output, 3D data of a face including positions respectively corresponding to predefined parts in the face is generated,
Each of the predefined areas in the face is expressed with different constants,
Some or all of the constants are included in any one or more of preset groups,
Processor for obtaining 3D data of a specific area of the face from 3D data of the face based on constants of one or more of the groups
Including
Deep learning-based 3D data generation device.