KR102334666B1

KR102334666B1 - A method for creating a face image

Info

Publication number: KR102334666B1
Application number: KR1020210064840A
Authority: KR
Inventors: 최광민; 김다운
Original assignee: 알레시오 주식회사
Priority date: 2021-05-20
Filing date: 2021-05-20
Publication date: 2021-12-07

Abstract

In one embodiment of the present disclosure for solving the above problems, a method of creating a face image performed by one or more processors of a computing device is disclosed. The method may include the steps of: receiving, by the processor, first image data and second image data; extracting, by the processor, common characteristic information corresponding to a plurality of first reference images constituting the first image data; deriving, by the processor, one or more similar images based on the extracted common characteristic information; and generating, by the processor, the one or more similar images based on a plurality of second reference images constituting the second image data, thereby creating a target prediction image.

Description

{A METHOD FOR CREATING A FACE IMAGE}

본 개시는 이미지 생성 방법에 관한 발명으로, 보다 구체적으로, 인공신경망을 활용하여 목표하는 예측 얼굴과 유사한 얼굴 이미지를 생성하기 위한 방법에 관한 것이다.The present disclosure relates to an image generating method, and more particularly, to a method for generating a face image similar to a target predicted face by using an artificial neural network.

최근 4차 산업혁명의 패러다임 변화 속에서 새로운 시대를 이끄는 핵심 동인으로 인공지능이 세간의 주목을 받고 있다. 인공지능은 인지, 학습, 추론, 판단 등 인간사고 과정의 전반을 알고리즘 설계로 구현하는 SW기술로서, 특정 산업에 한정하지 않고 전 산업 영역에 걸쳐 생산성을 획기적으로 개선하는 범용기술의 특징을 가진다.In the recent paradigm shift of the 4th industrial revolution, artificial intelligence is attracting attention as a key driver leading a new era. Artificial intelligence is a software technology that implements the overall human thinking process such as cognition, learning, reasoning, and judgment through algorithm design, and has the characteristics of a general-purpose technology that dramatically improves productivity across all industrial areas, not limited to specific industries.

한편, 딥러닝의 제안 이후, 인공지능 기술 기반의 영상 생성 기술은 크게 발전하고 있다. 특히 생성적 적대 신경망(GAN)의 제안은 데이터의 일반적인 구조적 특성뿐만 아니라 세부적인 정보를 재현할 수 있는 학습 방법을 제공함으로써, 흡사 인간에 의해 창작된 듯한 데이터의 생성을 가능하게 하였다. 생성적 적대 신경망은 생성자와 판별자 사이의 내쉬 균형을 찾는 것을 목표로 하여, 데이터의 분포 추정에 있어 기존 모델들의 성능을 크게 향상시켰다.On the other hand, since the proposal of deep learning, image generation technology based on artificial intelligence technology has been greatly developed. In particular, the proposal of a generative adversarial neural network (GAN) made it possible to generate data that looked like it was created by a human by providing a learning method that can reproduce detailed information as well as general structural characteristics of the data. The generative adversarial neural network aims to find the Nash equilibrium between the generator and the discriminator, and greatly improves the performance of existing models in estimating the distribution of data.

이러한 인공지능을 활용한 영상 생성 기술은 사용자의 의도에 부합하는 영상을 생성하는 데 목표를 두고 있다. 다만, 사용자 별로 영상을 표현하고자 하는 영상 표현 방식이나 또는, 특정 영상(예컨대, 참조 영상)을 기반으로 출력하고자 하는 타겟 영상(즉, 목표 영상)이 각각 상이함에 따라, 각 사용자의 의도에 부합하는 목표 영상이나, 기반이 되는 학습 영상 등을 활용한 인공신경망의 학습이 용이하지 않을 수 있다. 즉, 임의의 의도에 대해 적합한 이미지 또는 영상을 생성하여 제공하는 데는 한계가 있다.The image generation technology using such artificial intelligence aims to create an image that meets the user's intention. However, as the image expression method for expressing the image for each user or the target image (ie, the target image) to be output based on a specific image (eg, a reference image) are different, the user's intention is not met. It may not be easy to learn an artificial neural network using a target image or a base learning image. That is, there is a limit to generating and providing an image or image suitable for an arbitrary intention.

구체적인 예를 들어, 태아의 초음파 영상에 관한 정보에 기반하여 태아의 미래 얼굴에 관련한 이미지 또는 영상을 생성하고자 하는 경우, 영상 생성에 기반이 되는 초음파 영상의 일정하지 않은 촬영 각도와 노이즈 등에 따라 목표 영상인 미래 얼굴에 관련한 영상의 생성이 어려울 수 있다. 즉, 참조 영상으로 활용되는 초음파 영상의 경우, 태아의 형태적인 정보만을 포함하고 있어, 태아의 자연스러운 얼굴형상을 인지하는데 어려움이 있으며, 초음파 영상의 왜곡으로 인해 실제와 다른 예측 얼굴 형상이 출력되도록 신경망이 학습될 우려가 있다.For a specific example, if an image or image related to the future face of a fetus is to be generated based on information about an ultrasound image of the fetus, the target image is based on the non-uniform shooting angle and noise of the ultrasound image that is based on the image generation. It can be difficult to create images related to future faces. That is, in the case of an ultrasound image used as a reference image, it is difficult to recognize the natural face shape of the fetus because it contains only the morphological information of the fetus. There is a risk that this will be learned.

이에 따라, 당 업계에는 다양한 참조 영상에도 높은 예측 정확도를 통해 목표 영상을 생성하여 제공하는 컴퓨팅 장치의 수요가 존재할 수 있다.Accordingly, there may be a demand in the industry for a computing device that generates and provides a target image with high prediction accuracy even for various reference images.

대한민국 공개특허 10-2021-0047920Korean Patent Laid-Open Patent No. 10-2021-0047920

본 개시가 해결하고자 하는 과제는 상술한 문제점을 해결하기 위한 것으로서, 인공신경망을 활용하여 목표하는 예측 얼굴과 유사한 얼굴 이미지를 생성하는 컴퓨팅 장치를 제공하기 위함이다.SUMMARY OF THE INVENTION An object of the present disclosure is to solve the above problems, and to provide a computing device that generates a face image similar to a target predicted face by using an artificial neural network.

본 개시가 해결하고자 하는 과제들은 이상에서 언급된 과제로 제한되지 않으며, 언급되지 않은 또 다른 과제들은 아래의 기재로부터 통상의 기술자에게 명확하게 이해될 수 있을 것이다.The problems to be solved by the present disclosure are not limited to the problems mentioned above, and other problems not mentioned will be clearly understood by those skilled in the art from the following description.

상술한 과제를 해결하기 위한 본 개시의 다양한 실시예에 따른 컴퓨팅 장치의 하나 이상의 프로세서에서 수행되는 얼굴 이미지 생성 방법이 개시된다. 상기 방법은, 상기 프로세서가 제1 영상 데이터 및 제2 영상 데이터를 획득하는 단계, 상기 프로세서가 상기 제1 영상 데이터를 구성하는 복수의 제1 참조 이미지에 대응하는 공통 특성 정보를 추출하는 단계, 상기 프로세서가 상기 추출된 공통 특성 정보에 기반하여 하나 이상의 유사 이미지를 도출하는 단계 및 상기 프로세서가 상기 제2 영상 데이터를 구성하는 복수의 제2 참조 이미지에 기반하여 상기 하나 이상의 유사 이미지를 조정함으로써, 목표 예측 이미지를 생성하는 단계를 포함할 수 있다.Disclosed is a method for generating a face image performed by one or more processors of a computing device according to various embodiments of the present disclosure for solving the above-described problems. The method may include: obtaining, by the processor, first image data and second image data; extracting, by the processor, common characteristic information corresponding to a plurality of first reference images constituting the first image data; deriving one or more similar images based on the extracted common characteristic information, and adjusting the one or more similar images by the processor based on a plurality of second reference images constituting the second image data. It may include generating a prediction image.

대안적인 실시예에 따르면, 상기 제1 영상 데이터 및 상기 제2 영상 데이터는, 서로 상이한 도메인 정보를 포함하도록 구성되며, 상기 공통 특성 정보는, 얼굴 특징점(face landmark)에 관한 정보를 포함할 수 있다.According to an alternative embodiment, the first image data and the second image data are configured to include different domain information, and the common characteristic information may include information about a face landmark. .

대안적인 실시예에 따르면, 상기 하나 이상의 유사 이미지를 도출하는 단계는, 상기 프로세서가 상기 복수의 제1 참조 이미지 각각을 특성 추출 모델의 입력으로 처리하여 각 제1 참조 이미지에 대응하는 하나 이상의 특성 정보를 획득하는 단계, 상기 프로세서가 상기 하나 이상의 특성 정보에 기반하여 상기 공통 특성 정보를 획득하는 단계 및 상기 프로세서가 상기 공통 특성 정보에 기반하여 상기 하나 이상의 유사 이미지를 도출하는 단계를 포함할 수 있다.According to an alternative embodiment, the deriving the one or more similar images may include: processing, by the processor, each of the plurality of first reference images as an input of a feature extraction model to provide one or more feature information corresponding to each first reference image It may include obtaining, by the processor, the common characteristic information based on the one or more characteristic information, and the processor deriving the one or more similar images based on the common characteristic information.

대안적인 실시예에 따르면, 상기 특성 추출 모델은, 학습된 오토인코더(AutoEncoder)의 적어도 일부를 통해 구성되며, 상기 복수의 제1 참조 이미지 각각에 대응하는 상기 하나 이상의 공통 특성 정보 및 하나 이상의 도메인 정보를 출력하는 것을 특징으로 할 수 있다.According to an alternative embodiment, the feature extraction model is configured through at least a part of a learned autoencoder, and the one or more common feature information and one or more domain information corresponding to each of the plurality of first reference images It may be characterized by outputting

대안적인 실시예에 따르면, 상기 프로세서가 상기 공통 특성 정보에 기반하여 상기 하나 이상의 유사 이미지를 도출하는 단계는, 상기 프로세서가 상기 공통 특성 정보를 기반으로 타겟 이미지 데이터베이스에 대한 검색을 수행하여 사전 결정된 유사도 이상의 상기 하나 이상의 유사 이미지를 도출하는 단계를 포함하며, 상기 타겟 이미지 데이터베이스는, 복수의 도메인 각각에 대응하여 구성되는 것을 특징으로 할 수 있다.According to an alternative embodiment, in the step of the processor deriving the one or more similar images based on the common characteristic information, the processor searches a target image database based on the common characteristic information to perform a predetermined similarity degree and deriving the one or more similar images, wherein the target image database is configured to correspond to each of a plurality of domains.

대안적인 실시예에 따르면, 상기 프로세서가 상기 제1 영상 데이터를 구성하는 복수의 제1 이미지 각각을 특성 추출 모델의 입력으로 처리하여 각 이미지에 대응하는 하나 이상의 특성 정보를 획득하는 단계, 상기 프로세서가 상기 하나 이상의 특성 정보에 기초하여 얼굴 이미지에 포함된 구성 요소에 관련한 복수의 항목 각각에 대응하는 확률값을 산출하는 단계 및 상기 프로세서가 상기 각각의 확률값에 기초하여 상기 복수의 제1 이미지들 중 일부를 상기 복수의 제1 참조 이미지로 선별하는 단계를 포함할 수 있다. According to an alternative embodiment, the processor processing each of a plurality of first images constituting the first image data as an input of a feature extraction model to obtain one or more feature information corresponding to each image, the processor calculating, by the processor, a probability value corresponding to each of a plurality of items related to a component included in a face image based on the one or more characteristic information, and selecting, by the processor, a part of the plurality of first images based on the respective probability values It may include selecting the plurality of first reference images.

대안적인 실시예에 따르면, 상기 프로세서가 상기 제2 영상 데이터를 구성하는 복수의 제2 참조 이미지에 기반하여 상기 하나 이상의 유사 이미지를 조정함으로써, 목표 예측 이미지를 생성하는 단계는, 상기 프로세서가 상기 하나 이상의 제2 참조 이미지를 특성 추출 모델의 입력으로 처리하여 하나 이상의 추가 특성 정보를 획득하는 단계, 상기 프로세서가 상기 하나 이상의 추가 특성 정보에 기반하여 상기 추가 공통 특성 정보를 획득하는 단계 및 상기 프로세서가 상기 추가 공통 특성 정보 및 상기 공통 특성 정보에 기반하여 상기 목표 예측 이미지를 생성하는 단계를 포함할 수 있다.According to an alternative embodiment, generating, by the processor, the target prediction image by adjusting the one or more similar images based on a plurality of second reference images constituting the second image data may include: processing the second reference image as an input of a feature extraction model to obtain one or more additional characteristic information; obtaining, by the processor, the additional common characteristic information based on the one or more additional characteristic information; The method may include generating the target prediction image based on additional common characteristic information and the common characteristic information.

대안적인 실시예에 따르면, 상기 목표 예측 이미지를 생성하는 단계는, 상기 프로세서가 상기 추가 공통 특성 정보 및 상기 공통 특성 정보에 기반하여 최종 특성 정보를 획득하는 단계 및 상기 프로세서가 최종 특성 정보를 이미지 생성 모델의 입력으로 처리하여 상기 목표 예측 이미지를 생성하는 단계를 포함하고, 상기 이미지 생성 모델은, 학습된 오토인코더의 적어도 일부를 통해 구성될 수 있다.According to an alternative embodiment, the generating of the target prediction image may include: obtaining, by the processor, final characteristic information based on the additional common characteristic information and the common characteristic information; and generating, by the processor, the final characteristic information into an image and generating the target prediction image by processing it as an input of a model, wherein the image generation model may be configured through at least a part of a learned autoencoder.

본 개시의 다른 실시예에 따르면, 얼굴 이미지 생성 방법을 수행하는 컴퓨팅 장치가 개시된다. 상기 컴퓨팅 장치는 하나 이상의 인스트럭션을 저장하는 메모리 및 상기 메모리에 저장된 하나 이상의 인스터럭션을 실행하는 프로세서를 포함하고, 상기 프로세서는 상기 하나 이상의 인스터럭션을 실행함으로써, 전술한 얼굴 이미지 방법을 수행할 수 있다.According to another embodiment of the present disclosure, a computing device for performing a method of generating a face image is disclosed. The computing device includes a memory for storing one or more instructions and a processor for executing one or more instructions stored in the memory, wherein the processor executes the one or more instructions to perform the face image method described above. can

본 개시의 또 다른 실시예에 따르면, 컴퓨터에서 독출가능한 기록매체에 저장된 컴퓨터 프로그램이 개시된다. 상기 컴퓨터 프로그램은 하드웨어인 컴퓨터와 결합되어, 전술한 얼굴 이미지 생성 방법을 수행할 수 있다.According to another embodiment of the present disclosure, a computer program stored in a computer-readable recording medium is disclosed. The computer program may be combined with a computer, which is hardware, to perform the above-described method for generating a face image.

본 개시의 다양한 실시예에 따르면, 인공신경망을 활용하여 목표하는 예측 얼굴과 유사한 얼굴 이미지를 생성하는 컴퓨팅 장치를 제공할 수 있다.According to various embodiments of the present disclosure, it is possible to provide a computing device that generates a face image similar to a target predicted face by using an artificial neural network.

본 개시의 효과들은 이상에서 언급된 효과로 제한되지 않으며, 언급되지 않은 또 다른 효과들은 아래의 기재로부터 통상의 기술자에게 명확하게 이해될 수 있을 것이다.Effects of the present disclosure are not limited to the effects mentioned above, and other effects not mentioned will be clearly understood by those skilled in the art from the following description.

다양한 양상들이 이제 도면들을 참조로 기재되며, 여기서 유사한 참조 번호들은 총괄적으로 유사한 구성요소들을 지칭하는데 이용된다. 이하의 실시예에서, 설명 목적을 위해, 다수의 특정 세부사항들이 하나 이상의 양상들의 총체적 이해를 제공하기 위해 제시된다. 그러나, 그러한 양상(들)이 이러한 구체적인 세부사항들 없이 실시될 수 있음은 명백할 것이다.
도 1은 본 개시의 일 실시예와 관련된 얼굴 이미지 생성 방법을 제공하기 위한 컴퓨팅 장치의 다양한 양태가 구현될 수 있는 시스템을 나타낸 개념도를 도시한다.
도 2는 본 개시의 일 실시예와 관련한 얼굴 이미지 생성 방법을 제공하기 위한 컴퓨팅 장치의 블록 구성도를 도시한다.
도 3는 본 개시의 일 실시예와 관련된 얼굴 이미지 생성 방법의 전체적인 개략도를 예시적으로 나타낸 도면이다.
도 4는 본 개시의 일 실시예와 관련된 복수의 제1 참조 이미지들에 대응하는 공통 특성 정보를 출력하는 과정을 예시적으로 나타낸 예시도를 도시한다.
도 5은 본 개시의 일 실시예와 관련된 제2 참조 이미지를 기반으로 목표 예측 이미지를 생성하는 과정을 예시적으로 나타낸 예시도를 도시한다.
도 6은 본 개시의 일 실시예와 관련된 얼굴 이미지를 생성하기 위한 방법을 예시적으로 도시한 순서도이다.
도 7은 본 개시의 일 실시예와 관련된 하나 이상의 네트워크 함수를 나타낸 개략도이다.Various aspects are now described with reference to the drawings, wherein like reference numbers are used to refer to like elements collectively. In the following example, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of one or more aspects. It will be evident, however, that such aspect(s) may be practiced without these specific details.
1 is a conceptual diagram illustrating a system in which various aspects of a computing device for providing a method for generating a face image related to an embodiment of the present disclosure may be implemented.
2 is a block diagram of a computing device for providing a method for generating a face image according to an embodiment of the present disclosure.
3 is a diagram exemplarily showing an overall schematic diagram of a method for generating a face image according to an embodiment of the present disclosure.
4 is an exemplary diagram illustrating a process of outputting common characteristic information corresponding to a plurality of first reference images related to an embodiment of the present disclosure.
5 is an exemplary diagram illustrating a process of generating a target prediction image based on a second reference image related to an embodiment of the present disclosure.
6 is a flowchart exemplarily illustrating a method for generating a face image related to an embodiment of the present disclosure.
7 is a schematic diagram illustrating one or more network functions related to an embodiment of the present disclosure.

다양한 실시예들이 이제 도면을 참조하여 설명된다. 본 명세서에서, 다양한 설명들이 본 개시의 이해를 제공하기 위해서 제시된다. 그러나, 이러한 실시예들은 이러한 구체적인 설명 없이도 실행될 수 있음이 명백하다.Various embodiments are now described with reference to the drawings. In this specification, various descriptions are presented to provide an understanding of the present disclosure. However, it is apparent that these embodiments may be practiced without these specific descriptions.

본 명세서에서 사용되는 용어 "컴포넌트", "모듈", "시스템" 등은 컴퓨터-관련 엔티티, 하드웨어, 펌웨어, 소프트웨어, 소프트웨어 및 하드웨어의 조합, 또는 소프트웨어의 실행을 지칭한다. 예를 들어, 컴포넌트는 프로세서상에서 실행되는 처리과정(procedure), 프로세서, 객체, 실행 스레드, 프로그램, 및/또는 컴퓨터일 수 있지만, 이들로 제한되는 것은 아니다. 예를 들어, 컴퓨팅 장치에서 실행되는 애플리케이션 및 컴퓨팅 장치 모두 컴포넌트일 수 있다. 하나 이상의 컴포넌트는 프로세서 및/또는 실행 스레드 내에 상주할 수 있다. 일 컴포넌트는 하나의 컴퓨터 내에 로컬화 될 수 있다. 일 컴포넌트는 2개 이상의 컴퓨터들 사이에 분배될 수 있다. 또한, 이러한 컴포넌트들은 그 내부에 저장된 다양한 데이터 구조들을 갖는 다양한 컴퓨터 판독가능한 매체로부터 실행할 수 있다. 컴포넌트들은 예를 들어 하나 이상의 데이터 패킷들을 갖는 신호(예를 들면, 로컬 시스템, 분산 시스템에서 다른 컴포넌트와 상호작용하는 하나의 컴포넌트로부터의 데이터 및/또는 신호를 통해 다른 시스템과 인터넷과 같은 네트워크를 통해 전송되는 데이터)에 따라 로컬 및/또는 원격 처리들을 통해 통신할 수 있다.The terms “component,” “module,” “system,” and the like, as used herein, refer to a computer-related entity, hardware, firmware, software, a combination of software and hardware, or execution of software. For example, a component can be, but is not limited to being, a process running on a processor, a processor, an object, a thread of execution, a program, and/or a computer. For example, both an application running on a computing device and the computing device may be a component. One or more components may reside within a processor and/or thread of execution. A component may be localized within one computer. A component may be distributed between two or more computers. In addition, these components can execute from various computer readable media having various data structures stored therein. Components may communicate via a network such as the Internet with another system, for example, via a signal having one or more data packets (eg, data and/or signals from one component interacting with another component in a local system, distributed system, etc.) may communicate via local and/or remote processes depending on the data being transmitted).

더불어, 용어 "또는"은 배타적 "또는"이 아니라 내포적 "또는"을 의미하는 것으로 의도된다. 즉, 달리 특정되지 않거나 문맥상 명확하지 않은 경우에, "X는 A 또는 B를 이용한다"는 자연적인 내포적 치환 중 하나를 의미하는 것으로 의도된다. 즉, X가 A를 이용하거나; X가 B를 이용하거나; 또는 X가 A 및 B 모두를 이용하는 경우, "X는 A 또는 B를 이용한다"가 이들 경우들 어느 것으로도 적용될 수 있다. 또한, 본 명세서에 사용된 "및/또는"이라는 용어는 열거된 관련 아이템들 중 하나 이상의 아이템의 가능한 모든 조합을 지칭하고 포함하는 것으로 이해되어야 한다.In addition, the term “or” is intended to mean an inclusive “or” rather than an exclusive “or.” That is, unless otherwise specified or clear from context, "X employs A or B" is intended to mean one of the natural implicit substitutions. That is, X employs A; X employs B; or when X employs both A and B, "X employs A or B" may apply to either of these cases. It should also be understood that the term “and/or” as used herein refers to and includes all possible combinations of one or more of the listed related items.

또한, "포함한다" 및/또는 "포함하는"이라는 용어는, 해당 특징 및/또는 구성요소가 존재함을 의미하는 것으로 이해되어야 한다. 다만, "포함한다" 및/또는 "포함하는"이라는 용어는, 하나 이상의 다른 특징, 구성요소 및/또는 이들의 그룹의 존재 또는 추가를 배제하지 않는 것으로 이해되어야 한다. 또한, 달리 특정되지 않거나 단수 형태를 지시하는 것으로 문맥상 명확하지 않은 경우에, 본 명세서와 청구범위에서 단수는 일반적으로 "하나 또는 그 이상"을 의미하는 것으로 해석되어야 한다.Also, the terms "comprises" and/or "comprising" should be understood to mean that the feature and/or element in question is present. However, it should be understood that the terms "comprises" and/or "comprising" do not exclude the presence or addition of one or more other features, elements and/or groups thereof. Also, unless otherwise specified or unless it is clear from context to refer to a singular form, the singular in the specification and claims should generally be construed to mean “one or more”.

당업자들은 추가적으로 여기서 개시된 실시예들과 관련되어 설명된 다양한 예시적 논리적 블록들, 구성들, 모듈들, 회로들, 수단들, 로직들, 및 알고리즘 단계들이 전자 하드웨어, 컴퓨터 소프트웨어, 또는 양쪽 모두의 조합들로 구현될 수 있음을 인식해야 한다. 하드웨어 및 소프트웨어의 상호교환성을 명백하게 예시하기 위해, 다양한 예시적 컴포넌트들, 블록들, 구성들, 수단들, 로직들, 모듈들, 회로들, 및 단계들은 그들의 기능성 측면에서 일반적으로 위에서 설명되었다. 그러한 기능성이 하드웨어로 또는 소프트웨어로서 구현되는지 여부는 전반적인 시스템에 부과된 특정 어플리케이션(application) 및 설계 제한들에 달려 있다. 숙련된 기술자들은 각각의 특정 어플리케이션들을 위해 다양한 방법들로 설명된 기능성을 구현할 수 있다. 다만, 그러한 구현의 결정들이 본 개시내용의 영역을 벗어나게 하는 것으로 해석되어서는 안된다.Those skilled in the art will further appreciate that the various illustrative logical blocks, configurations, modules, circuits, means, logics, and algorithm steps described in connection with the embodiments disclosed herein may be implemented in electronic hardware, computer software, or combinations of both. It should be recognized that they can be implemented with To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, configurations, means, logics, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application. However, such implementation decisions should not be interpreted as causing a departure from the scope of the present disclosure.

제시된 실시예들에 대한 설명은 본 개시의 기술 분야에서 통상의 지식을 가진 자가 본 개시를 이용하거나 또는 실시할 수 있도록 제공된다. 이러한 실시예들에 대한 다양한 변형들은 본 개시의 기술 분야에서 통상의 지식을 가진 자에게 명백할 것이다. 여기에 정의된 일반적인 원리들은 본 개시의 범위를 벗어남이 없이 다른 실시예들에 적용될 수 있다. 그리하여, 본 개시는 여기에 제시된 실시예들로 한정되는 것이 아니다. 본 개시는 여기에 제시된 원리들 및 신규한 특징들과 일관되는 최광의의 범위에서 해석되어야 할 것이다.Descriptions of the presented embodiments are provided to enable those of ordinary skill in the art to use or practice the present disclosure. Various modifications to these embodiments will be apparent to those skilled in the art of the present disclosure. The generic principles defined herein may be applied to other embodiments without departing from the scope of the present disclosure. Thus, the present disclosure is not limited to the embodiments presented herein. This disclosure is to be interpreted in the widest scope consistent with the principles and novel features presented herein.

본 명세서에서, 컴퓨터는 적어도 하나의 프로세서를 포함하는 모든 종류의 하드웨어 장치를 의미하는 것이고, 실시 예에 따라 해당 하드웨어 장치에서 동작하는 소프트웨어적 구성도 포괄하는 의미로서 이해될 수 있다. 예를 들어, 컴퓨터는 스마트폰, 태블릿 PC, 데스크톱, 노트북 및 각 장치에서 구동되는 사용자 클라이언트 및 애플리케이션을 모두 포함하는 의미로서 이해될 수 있으며, 또한 이에 제한되는 것은 아니다.In this specification, a computer refers to all types of hardware devices including at least one processor, and may be understood as encompassing software configurations operating in the corresponding hardware device according to embodiments. For example, a computer may be understood to include, but is not limited to, smart phones, tablet PCs, desktops, notebooks, and user clients and applications running on each device.

이하, 첨부된 도면을 참조하여 본 개시의 실시예를 상세하게 설명한다.Hereinafter, embodiments of the present disclosure will be described in detail with reference to the accompanying drawings.

본 명세서에서 설명되는 각 단계들은 컴퓨터에 의하여 수행되는 것으로 설명되나, 각 단계의 주체는 이에 제한되는 것은 아니며, 실시 예에 따라 각 단계들의 적어도 일부가 서로 다른 장치에서 수행될 수도 있다.Each step described in this specification is described as being performed by a computer, but the subject of each step is not limited thereto, and at least a portion of each step may be performed in different devices according to embodiments.

도 1은 본 개시의 일 실시예와 관련된 얼굴 이미지 생성 방법을 제공하기 위한 컴퓨팅 장치의 다양한 양태가 구현될 수 있는 시스템을 나타낸 개념도를 도시한다.1 is a conceptual diagram illustrating a system in which various aspects of a computing device for providing a method for generating a face image related to an embodiment of the present disclosure may be implemented.

본 개시의 실시예들에 따른 시스템은, 컴퓨팅 장치(100), 사용자 단말(10), 외부 서버(20) 및 네트워크를 포함할 수 있다. 도 1에서 도시되는 컴포넌트들은 예시적인 것으로서, 추가적인 컴포넌트들이 존재하거나 또는 도 1에서 도시되는 컴포넌트들 중 일부는 생략될 수 있다. 본 개시의 실시예들에 따른 컴퓨팅 장치(100), 사용자 단말(10) 및 외부 서버(20)는 네트워크를 통해, 본 개시의 일 실시예들에 따른 시스템을 위한 데이터를 상호 송수신할 수 있다.A system according to embodiments of the present disclosure may include a computing device 100 , a user terminal 10 , an external server 20 , and a network. The components illustrated in FIG. 1 are exemplary, and additional components may be present or some of the components illustrated in FIG. 1 may be omitted. The computing device 100 , the user terminal 10 and the external server 20 according to embodiments of the present disclosure may mutually transmit/receive data for the system according to embodiments of the present disclosure through a network.

본 개시의 실시예들에 따른 네트워크는 공중전화 교환망(PSTN: Public Switched Telephone Network), xDSL(x Digital Subscriber Line), RADSL(Rate Adaptive DSL), MDSL(Multi Rate DSL), VDSL(Very High Speed DSL), UADSL(Universal Asymmetric DSL), HDSL(High Bit Rate DSL) 및 근거리 통신망(LAN) 등과 같은 다양한 유선 통신 시스템들을 사용할 수 있다.Networks according to embodiments of the present disclosure include Public Switched Telephone Network (PSTN), x Digital Subscriber Line (xDSL), Rate Adaptive DSL (RADSL), Multi Rate DSL (MDSL), Very High Speed DSL (VDSL). ), a variety of wired communication systems such as Universal Asymmetric DSL (UADSL), High Bit Rate DSL (HDSL), and Local Area Network (LAN) can be used.

또한, 여기서 제시되는 네트워크는 CDMA(Code Division Multi Access), TDMA(Time Division Multi Access), FDMA(Frequency Division Multi Access), OFDMA(Orthogonal Frequency Division Multi Access), SC-FDMA(Single Carrier-FDMA) 및 다른 시스템들과 같은 다양한 무선 통신 시스템들을 사용할 수 있다.In addition, the networks presented herein are Code Division Multi Access (CDMA), Time Division Multi Access (TDMA), Frequency Division Multi Access (FDMA), Orthogonal Frequency Division Multi Access (OFDMA), Single Carrier-FDMA (SC-FDMA) and Various wireless communication systems may be used, such as other systems.

본 개시의 실시예들에 따른 네트워크는 유선 및 무선 등과 같은 그 통신 양태를 가리지 않고 구성될 수 있으며, 단거리 통신망(PAN: Personal Area Network), 근거리 통신망(WAN: Wide Area Network) 등 다양한 통신망으로 구성될 수 있다. 또한, 상기 네트워크는 공지의 월드와이드웹(WWW: World Wide Web)일 수 있으며, 적외선(IrDA: Infrared Data Association) 또는 블루투스(Bluetooth)와 같이 단거리 통신에 이용되는 무선 전송 기술을 이용할 수도 있다. 본 명세서에서 설명된 기술들은 위에서 언급된 네트워크들뿐만 아니라, 다른 네트워크들에서도 사용될 수 있다.The network according to the embodiments of the present disclosure may be configured regardless of its communication mode, such as wired and wireless, and is composed of various communication networks such as a personal area network (PAN) and a wide area network (WAN). can be In addition, the network may be a well-known World Wide Web (WWW), and may use a wireless transmission technology used for short-range communication, such as infrared (IrDA) or Bluetooth (Bluetooth). The techniques described herein may be used in the networks mentioned above, as well as in other networks.

본 개시의 일 실시예에 따르면, 사용자 단말(10)은 컴퓨팅 장치(100)에 엑세스하여 참조 이미지를 기반으로 목표 예측 이미지를 획득하고자 하는 사용자와 관련된 단말일 수 있다. 사용자 단말(10)은 사용자 단말(10)에서 촬영하거나 또는 저장된 참조 이미지를 컴퓨팅 장치(100)로 전송할 수 있다. 예를 들어, 참조 이미지는 태아의 초음파 영상에 관련한 이미지일 수 있으며, 목표 예측 이미지는 태아의 초음파 이미지에 대응하는 실사 이미지일 수 있다. 예컨대, 실사 이미지는, 태아의 피부색, 피부질감, 눈, 코, 입 각각의 크기 등을 복원하여 실사화한 이미지를 의미할 수 있다. 다른 예를 들어, 참조 이미지는, 사용자의 실제 촬영 이미지일 수 있으며, 목표 예측 이미지는 사용자의 실제 촬영 이미지를 일러스트화한 일러스트 이미지일 수 있다. 전술한 참조 이미지 및 목표 예측 이미지에 대한 구체적인 기재는 예시일 뿐, 본 개시는 이에 제한되지 않는다. According to an embodiment of the present disclosure, the user terminal 10 may be a terminal related to a user who accesses the computing device 100 to obtain a target prediction image based on the reference image. The user terminal 10 may transmit a reference image photographed or stored in the user terminal 10 to the computing device 100 . For example, the reference image may be an image related to an ultrasound image of a fetus, and the target prediction image may be a live-action image corresponding to an ultrasound image of a fetus. For example, the live-action image may mean a live-action image by restoring the skin color, skin texture, eye, nose, and mouth sizes of the fetus. For another example, the reference image may be an actual photographed image of the user, and the target prediction image may be an illustration image obtained by illustrating the user's actual photographed image. The detailed description of the above-described reference image and target prediction image is only an example, and the present disclosure is not limited thereto.

이러한 사용자 단말(10)은 예컨대, 초음파 이미지에 관련한 참조 이미지를 기반으로 태아의 실사 이미지를 제공하는 검사자(예컨대, 전문의)와 관련한 단말일 수 있다. 사용자 단말(10)이 태아의 실사 이미지를 제공하는 검사자에 관련한 단말인 경우, 컴퓨팅 장치(100)로부터 수신하는 실사 이미지는, 태아의 건강 상태 판독을 위한 의료 보조 정보로 활용될 수 있다. 사용자 단말(10)은 디스플레이를 구비하고 있어서, 사용자의 입력을 수신하고, 사용자에게 임의의 형태의 출력을 제공할 수 있다.The user terminal 10 may be, for example, a terminal related to an examiner (eg, a specialist) that provides a live-action image of a fetus based on a reference image related to an ultrasound image. When the user terminal 10 is a terminal related to an examiner providing an actual image of a fetus, the actual image received from the computing device 100 may be used as medical assistance information for reading a fetal health condition. The user terminal 10 has a display, so it can receive a user's input and provide an output of any type to the user.

사용자 단말(10)의 사용자는 의료 전문가로서, 의사, 간호사, 임상 병리사, 의료 영상 전문가 등을 의미할 수 있으며, 의료 장치를 수리하는 기술자가 될 수 있으나, 이에 제한되지 않는다. 또한 예를 들어, 사용자는 개시된 실시 예에 따른 시스템을 이용하여 태아의 실사 이미지를 획득하고자 하는 산모 본인을 의미할 수도 있다.A user of the user terminal 10 is a medical professional, which may mean a doctor, a nurse, a clinical pathologist, a medical imaging specialist, or the like, and may be a technician repairing a medical device, but is not limited thereto. Also, for example, the user may mean the mother herself who wants to acquire a live-action image of the fetus using the system according to the disclosed embodiment.

사용자 단말(10)은 컴퓨팅 장치(100)와 통신을 위한 메커니즘을 갖는 시스템에서의 임의의 형태의 엔티티(들)를 의미할 수 있다. 예를 들어, 이러한 사용자 단말(10)은 PC(personal computer), 노트북(note book), 모바일 단말기(mobile terminal), 스마트 폰(smart phone), 태블릿 PC(tablet pc) 및 웨어러블 디바이스(wearable device) 등을 포함할 수 있으며, 유/무선 네트워크에 접속할 수 있는 모든 종류의 단말을 포함할 수 있다. 또한, 사용자 단말(10)은 에이전트, API(Application Programming Interface) 및 플러그-인(Plug-in) 중 적어도 하나에 의해 구현되는 임의의 서버를 포함할 수도 있다. 또한, 사용자 단말(10)은 애플리케이션 소스 및/또는 클라이언트 애플리케이션을 포함할 수 있다.The user terminal 10 may refer to any type of entity(s) in a system having a mechanism for communication with the computing device 100 . For example, the user terminal 10 is a personal computer (PC), a notebook (note book), a mobile terminal (mobile terminal), a smart phone (smart phone), a tablet PC (tablet pc), and a wearable device (wearable device) and the like, and may include all types of terminals capable of accessing a wired/wireless network. In addition, the user terminal 10 may include an arbitrary server implemented by at least one of an agent, an application programming interface (API), and a plug-in. In addition, the user terminal 10 may include an application source and/or a client application.

본 개시의 일 실시예에 따르면, 외부 서버(20)는 태아의 초음파 영상에 관련한 참조 이미지 및 각 참조 이미지에 관련한 실제 이미지 등을 저장하는 서버일 수 있다. 예를 들어, 외부 서버(20)는 병원 서버 및 정부 서버 중 적어도 하나일 수 있으며, 복수의 태아 초음파 영상에 관련한 참조 이미지들, 각 참조 이미지들에 관련한 실제 이미지 등에 관한 정보를 저장하는 서버일 수 있다. 외부 서버(20)에 저장된 정보들은 본 개시에서의 신경망을 학습시키기 위한 학습 데이터, 검증 데이터 및 테스트 데이터로 활용될 수 있다. 즉, 외부 서버(20)는 본 개시의 신경망 모델을 학습시키기 위한 데이터 세트에 관한 정보를 저장하고 있는 서버일 수 있다.According to an embodiment of the present disclosure, the external server 20 may be a server that stores a reference image related to an ultrasound image of a fetus and an actual image related to each reference image. For example, the external server 20 may be at least one of a hospital server and a government server, and may be a server that stores information about reference images related to a plurality of fetal ultrasound images and actual images related to each reference image. have. Information stored in the external server 20 may be utilized as training data, verification data, and test data for learning the neural network in the present disclosure. That is, the external server 20 may be a server that stores information about a data set for training the neural network model of the present disclosure.

본 개시의 컴퓨팅 장치(100)는 외부 서버(20)로부터 복수의 참조 이미지 및 각 참조 이미지에 대응하는 실제 이미지들에 기반하여 학습 데이터 세트를 구축할 수 있으며, 학습 데이터 세트를 통해 하나 이상의 네트워크 함수를 포함하는 신경망 모델을 학습시킴으로써, 본 개시의 특성 추출 모델 및 이미지 생성 모델을 생성할 수 있다.The computing device 100 of the present disclosure may build a training data set based on a plurality of reference images from the external server 20 and actual images corresponding to each reference image, and one or more network functions through the training data set By training a neural network model including

외부 서버(20)는 디지털 기기로서, 랩탑 컴퓨터, 노트북 컴퓨터, 데스크톱 컴퓨터, 웹 패드, 이동 전화기와 같이 프로세서를 탑재하고 메모리를 구비한 연산 능력을 갖춘 디지털 기기일 수 있다. 외부 서버(20)는 서비스를 처리하는 웹 서버일 수 있다. 전술한 서버의 종류는 예시일 뿐이며 본 개시는 이에 제한되지 않는다.The external server 20 is a digital device, and may be a digital device equipped with a processor, such as a laptop computer, a notebook computer, a desktop computer, a web pad, and a mobile phone, and having a computing capability with a memory. The external server 20 may be a web server that processes a service. The above-described types of servers are merely examples, and the present disclosure is not limited thereto.

본 개시의 일 실시예에 따르면, 컴퓨팅 장치(100)는 사용자 단말(10)로부터 수신한 참조 이미지에 기반하여 목표 예측 이미지를 생성할 수 있다. 여기서, 참조 이미지는, 목표 예측 이미지 생성에 기반이 되는 이미지로, 예를 들어, 태아 초음파 영상에 관련한 이미지 및 사용자의 실제 촬영 이미지 등을 포함할 수 있다. 또한, 목표 예측 이미지는, 사용자의 의도에 관련하여 생성하고자 하는 목표 이미지로, 예를 들어, 태아의 초음파 이미지에 대응하는 실사 이미지 또는, 실제 촬영 이미지에 대응하는 일러스트화 이미지 등을 포함할 수 있다. 즉, 컴퓨팅 장치(100)는 참조 이미지에 기반하여 해당 참조 이미지에 대응하는 목표 예측 이미지를 생성할 수 있다. 이러한 목표 예측 이미지의 생성은, 참조 이미지를 기반으로 특정 표현 방식을 갖는 이미지를 생성하기 위한 것일 수 있다. 예컨대, 시각적으로 인지하기 어려운 태아의 초음파 영상에 관련된 이미지를 기반으로 실제 태아의 얼굴을 예측하기 위한 것일 수 있다. 일 실시예에서, 참조 이미지는 유사 이미지 도출에 기반이되는 제1 참조 이미지 및 유사 이미지 조정에 기반이되는 제2 참조 이미지를 포함할 수 있다. 즉, 컴퓨팅 장치(100)는 복수의 제1 참조 이미지를 포함하는 제1 영상 데이터 및 복수의 제2 참조 이미지를 포함하는 제2 영상 데이터를 통해 본 개시의 최종 출력에 관련한 목표 예측 이미지를 생성할 수 있다.According to an embodiment of the present disclosure, the computing device 100 may generate a target prediction image based on the reference image received from the user terminal 10 . Here, the reference image is an image based on generation of the target prediction image, and may include, for example, an image related to an ultrasound image of a fetus and an actual photographed image of a user. In addition, the target prediction image is a target image to be generated in relation to the user's intention, and may include, for example, a live-action image corresponding to an ultrasound image of a fetus or an illustrated image corresponding to an actual photographed image. . That is, the computing device 100 may generate a target prediction image corresponding to the reference image based on the reference image. The generation of the target prediction image may be to generate an image having a specific expression method based on the reference image. For example, the fetal face may be predicted based on an image related to an ultrasound image of the fetus, which is difficult to visually recognize. In an embodiment, the reference image may include a first reference image based on the similar image derivation and a second reference image based on the similar image adjustment. That is, the computing device 100 may generate a target prediction image related to the final output of the present disclosure through first image data including a plurality of first reference images and second image data including a plurality of second reference images. can

보다 구체적으로, 컴퓨팅 장치(100)는 사용자 단말(10)로부터 제1 영상 데이터 및 제2 영상 데이터를 수신할 수 있다. 제1 영상 데이터는, 복수의 제1 참조 이미지로 구성된 영상 데이터일 수 있으며, 제2 영상 데이터는 복수의 제2 참조 이미지로 구성된 영상 데이터일 수 있다. 다시 말해, 제1 영상 데이터는 복수의 제1 참조 이미지들을 프레임으로써 포함하는 영상이며, 제2 영상 데이터는 복수의 제2 참조 이미지들을 프레임으로써 포함하는 영상일 수 있다. 구체적인 예를 들어, 복수의 제1 참조 이미지는 태아의 초음파 영상에 관련한 이미지들 일 수 있으며, 복수의 제2 참조 이미지는 부모의 실제 영상에 관련한 이미지들 일 수 있다. 전술한 각 참조 이미지에 대한 구체적인 기재는 예시일 뿐, 본 개시는 이에 제한되지 않는다. More specifically, the computing device 100 may receive the first image data and the second image data from the user terminal 10 . The first image data may be image data composed of a plurality of first reference images, and the second image data may be image data composed of a plurality of second reference images. In other words, the first image data may be an image including a plurality of first reference images as frames, and the second image data may be an image including a plurality of second reference images as frames. As a specific example, the plurality of first reference images may be images related to an ultrasound image of a fetus, and the plurality of second reference images may be images related to an actual image of a parent. The detailed description of each reference image described above is only an example, and the present disclosure is not limited thereto.

본 개시에서 제1 참조 이미지 및 제2 참조 이미지 각각은 전술한 바와 같이, 유사 이미지를 도출하기 위한 이미지 및 도출된 유사 이미지를 조정하기 위한 이미지 각각으로 활용될 수 있다. 즉, 본 개시의 컴퓨팅 장치(100)는 복수의 제1 참조 이미지에 기반하여 하나 이상의 유사 이미지를 도출하는 프로세스와 도출된 하나 이상의 유사 이미지를 복수의 제2 참조 이미지에 기반하여 조정함으로써 목표 예측 이미지를 생성하는 프로세스를 수행할 수 있다. 컴퓨팅 장치(100)는 제1 참조 이미지를 통해 기 설정된 유사도 이상을 갖는 하나 이상의 유사 이미지들을 도출하고, 도출된 하나 이상의 유사 이미지들을 제2 참조 이미지에 기반하여 조정함으로써, 사용자의 의도에 부합하는 목표 예측 이미지를 생성할 수 있다. In the present disclosure, each of the first reference image and the second reference image may be utilized as an image for deriving a similarity image and an image for adjusting the derived similarity image, respectively, as described above. That is, the computing device 100 of the present disclosure provides a target prediction image by adjusting a process of deriving one or more similar images based on a plurality of first reference images and adjusting the derived one or more similar images based on a plurality of second reference images. The process of creating . The computing device 100 derives one or more similar images having a degree of similarity or higher through the first reference image, and adjusts the derived one or more similar images based on the second reference image, thereby meeting the user's intention. A predictive image can be generated.

전술한 바와 같이, 본 개시의 컴퓨팅 장치(100)는 다양한 참조 이미지들(즉, 제1 참조 이미지 및 제2 참조 이미지)을 통해 유사 이미지 도출 및 도출된 유사 이미지 조정 등 2가지의 프로세스를 수행함으로써, 출력되는 목표 예측 이미지의 정확도를 향상시킬 수 있다. 이는 불분명한 참조 영상이나, 촬영 각도가 일정하지 않은 영상이나 또는 노이즈가 많이 포함된 영상 등에 관련한 참조 이미지에 대응하여 높은 정확도의 목표 예측 이미지를 생성하는 효과를 가질 수 있다.As described above, the computing device 100 of the present disclosure performs two processes, such as deriving a similar image and adjusting the derived similar image through various reference images (ie, the first reference image and the second reference image). , it is possible to improve the accuracy of the output target prediction image. This may have an effect of generating a target prediction image with high accuracy in response to a reference image related to an unclear reference image, an image having a non-uniform shooting angle, or an image containing a lot of noise.

일 실시예에서, 컴퓨팅 장치(100)는 단말 또는 서버일 수 있으며, 임의의 형태의 장치는 모두 포함할 수 있다. 컴퓨팅 장치(100)는 디지털 기기로서, 랩탑 컴퓨터, 노트북 컴퓨터, 데스크톱 컴퓨터, 웹 패드, 이동 전화기와 같이 프로세서를 탑재하고 메모리를 구비한 연산 능력을 갖춘 디지털 기기일 수 있다. 컴퓨팅 장치(100)는 서비스를 처리하는 웹 서버일 수 있다. 전술한 컴퓨팅 장치의 종류는 예시일 뿐이며 본 개시는 이에 제한되지 않는다.In an embodiment, the computing device 100 may be a terminal or a server, and may include any type of device. The computing device 100 is a digital device, and may be a digital device equipped with a processor, such as a laptop computer, a notebook computer, a desktop computer, a web pad, and a mobile phone, and having a computing power having a memory. The computing device 100 may be a web server that processes a service. The types of computing devices described above are merely examples, and the present disclosure is not limited thereto.

본 개시의 일 실시예에 따르면, 컴퓨팅 장치(100)는 클라우드 컴퓨팅 서비스를 제공하는 서버일 수 있다. 보다 구체적으로, 컴퓨팅 장치(100)는 인터넷 기반 컴퓨팅의 일종으로 정보를 사용자의 컴퓨터가 아닌 인터넷에 연결된 다른 컴퓨터로 처리하는 클라우드 컴퓨팅 서비스를 제공하는 서버일 수 있다. 상기 클라우드 컴퓨팅 서비스는 인터넷 상에 자료를 저장해 두고, 사용자가 필요한 자료나 프로그램을 자신의 컴퓨터에 설치하지 않고도 인터넷 접속을 통해 언제 어디서나 이용할 수 있는 서비스일 수 있으며, 인터넷 상에 저장된 자료들을 간단한 조작 및 클릭으로 쉽게 공유하고 전달할 수 있다. 또한, 클라우드 컴퓨팅 서비스는 인터넷 상의 서버에 단순히 자료를 저장하는 것뿐만 아니라, 별도로 프로그램을 설치하지 않아도 웹에서 제공하는 응용프로그램의 기능을 이용하여 원하는 작업을 수행할 수 있으며, 여러 사람이 동시에 문서를 공유하면서 작업을 진행할 수 있는 서비스일 수 있다. 또한, 클라우드 컴퓨팅 서비스는 IaaS(Infrastructure as a Service), PaaS(Platform as a Service), SaaS(Software as a Service), 가상 머신 기반 클라우드 서버 및 컨테이너 기반 클라우드 서버 중 적어도 하나의 형태로 구현될 수 있다. 즉, 본 개시의 컴퓨팅 장치(100)는 상술한 클라우드 컴퓨팅 서비스 중 적어도 하나의 형태로 구현될 수 있다. 전술한 클라우드 컴퓨팅 서비스의 구체적인 기재는 예시일 뿐, 본 개시의 클라우드 컴퓨팅 환경을 구축하는 임의의 플랫폼을 포함할 수도 있다.According to an embodiment of the present disclosure, the computing device 100 may be a server that provides a cloud computing service. More specifically, the computing device 100 is a type of Internet-based computing, and may be a server that provides a cloud computing service that processes information not with a user's computer but with another computer connected to the Internet. The cloud computing service may be a service that stores data on the Internet and allows the user to use it anytime and anywhere through Internet access without installing necessary data or programs on his/her computer. Easy to share and deliver with a click. In addition, cloud computing service not only stores data on a server on the Internet, but also allows users to perform desired tasks using the functions of applications provided on the web without installing a separate program, and multiple people can simultaneously view documents. It may be a service that allows you to work while sharing. In addition, the cloud computing service may be implemented in the form of at least one of Infrastructure as a Service (IaaS), Platform as a Service (PaaS), Software as a Service (SaaS), a virtual machine-based cloud server, and a container-based cloud server. . That is, the computing device 100 of the present disclosure may be implemented in the form of at least one of the above-described cloud computing services. The detailed description of the above-described cloud computing service is merely an example, and may include any platform for building the cloud computing environment of the present disclosure.

본 개시에서의 신경망에 대한 학습 방법, 학습 과정, 참조 이미지에 기반하여 유사 이미지를 도출하는 방법, 참조 이미지에 기반하여 유사 이미지를 조정하는 방법 및 참조 이미지에 대응하는 목표 예측 이미지를 생성하는 방법에 대한 구체적인 설명은 이하의 도 2를 참조하여 후술하도록 한다.A learning method, a learning process, a method of deriving a similar image based on a reference image, a method of adjusting a similar image based on the reference image, and a method of generating a target prediction image corresponding to the reference image for the neural network in the present disclosure A detailed description will be provided later with reference to FIG. 2 below.

도 2는 본 개시의 일 실시예와 관련한 얼굴 이미지 생성 방법을 제공하기 위한 컴퓨팅 장치의 블록 구성도를 도시한다.2 is a block diagram of a computing device for providing a method for generating a face image according to an embodiment of the present disclosure.

도 2에 도시된 바와 같이, 컴퓨팅 장치(100)는 네트워크부(110), 메모리(120) 및 프로세서(130)를 포함할 수 있다. 전술한 컴퓨팅 장치(100)에 포함된 컴포넌트들은 예시적인 것으로 본 개시내용의 권리범위가 전술한 컴포넌트들로 제한되지 않는다. 즉, 본 개시내용의 실시예들에 대한 구현 양태에 따라서 추가적인 컴포넌트들이 포함되거나 전술한 컴포넌트들 중 일부가 생략될 수 있다.As shown in FIG. 2 , the computing device 100 may include a network unit 110 , a memory 120 , and a processor 130 . Components included in the aforementioned computing device 100 are exemplary and the scope of the present disclosure is not limited to the aforementioned components. That is, additional components may be included or some of the above-described components may be omitted according to implementation aspects for the embodiments of the present disclosure.

본 개시의 일 실시예에 따르면, 컴퓨팅 장치(100)는 사용자 단말(10) 및 외부 서버(20)와 데이터를 송수신하는 네트워크부(110)를 포함할 수 있다. 네트워크부(110)는 본 개시의 일 실시예에 따른 얼굴 이미지 생성 방법을 수행하기 위한 데이터들 및 신경망 모델을 학습시키기 위한 학습 데이터 세트 등을 다른 컴퓨팅 장치, 서버 등과 송수신할 수 있다. 즉, 네트워크부(110)는 컴퓨팅 장치(100)와 사용자 단말(10) 및 외부 서버(20) 간의 통신 기능을 제공할 수 있다. 예를 들어, 네트워크부(110)는 사용자 단말(10)로부터 참조 이미지 데이터를 수신할 수 있다. 다른 예를 들어, 네트워크부(110)는 외부 서버(20)로부터 본 개시의 특성 추출 모델 또는 이미지 생성 모델을 학습시키기 위한 학습 데이터 세트를 수신할 수 있다. 추가적으로, 네트워크부(110)는 컴퓨팅 장치(100)로 프로시저를 호출하는 방식으로 컴퓨팅 장치(100)와 사용자 단말(10) 및 외부 서버(20) 간의 정보 전달을 허용할 수 있다.According to an embodiment of the present disclosure, the computing device 100 may include the user terminal 10 and the network unit 110 for transmitting and receiving data to and from the external server 20 . The network unit 110 may transmit/receive data for performing the method for generating a face image according to an embodiment of the present disclosure and a training data set for learning a neural network model to/from other computing devices, servers, and the like. That is, the network unit 110 may provide a communication function between the computing device 100 , the user terminal 10 , and the external server 20 . For example, the network unit 110 may receive reference image data from the user terminal 10 . As another example, the network unit 110 may receive a training data set for learning the feature extraction model or the image generation model of the present disclosure from the external server 20 . Additionally, the network unit 110 may allow information transfer between the computing device 100 and the user terminal 10 and the external server 20 by calling a procedure to the computing device 100 .

본 개시의 일 실시예에 따른 네트워크부(110)는 공중전화 교환망(PSTN: Public Switched Telephone Network), xDSL(x Digital Subscriber Line), RADSL(Rate Adaptive DSL), MDSL(Multi Rate DSL), VDSL(Very High Speed DSL), UADSL(Universal Asymmetric DSL), HDSL(High Bit Rate DSL) 및 근거리 통신망(LAN) 등과 같은 다양한 유선 통신 시스템들을 사용할 수 있다.The network unit 110 according to an embodiment of the present disclosure includes a Public Switched Telephone Network (PSTN), x Digital Subscriber Line (xDSL), Rate Adaptive DSL (RADSL), Multi Rate DSL (MDSL), VDSL ( A variety of wired communication systems such as Very High Speed DSL), Universal Asymmetric DSL (UADSL), High Bit Rate DSL (HDSL), and Local Area Network (LAN) can be used.

또한, 본 명세서에서 제시되는 네트워크부(110)는 CDMA(Code Division Multi Access), TDMA(Time Division Multi Access), FDMA(Frequency Division Multi Access), OFDMA(Orthogonal Frequency Division Multi Access), SC-FDMA(Single Carrier-FDMA) 및 다른 시스템들과 같은 다양한 무선 통신 시스템들을 사용할 수 있다.In addition, the network unit 110 presented herein is CDMA (Code Division Multi Access), TDMA (Time Division Multi Access), FDMA (Frequency Division Multi Access), OFDMA (Orthogonal Frequency Division Multi Access), SC-FDMA ( A variety of wireless communication systems can be used, such as Single Carrier-FDMA) and other systems.

본 개시에서 네트워크부(110)는 유선 및 무선 등과 같은 그 통신 양태를 가리지 않고 구성될 수 있으며, 단거리 통신망(PAN: Personal Area Network), 근거리 통신망(WAN: Wide Area Network) 등 다양한 통신망으로 구성될 수 있다. 또한, 상기 네트워크는 공지의 월드와이드웹(WWW: World Wide Web)일 수 있으며, 적외선(IrDA: Infrared Data Association) 또는 블루투스(Bluetooth)와 같이 단거리 통신에 이용되는 무선 전송 기술을 이용할 수도 있다. 본 명세서에서 설명된 기술들은 위에서 언급된 네트워크들뿐만 아니라, 다른 네트워크들에서도 사용될 수 있다.In the present disclosure, the network unit 110 may be configured regardless of its communication mode, such as wired and wireless, and may be composed of various communication networks such as a short-range network (PAN: Personal Area Network) and a local area network (WAN: Wide Area Network). can In addition, the network may be a well-known World Wide Web (WWW), and may use a wireless transmission technology used for short-range communication, such as infrared (IrDA) or Bluetooth (Bluetooth). The techniques described herein may be used in the networks mentioned above, as well as in other networks.

본 개시의 일 실시예에 따르면, 메모리(120)는 본 개시의 일 실시예에 따른 얼굴 이미지 생성 방법을 수행하기 위한 컴퓨터 프로그램을 저장할 수 있으며, 저장된 컴퓨터 프로그램은 프로세서(130)에 의하여 판독되어 구동될 수 있다. 또한, 메모리(120)는 프로세서(130)가 생성하거나 결정한 임의의 형태의 정보 및 네트워크부(110)가 수신한 임의의 형태의 정보를 저장할 수 있다. 또한, 메모리(120)는 참조 이미지들에 대응하는 유사 이미지들에 대한 정보 또는 참조 이미지들에 대응하는 실사 이미지들을 저장할 수 있다. 예를 들어, 메모리(120)는 입/출력되는 데이터들(예를 들어, 제1 영상 데이터, 제2 영상 데이터, 하나 이상의 유사 이미지, 목표 예측 이미지 및 실사 이미지 등)을 임시 또는 영구 저장할 수 있다.According to an embodiment of the present disclosure, the memory 120 may store a computer program for performing the method for generating a face image according to an embodiment of the present disclosure, and the stored computer program is read and driven by the processor 130 . can be In addition, the memory 120 may store any type of information generated or determined by the processor 130 and any type of information received by the network unit 110 . Also, the memory 120 may store information about similar images corresponding to the reference images or actual images corresponding to the reference images. For example, the memory 120 may temporarily or permanently store input/output data (eg, first image data, second image data, one or more similar images, target prediction images, actual images, etc.) .

본 개시의 일 실시예에 따르면, 메모리(120)는 플래시 메모리 타입(flash memory type), 하드디스크 타입(hard disk type), 멀티미디어 카드 마이크로 타입(multimedia card micro type), 카드 타입의 메모리(예를 들어 SD 또는 XD 메모리 등), 램(Random Access Memory, RAM), SRAM(Static Random Access Memory), 롬(Read-Only Memory, ROM), EEPROM(Electrically Erasable Programmable Read-Only Memory), PROM(Programmable Read-Only Memory), 자기 메모리, 자기 디스크, 광디스크 중 적어도 하나의 타입의 저장매체를 포함할 수 있다. 컴퓨팅 장치(100)는 인터넷(internet) 상에서 상기 메모리(120)의 저장 기능을 수행하는 웹 스토리지(web storage)와 관련되어 동작할 수도 있다. 전술한 메모리에 대한 기재는 예시일 뿐, 본 개시는 이에 제한되지 않는다.According to an embodiment of the present disclosure, the memory 120 is a flash memory type, a hard disk type, a multimedia card micro type, and a card type memory (eg, a SD or XD memory, etc.), Random Access Memory (RAM), Static Random Access Memory (SRAM), Read-Only Memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Programmable Read (PROM) -Only Memory), a magnetic memory, a magnetic disk, and an optical disk may include at least one type of storage medium. The computing device 100 may operate in relation to a web storage that performs a storage function of the memory 120 on the Internet. The description of the above-described memory is only an example, and the present disclosure is not limited thereto.

본 개시의 일 실시예에 따르면, 프로세서(130)는 하나 이상의 코어로 구성될 수 있으며, 컴퓨팅 장치의 중앙 처리 장치(CPU: central processing unit), 범용 그래픽 처리 장치(GPGPU: general purpose graphics processing unit), 텐서 처리 장치(TPU: tensor processing unit) 등의 데이터 분석, 딥러닝을 위한 프로세서를 포함할 수 있다.According to an embodiment of the present disclosure, the processor 130 may be configured with one or more cores, and may include a central processing unit (CPU) and a general purpose graphics processing unit (GPGPU) of a computing device. , data analysis such as a tensor processing unit (TPU), and a processor for deep learning.

프로세서(130)는 메모리(120)에 저장된 컴퓨터 프로그램을 판독하여 본 개시의 일 실시예에 따른 딥러닝을 위한 데이터 처리를 수행할 수 있다. 본 개시의 일 실시예에 따라 프로세서(130)는 신경망의 학습을 위한 연산을 수행할 수 있다. 프로세서(130)는 딥러닝(DL: deep learning)에서 학습을 위한 입력 데이터의 처리, 입력 데이터에서의 피처 추출, 오차 계산, 역전파(backpropagation)를 이용한 신경망의 가중치 업데이트 등의 신경망의 학습을 위한 계산을 수행할 수 있다.The processor 130 may read a computer program stored in the memory 120 to perform data processing for deep learning according to an embodiment of the present disclosure. According to an embodiment of the present disclosure, the processor 130 may perform an operation for learning the neural network. The processor 130 for learning of the neural network, such as processing input data for learning in deep learning (DL), extracting features from input data, calculating an error, updating the weight of the neural network using backpropagation calculations can be performed.

또한, 프로세서(130)는 CPU, GPGPU, 및 TPU 중 적어도 하나가 네트워크 함수의 학습을 처리할 수 있다. 예를 들어, CPU 와 GPGPU가 함께 네트워크 함수의 학습, 네트워크 함수를 이용한 데이터 분류를 처리할 수 있다. 또한, 본 개시의 일 실시예에서 복수의 컴퓨팅 장치의 프로세서를 함께 사용하여 네트워크 함수의 학습, 네트워크 함수를 이용한 데이터 분류를 처리할 수 있다. 또한, 본 개시의 일 실시예에 따른 컴퓨팅 장치에서 수행되는 컴퓨터 프로그램은 CPU, GPGPU 또는 TPU 실행가능 프로그램일 수 있다.Also, in the processor 130, at least one of a CPU, a GPGPU, and a TPU may process learning of a network function. For example, the CPU and the GPGPU can process learning of a network function and data classification using the network function. Also, in an embodiment of the present disclosure, learning of a network function and data classification using the network function may be processed by using the processors of a plurality of computing devices together. In addition, the computer program executed in the computing device according to an embodiment of the present disclosure may be a CPU, GPGPU or TPU executable program.

본 명세서에서 네트워크 함수는 인공 신경망, 뉴럴 네트워크와 상호 교환 가능하게 사용될 수 있다. 본 명세서에서 네트워크 함수는 하나 이상의 뉴럴 네트워크를 포함할 수도 있으며, 이 경우 네트워크 함수의 출력은 하나 이상의 뉴럴 네트워크의 출력의 앙상블(ensemble)일 수 있다.In the present specification, a network function may be used interchangeably with an artificial neural network and a neural network. In the present specification, the network function may include one or more neural networks, and in this case, the output of the network function may be an ensemble of the outputs of the one or more neural networks.

프로세서(130)는 메모리(120)에 저장된 컴퓨터 프로그램을 판독하여 본 개시의 일 실시예에 따른 특성 추출 모델 및 이미지 생성 모델을 제공할 수 있다. 본 개시의 일 실시예에 따라, 프로세서(130)는 참조 이미지에 기반한 목표 예측 이미지 생성을 수행할 수 있다. 본 개시의 일 실시예에 따라, 프로세서(130)는 분류 모델을 학습시키기 위한 계산을 수행할 수 있다.The processor 130 may read a computer program stored in the memory 120 to provide a feature extraction model and an image generation model according to an embodiment of the present disclosure. According to an embodiment of the present disclosure, the processor 130 may generate a target prediction image based on a reference image. According to an embodiment of the present disclosure, the processor 130 may perform calculation for training the classification model.

본 개시의 일 실시예에 따르면, 프로세서(130)는 통상적으로 컴퓨팅 장치(100)의 전반적인 동작을 처리할 수 있다. 프로세서(130)는 위에서 살펴본 구성요소들을 통해 입력 또는 출력되는 신호, 데이터, 정보 등을 처리하거나 메모리(120)에 저장된 응용 프로그램을 구동함으로써, 사용자 또는 사용자 단말에게 적정한 정보 또는, 기능을 제공하거나 처리할 수 있다.According to an embodiment of the present disclosure, the processor 130 may typically process the overall operation of the computing device 100 . The processor 130 processes signals, data, information, etc. input or output through the above-described components or runs an application program stored in the memory 120, thereby providing or processing appropriate information or functions to the user or user terminal. can do.

본 개시의 일 실시예에 따르면, 프로세서(130)는 복수의 참조 이미지를 포함하는 영상 데이터를 획득할 수 있다. 본 개시의 일 실시예에 따른 영상 데이터의 획득은, 메모리(120)에 저장된 영상 데이터를 수신하거나, 또는 로딩(loading)하는 것일 수 있다. 또한, 영상 데이터의 획득은, 유/무선 통신 수단에 기초하여 다른 저장 매체, 다른 컴퓨팅 장치, 동일한 컴퓨팅 장치 내의 별도 처리 모듈로부터 데이터를 수신하거나 또는 로딩하는 것일 수 있다.According to an embodiment of the present disclosure, the processor 130 may acquire image data including a plurality of reference images. Acquisition of image data according to an embodiment of the present disclosure may include receiving or loading image data stored in the memory 120 . Also, the image data acquisition may be receiving or loading data from another storage medium, another computing device, or a separate processing module in the same computing device based on a wired/wireless communication means.

본 개시의 일 실시예에 따르면, 프로세서(130)는 제1 영상 데이터 및 제2 영상 데이터를 획득할 수 있다. 여기서 영상 데이터는, 이미지 데이터들의 조합을 의미할 수 있다. 각 영상 데이터는 복수의 이미지들을 프레임으로써 구성할 수 있다.According to an embodiment of the present disclosure, the processor 130 may acquire first image data and second image data. Here, the image data may mean a combination of image data. Each image data may be composed of a plurality of images as a frame.

본 개시에서 제1 영상 데이터는, 하나 이상의 유사 이미지 도출에 기반이 되는 복수의 제1 참조 이미지를 포함하는 영상 데이터일 수 있으며, 제2 영상 데이터는, 도출된 유사 이미지의 조정에 기반이 되는 복수의 제2 참조 이미지를 포함하는 영상 데이터일 수 있다. In the present disclosure, the first image data may be image data including a plurality of first reference images based on derivation of one or more similar images, and the second image data may include a plurality of images based on adjustment of the derived similar images. may be image data including a second reference image of

즉, 도 3에 도시된 바와 같이, 프로세서(130)는 제1 영상 데이터를 구성하는 복수의 제1 참조 이미지에 기반하여 하나 이상의 유사 이미지를 도출하고, 제2 영상 데이터를 구성하는 복수의 제2 참조 이미지에 기반하여 도출된 유사 이미지를 조정함으로써, 사용자의 의도에 부합하는 목표 예측 이미지를 생성할 수 있다.That is, as shown in FIG. 3 , the processor 130 derives one or more similar images based on a plurality of first reference images constituting the first image data, and a plurality of second images constituting the second image data. By adjusting the similarity image derived based on the reference image, a target prediction image matching the user's intention may be generated.

본 개시의 일 실시예에 따르면, 영상 데이터는 도메인 정보를 포함할 수 있다. 여기서 도메인 정보는, 영상의 다양한 구성 방식에 따라 세분화된 정보일 수 있다. 즉, 영상 데이터의 도메인 정보는, 일군의 영상 데이터를 다른 군과 구분하기 위한 기준이 되는 정보를 포함할 수 있다. 구체적인 예를 들어, 제1 영상 데이터는, 해당 제1 영상 데이터를 구성하는 제1 이미지들이 태아의 초음파 이미지에 관련한다는 제1 도메인 정보를 포함할 수 있다. 다른 예를 들어, 제2 영상 데이터는, 해당 제2 영상 데이터를 구성하는 제2 이미지들이 태아의 부모에 관련한다는 제2 도메인 정보를 포함할 수 있다. 또 다른 예를 들어, 제3 영상 데이터는, 해당 제3 영상 데이터를 구성하는 제3 이미지들이 얼굴의 측면 모습에 관련한다는 제3 도메인 정보를 포함할 수 있다. 추가적인 예를 들어, 제4 영상 데이터는, 해당 제4 영상 데이터를 구성하는 제4 이미지들이 얼굴의 정면 모습에 관련한다는 제4 도메인 정보를 포함할 수 있다. 전술한 각 영상 데이터에 관련한 도메인 정보들은 본 개시의 이해를 돕기위한 일 예시에 불과할 뿐, 본 개시는 이에 제한되지 않는다.According to an embodiment of the present disclosure, image data may include domain information. Here, the domain information may be information subdivided according to various methods of configuring an image. That is, the domain information of the image data may include information serving as a reference for distinguishing one group of image data from another group. As a specific example, the first image data may include first domain information indicating that first images constituting the first image data are related to an ultrasound image of a fetus. As another example, the second image data may include second domain information indicating that the second images constituting the second image data relate to the parent of the fetus. As another example, the third image data may include third domain information indicating that third images constituting the third image data are related to the side view of the face. As an additional example, the fourth image data may include fourth domain information indicating that fourth images constituting the fourth image data relate to the frontal view of the face. The domain information related to each of the above-described image data is merely an example to help the understanding of the present disclosure, and the present disclosure is not limited thereto.

일 실시예에 따르면, 제1 영상 데이터 및 제2 영상 데이터는 서로 상이한 도메인 정보를 포함하도록 구성될 수 있다. 구체적인 예를 들어, 제1 영상 데이터가 태아의 초음파 이미지에 관련한다는 제1 도메인 정보를 포함하는 경우, 제2 영상 데이터는 태아의 부모의 실제 이미지에 관련한다는 제2 도메인 정보를 포함할 수 있다. 즉, 프로세서(130)는 서로 상이한 도메인 정보를 갖는 두 개의 영상 데이터를 획득하여, 제1 영상 데이터에 포함된 이미지들을 하나 이상의 유사 이미지 검색 과정에서 활용하고, 그리고 제2 영상 데이터에 포함된 이미지들을 도출된 유사 이미지의 조정 과정에서 활용할 수 있다. 다시 말해, 서로 상이한 방식으로 구성된 복수의 영상 데이터를 통해 목표 예측 이미지의 출력 정확도를 더욱 향상시킬 수 있다. According to an embodiment, the first image data and the second image data may be configured to include different domain information. As a specific example, when the first image data includes first domain information relating to an ultrasound image of a fetus, the second image data may include second domain information relating to an actual image of a parent of a fetus. That is, the processor 130 obtains two image data having different domain information, uses images included in the first image data in one or more similar image search processes, and uses images included in the second image data. It can be used in the process of adjusting the derived similar image. In other words, the output accuracy of the target prediction image may be further improved through a plurality of image data configured in different ways.

본 개시의 일 실시예에 따르면, 프로세서(130)는 복수의 영상 데이터를 획득하는 경우, 각 영상 데이터를 제1 영상 데이터 및 제2 영상 데이터로 구분할 수 있다. 본 개시에서 제1 영상 데이터는 유사 이미지를 도출하기 위한 기준이 되는 제1 참조 이미지들을 포함하며, 제2 영상 데이터는 도출된 유사 이미지를 조정하기 위한 기준이 되는 제2 참조 이미지들을 포함할 수 있다. 즉, 프로세서(130)는 두 개의 영상 데이터를 획득하는 경우, 그 중 하나의 영상 데이터를 유사 이미지 도출에 기준이되는 제1 영상 데이터로 결정하고, 그리고 다른 하나의 영상 데이터를 도출된 유사 이미지 조정에 기준이 되는 제2 영상 데이터로 결정할 수 있다.According to an embodiment of the present disclosure, when acquiring a plurality of image data, the processor 130 may divide each image data into first image data and second image data. In the present disclosure, the first image data may include first reference images serving as a reference for deriving a similar image, and the second image data may include second reference images serving as a reference for adjusting the derived similar image. . That is, when acquiring two pieces of image data, the processor 130 determines one of the image data as the first image data that is a reference for deriving the similar image, and adjusts the other image data to the derived similar image. It may be determined as the second image data as a reference for .

자세히 설명하면, 프로세서(130)는 복수의 영상 데이터 각각의 도메인 정보에 기반하여, 각 영상 데이터를 제1 영상 데이터 및 제2 영상 데이터로 구분할 수 있다. 구체적으로, 프로세서(130)가 복수의 영상 데이터를 획득하는 경우, 각 영상 데이터의 도메인 정보를 식별할 수 있다. 각 영상 데이터의 도메인 정보는, 일군의 영상 데이터를 다른 군과 구분하기 위한 기준이 되는 정보일 수 있다. 프로세서(130)는 각 영상 데이터의 도메인 정보 간의 비교를 통해 제1 영상 데이터 및 제2 영상 데이터를 결정할 수 있다. 프로세서(130)는 각 영상 데이터의 도메인 정보를 식별하고, 얼굴 이미지에 대한 특징 정보를 보다 적게 포함할 것으로 예상되는 도메인 정보를 포함하는 영상 데이터를 제1 영상 데이터로 결정하고, 비교적 얼굴 이미지에 대한 특징 정보를 보다 많이 포함할 것으로 예상되는 도메인 정보를 포함하는 영상 데이터를 제2 영상 데이터로 결정할 수 있다. 예컨대, 유사 이미지 조정에 활용되는 제2 참조 이미지들은 도출된 유사 이미지를 세부적으로 조정 또는 보정하기 위한 기준이 되는 이미지들임에 따라, 유사 이미지 도출(또는 검색)에 활용되는 제1 참조 이미지들 보다 얼굴의 세부 디테일에 관한 정보를 더욱 포함되어야할 수 있다. 이에 따라, 프로세서(130)는 얼굴 이미지에 대한 특징 정보를 비교적 많이 포함할 것으로 예상되는 도메인 정보를 포함하는 영상 데이터를 제2 영상 데이터로 결정할 수 있다.In more detail, the processor 130 may divide each image data into first image data and second image data based on domain information of each of the plurality of image data. Specifically, when the processor 130 acquires a plurality of image data, domain information of each image data may be identified. The domain information of each image data may be information serving as a reference for distinguishing one group of image data from another group. The processor 130 may determine the first image data and the second image data through comparison between domain information of each image data. The processor 130 identifies domain information of each image data, determines image data including domain information expected to include less feature information on the face image as the first image data, and compares the image data with respect to the face image relatively. Image data including domain information expected to include more feature information may be determined as the second image data. For example, since the second reference images used for adjusting similar images are images that serve as a reference for finely adjusting or correcting the derived similar images, face than the first reference images used for deriving (or searching for) similar images. It may be necessary to further include information on the detailed details of Accordingly, the processor 130 may determine, as the second image data, image data including domain information, which is expected to include a relatively large amount of feature information on the face image.

구체적인 예를 들어, 하나의 영상 데이터가 태아의 초음파 이미지에 관련한다는 제1 도메인 정보를 포함하고, 다른 하나의 영상 데이터가 태아의 부모 실제 이미지에 관련한다는 제2 도메인 정보를 포함하는 경우, 프로세서(130)는 제1 도메인 정보를 포함하는 하나의 영상 데이터를 유사 이미지의 도출에 기준이되는 제1 영상 데이터로 결정하고, 그리고 제2 도메인 정보를 포함하는 다른 하나의 영상 데이터를 유사 이미지 조정에 기준이되는 제2 영상 데이터로 결정할 수 있다. 즉, 프로세서(130)는 각 영상 데이터의 도메인 정보에 기반하여 태아의 초음파 이미지 보다 부모의 실제 이미지가 얼굴에 대한 특정 정보를 비교적 많이 포함(즉, 세부적인 조정에 활용될 수 있는 세부 디테일에 관한 정보를 보다 많이 포함)하는 것으로 예측하여, 하나의 영상 데이터를 제1 영상 데이터로 결정하고, 그리고 다른 하나의 영상 데이터를 제2 영상 데이터로 결정할 수 있다. 전술한 각 영상 데이터에 관련한 도메인 정보에 대한 구체적인 기재는 예시일 뿐, 본 개시는 이에 제한되지 않는다.As a specific example, when one image data includes first domain information that relates to an ultrasound image of a fetus and the other image data includes second domain information that relates to an actual image of a parent of a fetus, the processor ( 130) determines one image data including the first domain information as the first image data as a reference for deriving a similar image, and uses the other image data including the second domain information as a reference for adjusting the similarity image. This may be determined as the second image data. That is, the processor 130 determines that, based on the domain information of each image data, the actual image of the parent contains relatively more specific information about the face than the ultrasound image of the fetus (that is, it relates to detailed details that can be used for detailed adjustments). (including more information), one image data may be determined as the first image data, and the other image data may be determined as the second image data. The detailed description of domain information related to each of the above-described image data is only an example, and the present disclosure is not limited thereto.

즉, 프로세서(130)는 복수의 영상 데이터를 획득하는 경우, 획득한 영상 데이터 각각의 도메인 정보에 기반하여 각 영상 데이터를 유사 이미지 도출을 위한 제1 영상 데이터 및 유사 이미지 조정을 위한 제2 영상 데이터로 구분할 수 있다. 다시 말해, 프로세서(130)는 사용자 단말(10)로부터 복수의 영상 데이터를 수신하는 경우, 복수의 영상 데이터들 각각을 최종 출력 이미지를 생성하는 과정에서 어떻게 활용할 것인지 자동으로 결정함으로써, 사용자에게 편의성을 제공할 수 있다. 추가적으로, 프로세서(130)의 전술한 복수의 영상 데이터의 구분 동작을 통해 이하에서 설명할 목표 예측 이미지의 출력 정확도가 더욱 향상될 수 있다.That is, when acquiring a plurality of image data, the processor 130 converts each image data to first image data for deriving a similar image and second image data for adjusting similar images based on domain information of each of the acquired image data. can be distinguished as In other words, when receiving a plurality of image data from the user terminal 10, the processor 130 automatically determines how to utilize each of the plurality of image data in the process of generating the final output image, thereby providing convenience to the user. can provide Additionally, the output accuracy of the target prediction image, which will be described below, may be further improved through the above-described operation of dividing the plurality of image data by the processor 130 .

본 개시의 일 실시예에 따르면, 프로세서(130)는 제1 영상 데이터에 기반하여 하나 이상의 유사 이미지를 도출할 수 있다. 프로세서(130)는 제1 영상 데이터를 구성하는 복수의 제1 참조 이미지에 대응하는 공통 특성 정보를 추출할 수 있다. 공통 특성 정보는, 유사 이미지를 도출하기 위한 기반이되는 정보로, 얼굴 특징점(face landmark)에 관한 정보를 포함할 수 있다. 공통 특성 정보는, 복수의 제1 참조 이미지 내에 공통적으로 포함된 특성 정보에 관한 정보일 수 있다. 특성 정보는 각 이미지에 포함된 특징값에 관련한 정보로, 예를 들어, 구도, 선명도, 명도, 명암, 장애물(손, 발, 태반), 눈, 코, 입, 얼굴 윤곽 등에 관한 정보를 포함할 수 있다. 이러한 공통 특성 정보는, 복수의 제1 참조 이미지와 유사 이미지 상에 공통으로 포함된 이미지 특성에 관한 정보일 수 있다. 프로세서(130)는 공통 특성 정보에 기반하여 하나 이상의 유사 이미지를 도출할 수 있다. 즉, 프로세서(130)는 공통 특성 정보를 통해 제1 참조 이미지들 상의 공통적인 특성들을 식별하고, 식별된 공통 특성들을 기준으로 하나 이상의 유사 이미지를 도출할 수 있다. 다시 말해, 프로세서(130)는, 제1 참조 이미지들에서 공통적으로 관측되는 구도, 명암, 장애물, 눈, 코, 입, 얼굴 윤곽 등을 기반으로 하나 이상의 유사 이미지를 도출할 수 있다.According to an embodiment of the present disclosure, the processor 130 may derive one or more similar images based on the first image data. The processor 130 may extract common characteristic information corresponding to a plurality of first reference images constituting the first image data. The common characteristic information is information that is a basis for deriving a similar image, and may include information about a face landmark. The common characteristic information may be information regarding characteristic information commonly included in the plurality of first reference images. The characteristic information is information related to the feature values included in each image, and for example, information about composition, sharpness, brightness, contrast, obstacles (hands, feet, placenta), eyes, nose, mouth, face outline, etc. can The common characteristic information may be information on image characteristics commonly included in the plurality of first reference images and similar images. The processor 130 may derive one or more similar images based on the common characteristic information. That is, the processor 130 may identify common characteristics on the first reference images through common characteristic information, and derive one or more similar images based on the identified common characteristics. In other words, the processor 130 may derive one or more similar images based on a composition, a contrast, an obstacle, an eye, a nose, a mouth, a face outline, etc. commonly observed in the first reference images.

보다 자세히 설명하면, 프로세서(130)는 제1 영상 데이터를 구성하는 복수의 제1 참조 이미지 각각을 특성 추출 모델의 입력으로 처리하여 제1 참조 이미지에 대응하는 하나 이상의 특성 정보를 획득할 수 있다. In more detail, the processor 130 may obtain one or more characteristic information corresponding to the first reference image by processing each of the plurality of first reference images constituting the first image data as an input of the feature extraction model.

일 실시예에서, 특성 추출 모델은, 히트맵 회귀 방식으로 학습된 합성곱 신경망(convolutional neural network; CNN)을 통해 구성될 수 있다. 구체적으로, 특성 추출 모델은, 각 특징점 마다 그 특징점의 존재 확률을 의미하는 히트맵(heatmap)을 출력하도록 학습된 신경망 모델일 수 있다. 예를 들어, 이러한 특성 추출 모델은, 기계학습 알고리즘을 통해 얼굴에 존재하는 68개의 랜드마크라 부르는 특정 포인트(예컨대, 턱의 상단, 눈 바깥의 가장자리, 눈썹 안?騈? 가장자기 등)를 식별하도록 학습될 수 있다. 특성 추출 모델은, 복수의 제1 참조 이미지 각각에 대응하여 하나 이상의 특성 정보를 획득할 수 있으며, 이를 기반으로 공통 특성 정보를 획득할 수 있다. 즉, 공통 특성 정보는 각 제1 참조 이미지 내에 공통으로 포함된 얼굴 특징점에 관한정보를 포함할 수 있다. In an embodiment, the feature extraction model may be configured through a convolutional neural network (CNN) trained by a heat map regression method. Specifically, the feature extraction model may be a neural network model trained to output a heatmap indicating the existence probability of each feature point for each feature point. For example, this feature extraction model uses a machine learning algorithm to identify specific points called 68 landmarks on the face (e.g., the top of the chin, the outer edge of the eye, the inner eyebrow? can be learned to do. The feature extraction model may acquire one or more feature information corresponding to each of the plurality of first reference images, and may acquire common feature information based thereon. That is, the common characteristic information may include information on facial feature points commonly included in each first reference image.

다른 실시예에서, 특성 추출 모델은 학습된 오토인코더의 적어도 일부를 통해 구성될 수 있다. 구체적으로, 특성 추출 모델은, 학습 오토인코더를 구성하는 차원 감소 네트워크 함수를 통해 구현될 수 있다.In another embodiment, the feature extraction model may be constructed via at least a portion of a learned autoencoder. Specifically, the feature extraction model may be implemented through a dimensionality reduction network function constituting a learning autoencoder.

본 개시에서 학습된 오토인코더는, 프로세서(130)에 의해 입력 데이터와 유사한 출력 데이터를 출력하도록 학습된 차원 감소 네트워크 함수 및 차원 복원 네트워크 함수로 구성될 수 있다. 오토인코더는 입력 데이터와 유사한 출력 데이터를 출력하기 위한 인공 신경망의 일종일 수 있다. 보다 구체적으로, 오토인코더는 적어도 하나의 히든 레이어를 포함할 수 있으며, 적어도 하나의 히든 레이어가 입출력 레이어 사이에 배치될 수 있다. 각각의 레이어의 노드의 수는 입력 레이어의 노드의 수에서 병목 레이어(인코딩)라는 중간 레이어로 축소되었다가, 병목 레이어에서 출력 레이어(입력 레이어와 대칭)로 축소와 대칭되어 확장될 수도 있다. 오토인코더는 비선형 차원 감소를 수행할 수 있다. 입력 레이어 및 출력 레이어의 수는 입력 데이터의 전처리 이후에 남은 입력 데이터의 항목들의의 수와 대응될 수 있다. 오토인코더 구조에서 차원 감소 네트워크 함수(즉, 인코더)에 포함된 히든 레이어의 노드의 수는 입력 레이어에서 멀어질수록 감소하는 구조를 가질 수 있다. 병목 레이어(인코더와 디코더 사이에 위치하는 가장 적은 노드를 가진 레이어)의 노드의 수는 너무 작은 경우 충분한 양의 정보가 전달되지 않을 수 있으므로, 특정 수 이상(예를 들어, 입력 레이어의 절반 이상 등)으로 유지될 수도 있다.The autoencoder learned in the present disclosure may be configured with a dimensionality reduction network function and a dimensionality restoration network function learned to output output data similar to input data by the processor 130 . The autoencoder may be a kind of artificial neural network for outputting output data similar to input data. More specifically, the autoencoder may include at least one hidden layer, and at least one hidden layer may be disposed between the input/output layers. The number of nodes in each layer may be reduced from the number of nodes of the input layer to an intermediate layer called the bottleneck layer (encoding), and then expanded symmetrically with reduction from the bottleneck layer to the output layer (symmetrically with the input layer). Autoencoders can perform non-linear dimensionality reduction. The number of input layers and output layers may correspond to the number of items of input data remaining after preprocessing of input data. In the autoencoder structure, the number of nodes of the hidden layer included in the dimensionality reduction network function (ie, the encoder) may have a structure that decreases as the distance from the input layer increases. If the number of nodes in the bottleneck layer (the layer with the fewest nodes located between the encoder and decoder) is too small, a sufficient amount of information may not be conveyed, so a certain number or more (e.g., more than half of the input layer, etc.) ) may be maintained.

일 실시예에 따르면, 프로세서(130)는 오토인코더를 비지도학습 방식을 통해 학습시킬 수 있다. 구체적으로, 프로세서(130)는 입력 데이터와 유사한 출력 데이터를 출력하도록 오토인코더를 구성하는 차원 감소 네트워크 함수(예컨대, 인코더) 및 차원 복원 네트워크 함수(예컨대, 디코더)를 학습시킬 수 있다. 차원 감소 네트워크 함수를 통한 인코딩 과정에서 입력된 이미지 데이터의 핵심 특징 데이터(또는 피처(feature)) 만을 히든 레이어를 통해 학습하고 나머지 정보를 손실시킬 수 있다. 이 경우, 차원 복원 네트워크 함수를 통한 디코딩 과정에서 히든 레이어의 출력 데이터는 완벽한 복사 값이 아닌 입력 데이터의 근사치일 수 있다. 즉, 프로세서(130)는 출력 데이터와 입력 데이터가 최대한 같아지도록 가중치를 조정함으로써, 오토인코더를 학습시킬 수 있다. 본 개시에서 오토인코더의 학습에 활용되는 입력 데이터는 예컨대, 태아의 실사 이미지일 수 있다.According to an embodiment, the processor 130 may learn the autoencoder through an unsupervised learning method. Specifically, the processor 130 may train a dimensionality reduction network function (eg, encoder) and a dimensionality restoration network function (eg, decoder) that configure the autoencoder to output output data similar to input data. Only the core feature data (or features) of the input image data in the encoding process through the dimensionality reduction network function can be learned through the hidden layer and the rest of the information can be lost. In this case, the output data of the hidden layer may be an approximation of the input data rather than a perfect copy value in the decoding process through the dimensional reconstruction network function. That is, the processor 130 may learn the autoencoder by adjusting the weight so that the output data and the input data are the same as possible. The input data used for learning of the autoencoder in the present disclosure may be, for example, a live-action image of a fetus.

일 실시예에서, 전술한 과정을 통해 학습된 오토인코더는 태아의 실제 이미지를 입력으로 하여 태아의 실제 이미지에 대응하는 출력 이미지(즉, 실사 이미지)를 출력할 수 있다. 이 경우, 출력 이미지는 태아의 실제 이미지를 완벽하게 복사한 값이 아닌 근사치에 관련한 이미지일 수 있다.In an embodiment, the autoencoder learned through the above-described process may output an output image (ie, a live-action image) corresponding to the real image of the fetus by receiving the actual image of the fetus as an input. In this case, the output image may be an image related to an approximation rather than a value that is a perfect copy of the actual image of the fetus.

학습된 오토인코더에 포함된 차원 감소 네트워크 함수(즉, 인코더)는 이미지(예컨대, 태아의 초음파 이미지)를 입력으로 하여 피처(즉, 임베딩)을 추출할 있다. 즉, 차원 감소 네트워크 함수는 프로세서(130)로부터 태아의 초음파 이미지에 관련한 학습 입력 데이터를 수신하여 학습 입력 데이터의 특징 벡터 열을 출력으로 지정하여 입력 데이터가 피처로 변환되는 중간 과정을 학습할 수 있다. 또한, 프로세서(130)는 차원 감소 네트워크 함수의 출력에 관련한 임베딩(즉, 피처)를 차원 복원 네트워크 함수로 전달할 수 있으며, 차원 복원 네트워크 함수는 피처를 입력으로 하여 해당 피처에 관련한 이미지(즉, 태아의 초음파 이미지와 근사치의 이미지)를 출력할 수 있다. The dimensionality reduction network function (ie, encoder) included in the learned autoencoder may take an image (eg, ultrasound image of a fetus) as input and extract features (ie, embeddings). That is, the dimension reduction network function receives learning input data related to an ultrasound image of a fetus from the processor 130 and designates a feature vector column of the learning input data as an output to learn an intermediate process in which the input data is converted into features. . In addition, the processor 130 may pass the embeddings (ie, features) related to the output of the dimension reduction network function to the dimension reconstruction network function, and the dimension reconstruction network function receives the feature as an input and an image (ie, a fetus) related to the feature. of ultrasound image and approximation) can be output.

다시 말해, 학습된 오토인코더에 포함된 차원 감소 네트워크 함수는, 입력 데이터에 대한 피처를 추출하도록 학습되며, 차원 복원 네트워크 함수는, 피처에 대응하는 이미지를 출력하도록 학습될 수 있다. In other words, the dimensionality reduction network function included in the learned autoencoder may be trained to extract features for input data, and the dimensionality restoration network function may be trained to output an image corresponding to the features.

본 개시의 일 실시예에 따르면, 프로세서(130)는 제1 영상 데이터를 구성하는 복수의 제1 참조 이미지 각각을 차원 감소 네트워크 함수로 구성된 특성 추출 모델의 입력으로 처리하여 하나 이상의 특성 정보를 출력할 수 있다. 또한, 프로세서(130)는 각 제1 참조 이미지에 대응하여 출력된 하나 이상의 특성 정보에 기반하여 공통 특성 정보를 획득할 수 있다.According to an embodiment of the present disclosure, the processor 130 outputs one or more characteristic information by processing each of the plurality of first reference images constituting the first image data as an input of a characteristic extraction model composed of a dimensionality reduction network function. can Also, the processor 130 may acquire common characteristic information based on one or more characteristic information output in response to each first reference image.

도 4를 참조하여 보다 자세히 설명하면, 프로세서(130)는 제1 영상 데이터를 구성하는 제1 참조 제1 이미지(211), 제1 참조 제2 이미지(212) 및 제1 참조 제n 이미지(213) 각각을 하나 이상의 차원 감소 네트워크 함수 각각의 입력으로 처리할 수 있다. 이 경우, 각 차원 감소 네트워크 함수는 학습된 오토인코더에서 인코더 부분을 통해 구현되는 것으로, 특정 이미지를 입력으로 하는 경우, 특정 이미지에 대응하는 핵심 피처 즉, 특성 정보를 출력할 수 있다. 여기서 특성 정보는 각 이미지에 포함된 특징값에 관련한 정보로, 예를 들어, 구도, 선명도, 명도, 명암, 장애물(손, 발, 태반), 눈, 코, 입, 얼굴 윤곽 등에 관한 정보를 포함할 수 있다.4 , the processor 130 generates a first reference first image 211 , a first reference second image 212 , and a first reference n-th image 213 constituting the first image data. ) can be treated as inputs to each of one or more dimensionality reduction network functions. In this case, each dimension reduction network function is implemented through the encoder part in the learned autoencoder, and when a specific image is input, core features corresponding to the specific image, that is, characteristic information can be output. Here, the characteristic information is information related to the feature value included in each image, and includes, for example, information about composition, sharpness, brightness, contrast, obstacles (hands, feet, placenta), eyes, nose, mouth, face outline, etc. can do.

다시 말해, 제1 차원 복원 네트워크 함수(221)는 제1 참조 제1 이미지(211)를 입력으로 하여 제1 특성 정보(231)를 출력하며, 제2 차원 복원 네트워크 함수(222)는 제1 참조 제2 이미지(212)를 입력으로 하여 제2 특성 정보(232)를 출력하고, 그리고 제n 차원 복원 네트워크 함수(223)는 제1 참조 제n 이미지(213)를 입력으로 하여 제n 특성 정보(233)를 출력할 수 있다.In other words, the first-dimensional reconstruction network function 221 receives the first reference first image 211 as an input and outputs the first characteristic information 231, and the second-dimensional reconstruction network function 222 receives the first reference The second image 212 is input to output the second characteristic information 232 , and the n-th dimensional reconstruction network function 223 takes the first reference n-th image 213 as an input and outputs the n-th characteristic information ( 233) can be printed.

프로세서(130)는 각 차원 복원 네트워크 함수를 통해 출력된 특성 정보들에 기반하여 공통 특성 정보(240)를 획득할 수 있다. 프로세서(130)는 제1 특성 정보(231), 제2 특성 정보(232) 및 제n 특성 정보(233)를 병합하여 공통 특성 정보를 생성할 수 있다. 일 실시예에서, 프로세서(130)는 각 특성 정보의 평균값 또는 가중합(weighted sum)을 통해 공통 특성 정보를 생성할 수 있다. 추가적인 실시예에서, 각 특성 정보를 추합하기 위한 별도의 추가적인 신경망 모델이 구성될 수 있으며, 해당 추가 신경망 모델을 통해 공통 특성 정보가 생성될 수 있다. 즉, 추가 신경망 모델은 복수의 특성 정보가 입력되는 경우, 각각의 특성 정보를 병합하여 공통 특성 정보를 출력할 수 있다. 전술한 공통 특성 정보 생성 방법에 대한 구체적인 기재는 예시일 뿐, 본 개시는 이에 제한되지 않는다.The processor 130 may acquire the common characteristic information 240 based on the characteristic information output through each dimension restoration network function. The processor 130 may generate common characteristic information by merging the first characteristic information 231 , the second characteristic information 232 , and the n-th characteristic information 233 . In an embodiment, the processor 130 may generate common characteristic information through an average value or a weighted sum of each characteristic information. In an additional embodiment, a separate additional neural network model for summing each characteristic information may be configured, and common characteristic information may be generated through the additional neural network model. That is, when a plurality of pieces of characteristic information are input, the additional neural network model may output common characteristic information by merging each characteristic information. The detailed description of the above-described method for generating common characteristic information is only an example, and the present disclosure is not limited thereto.

즉, 프로세서(130)는 각 특성 정보들의 가중합이나, 평균값 또는 추가 신경망 모델을 활용함으로써, 복수의 제1 참조 이미지들 각각에 대응하는 복수의 특성 정보들을 병합시켜 공통 특성 정보를 생성할 수 있다. 이에 따라, 생성된 공통 특성 정보는 제1 참조 이미지들에 공통적으로 포함된 특징들에 관한 정보를 포함할 수 있다.That is, the processor 130 may generate common characteristic information by merging a plurality of characteristic information corresponding to each of the plurality of first reference images by using a weighted sum of each characteristic information, an average value, or an additional neural network model. . Accordingly, the generated common characteristic information may include information on characteristics commonly included in the first reference images.

추가적인 실시예에서, 특성 추출 모델은, 제1 참조 이미지 각각에 대응하는 하나 이상의 특성 정보 및 하나 이상의 도메인 정보를 출력하는 것을 특징으로 할 수 있다. 특성 추출 모델은 학습된 오토인코더의 일부 즉, 인코더를 통해 구현될 수 있다. 프로세서(130) 오토인코더의 사전 학습 과정에서 입력 데이터에 도메인 정보를 태깅할 수 있다. 예컨대, 프로세서(130)는 태아의 실사 이미지를 제1 입력 데이터로 오토인코더에 입력시키는 과정에서, 해당 제1 입력 데이터에 "태아의 초음파 사진"이라는 도메인 정보를 태깅할 수 있다. 이에 따라, 오토인코더는 학습 과정에서 제1 입력 데이터와 유사한 출력 데이터 및 태아의 초음파 사진이라는 도메인 정보를 출력하도록 학습될 수 있다.In a further embodiment, the feature extraction model may output one or more feature information and one or more domain information corresponding to each of the first reference images. The feature extraction model may be implemented through a part of the learned autoencoder, that is, the encoder. In the pre-learning process of the processor 130 autoencoder, domain information may be tagged to input data. For example, in the process of inputting a live-action image of a fetus into the autoencoder as first input data, the processor 130 may tag domain information of "an ultrasound picture of a fetus" to the first input data. Accordingly, the autoencoder may be learned to output domain information such as output data similar to the first input data and an ultrasound picture of a fetus in the learning process.

또한, 프로세서(130)는 공통 특성 정보에 기반하여 하나 이상의 유사 이미지를 도출할 수 있다. 보다 구체적으로, 프로세서(130)는 공통 특성 정보를 기반으로 타겟 이미지 데이터베이스에 대한 검색을 수행하여 사전 결정된 유사도 이상의 하나 이상의 유사 이미지를 도출할 수 있다. 이 경우, 타겟 이미지 데이터베이스는, 복수의 도메인 각각에 대응하여 구성되는 것을 특징으로 할 수 있다. 예를 들어, 타겟 이미지 데이터베이스는, 태아의 초음파 이미지에 관련한 제1 도메인으로 분류된 제1 타겟 이미지 데이터 베이스 및 사용자의 실체 촬영 이미지에 관련한 제2 도메인으로 분류된 제2 타겟 이미지 데이터 베이스를 포함할 수 있다. 전술한 제1 도메인 및 제2 도메인에 대한 구체적인 기재는 예시일 뿐, 본 개시는 이에 제한되지 않는다. Also, the processor 130 may derive one or more similar images based on the common characteristic information. More specifically, the processor 130 may derive one or more similarity images with a degree of similarity or higher by performing a search on the target image database based on the common characteristic information. In this case, the target image database may be configured to correspond to each of a plurality of domains. For example, the target image database may include a first target image database classified into a first domain related to an ultrasound image of a fetus and a second target image database classified into a second domain related to a user's actual photographed image. can The detailed description of the above-described first domain and second domain is only an example, and the present disclosure is not limited thereto.

일 실시예에서, 프로세서(130)는 제1 참조 이미지에 대응하는 도메인 정보에 기반하여 복수의 도메인 각각에 대응하여 구축된 타겟 이미지 데이터베이스 중 유사 이미지를 도출 또는 검색하기 위한 타겟 이미지 데이터베이스를 결정 또는 식별할 수 있다. 예컨대, 프로세서(130)는 특성 추출 모델을 통해 제1 참조 이미지들에 대한 도메인 정보를 획득할 수 있으며, 해당 도메인 정보와 매칭되는 도메인으로 구분된 타겟 데이터베이스를 유사 이미지 도출을 위한 타겟 이미지 데이터베이스로 결정할 수 있다. 즉, 프로세서(130)는 유사 이미지 도출 또는 검색에 기반이 되는 타겟 이미지 데이터베이스들을 다양한 도메인에 대응하여 구분하여 구축함으로써, 데이터의 검색 효율을 향상시킬 수 있다. 다시 말해, 제1 참조 이미지의 도메인 정보와 매칭되는 타겟 이미지 데이터베이스 내에서 유사 이미지에 대한 검색을 수행하기 때문에, 검색 속도가 증가하게 되며, 이에 따라 유사 이미지 도출 과정에서의 효율이 극대화될 수 있다.In an embodiment, the processor 130 determines or identifies a target image database for deriving or searching for a similar image among target image databases constructed corresponding to each of a plurality of domains based on domain information corresponding to the first reference image. can do. For example, the processor 130 may acquire domain information about the first reference images through the feature extraction model, and determine a target database divided into domains matching the domain information as a target image database for deriving similar images. can That is, the processor 130 may improve data retrieval efficiency by classifying and constructing target image databases based on similar image derivation or retrieval corresponding to various domains. In other words, since a similar image is searched in the target image database that matches the domain information of the first reference image, the search speed is increased, and thus the efficiency in the similar image derivation process can be maximized.

일 실시예에서, 프로세서(130)는 공통 특성 정보를 기반으로 타겟 이미지 데이터베이스에 대한 검색을 수행하여 사전 결정된 유사도 이상의 하나 이상의 유사 이미지를 도출할 수 있다. 이 경우, 사전 결정된 유사도 이상인지 여부를 결정하는 유사도 판별은, 유클리드 거리 또는 코사인 유사도에 기반한 것일 수 있다. 구체적으로, 공통 특성 정보는 각 이미지 데이터들의 특성 정보 간의 병합을 통해 생성되는 것일 수 있다. 이 경우, 특성 정보들 각각은 특성 추출 모델을 통해 출력되는 것으로, 각 이미지(즉 제1 참조 이미지)들의 특성에 관한 정보들을 포함하는 특징 벡터일 수 있다. 예컨대, 각 특성 정보들은 벡터 공간 상의 일 영역에 표시될 수 있다. 여기서 유사한 특성 정보들은 유사한 영역 상에 근접하게 위치할 수 있으며, 상이할수록 비교적 먼 위치에 위치할 수 있다. In an embodiment, the processor 130 may derive one or more similar images with a predetermined similarity or higher by performing a search on a target image database based on common characteristic information. In this case, the similarity determination for determining whether the similarity is greater than or equal to the predetermined similarity may be based on the Euclidean distance or the cosine similarity. Specifically, the common characteristic information may be generated through merging between characteristic information of each image data. In this case, each of the feature information is output through the feature extraction model, and may be a feature vector including information about the feature of each image (ie, the first reference image). For example, each characteristic information may be displayed in one region on the vector space. Here, similar characteristic information may be located close to each other on a similar region, and may be located relatively farther apart as they are different.

즉, 프로세서(130)는 타겟 이미지 데이터베이스에서 해당 공통 특성 정보에 대응하는 특징 벡터들과 거리적으로 인접하게 배치된 특징 벡터들을 포함하는 이미지들을 하나 이상의 유사 이미지로써 도출할 수 있다. 다시 말해, 프로세서(130)는 공통 특성 정보와 타겟 이미지 데이터베이스에 포함된 이미지들의 벡터 공간 상에 거리 비교를 통해 유사성을 평가할 수 있다. That is, the processor 130 may derive, as one or more similar images, images including feature vectors disposed to be adjacent in distance to feature vectors corresponding to the corresponding common feature information in the target image database. In other words, the processor 130 may evaluate the similarity by comparing the common characteristic information and the distance on the vector space of images included in the target image database.

추가적인 실시예에서, 프로세서(130)는 제1 영상 데이터를 구성하는 복수의 제1 이미지들 중 적어도 일부를 복수의 제1 참조 이미지로 결정할 수 있다. 본 개시에서 복수의 제1 참조 이미지는, 유사 이미지 검색에 기반이 되는 이미지일 수 있다. 복수의 제1 참조 이미지는 유사 이미지 검색에 기준이 되는 것이므로, 해당 제1 참조 이미지가 명확한 특성 정보를 포함하는 경우, 검색되는 유사 이미지의 정확도가 높아질 수 있다. 다시 말해, 이미지에 특성들이 명확히 반영되어 있을수록 타겟 이미지 데이터베이스에서 사용자의 의도에 근접한 유사 이미지들을 검색할 수 있게된다. 예를 들어, 구도, 노이즈, 명암, 장애물(예컨대, 손, 발, 태반 등) 등을 통해 명료하지 않은 이미지들이 제1 참조 이미지로 결정되는 경우, 타겟 이미지 데이터베이스로부터 검색되는 하나 이상의 유사 이미지들은 사용자의 의도와 상이할 수 있으며, 결과적으로 사용자의 의도에 부합하지 않는 목표 예측 데이터의 생성을 야기시킬 수 있다. 이에 따라, 프로세서(130)는 제1 영상 데이터를 구성하는 복수의 제1 이미지들 중 비교적 특성들을 많이 포함(즉, 비교적 명료한)하는 이미지들을 선별하여 복수의 제1 이미지로 결정할 수 있다. In an additional embodiment, the processor 130 may determine at least some of the plurality of first images constituting the first image data as the plurality of first reference images. In the present disclosure, the plurality of first reference images may be images based on a similar image search. Since the plurality of first reference images serves as a reference for similar image search, when the corresponding first reference image includes clear characteristic information, the accuracy of the searched similar image may be increased. In other words, the more clearly the characteristics are reflected in the image, the more similar images close to the user's intention can be retrieved from the target image database. For example, when images that are not clear through composition, noise, contrast, obstacles (eg, hand, foot, placenta, etc.) are determined as the first reference image, one or more similar images retrieved from the target image database are may be different from the intention of , and, as a result, may cause generation of target prediction data that does not conform to the user's intention. Accordingly, the processor 130 may select images having relatively many characteristics (ie, relatively clear) from among the plurality of first images constituting the first image data and determine the plurality of first images.

구체적으로, 프로세서(130)는 제1 영상 데이터를 구성하는 복수의 제1 이미지 각각을 특성 추출 모델의 입력으로 처리하여 각 이미지에 대응하는 하나 이상의 특성 정보를 획득할 수 있다. 여기서, 특성 정보는, 각 이미지에 대응하는 벡터 공간 상의 임베딩에 관련한 것일 수 있다. Specifically, the processor 130 may obtain one or more characteristic information corresponding to each image by processing each of the plurality of first images constituting the first image data as an input of the feature extraction model. Here, the characteristic information may be related to embedding in a vector space corresponding to each image.

또한, 프로세서(130)는 하나 이상의 특성 정보에 기초하여 복수의 항목 각각에 대응하는 확률값을 산출할 수 있다. 이 경우, 복수의 항목은, 얼굴 이미지에 포함된 구성 요소들에 관련한 것으로, 예를 들어, 눈, 코, 입, 귀 등에 관한 정보일 수 있다. 확률값은, 구성 요소들이 각 이미지에 포함되어 있는지에 대한 예측도에 관련한 출력값을 의미할 수 있다. 구체적으로, 프로세서(130)는 특성 추출 모델의 출력에 관련한 하나 이상의 특성 정보 각각을, 하나 이상의 네트워크 함수를 포함하여 구성된 확률 추정 모델의 입력으로 처리하여, 각 특성 정보에 대응하여 각 항목에 대한 확률값을 출력할 수 있다. 여기서, 확률 추정 모델은, 특성 추출 모델의 출력(즉, 벡터 공간 상의 임베딩에 관련한 특성 정보)을 구성 요소의 확률값으로 변환하도록 학습된 신경망 모델일 수 있다. 확률 추정 모델은, 얼굴 이미지를 구성하는 복수의 항목(즉, 얼굴의 구성요소) 각각에 대응하여 하나 이상의 확률값 각각을 출력하도록 학습된 하나 이상의 로지스틱 회귀 모델(logistic regression model)을 포함할 수 있다. 예컨대, 제1 로지스틱 회귀 모델은, 제1 특성 정보를 입력으로 하여, '코'에 관련한 확률값을 0.6으로 출력할 수 있으며, 제2 로지스틱 회귀 모델은, 제1 특성 정보를 입력으로 하여, '눈'에 관련한 확률값을 0.85로 출력할 수 있다. 전술한 로지스틱 회귀 모델의 출력값에 대한 구체적인 기재는 예시일 뿐, 본 개시는 이에 제한되지 않는다. 일 실시예에서, 하나 이상의 로지스틱 회귀 모델 각각의 파라미터의 적어도 일부는 서로 공유되는 것을 특징으로 할 수 있다.Also, the processor 130 may calculate a probability value corresponding to each of the plurality of items based on one or more characteristic information. In this case, the plurality of items relate to components included in the face image, and may be, for example, information about eyes, nose, mouth, ears, and the like. The probability value may mean an output value related to the degree of prediction of whether the components are included in each image. Specifically, the processor 130 processes each of one or more characteristic information related to the output of the characteristic extraction model as an input of a probability estimation model including one or more network functions, and a probability value for each item corresponding to each characteristic information can be printed out. Here, the probability estimation model may be a neural network model trained to convert the output of the feature extraction model (ie, feature information related to embedding in a vector space) into probability values of components. The probability estimation model may include one or more logistic regression models trained to output each of one or more probability values corresponding to each of a plurality of items (ie, components of a face) constituting the face image. For example, the first logistic regression model may output a probability value related to 'nose' as 0.6 by inputting the first characteristic information as an input, and the second logistic regression model may receive the first characteristic information as an input and output the 'eye' as an input. The probability value related to ' can be output as 0.85. The detailed description of the output value of the logistic regression model described above is only an example, and the present disclosure is not limited thereto. In one embodiment, at least some of the parameters of each of the one or more logistic regression models may be shared with each other.

또한, 프로세서(130)는 복수의 항목 각각에 대응하는 확률값에 기초하여 복수의 제1 이미지들 중 일부를 복수의 제1 참조 이미지로 선별할 수 있다. 구체적인 예를 들어, 특성 추출 모델은 제1 이미지들을 입력으로 하는 경우, 각 이미지들에 대응하는 하나 이상의 특성 정보를 출력할 수 있다. 프로세서(130)는 하나 이상의 특성 정보를 확률 추정 모델의 입력으로 처리하여 복수의 항목 각각에 대응하는 확률값을 출력할 수 있다. 예컨대, 확률 추정 모델은 하나의 제1 이미지에 대응하는 특성정보에 기반하여 눈에 대한 확률값을 0.1으로, 코에 대한 출력값을 0.8으로, 입에 대한 출력값을 0.9으로 출력할 수 있다. 즉, 해당 이미지에 대응하여 눈, 코, 입 각각에 관련한 확률값이 획득될 수 있다. 이와 같이, 확률 추정 모델은 해당 이미지 내에 포함된 항목들에 관련하여 각각의 확률값을 출력할 수 있다. 이 경우, 특정 항목에 관련한 확률값이 높을수록 해당 항목의 특성이 이미지 내에 존재할 확률이 높은 것을 의미하며, 확률값이 낮을수록 해당 항목의 특성이 이미지 내에 존재할 확률이 낮은 것을 의미할 수 있다. 프로세서(130)는 사전 결정된 확률값(예컨대, 0.8) 이상의 항목들의 수가 최대인 이미지들을 복수의 제1 참조 이미지로 결정할 수 있다. 즉, 프로세서(130)는 제1 영상 데이터를 구성하는 제1 이미지들 중 보다 많은 항목 값을 포함하고 있는 이미지를 식별하여 복수의 제1 참조 이미지로 결정할 수 있다. 다시 말해, 프로세서(130)는 다수의 특성 정보를 가진 프레임들을 공통 특성 정보 생성에 기반이 되는 제1 참조 이미지로써 선별할 수 있다. 이는, 영상 데이터 내에서 비교적 명료한 데이터들(또는 다수의 특성 정보를 포함하는 이미지 데이터들)을 통해 복수의 제1 참조 이미지를 구성하도록 함으로써, 획득되는 공통 특성 정보의 정확성을 향상시키며, 결과적으로 사용자의 의도에 보다 부합하는 목표 예측 데이터의 생성을 도모할 수 있다.Also, the processor 130 may select some of the plurality of first images as the plurality of first reference images based on probability values corresponding to each of the plurality of items. As a specific example, when the first images are input to the feature extraction model, one or more feature information corresponding to each image may be output. The processor 130 may process one or more characteristic information as inputs of the probability estimation model to output a probability value corresponding to each of the plurality of items. For example, the probability estimation model may output a probability value of 0.1 for an eye, an output value of a nose as 0.8, and an output value of a mouth as 0.9 based on the characteristic information corresponding to one first image. That is, probability values related to each of the eyes, nose, and mouth may be obtained in correspondence to the corresponding image. In this way, the probability estimation model may output respective probability values in relation to items included in the corresponding image. In this case, the higher the probability value related to a specific item, the higher the probability that the characteristic of the corresponding item exists in the image, and the lower the probability value, the lower the probability that the characteristic of the corresponding item exists in the image. The processor 130 may determine images having a maximum number of items equal to or greater than a predetermined probability value (eg, 0.8) as the plurality of first reference images. That is, the processor 130 may identify an image including more item values from among the first images constituting the first image data and determine the plurality of first reference images. In other words, the processor 130 may select frames having a plurality of characteristic information as the first reference image based on the generation of the common characteristic information. This improves the accuracy of common characteristic information obtained by configuring a plurality of first reference images through relatively clear data (or image data including a plurality of characteristic information) within the image data, and as a result, Generation of target prediction data more conforming to the user's intention can be achieved.

본 개시의 일 실시예에 따르면, 프로세서(130)는 제2 영상 데이터를 구성하는 복수의 제2 참조 이미지에 기반하여 하나 이상의 유사 이미지를 조정함으로써, 목표 예측 이미지를 생성할 수 있다. 예컨대, 복수의 제2 참조 이미지는 부모의 실제 영상에 관련한 이미지들 일 수 있다. 하나 이상의 유사 이미지를 조정하여 목표 예측 이미지를 생성하는 구성은 도 5를 참조하여 구체적으로 후술하도록 한다.According to an embodiment of the present disclosure, the processor 130 may generate a target prediction image by adjusting one or more similar images based on a plurality of second reference images constituting the second image data. For example, the plurality of second reference images may be images related to an actual image of a parent. A configuration for generating a target prediction image by adjusting one or more similar images will be described in detail later with reference to FIG. 5 .

구체적으로, 프로세서(130)는 하나 이상의 제2 참조 이미지(즉, 제2 참조 제1 이미지 내지 제2 참조 제n 이미지)를 특성 추출 모델의 입력으로 처리하여 하나 이상의 추가 특성 정보를 획득할 수 있다. 여기서, 추가 특성 정보는, 각 제2 참조 이미지에 대응하여 출력된 각각의 특성 정보를 의미할 수 있다. 즉, 추가 특성 정보는, 복수의 제2 참조 이미지 각각에 포함된 특징값에 관련한 정보로, 예를 들어, 구도, 선명도, 명도, 명암, 장애물(손, 발, 태반), 눈, 코, 입, 얼굴 윤곽 등에 관한 정보를 포함할 수 있다. 또한, 프로세서(130)는 하나 이상의 추가 특성 정보에 기반하여 추가 공통 특성 정보(312)를 획득할 수 있다. 프로세서(130)는 제1 추가 특성 정보 내지 제n 추가 특성 정보를 병합하여 추가 공통 특성 정보(312)를 생성할 수 있다. 즉, 생성된 추가 공통 특성 정보는, 복수의 제2 참조 이미지들에 공통으로 포함된 이미지 특성에 관한 정보일 수 있다.Specifically, the processor 130 may obtain one or more additional characteristic information by processing one or more second reference images (ie, second reference first image to second reference n-th image) as an input of the feature extraction model. . Here, the additional characteristic information may mean each characteristic information output in response to each second reference image. That is, the additional characteristic information is information related to a feature value included in each of the plurality of second reference images, for example, composition, sharpness, brightness, contrast, obstacles (hands, feet, placenta), eyes, nose, and mouth. , and may include information about a face outline, etc. Also, the processor 130 may acquire the additional common characteristic information 312 based on one or more pieces of additional characteristic information. The processor 130 may generate the additional common characteristic information 312 by merging the first additional characteristic information to the nth additional characteristic information. That is, the generated additional common characteristic information may be information on image characteristics commonly included in the plurality of second reference images.

일 실시예에서, 프로세서(130)는 각 추가 특성 정보의 평균값 또는 가중합(weighted sum)을 통해 추가 공통 특성 정보(312)를 생성할 수 있다. 추가적인 실시예에서, 각 추가 특성 정보를 병합하기 위한 별도의 추가적인 신경망 모델이 구성될 수 있으며, 해당 추가 신경망 모델을 통해 추가 공통 특성 정보(312)가 생성될 수 있다. 즉, 추가 신경망 모델은 복수의 추가 특성 정보가 입력되는 경우, 각각의 추가 특성 정보를 병합하여 추가 공통 특성 정보(312)를 출력할 수 있다. 전술한 추가 공통 특성 정보(312) 생성 방법에 대한 구체적인 기재는 예시일 뿐, 본 개시는 이에 제한되지 않는다.In an embodiment, the processor 130 may generate the additional common characteristic information 312 through an average value or a weighted sum of each additional characteristic information. In an additional embodiment, a separate additional neural network model for merging each additional characteristic information may be configured, and additional common characteristic information 312 may be generated through the additional neural network model. That is, when a plurality of additional characteristic information is input, the additional neural network model may output additional common characteristic information 312 by merging each additional characteristic information. The detailed description of the above-described method for generating the additional common characteristic information 312 is only an example, and the present disclosure is not limited thereto.

즉, 프로세서(130)는 각 추가 특성 정보들의 가중합이나, 평균값 또는 추가 신경망 모델을 활용함으로써, 복수의 제2 참조 이미지들 각각에 대응하는 복수의 추가 특성 정보들을 병합시켜 추가 공통 특성 정보를 생성할 수 있다. 이에 따라, 생성된 추가 공통 특성 정보는 제2 참조 이미지들에 공통적으로 포함된 특징들에 관한 정보를 포함할 수 있다.That is, the processor 130 generates additional common characteristic information by merging a plurality of additional characteristic information corresponding to each of the plurality of second reference images by using a weighted sum of each additional characteristic information, an average value, or an additional neural network model. can do. Accordingly, the generated additional common characteristic information may include information on characteristics commonly included in the second reference images.

프로세서(130)는 추가 공통 특성 정보(312) 및 공통 특성 정보(2410)에 기반하여 목표 예측 이미지를 생성할 수 있다. 구체적으로, 프로세서(130)는 추가 공통 특성 정보(312) 및 공통 특성 정보(240)에 기반하여 최종 특성 정보(320)를 획득할 수 있다. 최종 특성 정보(320)는, 공통 특성 정보(240)에 추가 공통 특성 정보(312)를 반영함으로써 생성되는 것일 수 있다. 즉, 최종 특성 정보(320)는 공통 특성 정보(240)의 적어도 일부가 추가 공통 특성 정보(312)에 기반하여 조정된 정보를 포함할 수 있다.The processor 130 may generate a target prediction image based on the additional common characteristic information 312 and the common characteristic information 2410 . Specifically, the processor 130 may acquire the final characteristic information 320 based on the additional common characteristic information 312 and the common characteristic information 240 . The final characteristic information 320 may be generated by reflecting the additional common characteristic information 312 in the common characteristic information 240 . That is, the final characteristic information 320 may include information in which at least a portion of the common characteristic information 240 is adjusted based on the additional common characteristic information 312 .

또한, 프로세서(130)는 최종 특성 정보(320)를 이미지 생성 모델(330)의 입력으로 처리하여 목표 예측 이미지를 생성할 수 있다. 여기서, 이미지 생성 모델(330)은, 학습된 오토인코더의 적어도 일부를 통해 구성될 수 있다. 학습된 오토인코더는 차원 감소 네트워크 함수 및 차원 복원 네트워크 함수를 포함할 수 있다. 이 경우, 차원 감소 네트워크 함수는, 입력 데이터에 대한 피처를 추출하도록 사전 학습되며, 차원 복원 네트워크 함수는, 피처에 대응하는 이미지를 출력하도록 사전 학습될 수 있다.In addition, the processor 130 may generate the target prediction image by processing the final characteristic information 320 as an input of the image generation model 330 . Here, the image generation model 330 may be configured through at least a part of the learned autoencoder. The learned autoencoder may include a dimensionality reduction network function and a dimensionality restoration network function. In this case, the dimensionality reduction network function may be pre-trained to extract features for the input data, and the dimension restoration network function may be pre-trained to output an image corresponding to the features.

이미지 생성 모델(330)은 학습된 오토인코더의 차원 복원 네트워크 함수(즉, 디코더) 부분을 통해 구현될 수 있다. 차원 복원 네트워크 함수로 구성되는 이미지 생성 모델(330)은 피처(즉, 벡터 공간 상의 임베딩)를 입력으로 하여 해당 피처에 관련한 이미지(즉, 실제 촬영 이미지와 근사치의 이미지)를 출력하도록 사전 학습될 수 있다.The image generation model 330 may be implemented through a dimensional reconstruction network function (ie, decoder) part of the learned autoencoder. The image generation model 330 composed of a dimensional reconstruction network function can be pre-trained to output an image related to the feature (ie, an image that is approximate to the actual photographed image) by taking a feature (ie, embedding in vector space) as an input. have.

즉, 프로세서(130)는 이미지 생성 모델(330)에 공통 특성 정보(240)의 적어도 일부가 추가 공통 특성 정보(312)에 기반하여 조정된 정보인 최종 특성 정보(320)를 입력으로 처리하여 목표 예측 이미지를 생성할 수 있다. 이 경우, 최종 특성 정보(320)는 제1 참조 이미지들에 대응하는 공통 특성 정보(240)가 제2 참조 이미지들에 대응하는 추가 공통 특성 정보(312)에 기반하여 조정된 정보이므로, 이미지 생성 모델(330)을 통해 출력되는 목표 예측 이미지는, 제1 참조 이미지들과 유사한 이미지가 제2 참조 이미지를 기반으로 조정된 이미지와 관련한 것일 수 있다.That is, the processor 130 processes the final characteristic information 320 , which is information adjusted based on the additional common characteristic information 312 in which at least a portion of the common characteristic information 240 to the image generation model 330 , as an input to target the target A predictive image can be generated. In this case, since the final characteristic information 320 is information in which the common characteristic information 240 corresponding to the first reference images is adjusted based on the additional common characteristic information 312 corresponding to the second reference images, the image is generated The target prediction image output through the model 330 may be related to an image in which an image similar to the first reference images is adjusted based on the second reference image.

예컨대, 사용자 별로 이미지를 표현하고자 하는 이미지 표현 방식이나 또는, 특정 이미지(예컨대, 참조 이미지)를 기반으로 출력하고자 하는 목표 이미지 각각 상이함에 따라, 각 사용자의 의도에 부합하는 목표 이미지를 생성하는데 어려움이 있을 수 있다. 구체적인 예를 들어, 태아의 초음파 이미지에 관한 정보에 기반하여 태아의 미래 얼굴에 관련한 이미지를 생성하고자 하는 경우, 목표 이미지 생성에 기반이 되는 초음파 이미지의 일정하지 않은 촬영 각도와 노이즈 등에 따라 목표 이미지의 생성이 어려울 수 있다. 다시 말해, 참조 이미지로 활용되는 초음파 이미지의 경우, 태아의 형태적인 정보만을 포함하고 있어, 태아의 자연스러운 얼굴형상을 인지하여 신경망을 학습하는데에 어려움이 있으며, 초음파 이미지들의 왜곡으로 인해 실제와 다른 예측 얼굴 형상이 출력되도록 신경망이 학습될 우려가 있다. 즉, 임의의 의도에 대해 적합한 이미지 또는 영상을 생성하여 제공하는 데는 한계가 있다. For example, as the image expression method for each user and the target image to be output based on a specific image (eg, a reference image) are different, it is difficult to generate a target image that meets the intention of each user. there may be For a specific example, if an image related to the future face of the fetus is to be generated based on information about the ultrasound image of the fetus, the target image may be changed according to the non-uniform shooting angle and noise of the ultrasound image that is the basis for generating the target image. It can be difficult to create. In other words, in the case of an ultrasound image used as a reference image, it contains only the morphological information of the fetus, so it is difficult to learn the neural network by recognizing the natural face shape of the fetus. There is a fear that the neural network is trained to output the face shape. That is, there is a limit to generating and providing an image or image suitable for an arbitrary intention.

본 개시의 프로세서(130)는 복수의 제1 참조 이미지에 기반하여 하나 이상의 유사 이미지를 도출하는 단계와 도출된 하나 이상의 유사 이미지를 복수의 제2 참조 이미지에 기반하여 조정함으로써 목표 예측 이미지를 생성하는 단계를 수행할 수 있다. 프로세서(130)는 제1 참조 이미지를 통해 기 설정된 유사도 이상을 갖는 하나 이상의 유사 이미지들을 도출하고, 도출된 하나 이상의 유사 이미지들을 제2 참조 이미지에 기반하여 조정함으로써, 사용자의 의도에 부합하는 목표 예측 이미지를 생성할 수 있다. 예를 들어, 프로세서(130)는 특정 태아의 초음파 이미지에 기반하여 타겟 데이터베이스로부터 유사한 이미지를 검색하여 유사 이미지를 도출하며, 도출된 유사 이미지를 부모의 실제 이미지에 기반하여 조정함으로서, 예측 정확도가 향상된 목표 예측 이미지(즉, 태아의 미래 얼굴에 관한 실사 이미지)를 생성할 수 있다.The processor 130 of the present disclosure generates a target prediction image by deriving one or more similar images based on a plurality of first reference images and adjusting the derived one or more similar images based on a plurality of second reference images. steps can be performed. The processor 130 derives one or more similar images having a degree of similarity or more preset through the first reference image, and adjusts the derived one or more similar images based on the second reference image, thereby predicting a target matching the user's intention. You can create an image. For example, the processor 130 derives a similar image by searching for a similar image from a target database based on an ultrasound image of a specific fetus, and adjusts the derived similar image based on the actual image of the parent, so that prediction accuracy is improved A target prediction image (ie, a photorealistic image of the future face of the fetus) may be generated.

전술한 바와 같이, 본 개시의 프로세서(130)는 다양한 참조 이미지들(즉, 제1 참조 이미지 및 제2 참조 이미지)을 통해 유사 이미지 도출 및 도출된 유사 이미지 조정 등 2가지의 프로세스를 수행함으로써, 출력되는 목표 예측 이미지의 정확도를 향상시킬 수 있다. 이는 불분명한 참조 영상이나, 촬영 각도가 일정하지 않은 영상이나 또는 노이즈가 많이 포함된 영상 등에 관련한 참조 이미지에 대응하여 높은 정확도의 목표 예측 이미지를 생성하는 효과를 가질 수 있다.As described above, the processor 130 of the present disclosure performs two processes, such as deriving a similar image and adjusting the derived similar image through various reference images (that is, the first reference image and the second reference image), It is possible to improve the accuracy of the output target prediction image. This may have an effect of generating a target prediction image with high accuracy in response to a reference image related to an unclear reference image, an image having a non-uniform shooting angle, or an image containing a lot of noise.

도 6은 본 개시의 일 실시예와 관련된 얼굴 이미지를 생성하기 위한 방법을 예시적으로 도시한 순서도이다.6 is a flowchart exemplarily illustrating a method for generating a face image related to an embodiment of the present disclosure.

본 개시의 일 실시예에 따르면, 상기 방법은, 제1 영상 데이터 및 제2 영상 데이터를 획득하는 단계(410)를 포함할 수 있다.According to an embodiment of the present disclosure, the method may include obtaining ( 410 ) first image data and second image data.

본 개시의 일 실시예에 따르면, 상기 방법은, 제1 영상 데이터를 구성하는 복수의 제1 참조 이미지에 대응하는 공통 특성 정보를 추출하는 단계(420)를 포함할 수 있다.According to an embodiment of the present disclosure, the method may include extracting common characteristic information corresponding to a plurality of first reference images constituting the first image data ( 420 ).

본 개시의 일 실시예에 따르면, 상기 방법은, 공통 특성 정보에 기반하여 하나 이상의 유사 이미지를 도출하는 단계(430)를 포함할 수 있다.According to an embodiment of the present disclosure, the method may include deriving one or more similar images based on common characteristic information ( 430 ).

본 개시의 일 실시예에 따르면, 상기 방법은, 제2 영상 데이터를 구성하는 복수의 제2 참조 이미지에 기반하여 하나 이상의 유사 이미지를 조정함으로써, 목표 예측 이미지를 생성하는 단계(440)를 포함할 수 있다.According to an embodiment of the present disclosure, the method may include generating (440) a target prediction image by adjusting one or more similar images based on a plurality of second reference images constituting the second image data. can

전술한 도 6에 도시된 단계들은 필요에 의해 순서가 변경될 수 있으며, 적어도 하나 이상의 단계가 생략 또는 추가될 수 있다. 즉, 전술한 단계는 본 개시의 일 실시예에 불과할 뿐, 본 개시의 권리 범위는 이에 제한되지 않는다.The order of the steps illustrated in FIG. 6 described above may be changed if necessary, and at least one or more steps may be omitted or added. That is, the above-described steps are merely an embodiment of the present disclosure, and the scope of the present disclosure is not limited thereto.

도 7은 본 개시의 일 실시예와 관련된 하나 이상의 네트워크 함수를 나타낸 개략도이다.7 is a schematic diagram illustrating one or more network functions related to an embodiment of the present disclosure.

본 명세서에 걸쳐, 연산 모델, 신경망, 네트워크 함수, 뉴럴 네트워크(neural network)는 동일한 의미로 사용될 수 있다. 신경망은 일반적으로 “노드”라 지칭될 수 있는 상호 연결된 계산 단위들의 집합으로 구성될 수 있다. 이러한 “노드”들은 “뉴런(neuron)”들로 지칭될 수도 있다. 신경망은 적어도 하나 이상의 노드들을 포함하여 구성된다. 신경망들을 구성하는 노드(또는 뉴런)들은 하나 이상의“링크”에 의해 상호 연결될 수 있다.Throughout this specification, computational model, neural network, network function, and neural network may be used interchangeably. A neural network may be composed of a set of interconnected computational units, which may generally be referred to as “nodes”. These “nodes” may also be referred to as “neurons”. A neural network is configured to include at least one or more nodes. Nodes (or neurons) constituting neural networks may be interconnected by one or more “links”.

신경망 내에서, 링크를 통해 연결된 하나 이상의 노드들은 상대적으로 입력 노드 및 출력 노드의 관계를 형성할 수 있다. 입력 노드 및 출력 노드의 개념은 상대적인 것으로서, 하나의 노드에 대하여 출력 노드 관계에 있는 임의의 노드는 다른 노드와의 관계에서 입력 노드 관계에 있을 수 있으며, 그 역도 성립할 수 있다. 상술한 바와 같이, 입력 노드 대 출력 노드 관계는 링크를 중심으로 생성될 수 있다. 하나의 입력 노드에 하나 이상의 출력 노드가 링크를 통해 연결될 수 있으며, 그 역도 성립할 수 있다.In the neural network, one or more nodes connected through a link may relatively form a relationship between an input node and an output node. The concepts of an input node and an output node are relative, and any node in an output node relationship with respect to one node may be in an input node relationship in a relationship with another node, and vice versa. As described above, an input node-to-output node relationship may be created around a link. One or more output nodes may be connected to one input node through a link, and vice versa.

하나의 링크를 통해 연결된 입력 노드 및 출력 노드 관계에서, 출력 노드는 입력 노드에 입력된 데이터에 기초하여 그 값이 결정될 수 있다. 여기서 입력 노드와 출력 노드를 상호 연결하는 노드는 가중치(weight)를 가질 수 있다. 가중치는 가변적일 수 있으며, 신경망이 원하는 기능을 수행하기 위해, 사용자 또는 알고리즘에 의해 가변될 수 있다. 예를 들어, 하나의 출력 노드에 하나 이상의 입력 노드가 각각의 링크에 의해 상호 연결된 경우, 출력 노드는 상기 출력 노드와 연결된 입력 노드들에 입력된 값들 및 각각의 입력 노드들에 대응하는 링크에 설정된 가중치에 기초하여 출력 노드 값을 결정할 수 있다.In the relationship between the input node and the output node connected through one link, the value of the output node may be determined based on data input to the input node. Here, a node interconnecting the input node and the output node may have a weight. The weight may be variable, and may be changed by a user or an algorithm in order for the neural network to perform a desired function. For example, when one or more input nodes are interconnected to one output node by respective links, the output node sets values input to input nodes connected to the output node and links corresponding to the respective input nodes. An output node value may be determined based on the weight.

상술한 바와 같이, 신경망은 하나 이상의 노드들이 하나 이상의 링크를 통해 상호 연결되어 신경망 내에서 입력 노드 및 출력 노드 관계를 형성한다. 신경망 내에서 노드들과 링크들의 개수 및 노드들과 링크들 사이의 연관관계, 링크들 각각에 부여된 가중치의 값에 따라, 신경망의 특성이 결정될 수 있다. 예를 들어, 동일한 개수의 노드 및 링크들이 존재하고, 링크들 사이의 가중치 값이 상이한 두 신경망이 존재하는 경우, 두 개의 신경망들은 서로 상이한 것으로 인식될 수 있다.As described above, in a neural network, one or more nodes are interconnected through one or more links to form an input node and an output node relationship in the neural network. The characteristics of the neural network may be determined according to the number of nodes and links in the neural network, the correlation between the nodes and the links, and the value of a weight assigned to each of the links. For example, when the same number of nodes and links exist and there are two neural networks having different weight values between the links, the two neural networks may be recognized as different from each other.

신경망은 하나 이상의 노드들을 포함하여 구성될 수 있다. 신경망을 구성하는 노드들 중 일부는, 최초 입력 노드로부터의 거리들에 기초하여, 하나의 레이어(layer)를 구성할 수 있다, 예를 들어, 최초 입력 노드로부터 거리가 n인 노드들의 집합은, n 레이어를 구성할 수 있다. 최초 입력 노드로부터 거리는, 최초 입력 노드로부터 해당 노드까지 도달하기 위해 거쳐야 하는 링크들의 최소 개수에 의해 정의될 수 있다. 그러나, 이러한 레이어의 정의는 설명을 위한 임의적인 것으로서, 신경망 내에서 레이어의 차수는 상술한 것과 상이한 방법으로 정의될 수 있다. 예를 들어, 노드들의 레이어는 최종 출력 노드로부터 거리에 의해 정의될 수도 있다.A neural network may include one or more nodes. Some of the nodes constituting the neural network may configure one layer based on distances from the initial input node. For example, a set of nodes having a distance of n from the initial input node is You can configure n layers. The distance from the initial input node may be defined by the minimum number of links that must be passed to reach the corresponding node from the initial input node. However, the definition of such a layer is arbitrary for description, and the order of the layer in the neural network may be defined in a different way from the above. For example, a layer of nodes may be defined by a distance from the final output node.

최초 입력 노드는 신경망 내의 노드들 중 다른 노드들과의 관계에서 링크를 거치지 않고 데이터가 직접 입력되는 하나 이상의 노드들을 의미할 수 있다. 또는, 신경망 네트워크 내에서, 링크를 기준으로 한 노드 간의 관계에 있어서, 링크로 연결된 다른 입력 노드를 가지지 않는 노드들을 의미할 수 있다. 이와 유사하게, 최종 출력 노드는 신경망 내의 노드들 중 다른 노드들과의 관계에서, 출력 노드를 가지지 않는 하나 이상의 노드들을 의미할 수 있다. 또한, 히든 노드는 최초 입력 노드 및 최후 출력 노드가 아닌 신경망을 구성하는 노드들을 의미할 수 있다. 본 개시의 일 실시예에 따른 신경망은 입력 레이어의 노드의 개수가 출력 레이어의 노드의 개수와 동일할 수 있으며, 입력 레이어에서 히든 레이어로 진행됨에 따라 노드의 수가 감소하다가 다시 증가하는 형태의 신경망일 수 있다. 또한, 본 개시의 다른 일 실시예에 따른 신경망은 입력 레이어의 노드의 개수가 출력 레이어의 노드의 개수 보다 적을 수 있으며, 입력 레이어에서 히든 레이어로 진행됨에 따라 노드의 수가 감소하는 형태의 신경망일 수 있다. 또한, 본 개시의 또 다른 일 실시예에 따른 신경망은 입력 레이어의 노드의 개수가 출력 레이어의 노드의 개수보다 많을 수 있으며, 입력 레이어에서 히든 레이어로 진행됨에 따라 노드의 수가 증가하는 형태의 신경망일 수 있다. 본 개시의 또 다른 일 실시예에 따른 신경망은 상술한 신경망들의 조합된 형태의 신경망일 수 있다.The initial input node may mean one or more nodes to which data is directly input without going through a link in a relationship with other nodes among nodes in the neural network. Alternatively, in a relationship between nodes based on a link in a neural network, it may mean nodes that do not have other input nodes connected by a link. Similarly, the final output node may refer to one or more nodes that do not have an output node in relation to other nodes among nodes in the neural network. In addition, the hidden node may mean nodes constituting the neural network other than the first input node and the last output node. The neural network according to an embodiment of the present disclosure may be a neural network in which the number of nodes in the input layer may be the same as the number of nodes in the output layer, and the number of nodes decreases and then increases again as progresses from the input layer to the hidden layer. can Also, in the neural network according to another embodiment of the present disclosure, the number of nodes in the input layer may be less than the number of nodes in the output layer, and the number of nodes may be reduced as the number of nodes progresses from the input layer to the hidden layer. have. In addition, the neural network according to another embodiment of the present disclosure may be a neural network in which the number of nodes in the input layer may be greater than the number of nodes in the output layer, and the number of nodes increases as the number of nodes progresses from the input layer to the hidden layer. can The neural network according to another embodiment of the present disclosure may be a neural network in a combined form of the aforementioned neural networks.

딥 뉴럴 네트워크(DNN: deep neural network, 심층신경망)는 입력레이어와 출력 레이어 외에 복수의 히든 레이어를 포함하는 신경망을 의미할 수 있다. 딥 뉴럴 네트워크를 이용하면 데이터의 잠재적인 구조(latent structures)를 파악할 수 있다. 즉, 사진, 글, 비디오, 음성, 음악의 잠재적인 구조(예를 들어, 어떤 물체가 사진에 있는지, 글의 내용과 감정이 무엇인지, 음성의 내용과 감정이 무엇인지 등)를 파악할 수 있다. 딥 뉴럴 네트워크는 컨볼루션 뉴럴 네트워크(CNN: convolutional neural network), 리커런트 뉴럴 네트워크(RNN: recurrent neural network), 오토 인코더(auto encoder), GAN(Generative Adversarial Networks), 제한 볼츠만 머신(RBM: restricted boltzmann machine), 심층 신뢰 네트워크(DBN: deep belief network), Q 네트워크, U 네트워크, 샴 네트워크 등을 포함할 수 있다. 전술한 딥 뉴럴 네트워크의 기재는 예시일 뿐이며 본 개시는 이에 제한되지 않는다.A deep neural network (DNN) may refer to a neural network including a plurality of hidden layers in addition to an input layer and an output layer. Deep neural networks can be used to identify the latent structures of data. In other words, it can identify the potential structure of photos, texts, videos, voices, and music (e.g., what objects are in the photos, what the text and emotions are, what the texts and emotions are, etc.) . Deep neural networks include convolutional neural networks (CNNs), recurrent neural networks (RNNs), auto encoders, generative adversarial networks (GANs), and restricted boltzmann machines (RBMs). machine), a deep trust network (DBN), a Q network, a U network, a Siamese network, and the like. The description of the deep neural network described above is only an example, and the present disclosure is not limited thereto.

뉴럴 네트워크는 교사 학습(supervised learning), 비교사 학습(unsupervised learning) 및 반교사학습(semi supervised learning) 중 적어도 하나의 방식으로 학습될 수 있다. 뉴럴 네트워크의 학습은 출력의 오류를 최소화하기 위한 것이다. 뉴럴 네트워크의 학습에서 반복적으로 학습 데이터를 뉴럴 네트워크에 입력시키고 학습 데이터에 대한 뉴럴 네트워크의 출력과 타겟의 에러를 계산하고, 에러를 줄이기 위한 방향으로 뉴럴 네트워크의 에러를 뉴럴 네트워크의 출력 레이어에서부터 입력 레이어 방향으로 역전파(backpropagation)하여 뉴럴 네트워크의 각 노드의 가중치를 업데이트 하는 과정이다. 교사 학습의 경우 각각의 학습 데이터에 정답이 라벨링되어있는 학습 데이터를 사용하며(즉, 라벨링된 학습 데이터), 비교사 학습의 경우는 각각의 학습 데이터에 정답이 라벨링되어 있지 않을 수 있다. 즉, 예를 들어 데이터 분류에 관한 교사 학습의 경우의 학습 데이터는 학습 데이터 각각에 카테고리가 라벨링 된 데이터 일 수 있다. 라벨링된 학습 데이터가 뉴럴 네트워크에 입력되고, 뉴럴 네트워크의 출력(카테고리)과 학습 데이터의 라벨이 비교함으로써 오류(error)가 계산될 수 있다. 다른 예로, 데이터 분류에 관한 비교사 학습의 경우 입력인 학습 데이터가 뉴럴 네트워크 출력과 비교됨으로써 오류가 계산될 수 있다. 계산된 오류는 뉴럴 네트워크에서 역방향(즉, 출력 레이어에서 입력 레이어 방향)으로 역전파 되며, 역전파에 따라 뉴럴 네트워크의 각 레이어의 각 노드들의 연결 가중치가 업데이트 될 수 있다. 업데이트 되는 각 노드의 연결 가중치는 학습률(learning rate)에 따라 변화량이 결정될 수 있다. 입력 데이터에 대한 뉴럴 네트워크의 계산과 에러의 역전파는 학습 사이클(epoch)을 구성할 수 있다. 학습률은 뉴럴 네트워크의 학습 사이클의 반복 횟수에 따라 상이하게 적용될 수 있다. 예를 들어, 뉴럴 네트워크의 학습 초기에는 높은 학습률을 사용하여 뉴럴 네트워크가 빠르게 일정 수준의 성능을 확보하도록 하여 효율성을 높이고, 학습 후기에는 낮은 학습률을 사용하여 정확도를 높일 수 있다.The neural network may be learned by at least one of teacher learning (supervised learning), unsupervised learning, and semi-supervised learning. The training of the neural network is to minimize the error in the output. In the training of a neural network, iteratively input the training data into the neural network, calculate the output of the neural network and the target error for the training data, and calculate the error of the neural network from the output layer of the neural network to the input layer in the direction to reduce the error. It is a process of updating the weight of each node in the neural network by backpropagation in the direction. In the case of teacher learning, learning data in which the correct answer is labeled in each learning data is used (ie, labeled learning data), and in the case of comparative learning, the correct answer may not be labeled in each learning data. That is, for example, learning data in the case of teacher learning related to data classification may be data in which categories are labeled in each of the learning data. The labeled training data is input to the neural network, and an error can be calculated by comparing the output (category) of the neural network with the label of the training data. As another example, in the case of comparison learning related to data classification, an error may be calculated by comparing the input training data with the neural network output. The calculated error is back propagated in the reverse direction (ie, from the output layer to the input layer) in the neural network, and the connection weight of each node of each layer of the neural network may be updated according to the back propagation. The change amount of the connection weight of each node to be updated may be determined according to a learning rate. The computation of the neural network on the input data and the backpropagation of errors can constitute a learning cycle (epoch). The learning rate may be applied differently according to the number of repetitions of the learning cycle of the neural network. For example, in the early stage of learning of a neural network, a high learning rate can be used to enable the neural network to quickly obtain a certain level of performance, thereby increasing efficiency, and using a low learning rate at a later stage of learning can increase accuracy.

뉴럴 네트워크의 학습에서 일반적으로 학습 데이터는 실제 데이터(즉, 학습된 뉴럴 네트워크를 이용하여 처리하고자 하는 데이터)의 부분집합일 수 있으며, 따라서, 학습 데이터에 대한 오류는 감소하나 실제 데이터에 대해서는 오류가 증가하는 학습 사이클이 존재할 수 있다. 과적합(overfitting)은 이와 같이 학습 데이터에 과하게 학습하여 실제 데이터에 대한 오류가 증가하는 현상이다. 예를 들어, 노란색 고양이를 보여 고양이를 학습한 뉴럴 네트워크가 노란색 이외의 고양이를 보고는 고양이임을 인식하지 못하는 현상이 과적합의 일종일 수 있다. 과적합은 머신러닝 알고리즘의 오류를 증가시키는 원인으로 작용할 수 있다. 이러한 과적합을 막기 위하여 다양한 최적화 방법이 사용될 수 있다. 과적합을 막기 위해서는 학습 데이터를 증가시키거나, 레귤라이제이션(regularization), 학습의 과정에서 네트워크의 노드 일부를 생략하는 드롭아웃(dropout) 등의 방법이 적용될 수 있다.In the training of neural networks, in general, the training data may be a subset of real data (that is, data to be processed using the trained neural network), and thus, the error on the training data is reduced, but the error on the real data is reduced. There may be increasing learning cycles. Overfitting is a phenomenon in which errors on actual data increase by over-learning on training data as described above. For example, a phenomenon in which a neural network that has learned a cat by seeing a yellow cat does not recognize that it is a cat when it sees a cat other than yellow may be a type of overfitting. Overfitting can act as a cause of increasing errors in machine learning algorithms. In order to prevent such overfitting, various optimization methods can be used. In order to prevent overfitting, methods such as increasing training data, regularization, or dropout in which a part of nodes in the network are omitted in the process of learning, may be applied.

본 명세서에 걸쳐, 연산 모델, 신경망, 네트워크 함수, 뉴럴 네트워크(neural network)는 동일한 의미로 사용될 수 있다. (이하에서는 신경망으로 통일하여 기술한다.) 데이터 구조는 신경망을 포함할 수 있다. 그리고 신경망을 포함한 데이터 구조는 컴퓨터 판독가능 매체에 저장될 수 있다. 신경망을 포함한 데이터 구조는 또한 신경망에 입력되는 데이터, 신경망의 가중치, 신경망의 하이퍼 파라미터, 신경망으로부터 획득한 데이터, 신경망의 각 노드 또는 레이어와 연관된 활성 함수, 신경망의 학습을 위한 손실 함수를 포함할 수 있다. 신경망을 포함한 데이터 구조는 상기 개시된 구성들 중 임의의 구성 요소들을 포함할 수 있다. 즉 신경망을 포함한 데이터 구조는 신경망에 입력되는 데이터, 신경망의 가중치, 신경망의 하이퍼 파라미터, 신경망으로부터 획득한 데이터, 신경망의 각 노드 또는 레이어와 연관된 활성 함수, 신경망의 트레이닝을 위한 손실 함수 등 전부 또는 이들의 임의의 조합을 포함하여 구성될 수 있다. 전술한 구성들 이외에도, 신경망을 포함한 데이터 구조는 신경망의 특성을 결정하는 임의의 다른 정보를 포함할 수 있다. 또한, 데이터 구조는 신경망의 연산 과정에 사용되거나 발생되는 모든 형태의 데이터를 포함할 수 있으며 전술한 사항에 제한되는 것은 아니다. 컴퓨터 판독가능 매체는 컴퓨터 판독가능 기록 매체 및/또는 컴퓨터 판독가능 전송 매체를 포함할 수 있다. 신경망은 일반적으로 노드라 지칭될 수 있는 상호 연결된 계산 단위들의 집합으로 구성될 수 있다. 이러한 노드들은 뉴런(neuron)들로 지칭될 수도 있다. 신경망은 적어도 하나 이상의 노드들을 포함하여 구성된다.Throughout this specification, computational model, neural network, network function, and neural network may be used interchangeably. (Hereinafter, the neural network is unified and described.) The data structure may include a neural network. And the data structure including the neural network may be stored in a computer-readable medium. Data structures, including neural networks, may also include data input to the neural network, weights of the neural network, hyperparameters of the neural network, data obtained from the neural network, activation functions associated with each node or layer of the neural network, and loss functions for learning the neural network. have. A data structure comprising a neural network may include any of the components disclosed above. That is, the data structure including the neural network includes all or all of the data input to the neural network, the weights of the neural network, hyperparameters of the neural network, data obtained from the neural network, the activation function associated with each node or layer of the neural network, and the loss function for training the neural network. may be configured including any combination of In addition to the above-described configurations, a data structure including a neural network may include any other information that determines a characteristic of a neural network. In addition, the data structure may include all types of data used or generated in the operation process of the neural network, and is not limited to the above. Computer-readable media may include computer-readable recording media and/or computer-readable transmission media. A neural network may be composed of a set of interconnected computational units, which may generally be referred to as nodes. These nodes may also be referred to as neurons. A neural network is configured to include at least one or more nodes.

데이터 구조는 신경망에 입력되는 데이터를 포함할 수 있다. 신경망에 입력되는 데이터를 포함하는 데이터 구조는 컴퓨터 판독가능 매체에 저장될 수 있다. 신경망에 입력되는 데이터는 신경망 학습 과정에서 입력되는 학습 데이터 및/또는 학습이 완료된 신경망에 입력되는 입력 데이터를 포함할 수 있다. 신경망에 입력되는 데이터는 전처리(pre-processing)를 거친 데이터 및/또는 전처리 대상이 되는 데이터를 포함할 수 있다. 전처리는 데이터를 신경망에 입력시키기 위한 데이터 처리 과정을 포함할 수 있다. 따라서 데이터 구조는 전처리 대상이 되는 데이터 및 전처리로 발생되는 데이터를 포함할 수 있다. 전술한 데이터 구조는 예시일 뿐 본 개시는 이에 제한되지 않는다.The data structure may include data input to the neural network. A data structure including data input to the neural network may be stored in a computer-readable medium. The data input to the neural network may include learning data input in a neural network learning process and/or input data input to the neural network in which learning is completed. Data input to the neural network may include pre-processing data and/or pre-processing target data. The preprocessing may include a data processing process for inputting data into the neural network. Accordingly, the data structure may include data to be pre-processed and data generated by pre-processing. The above-described data structure is merely an example, and the present disclosure is not limited thereto.

데이터 구조는 신경망의 가중치를 포함할 수 있다. (본 명세서에서 가중치, 파라미터는 동일한 의미로 사용될 수 있다.) 그리고 신경망의 가중치를 포함한 데이터 구조는 컴퓨터 판독가능 매체에 저장될 수 있다. 신경망은 복수개의 가중치를 포함할 수 있다. 가중치는 가변적일 수 있으며, 신경망이 원하는 기능을 수행하기 위해, 사용자 또는 알고리즘에 의해 가변 될 수 있다. 예를 들어, 하나의 출력 노드에 하나 이상의 입력 노드가 각각의 링크에 의해 상호 연결된 경우, 출력 노드는 상기 출력 노드와 연결된 입력 노드들에 입력된 값들 및 각각의 입력 노드들에 대응하는 링크에 설정된 파라미터에 기초하여 출력 노드 값을 결정할 수 있다. 전술한 데이터 구조는 예시일 뿐 본 개시는 이에 제한되지 않는다.The data structure may include the weights of the neural network. (In this specification, weight and parameter may be used interchangeably.) And the data structure including the weight of the neural network may be stored in a computer-readable medium. The neural network may include a plurality of weights. The weight may be variable, and may be changed by a user or an algorithm in order for the neural network to perform a desired function. For example, when one or more input nodes are interconnected to one output node by respective links, the output node sets values input to input nodes connected to the output node and links corresponding to the respective input nodes. An output node value may be determined based on the parameter. The above-described data structure is merely an example, and the present disclosure is not limited thereto.

제한이 아닌 예로서, 가중치는 신경망 학습 과정에서 가변되는 가중치 및/또는 신경망 학습이 완료된 가중치를 포함할 수 있다. 신경망 학습 과정에서 가변되는 가중치는 학습 사이클이 시작되는 시점의 가중치 및/또는 학습 사이클 동안 가변되는 가중치를 포함할 수 있다. 신경망 학습이 완료된 가중치는 학습 사이클이 완료된 가중치를 포함할 수 있다. 따라서 신경망의 가중치를 포함한 데이터 구조는 신경망 학습 과정에서 가변되는 가중치 및/또는 신경망 학습이 완료된 가중치를 포함한 데이터 구조를 포함할 수 있다. 그러므로 상술한 가중치 및/또는 각 가중치의 조합은 신경망의 가중치를 포함한 데이터 구조에 포함되는 것으로 한다. 전술한 데이터 구조는 예시일 뿐 본 개시는 이에 제한되지 않는다.By way of example and not limitation, the weight may include a weight variable in a neural network learning process and/or a weight in which neural network learning is completed. The variable weight in the neural network learning process may include a weight at a time point at which a learning cycle starts and/or a weight variable during the learning cycle. The weight for which neural network learning is completed may include a weight for which a learning cycle is completed. Accordingly, the data structure including the weights of the neural network may include a data structure including the weights that vary in the process of learning the neural network and/or the weights on which the learning of the neural network is completed. Therefore, it is assumed that the above-described weights and/or combinations of weights are included in the data structure including the weights of the neural network. The above-described data structure is merely an example, and the present disclosure is not limited thereto.

신경망의 가중치를 포함한 데이터 구조는 직렬화(serialization) 과정을 거친 후 컴퓨터 판독가능 저장 매체(예를 들어, 메모리, 하드 디스크)에 저장될 수 있다. 직렬화는 데이터 구조를 동일하거나 다른 컴퓨팅 장치에 저장하고 나중에 다시 재구성하여 사용할 수 있는 형태로 변환하는 과정일 수 있다. 컴퓨팅 장치는 데이터 구조를 직렬화하여 네트워크를 통해 데이터를 송수신할 수 있다. 직렬화된 신경망의 가중치를 포함한 데이터 구조는 역직렬화(deserialization)를 통해 동일한 컴퓨팅 장치 또는 다른 컴퓨팅 장치에서 재구성될 수 있다. 신경망의 가중치를 포함한 데이터 구조는 직렬화에 한정되는 것은 아니다. 나아가 신경망의 가중치를 포함한 데이터 구조는 컴퓨팅 장치의 자원을 최소한으로 사용하면서 연산의 효율을 높이기 위한 데이터 구조(예를 들어, 비선형 데이터 구조에서 B-Tree, Trie, m-way search tree, AVL tree, Red-Black Tree)를 포함할 수 있다. 전술한 사항은 예시일 뿐 본 개시는 이에 제한되지 않는다.The data structure including the weights of the neural network may be stored in a computer-readable storage medium (eg, memory, hard disk) after being serialized. Serialization can be the process of converting a data structure into a form that can be reconstructed and used later by storing it on the same or a different computing device. The computing device may serialize the data structure to send and receive data over the network. A data structure including weights of the serialized neural network may be reconstructed in the same computing device or in another computing device through deserialization. The data structure including the weight of the neural network is not limited to serialization. Furthermore, the data structure including the weights of the neural network is a data structure to increase the efficiency of computation while using the resources of the computing device to a minimum (e.g., B-Tree, Trie, m-way search tree, AVL tree, Red-Black Tree). The foregoing is merely an example, and the present disclosure is not limited thereto.

데이터 구조는 신경망의 하이퍼 파라미터(Hyper-parameter)를 포함할 수 있다. 그리고 신경망의 하이퍼 파라미터를 포함한 데이터 구조는 컴퓨터 판독가능 매체에 저장될 수 있다. 하이퍼 파라미터는 사용자에 의해 가변되는 변수일 수 있다. 하이퍼 파라미터는 예를 들어, 학습률(learning rate), 비용 함수(cost function), 학습 사이클 반복 횟수, 가중치 초기화(Weight initialization)(예를 들어, 가중치 초기화 대상이 되는 가중치 값의 범위 설정), Hidden Unit 개수(예를 들어, 히든 레이어의 개수, 히든 레이어의 노드 수)를 포함할 수 있다. 전술한 데이터 구조는 예시일 뿐 본 개시는 이에 제한되지 않는다.The data structure may include hyper-parameters of the neural network. In addition, the data structure including the hyperparameters of the neural network may be stored in a computer-readable medium. The hyper parameter may be a variable variable by a user. Hyperparameters are, for example, learning rate, cost function, number of iterations of the learning cycle, weight initialization (e.g., setting the range of weight values to be initialized for weights), Hidden Unit It may include the number (eg, the number of hidden layers, the number of nodes of the hidden layer). The above-described data structure is merely an example, and the present disclosure is not limited thereto.

본 개시의 실시예와 관련하여 설명된 방법 또는 알고리즘의 단계들은 하드웨어로 직접 구현되거나, 하드웨어에 의해 실행되는 소프트웨어 모듈로 구현되거나, 또는 이들의 결합에 의해 구현될 수 있다. 소프트웨어 모듈은 RAM(Random Access Memory), ROM(Read Only Memory), EPROM(Erasable Programmable ROM), EEPROM(Electrically Erasable Programmable ROM), 플래시 메모리(Flash Memory), 하드 디스크, 착탈형 디스크, CD-ROM, 또는 본 개시가 속하는 기술 분야에서 잘 알려진 임의의 형태의 컴퓨터 판독가능 기록매체에 상주할 수도 있다.The steps of a method or algorithm described in relation to an embodiment of the present disclosure may be implemented directly in hardware, implemented as a software module executed by hardware, or implemented by a combination thereof. A software module may include random access memory (RAM), read only memory (ROM), erasable programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), flash memory, hard disk, removable disk, CD-ROM, or It may reside in any type of computer-readable recording medium well known in the art to which the present disclosure pertains.

본 개시의 구성 요소들은 하드웨어인 컴퓨터와 결합되어 실행되기 위해 프로그램(또는 애플리케이션)으로 구현되어 매체에 저장될 수 있다. 본 개시의 구성 요소들은 소프트웨어 프로그래밍 또는 소프트웨어 요소들로 실행될 수 있으며, 이와 유사하게, 실시 예는 데이터 구조, 프로세스들, 루틴들 또는 다른 프로그래밍 구성들의 조합으로 구현되는 다양한 알고리즘을 포함하여, C, C++, 자바(Java), 어셈블러(assembler) 등과 같은 프로그래밍 또는 스크립팅 언어로 구현될 수 있다. 기능적인 측면들은 하나 이상의 프로세서들에서 실행되는 알고리즘으로 구현될 수 있다.Components of the present disclosure may be implemented as a program (or application) to be executed in combination with a computer, which is hardware, and stored in a medium. Components of the present disclosure may be implemented as software programming or software components, and similarly, embodiments may include various algorithms implemented as data structures, processes, routines, or combinations of other programming constructs, including C, C++ , Java, assembler, etc. may be implemented in a programming or scripting language. Functional aspects may be implemented in an algorithm running on one or more processors.

본 개시의 기술 분야에서 통상의 지식을 가진 자는 여기에 개시된 실시예들과 관련하여 설명된 다양한 예시적인 논리 블록들, 모듈들, 프로세서들, 수단들, 회로들 및 알고리즘 단계들이 전자 하드웨어, (편의를 위해, 여기에서 "소프트웨어"로 지칭되는) 다양한 형태들의 프로그램 또는 설계 코드 또는 이들 모두의 결합에 의해 구현될 수 있다는 것을 이해할 것이다. 하드웨어 및 소프트웨어의 이러한 상호 호환성을 명확하게 설명하기 위해, 다양한 예시적인 컴포넌트들, 블록들, 모듈들, 회로들 및 단계들이 이들의 기능과 관련하여 위에서 일반적으로 설명되었다. 이러한 기능이 하드웨어 또는 소프트웨어로서 구현되는지 여부는 특정한 애플리케이션 및 전체 시스템에 대하여 부과되는 설계 제약들에 따라 좌우된다. 본 개시의 기술 분야에서 통상의 지식을 가진 자는 각각의 특정한 애플리케이션에 대하여 다양한 방식들로 설명된 기능을 구현할 수 있으나, 이러한 구현 결정들은 본 개시의 범위를 벗어나는 것으로 해석되어서는 안 될 것이다.Those of ordinary skill in the art of the present disclosure will recognize that the various illustrative logical blocks, modules, processors, means, circuits, and algorithm steps described in connection with the embodiments disclosed herein include electronic hardware, (convenience For this purpose, it will be understood that it may be implemented by various forms of program or design code (referred to herein as "software") or a combination of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the overall system. A person skilled in the art of the present disclosure may implement the described functionality in various ways for each specific application, but such implementation decisions should not be interpreted as a departure from the scope of the present disclosure.

여기서 제시된 다양한 실시예들은 방법, 장치, 또는 표준 프로그래밍 및/또는 엔지니어링 기술을 사용한 제조 물품(article)으로 구현될 수 있다. 용어 "제조 물품"은 임의의 컴퓨터-판독가능 장치로부터 액세스 가능한 컴퓨터 프로그램, 캐리어, 또는 매체(media)를 포함한다. 예를 들어, 컴퓨터-판독가능 매체는 자기 저장 장치(예를 들면, 하드 디스크, 플로피 디스크, 자기 스트립, 등), 광학 디스크(예를 들면, CD, DVD, 등), 스마트 카드, 및 플래쉬 메모리 장치(예를 들면, EEPROM, 카드, 스틱, 키 드라이브, 등)를 포함하지만, 이들로 제한되는 것은 아니다. 또한, 여기서 제시되는 다양한 저장 매체는 정보를 저장하기 위한 하나 이상의 장치 및/또는 다른 기계-판독가능한 매체를 포함한다. 용어 "기계-판독가능 매체"는 명령(들) 및/또는 데이터를 저장, 보유, 및/또는 전달할 수 있는 무선 채널 및 다양한 다른 매체를 포함하지만, 이들로 제한되는 것은 아니다.The various embodiments presented herein may be implemented as methods, apparatus, or articles of manufacture using standard programming and/or engineering techniques. The term “article of manufacture” includes a computer program, carrier, or media accessible from any computer-readable device. For example, computer-readable media include magnetic storage devices (eg, hard disks, floppy disks, magnetic strips, etc.), optical disks (eg, CDs, DVDs, etc.), smart cards, and flash memory. devices (eg, EEPROMs, cards, sticks, key drives, etc.). Also, various storage media presented herein include one or more devices and/or other machine-readable media for storing information. The term “machine-readable medium” includes, but is not limited to, wireless channels and various other media that can store, hold, and/or convey instruction(s) and/or data.

제시된 프로세스들에 있는 단계들의 특정한 순서 또는 계층 구조는 예시적인 접근들의 일례임을 이해하도록 한다. 설계 우선순위들에 기반하여, 본 개시의 범위 내에서 프로세스들에 있는 단계들의 특정한 순서 또는 계층 구조가 재배열될 수 있다는 것을 이해하도록 한다. 첨부된 방법 청구항들은 샘플 순서로 다양한 단계들의 엘리먼트들을 제공하지만 제시된 특정한 순서 또는 계층 구조에 한정되는 것을 의미하지는 않는다.It is to be understood that the specific order or hierarchy of steps in the presented processes is an example of exemplary approaches. Based on design priorities, it is to be understood that the specific order or hierarchy of steps in the processes may be rearranged within the scope of the present disclosure. The appended method claims present elements of the various steps in a sample order, but are not meant to be limited to the specific order or hierarchy presented.

제시된 실시예들에 대한 설명은 임의의 본 개시의 기술 분야에서 통상의 지식을 가진 자가 본 개시를 이용하거나 또는 실시할 수 있도록 제공된다. 이러한 실시예들에 대한 다양한 변형들은 본 개시의 기술 분야에서 통상의 지식을 가진 자에게 명백할 것이며, 여기에 정의된 일반적인 원리들은 본 개시의 범위를 벗어남이 없이 다른 실시예들에 적용될 수 있다. 그리하여, 본 개시는 여기에 제시된 실시예들로 한정되는 것이 아니라, 여기에 제시된 원리들 및 신규한 특징들과 일관되는 최광의의 범위에서 해석되어야 할 것이다.The description of the presented embodiments is provided to enable any person skilled in the art to make or use the present disclosure. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the scope of the present disclosure. Thus, the present disclosure is not intended to be limited to the embodiments presented herein, but is to be construed in the widest scope consistent with the principles and novel features presented herein.

Claims

A method performed on one or more processors of a computing device, comprising:
acquiring, by the processor, a plurality of image data having different domain information;
dividing, by the processor, the plurality of image data into first image data and second image data based on domain information of each image data;
extracting, by the processor, common characteristic information corresponding to a plurality of first reference images constituting the first image data;
deriving, by the processor, one or more similar images based on the extracted common characteristic information; and
generating, by the processor, a target prediction image by adjusting the one or more similar images based on a plurality of second reference images constituting the second image data;
includes,
The first image data includes the plurality of first reference images serving as a reference for deriving the one or more similar images, and the second image data includes the plurality of first reference images serving as a reference for adjusting the derived one or more similar images. characterized by including second reference images of
A method of generating a facial image performed on one or more processors of a computing device.

According to claim 1,
The common characteristic information is
Information as a basis for deriving the similar image, including information on face landmarks,
A method of generating an image of a face performed on one or more processors of a computing device.

According to claim 1,
The step of deriving the one or more similar images comprises:
obtaining, by the processor, one or more characteristic information corresponding to each of the first reference images by processing each of the plurality of first reference images as inputs of a feature extraction model;
obtaining, by the processor, the common characteristic information based on the one or more characteristic information; and
deriving, by the processor, the one or more similar images based on the common characteristic information;
containing,
A method of generating an image of a face performed on one or more processors of a computing device.

4. The method of claim 3,
The feature extraction model is
Is configured through at least a part of the learned autoencoder (AutoEncoder), characterized in that outputting the one or more common characteristic information and one or more domain information corresponding to each of the plurality of first reference images,
A method of generating an image of a face performed on one or more processors of a computing device.

4. The method of claim 3,
The step of the processor deriving the one or more similar images based on the common characteristic information comprises:
deriving, by the processor, the one or more similar images having a degree of similarity greater than or equal to a predetermined similarity by performing a search on a target image database based on the common characteristic information;
includes,
The target image database,
characterized in that it is configured corresponding to each of a plurality of domains,
A method of generating a facial image performed on one or more processors of a computing device.

According to claim 1,
obtaining, by the processor, one or more characteristic information corresponding to each image by processing, by the processor, each of a plurality of first images constituting the first image data as an input of a feature extraction model;
calculating, by the processor, a probability value corresponding to each of a plurality of items related to a component included in a face image based on the one or more characteristic information; and
selecting, by the processor, some of the plurality of first images as the plurality of first reference images based on the respective probability values;
further comprising,
A method of generating an image of a face performed on one or more processors of a computing device.

According to claim 1,
generating, by the processor, the one or more similar images based on a plurality of second reference images constituting the second image data, to generate a target prediction image;
processing, by the processor, the one or more second reference images as an input of a feature extraction model to obtain one or more additional feature information;
obtaining, by the processor, the additional common characteristic information based on the one or more additional characteristic information; and
generating, by the processor, the target prediction image based on the additional common characteristic information and the common characteristic information;
containing,
A method of generating an image of a face performed on one or more processors of a computing device.

8. The method of claim 7,
The step of generating the target prediction image includes:
obtaining, by the processor, final characteristic information based on the additional common characteristic information and the common characteristic information; and
generating, by the processor, the target prediction image by processing the final characteristic information as an input of an image generation model;
including,
The image generation model is
configured through at least part of the learned autoencoder,
A method of generating a facial image performed on one or more processors of a computing device.

a memory storing one or more instructions; and
a processor executing one or more instructions stored in the memory; including,
The processor by executing the one or more instructions,
A computing device for performing the method of claim 1 .

A computer program stored in a computer-readable recording medium in combination with a computer, which is hardware, to perform the method of claim 1 .