KR102555027B1

KR102555027B1 - Latent Space Control System for Learned Generative Neural Networks Using Visualization Autoencoder

Info

Publication number: KR102555027B1
Application number: KR1020220120407A
Authority: KR
Inventors: 고영민; 고선우; 민정익; 송주환
Original assignee: 고영민; 고선우; 민정익; 송주환
Priority date: 2022-09-23
Filing date: 2022-09-23
Publication date: 2023-07-18
Also published as: KR20240042351A

Abstract

본 발명은 시각화 오토인코더를 이용한 학습된 생성신경망의 잠재공간 조작 시스템 및 그 방법에 관한 것으로서, 보다 상세하게는 학습된 생성 신경망의 잠재공간을 조작하여 원하는 데이터 생성 및 변환 과정 시각화를 할 수 있는 시각화 오토인코더 알고리즘을 제공하는 시각화 오토인코더를 이용한 학습된 생성신경망의 잠재공간 조작 시스템 및 그 방법에 관한 것이다.The present invention relates to a latent space manipulation system and method of a learned generative neural network using a visualization autoencoder, and more particularly, to a visualization capable of generating desired data and visualizing the transformation process by manipulating the latent space of a learned generative neural network. It relates to a latent space manipulation system and method of a learned generative neural network using a visualization autoencoder providing an autoencoder algorithm.

Description

Latent Space Control System for Learned Generative Neural Networks Using Visualization Autoencoder

본 발명은 시각화 오토인코더를 이용한 학습된 생성신경망의 잠재공간 조작 시스템 및 그 방법에 관한 것으로서, 보다 상세하게는 임의의 학습된 생성 신경망에 대하여 그 학습된 생성 신경망의 임의의 잠재공간 또는 은닉층을 3차원 이하로 시각화하여 관심있는 샘플을 생성하거나 변환 과정을 시각화하는 조작이 가능한 시각화 오토인코더를 이용한 학습된 생성신경망의 잠재공간 조작 시스템 및 그 방법에 관한 것이다.The present invention relates to a latent space manipulation system and method of a learned generative neural network using a visualization autoencoder, and more particularly, to a random latent space or hidden layer of the learned generative neural network for any learned generative neural network. It relates to a latent space manipulation system and method of a learned generative neural network using a visualization autoencoder that can be manipulated to visualize a sample of interest by sub-dimensional visualization or to visualize a transformation process.

일반적으로 도 1에 도시된 바와 같이 인공지능의 한 방법인 신경망은 학습 관점에서 지도학습, 비지도학습 그리고 강화학습 모두에 사용되는 영향력있는 학습방법이다.In general, as shown in FIG. 1, a neural network, which is a method of artificial intelligence, is an influential learning method used in all of supervised learning, unsupervised learning, and reinforcement learning from a learning point of view.

도 2에 도시된 바와 같이 심층신경망의 학습 구조는 데이터가 주어졌을 때 경사하강법을 사용하여 손실함수를 최소화함으로써 함수를 근사화하는 방법이다. 즉 심층신경망은 선형변환과 비선형변환, 그리고 차원변환의 반복으로 구성되어 있다.As shown in FIG. 2, the learning structure of a deep neural network is a method of approximating a function by minimizing a loss function using gradient descent when given data. In other words, the deep neural network consists of repeating linear transformation, nonlinear transformation, and dimensional transformation.

한편, 최근에 VAE와 GAN과 같은 심층신경망의 발전으로 잠재공간을 조작하거나 이해하려는 연구가 활발히 진행되고 있다.Meanwhile, with the recent development of deep neural networks such as VAE and GAN, research to manipulate or understand the latent space is being actively conducted.

본 연구는 잠재공간을 조작하기 위해 필요한 수학적 성질 2가지(일대일 대응, locally smoothness 성질)를 가정한 학습된 심층신경망을 전제로 한다.This study is premised on a trained deep neural network assuming two mathematical properties (one-to-one correspondence and locally smoothness properties) required to manipulate latent space.

그리고 가정된 학습된 심층신경망의 잠재공간을 조작하여 출력공간에서 관심있는 샘플들을 생성하거나 변환 과정을 시각화할 수 있는 시각화 오토인코더 방법을 제안한다.In addition, we propose a visualization autoencoder method that can generate samples of interest in the output space or visualize the conversion process by manipulating the latent space of the hypothesized trained deep neural network.

제안된 방법은 학습된 심층신경망의 임의의 차원 잠재공간에 매핑된 데이터를 입력으로 받아 3차원 이하의 시각화 가능한 잠재공간에 매핑하여 복원하는 심층오토인코더를 의미한다. The proposed method is an arbitrary It refers to a deep autoencoder that receives data mapped to a dimensional latent space as input and restores it by mapping it to a visualizable latent space of 3D or less.

구체적으로 학습된 심층신경망의 차원 벡터들을 입력으로 받아 최대한 복원시킨 2차원 잠재공간을 가지는 시각화 오토인코더에서 2차원 잠재공간의 의미는 차원 입력벡터들을 최대한 지나는 차원 잠재공간의 곡면을 의미하며 일대일 대응과 locally smoothness 성질에 의해 이 곡면은 관심있는 샘플들을 생성하거나 변환과정을 조작할 수 있다.Specifically trained deep neural network The meaning of 2D latent space in a visualization autoencoder that has a 2D latent space that receives dimensional vectors as input and restores it as much as possible dimensional input vectors as far as possible It refers to the curved surface of the dimensional latent space, and due to one-to-one correspondence and locally smoothness, this curved surface can generate interesting samples or manipulate the transformation process.

그러나, 본 발명에 관련된 연구들의 진행에서는 잠재공간에 대한 연구들(VAE, GAN)에 대해 집중적으로 하고 있을 뿐이다.However, in the progress of research related to the present invention, only studies on latent space (VAE, GAN) are being intensively conducted.

따라서 이러한 잠재공간을 시각적으로 확인하면서 학습된 생성신경망의 잠재공간을 직접 조작할 수 있는 새로운 알고리즘이 필요하게 되었다.Therefore, there is a need for a new algorithm that can directly manipulate the latent space of the learned generative neural network while visually checking this latent space.

본 발명은 전술한 문제점을 개선하기 위해 안출된 것으로, 본 발명은 임의의 학습된 생성 신경망에 대하여 그 학습된 생성 신경망의 임의의 잠재공간 또는 은닉층을 3차원 이하로 시각화하여 관심있는 샘플을 생성하거나 변환 과정을 시각화하는 조작이 가능한 시각화 오토인코더를 제공하는데 목적이 있다.The present invention has been devised to improve the above-mentioned problems, and the present invention creates a sample of interest by visualizing an arbitrary latent space or hidden layer of a learned neural network in 3D or less with respect to an arbitrary generated neural network. The purpose is to provide a visualization autoencoder that can be manipulated to visualize the conversion process.

또한 본 발명의 실시예에 따르면, 시각화 오토인코더를 이용한 학습된 생성신경망의 잠재공간 조작 시스템을 제공하는데 목적이 있다.Another object of the present invention is to provide a latent space manipulation system of a learned generative neural network using a visualization autoencoder.

상기와 같은 목적을 달성하기 위하여, 본 발명의 실시예는 임의의 학습된 생성 신경망에 대하여 그 학습된 생성 신경망의 임의의 잠재공간 또는 은닉층을 3차원 이하로 시각화하여 관심있는 샘플을 생성하거나 변환 과정을 시각화하는 조작이 가능하고, 상기 임의의 학습된 생성 신경망은 예를 들어 학습 데이터에 대해 학습된 AE, VAE, GAN 등이다.In order to achieve the above object, an embodiment of the present invention creates a sample of interest by visualizing an arbitrary latent space or hidden layer of a learned neural network in 3D or less for an arbitrary generated neural network, or a conversion process It is possible to manipulate to visualize, and the arbitrary learned neural network is, for example, AE, VAE, GAN, etc. trained on learning data.

상기 시각화 오토인코더를 이용한 임의의 학습된 생성 신경망의 임의의 잠재공간 또는 은닉층을 3차원 이하로 시각화하여 관심있는 샘플을 생성하거나, 상기 관심있는 샘플의 변환 과정을 시각화하고 그 조작이 가능하다.A sample of interest may be generated by visualizing an arbitrary latent space or a hidden layer of a neural network generated using the visualization autoencoder in three dimensions or less, or the conversion process of the sample of interest may be visualized and manipulated.

본 발명은 원본 벡터를 오토인코더(Autoencoder) 방식으로 낮은 차원으로 압축한 후 원래 차원으로 복원하고, 손실함수 L 값이 일정치를 만족하도록 반복하는 시각화 오토인코더를 이용한 학습된 생성신경망의 잠재공간 조작 시스템에 있어서, 학습된 심층신경망의 잠재공간(latent space) 조작 정의와 상기 잠재공간 조작 방법에 의하여, 학습된 심층신경망에서 관심있는 샘플을 생성하거나 변환 과정을 시각화하고, 학습된 잠재공간을 조작할 수 있는 시각화 오토 인코더;를 포함하는 시각화 오토인코더를 이용한 학습된 생성신경망의 잠재공간 조작 시스템을 제공한다.In the present invention, after compressing an original vector to a lower dimension using an autoencoder method, restoring the original vector to the original dimension, and manipulating the latent space of the learned generative neural network using a visualization autoencoder that repeats so that the value of the loss function L satisfies a certain value. In the system, according to the latent space manipulation definition and the latent space manipulation method of the learned deep neural network, a sample of interest can be created in the learned deep neural network or a conversion process can be visualized, and the learned latent space can be manipulated. It provides a latent space manipulation system of a learned generative neural network using a visualization autoencoder including;

상기 임의의 학습된 생성 신경망과 시각화 오토인코더는 선형변환과 비선형변환의 합성함수로써 잠재공간에서 출력공간으로 매핑하는 함수가 일대일 대응 성질과 locally smoothness 성질을 보장할 수 없는 경우, 잠재공간을 통해 매핑된 출력공간의 관심있는 샘플을 해석하거나 추적할 수 없어, 학습된 생성 신경망과 시각화 오토인코더는 수학적으로 일대일 대응과 locally smoothenss 성질을 가진다고 가정한다.The arbitrary learned generative neural network and visualization autoencoder are synthesized functions of linear transformation and nonlinear transformation. When a function mapping from latent space to output space cannot guarantee one-to-one correspondence and locally smoothness, mapping through latent space Since we cannot interpret or track the sample of interest in the output space, we assume that the trained generative neural network and the visualization autoencoder have mathematically one-to-one correspondence and locally smoothenss properties.

상기 임의의 학습된 생성 신경망을 학습된 오토인코더

를 가정하면,

는 차원 특징을 가지는 개의 입력 데이터 행렬

와 인코더

와 디코더

을 통해 복원한 행렬

의 손실함수

에 대하여 를 만족한다고 할때,

는

가

와 가깝고 모집단에 속하는 기준을 의미하며, 이때 시각화 오토인코더

는 차원 잠재공간에 매핑된 를 입력으로 받아 2차원 잠재공간의

을 거쳐

로 복원한 것을 나타내고, 상기

의 기준을

가

와 가까우면서

에 속하는 점을 갖는 기준이라고 가정하면

는

안에 속하고 이는 을 통해 모집단의 부분집합

안에 속하므로 차원에서 샘플

(수학식 6)으로 매핑된다.An autoencoder trained on the arbitrary learned generative neural network

Assuming

Is dimensionally characterized matrix of input data

and encoder

and decoder

Matrix restored via

the loss function of

about When it is said to satisfy

Is

go

means a criterion that is close to and belongs to the population, where the visualization autoencoder

Is mapped to dimensional latent space of the two-dimensional latent space

through

Indicates that it has been restored to , and the

standard of

go

close to

Assuming a criterion with a point belonging to

Is

belongs to and is subsets of the population through

because it belongs in dimensional sample

(Equation 6) is mapped.

상기 학습된 오토 인코더는, 아래 수학식 7과 같이, 상기

를 포함하는 2차원 시각화된 공간에서 적당한 영역

을 잡고,

은

을 통해 차원 잠재공간에 일대일 대응하는 곡면

로 매핑되고

는

을 통해 차원 공간에 일대일 대응하는 곡면

을 만들어내어 2차원에

이 일대일 대응하는 차원의 곡면

로 매핑함으로써 관심있는 샘플의 변환 경로

을 시각화할 수 있고,

에서

의 주변은 locally smoothness 성질에 의해

에서 관심있는 샘플들

을 생성할 수 있어,The learned auto-encoder, as shown in Equation 7 below,

A suitable area in a two-dimensional visualized space containing

hold,

silver

Through A surface that corresponds one-to-one to the dimensional latent space

is mapped to

Is

Through A surface that has a one-to-one correspondence in dimensional space

to create a two-dimensional

corresponds to this one-to-one dimensional surface

The conversion path of the sample of interest by mapping to

can be visualized,

at

The periphery of is caused by the locally smoothness property

samples of interest in

can generate

시각화 오토인코더를 사용하여 학습된 심층신경망에서 관심 있는 샘플을 생성하거나 변환 과정을 시각화할 수 있다.You can use a visualization autoencoder to generate samples of interest or visualize the conversion process in a trained deep neural network.

상기 시각화 오토 인코더(10)는, m개의 학습 이미지 데이터를 학습한 생성 신경망

를 생성하는 신경망 생성 모듈; 인코더 함수(

)를 통해 차원 잠재공간에 매핑된 m개의 데이터 중 관심있는 k개 데이터를 선택하는 제1 선택 모듈; 차원을 입력으로 하고 시각화 가능한 2차원 잠재공간과 적절한 은닉층을 가지는 시각화 오토인코더

를 설계하는 설계 모듈;

를 k개의

를 입력으로 학습하고, 학습은 손실함수 L 값이 적당한 기준값 보다 작은 L(

,

(

)) < delta 를 만족할 때 까지 반복하는 반복 모듈; 학습이 완료된 후, k 개의

를

를 통해 2차원 잠재공간으로 매핑하는 매핑 모듈; 2차원 시각화된 공간에서

를 포함하는 적절한 영역 (가령

의 가장 작은 값과 큰 값 사이)를 선택하는 제2 선택 모듈; 및 추출된 영역 를 각 디코더 함수를 통해 차원 공간으로 매핑하고, 차원에 매핑된 O를 이미지로 변환하는 변환 모듈;을 포함한다.The visualization autoencoder 10 is a generated neural network that has learned m pieces of training image data.

A neural network generating module for generating; encoder function (

)Through the a first selection module for selecting k pieces of data of interest from among m pieces of data mapped to the dimensional latent space; A visualization autoencoder that takes dimensions as input and has a visualizable two-dimensional latent space and an appropriate hidden layer.

A design module for designing;

to k

is learned as an input, and learning is performed when the value of the loss function L is smaller than the appropriate reference value L (

,

(

)) iteration module that iterates until < delta is satisfied; After learning is complete, k

cast

A mapping module that maps to a two-dimensional latent space through in a two-dimensional visualized space

A suitable area containing (chamberlain

a second selection module for selecting (between the smallest and largest values of ); and the extracted area through each decoder function map into dimensional space, A conversion module for converting O mapped to a dimension into an image; includes.

본 발명은 시각화 오토 인코더를 이용한 시각화 오토인코더를 이용한 학습된 생성신경망의 잠재공간 조작 방법으로서, m개의 학습 이미지 데이터를 학습한 생성 신경망

를 생성하는 1단계; 인코더 함수(

)를 통해 차원 잠재공간에 매핑된 m개의 데이터 중 관심있는 k개 데이터를 선택하는 2단계; 차원을 입력으로 하고 시각화 가능한 2차원 잠재공간과 적절한 은닉층을 가지는 시각화 오토인코더

(

는

인코더,

디코더 함수로 구성됨)를 설계하는 3 단계; 및

를

개의

의 입력으로 학습하고, 상기 학습은 손실함수 L 값이 적당한 기준값 보다 작은 L(

,

(

)) < delta 를 만족할 때 까지 반복하는 4 단계;를 포함한다.The present invention is a latent space manipulation method of a learned generative neural network using a visualization autoencoder using a visualization autoencoder, and a generative neural network learned from m learning image data.

Step 1 to generate; encoder function (

)Through the Step 2 of selecting k pieces of data of interest among m pieces of data mapped to the dimensional latent space; A visualization autoencoder that takes dimensions as input and has a visualizable two-dimensional latent space and an appropriate hidden layer.

(

Is

encoder,

3 steps to design a decoder function); and

cast

doggy

Learning with the input of , and the learning is such that the value of the loss function L is smaller than the appropriate reference value L (

,

(

)) Step 4 is repeated until <delta is satisfied.

상기 4단계의 학습이 완료된 후, k 개의

를

를 통해 2차원 잠재공간으로 매핑하는 5 단계;를 포함한다.After the learning of the above 4 steps is completed, k

cast

5 steps of mapping to a 2-dimensional latent space through

상기 5 단계 후에, 2차원 시각화된 공간에서

를 포함하는 적절한 영역 를 선택하는 6 단계;를 포함한다.After the above 5 steps, in the 2D visualized space

A suitable area containing 6 steps to select;

상기 6 단계 후에 추출된 영역 를 각 디코더 함수를 통해 차원 공간으로 매핑하고, 차원에 매핑된 O를 이미지로 변환하는 7 단계;를 포함한다.Area extracted after step 6 above through each decoder function map into dimensional space, Step 7 of converting O mapped to a dimension into an image;

상기 학습 과정을 시각화하고, 학습된 잠재공간을 조작하여 상기 1단계로 돌아가 반복하는 단계;를 포함Visualizing the learning process, manipulating the learned latent space, and returning to step 1 and repeating it.

본 발명의 일 실시예에 따라 본 발명은 3차원 이하의 시각화가능한 잠재공간으로 매핑하는 심층 오토인코더 방법인 시각화 오토인코더를 제안하여, GAN과 같은 심층신경망에 적용하여도 업그레이드된 결과를 얻을 수 있다.According to an embodiment of the present invention, the present invention proposes a visualization autoencoder, which is a deep autoencoder method that maps to a visualizable latent space of three dimensions or less, and can obtain upgraded results even when applied to a deep neural network such as GAN. .

또한, 본 발명의 일 실시예에 따르면 학습된 심층신경망의 잠재공간을 조작하여 출력공간에서 관심있는 샘플들 간의 변환 과정 혹은 관심있는 샘플들을 생성하는 것이 가능해졌다.In addition, according to an embodiment of the present invention, it is possible to create a conversion process between samples of interest or samples of interest in the output space by manipulating the latent space of the learned deep neural network.

도 1은 종래 발명을 개략적으로 나타내는 개략도(Neural networks in artificial intelligence from a learning perspective)이다.
도 2는 종래 발명의 학습 구조(Learning structure of a general deep neural network)를 보여주는 도면이다.
도 3은 본 발명을 기하적 관점에서 본 도면이다.
도 4는 본 발명에 따른 2차원 공간에 비선형적으로 분포되어 있는 3개의 점을 1차원으로 투영하는 문제를 고려했을 때, l2 norm 거리함수를 손실함수로 사용하여 최소화한 경우를 나타내는 도면이다.
도 5는 본 발명에 따른 차원 잠재공간에 매핑된 를 입력으로 받아 2차원 잠재공간의 을 거쳐

로 복원한 것을 나타내는 도면이다.
도 6 a,b는 (a) 본 발명의 구체적 구성, (b) 1 단계에서 학습된 심층 오토인코더 구조 등을 보여주는 도면이다.
도 7은 2 단계에서 선택된 5숫자 100개를 보여주는 도면이다.
도 8은 3 단계로 차원을 입력으로 하고 시각화 가능한 2차원 잠재공간과 적절한 은닉층을 가지는 시각화 오토인코더 V(V는

인코더,

디코더 함수로 구성됨)를 설계한다.
도 9는 4 단계로 V를 k개의

를 입력으로 학습하고, 학습은 손실함수 L 값이 적당한 기준값 (사용자에 의해 설정) 보다 작은 L(

, V(

)) < delta 를 만족할 때 까지 반복한다.
도 10은 5 단계로 학습이 완료된 후, k 개의

를

를 통해 2차원 잠재공간으로 매핑한다.
도 11은 6 단계로 2차원 시각화된 공간에서

를 포함하는 적절한 영역 (가령

의 가장 작은 값과 큰 값 사이)를 선택한다.
도 12는 7 단계로 추출된 영역 를 각 디코더 함수를 통해 차원 공간으로 매핑하고, 차원에 매핑된 O를 이미지로 변환한다.
도 13은 다양한 데이터 생성 예시로서, 특히 2차원 시각화 오토인코더를 사용한 다양한 데이터 생성 예시를 보여준다.
도 14는 (a) 두꺼운 0과 얇은 0 이미지 2개를 입력으로 하는 경우, (b) 가로로 긴 0과 세로로 긴 0 이미지 2개를 입력으로 하는 경우, (c) 각도가 다른 3 숫자 2개를 입력으로 하는 경우, (d) 1차원 시각화 오토인코더 학습 후 매핑된 1차원에서 그리드로 그린 이미지 이다.
도 15는 다양한 변환 과정 시각화 예시로서, (a) 5개의 1을 입력으로 하는 경우, (b) 1차원 시각화 오토인코더 학습 후 매핑된 1차원에서 그리드로 그린 이미지를 보여주는 도면이다.1 is a schematic diagram (Neural networks in artificial intelligence from a learning perspective) schematically showing a conventional invention.
2 is a diagram showing a learning structure of a general deep neural network of the prior art.
3 is a view of the present invention from a geometrical point of view.
4 is a diagram showing a case in which the l2norm distance function is minimized by using a loss function when considering the problem of projecting three points nonlinearly distributed in a two-dimensional space into one dimension according to the present invention.
5 is according to the present invention mapped to dimensional latent space of the two-dimensional latent space through

It is a drawing showing that it has been restored to .
6a and b are diagrams showing (a) a specific configuration of the present invention and (b) a deep autoencoder structure learned in step 1.
7 is a diagram showing 100 5 numbers selected in the second step.
8 is in three steps Visualization autoencoder V (V is

encoder,

composed of decoder functions).
9 shows k number of V in 4 steps

is learned as an input, and learning is performed when the value of the loss function L is smaller than the appropriate reference value (set by the user) L (

, V(

)) until < delta is satisfied.
10 is after learning is completed in 5 steps, k

cast

It is mapped to a two-dimensional latent space through
11 is a two-dimensional visualized space in six steps

A suitable area containing (chamberlain

between the smallest and largest values of
12 is an area extracted in 7 steps through each decoder function map into dimensional space, Convert O mapped to dimension to image.
13 shows various data generation examples, in particular, various data generation examples using a two-dimensional visualization autoencoder.
14 shows (a) when two thick 0 and thin 0 images are input, (b) when two horizontally long 0 and vertically long 0 images are input, (c) 3 numbers 2 with different angles In the case of dogs as input, (d) It is an image drawn as a grid in 1D mapped after learning 1D visualization autoencoder.
15 is an example of visualization of various conversion processes, (a) when five 1s are input, (b) a diagram showing an image drawn in a grid in a 1-dimensional map after learning a 1-dimensional visualization autoencoder.

상기한 바와 같은 본 발명을 첨부된 도면들과 실시예들을 통해 상세히 설명하도록 한다.The present invention as described above will be described in detail through the accompanying drawings and embodiments.

본 발명에서 사용되는 기술적 용어는 단지 특정한 실시 예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아님을 유의해야 한다. 또한, 본 발명에서 사용되는 기술적 용어는 본 발명에서 특별히 다른 의미로 정의되지 않는 한, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 의미로 해석되어야 하며, 과도하게 포괄적인 의미로 해석되거나, 과도하게 축소된 의미로 해석되지 않아야 한다. 또한, 본 발명에서 사용되는 기술적인 용어가 본 발명의 사상을 정확하게 표현하지 못하는 잘못된 기술적 용어일 때에는, 당업자가 올바르게 이해할 수 있는 기술적 용어로 대체되어 이해되어야 할 것이다. 또한, 본 발명에서 사용되는 일반적인 용어는 사전에 정의되어 있는 바에 따라, 또는 전후 문맥상에 따라 해석되어야 하며, 과도하게 축소된 의미로 해석되지 않아야 한다.It should be noted that technical terms used in the present invention are only used to describe specific embodiments and are not intended to limit the present invention. In addition, technical terms used in the present invention should be interpreted in terms commonly understood by those of ordinary skill in the art to which the present invention belongs, unless specifically defined otherwise in the present invention, and are excessively inclusive. It should not be interpreted in a positive sense or in an excessively reduced sense. In addition, when the technical terms used in the present invention are incorrect technical terms that do not accurately express the spirit of the present invention, they should be replaced with technical terms that those skilled in the art can correctly understand. In addition, general terms used in the present invention should be interpreted as defined in advance or according to context, and should not be interpreted in an excessively reduced sense.

또한, 본 발명에서 사용되는 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한 복수의 표현을 포함한다. 본 발명에서, "구성된다" 또는 "포함한다" 등의 용어는 발명에 기재된 여러 구성 요소들, 또는 여러 단계를 반드시 모두 포함하는 것으로 해석되지 않아야 하며, 그 중 일부 구성 요소들 또는 일부 단계들은 포함되지 않을 수도 있고, 또는 추가적인 구성 요소 또는 단계들을 더 포함할 수 있는 것으로 해석되어야 한다.Also, singular expressions used in the present invention include plural expressions unless the context clearly dictates otherwise. In the present invention, terms such as "consisting of" or "comprising" should not be construed as necessarily including all of the various elements or steps described in the invention, and some of the elements or steps are included. may not be, or may further include additional components or steps.

이하, 첨부된 도면을 참조하여 본 발명에 따른 바람직한 실시 예를 상세히 설명하되, 도면 부호에 관계없이 동일하거나 유사한 구성 요소는 동일한 참조 번호를 부여하고 이에 대한 중복되는 설명은 생략하기로 한다.Hereinafter, a preferred embodiment according to the present invention will be described in detail with reference to the accompanying drawings, but the same or similar components are assigned the same reference numerals regardless of reference numerals, and redundant description thereof will be omitted.

또한, 본 발명을 설명함에 있어서 관련된 공지 기술에 대한 구체적인 설명이 본 발명의 요지를 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다. 또한, 첨부된 도면은 본 발명의 사상을 쉽게 이해할 수 있도록 하기 위한 것일 뿐, 첨부된 도면에 의해 본 발명의 사상이 제한되는 것으로 해석되어서는 아니 됨을 유의해야 한다.In addition, in describing the present invention, if it is determined that a detailed description of a related known technology may obscure the gist of the present invention, the detailed description will be omitted. In addition, it should be noted that the accompanying drawings are only for easily understanding the spirit of the present invention, and should not be construed as limiting the spirit of the present invention by the accompanying drawings.

본 발명은 시각화 오토인코더를 이용한 학습된 생성신경망의 잠재공간 조작 시스템으로서, 상기 시각화 오토인코더를 이용한 임의의 학습된 생성 신경망의 임의의 잠재공간 또는 은닉층을 3차원 이하로 시각화하여 관심있는 샘플을 생성하거나, 변환 과정을 시각화하고 그 조작이 가능한 시각화 오토인코더를 이용한 학습된 생성신경망의 잠재공간 조작 시스템을 제공한다.The present invention is a system for manipulating the latent space of a learned generative neural network using a visualization autoencoder, which visualizes an arbitrary latent space or a hidden layer of any learned generative neural network using the visualization autoencoder in three dimensions or less to generate a sample of interest Alternatively, we provide a latent space manipulation system of a learned generative neural network using a visualization autoencoder capable of visualizing and manipulating the conversion process.

일실시예로서 상기 시각화 오토인코더를 이용한 임의의 학습된 생성 신경망의 임의의 잠재공간 또는 은닉층을 3차원 이하로 시각화하여 관심있는 샘플을 생성하거나, 상기 관심있는 샘플의 변환 과정을 시각화하고 그 조작이 가능하다.As an embodiment, a sample of interest is generated by visualizing any latent space or hidden layer of any learned neural network using the visualization autoencoder in three dimensions or less, or the conversion process of the sample of interest is visualized and the manipulation is possible.

이하 도 3 내지 도 5를 참고하여 시각화 오토인코더를 사용한 학습된 심층신경망의 잠재공간 조작 시스템에 대하여 구체적으로 설명한다. Hereinafter, with reference to FIGS. 3 to 5, a latent space manipulation system of a deep neural network trained using a visualization autoencoder will be described in detail.

도 3은 본 발명을 기하적 관점에서 본 도면이고, 도 4는 본 발명에 따른 2차원 공간에 비선형적으로 분포되어 있는 3개의 점을 1차원으로 투영하는 문제를 고려했을 때, l2 norm 거리함수를 손실함수로 사용하여 최소화한 경우를 나타내는 도면이며, 도 5는 본 발명에 따른 차원 잠재공간에 매핑된 를 입력으로 받아 2차원 잠재공간의 을 거쳐 로 복원한 것을 나타낸다.3 is a view of the present invention from a geometric point of view, and FIG. 4 is an l2 norm distance function when considering the problem of projecting three points nonlinearly distributed in a two-dimensional space according to the present invention into one dimension. It is a diagram showing the case of minimizing using as a loss function, Figure 5 is according to the present invention mapped to dimensional latent space of the two-dimensional latent space through indicates that it has been restored.

상기 임의의 학습된 생성 신경망과 시각화 오토인코더는 선형변환과 비선형변환의 합성함수로써 잠재공간에서 출력공간으로 매핑하는 함수가 일대일 대응 성질과 locally smoothness 성질을 보장할 수 없는 경우, 잠재공간을 통해 매핑된 출력공간의 관심있는 샘플을 해석하거나 추적할 수 없어, 학습된 생성 신경망과 시각화 오토인코더는 수학적으로 일대일 대응과 locally smoothenss 성질을 가진다고 가정할 수 있다.The arbitrary learned generative neural network and visualization autoencoder are synthesized functions of linear transformation and nonlinear transformation. When a function mapping from latent space to output space cannot guarantee one-to-one correspondence and locally smoothness, mapping through latent space Since we cannot interpret or track the sample of interest in the output space, we can assume that the trained generative neural network and the visualization autoencoder mathematically have a one-to-one correspondence and locally smoothenss properties.

구체적으로 살펴보면, 먼저 학습된 심층신경망의 잠재공간 조작 정의와 필요한 성질을 살펴보면, 본 발명에서 학습된 심층신경망의 잠재공간을 조작한다는 것은 다음 두 가지 행위로 정의한다.Specifically, looking at the definition and necessary properties of manipulating the latent space of the deep neural network learned, manipulating the latent space of the deep neural network learned in the present invention is defined as the following two actions.

1. 학습된 잠재공간(latent space)을 조작하여 출력공간(output space)에서 관심있는 샘플들 간의 변환 과정 시각화.1. Manipulating the learned latent space to visualize the conversion process between samples of interest in the output space.

2. 학습된 잠재공간을 조작하여 출력공간에서 관심있는 샘플 생성.2. Generating samples of interest in the output space by manipulating the learned latent space.

위와 같은 조작을 가능하게 하기 위해서 학습된 심층신경망이 필요로 하는 수학적 성질 2가지(일대일 대응 성질, locally smoothness 성질)를 가정하자.Let's assume two mathematical properties (one-to-one correspondence property and locally smoothness property) required by the trained deep neural network to enable the above manipulation.

먼저 일대일 대응 성질로서 학습된 심층신경망은 선형변환과 비선형변환의 합성함수로써 잠재공간에서 출력공간으로 매핑하는 함수가 일대일 대응 성질을 보장할 수 없는 경우 잠재공간을 통해 매핑된 출력공간의 관심있는 샘플을 해석하거나 추적할 수 없다.First, the deep neural network trained as a one-to-one correspondence property is a composite function of linear transformation and nonlinear transformation. If the function mapping from the latent space to the output space cannot guarantee the one-to-one correspondence property, the sample of interest in the output space mapped through the latent space cannot be interpreted or tracked.

이 문제를 해결하기 위해 학습된 심층신경망

의 임의의 번 째 선형변환 의 rank가 를 가진다고 하자.A deep neural network trained to solve this problem

any of th linear transformation the rank of Let's say we have

그리고 원소곱으로 연산되는 비선형변환 를 일대일 대응 성질을 만족하는 파라메트릭 활성함수, 예를 들어 abTanh 함수 등을 사용하였다고 가정하자.And non-linear transformation operated by element product Assume that a parametric activation function that satisfies the one-to-one correspondence property, for example, the abTanh function, is used.

두 번째는 학습된 잠재공간을 통해 출력공간에서 관심있는 샘플을 생성할 때 locally smoothness 성질을 보장할 수 없는 경우, 생성된 샘플이 관심있는 샘플과 차이가 클 위험성이 있다.Second, when the sample of interest is generated in the output space through the learned latent space, if the locally smoothness property cannot be guaranteed, there is a risk that the generated sample differs greatly from the sample of interest.

이 위험성을 해결하기 위해

의 임의의 번 째 잠재공간 내에 임의의 관심있는 벡터 에 대해 locally smoothness (수학식 1) 을 만족한다고 가정하자.to address this risk

any of Any vector of interest in the second latent space Assume that locally smoothness (Equation 1) is satisfied for

수학식 1은 0이 아닌 임의의 실수 노이즈 벡터 을 더한 에 대해 생성된 값 과 의 손실함수 값이 적당한 기준값 (가령 관심없는 샘플이 생성되는 기준)보다 작은 관계가 성립하는 것을 나타낸다.Equation 1 is a non-zero random real noise vector plus generated value for class the loss function of A reasonable reference value indicates that a relationship smaller than (i.e., the criterion for which samples of no interest are generated) holds.

위 두 가지 가정을 만족하는 학습된 심층신경망이 잠재공간에서 조작 가능함을 보다 자세히 이해해보자.Let's understand in more detail that a trained deep neural network that satisfies the above two assumptions can be manipulated in the latent space.

예를 들어 일대일 대응 및 locally smoothness 성질을 만족하는 학습된 심층오토인코더

가 주어졌다고 해보자(수학식 2).For example, a trained deep autoencoder that satisfies the one-to-one correspondence and locally smoothness properties

Suppose that is given (Equation 2).

수학식 2에서 은 차원 특징을 가지는 개의 입력 데이터 행렬 이 주어졌을 때 인코더 함수

와 디코더 함수

로 구성된

를 통해 복원된 데이터 행렬을 나타낸다.in Equation 2 silver dimensionally characterized matrix of input data Given an encoder function

and the decoder function

composed of

Represents a data matrix restored through

그리고

와

가 다음 (수학식 3)을 가진다고 가정하자.and

and

Assume that has the following (Equation 3).

실제로 우리는 현재 심층신경망의 최적화 구조를 모르기 때문에 손실함수

값을 정확히 최소화하기 어렵다. 그러므로 적당한

보다 작게 만드는 파라미터를 가지는

가 있다고 생각해볼 수 있다(수학식 3).In practice, since we currently do not know the optimization structure of deep neural networks, the loss function

It is difficult to precisely minimize the value. therefore appropriate

with parameters that make it smaller

It can be considered that there is (Equation 3).

확률점 관점에서 학습된 심층오토인코더는 모집단 중

개의 관측된 sample들

에 대해 학습되었다고 생각할 수 있다.A deep autoencoder trained in terms of probability points is

dog observed samples

can be considered learned.

구체적인 예시로 숫자 이미지 데이터셋인 MNIST 데이터를 학습한 심층오토인코더를 생각해보면 모집단은 Input space()에서 0부터 9까지 숫자 이미지라 생각할 수 있다.As a specific example, if we consider a deep autoencoder that has learned MNIST data, which is a digit image dataset, the population is Input space ( ) can be thought of as a number image from 0 to 9.

이때 input space에서 숫자 0부터 9는 비선형적으로 분포되어 있을 수 있고, 약간의 노이즈가 픽셀에 더해져도 숫자로 인식할 수 있는 적절한 주변 근방이 존재할 수 있다.At this time, the numbers 0 to 9 may be non-linearly distributed in the input space, and even if a little noise is added to a pixel, there may be an appropriate neighborhood that can be recognized as a number.

이를 기하적 관점(도 3)으로 해석해보면

는 input space에서 정의된 모집단 중 개의 관측된 sample들

을 latent space(eg, 차원) 안의

을 거쳐

로 매핑하였다고 할 수 있다(input space = output space).If this is interpreted from a geometric point of view (Fig. 3),

is the population defined in the input space middle dog observed samples

into latent space (eg, dimension) within

through

It can be said that it is mapped as (input space = output space).

이때

와

의 손실함수 값이

보다 작다는 기준을

가

와 가까우면서 에 속하는 점을 갖는 기준이라고 해보자.At this time

and

The value of the loss function of

criterion less than

go

close to Let be a criterion having a point belonging to .

그렇다면 차원 잠재공간에 매핑된

의 주변집합

은 일대일 대응과 locally smoothness 성질에 의해 차원 공간에서 모집단의 부분집합

을 생성할 수 있다.then mapped to dimensional latent space

marginal set of

by the one-to-one correspondence and the locally smoothness property A subset of a population in a dimensional space

can create

구체적으로 만약

중 3 이미지를 표현하는 한 개의 이미지 벡터

가 관심있다고 해보자.

에 일대일 대응하는

가 존재하고 locally smoothness 성질에 의해

의 주변을 조작하여 차원 공간으로 3 이미지 샘플들을 생성 가능하다.Specifically if

One image vector representing 3 of the images

Let's say you are interested.

corresponds one-to-one to

exists and by the locally smoothness property

By manipulating the surroundings of It is possible to generate 3 image samples in dimensional space.

그리고

중 3과 7 이미지를 표현하는 벡터들간의 변환 과정을 잠재공간을 통해 시각화한다고 해보자. and

Let's say we visualize the conversion process between vectors representing images from 3 and 7 through latent space.

차원 잠재공간에 3과 7에 대응하는 잠재벡터가 존재하고 두 잠재벡터를 잇는 path 또한 일대일 대응 성질에 의해 차원 공간에 매핑하여 생성 가능하다. Latent vectors corresponding to 3 and 7 exist in the dimensional latent space, and the path connecting the two latent vectors also has a one-to-one correspondence. It can be created by mapping in dimensional space.

한편 시각화 오토인코더를 사용한 잠재공간 조작 방법을 보면, 학습된 심층신경망의 잠재공간을 조작하여 출력공간에서 관심있는 샘플들 간의 변환 과정 혹은 관심있는 샘플들을 생성하는 것이 가능해졌다.On the other hand, looking at the latent space manipulation method using the visualization autoencoder, it is possible to manipulate the latent space of the trained deep neural network to create a conversion process between samples of interest or samples of interest in the output space.

하지만 잠재공간을 실질적으로 조작하는 방법에 대해서 불분명하다. 왜냐하면 학습된 잠재공간에 매핑된 잠재벡터들이 선형적으로 분포되어 있으리란 보장이 없고 특히 차원이 인 경우 잠재공간을 시각화하여 다룰 수 없기 때문이다. However, it is unclear how to actually manipulate the latent space. Because there is no guarantee that the latent vectors mapped to the learned latent space will be linearly distributed, especially if the dimension This is because the latent space cannot be visualized and handled in the case of .

이와 같은 문제를 해결하기 위해 본 발명은 3차원 이하의 시각화가능한 잠재공간으로 매핑하는 심층 오토인코더 방법인 시각화 오토인코더를 제안한다.In order to solve this problem, the present invention proposes a visualization autoencoder, which is a deep autoencoder method that maps to a visualizable latent space of three dimensions or less.

먼저 시각화 오토인코더의 의미에 대해 이해하기 위해 인 학습된 시각화 오토인코더를 고려해보자(도 4 (B)).First, to understand the meaning of the visualization autoencoder Consider a visualized autoencoder trained with .

도 4는 2차원 공간에 비선형적으로 분포되어 있는 3개의 점을 1차원으로 투영하는 문제를 고려했을 때, l2 norm 거리함수를 손실함수로 사용하여 최소화한 경우를 나타낸다.4 shows a case in which the l2 norm distance function is minimized by using it as a loss function when considering the problem of projecting three points nonlinearly distributed in a two-dimensional space into one dimension.

도 4 (A)는 선형변환만 사용한 선형 차원축소 방법으로 3개의 점을 모두 최소화하지 못한 반면, 선형변환과 비선형변환을 복합적으로 사용한

는 3개의 점을 지나는 2차원 공간에서 곡선을 만들어낼 수 있다.4 (A) shows that all three points were not minimized by the linear dimensionality reduction method using only linear transformation, whereas linear transformation and nonlinear transformation were used in combination.

can create a curve in a two-dimensional space passing through three points.

즉 이 의미를 확장하면 은 3차원 공간에 매핑된 데이터를 지나는 곡선인 잠재공간을 가지는 경우를 의미하고 은 3차원 공간에 매핑된 데이터를 지나는 곡면인 잠재공간을 가진다는 의미이다.That is, if we expand this meaning means a case of having a latent space, which is a curve passing through data mapped in a 3-dimensional space, and means that it has a latent space, which is a curved surface passing through the data mapped in the 3D space.

이를 더욱 확장하면 은 임의의 차원 잠재공간에 매핑된 잠재벡터들을 지나는 곡면과 같이 시각화할 수 있는 잠재공간으로 매핑할 수 있다는 의미다.Expanding this further is random This means that latent vectors mapped to a dimensional latent space can be mapped to a latent space that can be visualized like a curved surface.

이제 구체적으로 시각화 오토인코더를 사용하여 잠재공간을 조작해보자.Now let's specifically manipulate the latent space using the visualization autoencoder.

상기 학습된 심층신경망의 잠재공간 조작 정의에서 설명한 학습된 심층 오토인코더

의 차원 잠재공간에 매핑된 잠재벡터

를 입력으로 하여 2차원 잠재공간에 시각화하는 학습된 시각화 오토인코더

가 주어졌다고 해보자(수학식 4).The trained deep autoencoder described in the latent space manipulation definition of the trained deep neural network above.

of Latent vector mapped to dimensional latent space

A trained visualization autoencoder that visualizes in a two-dimensional latent space by taking as an input

Suppose that is given (Equation 4).

를 복원하는

는 2차원 잠재공간으로 매핑하는

인코더 함수와

와 같은 잠재공간으로 복원하는

디코더 함수로 구성되어 있으며

와 마찬가지로 잠재공간을 조작하기 위해 일대일 대응과 locally smoothness 성질을 만족한다고 가정하자.

to restore

is mapped to a two-dimensional latent space.

encoder function and

restoring to a latent space such as

It consists of a decoder function and

Assume that one-to-one correspondence and locally smoothness properties are satisfied to manipulate the latent space as in .

그리고

는 다음과 같은 손실함수 값을 가진다고 가정하자(수학식 5).and

Assume that has the following loss function value (Equation 5).

이도 마찬가지로 정확한 최적화 구조를 모르므로 적당한

값을 가진다고 하자.Likewise, since we do not know the exact optimization structure,

Let's say it has a value.

를 기하적 관점에서 해석하면 도5와 같다

If analyzed from a geometric point of view, it is as shown in FIG.

도 5는 차원 잠재공간에 매핑된

를 입력으로 받아 2차원 잠재공간의

을 거쳐

로 복원한 것을 나타낸다.Figure 5 mapped to dimensional latent space

of the two-dimensional latent space

through

indicates that it has been restored.

이 때

: A surface

containing

,

: A surface

with one-to-one correspondence with

,

: A surface

with one-to-one correspondence with

,

: Samples generated by locally smoothness properties,

: Transformation path of samples 이다.At this time

: A surface

containing

,

: A surface

with one-to-one correspondence with

,

: A surface

with one-to-one correspondence with

,

: Samples generated by locally smoothness properties,

: Transformation path of samples.

그리고

의 기준을

가

와 가까우면서

에 속하는 점을 갖는 기준이라고 해보자.and

standard of

go

close to

Let be a criterion having a point belonging to .

그렇다면

는

안에 속하고 이는 을 통해 모집단의 부분집합

안에 속하므로 차원에서 샘플

(수학식 6)으로 매핑된다(도 5).then

Is

belongs to and is subsets of the population through

because it belongs in dimensional sample

(Equation 6) is mapped (FIG. 5).

즉 2차원에 매핑된

는

와

에 의해

와 가까운 샘플로 매핑된다.i.e. mapped to 2D

Is

and

by

is mapped to samples close to

이제

를 포함하는 2차원 시각화된 공간에서 적당한 영역

을 잡아보자(도 6).

은

을 통해 차원 잠재공간에 일대일 대응하는 곡면

로 매핑된다. 그리고

는

을 통해 차원 공간에 일대일 대응하는 곡면

을 만들어낸다(도 6과 수학식 7).now

A suitable area in a two-dimensional visualized space containing

Let's catch (Fig. 6).

silver

Through A surface that corresponds one-to-one to the dimensional latent space

is mapped to and

Is

Through A surface that has a one-to-one correspondence in dimensional space

Produces (Fig. 6 and Equation 7).

즉,in other words,

1. 2차원에

이 일대일 대응하는 차원의 곡면

로 매핑함으로써 관심있는 샘플의 변환 경로

을 시각화할 수 있다(도 6).1. In the second dimension

corresponds to this one-to-one dimensional surface

The conversion path of the sample of interest by mapping to

can be visualized (Fig. 6).

2. 또한

에서

의 주변은 locally smoothness 성질에 의해

에서 관심있는 샘플들

을 생성할 수 있다(도 6).2. Also

at

The periphery of is caused by the locally smoothness property

samples of interest in

can be generated (FIG. 6).

이로써 시각화 오토인코더를 사용하여 학습된 심층신경망에서 관심있는 샘플을 생성하거나 변환 과정을 시각화할 수 있다.In this way, you can create samples of interest or visualize the transformation process in deep neural networks trained using visualization autoencoders.

구체적으로 도 6a를 보면, 상기 시각화 오토 인코더(10)는, m개의 학습 이미지 데이터를 학습한 생성 신경망

를 생성하는 신경망 생성 모듈(11); 인코더 함수(

)를 통해 차원 잠재공간에 매핑된 m개의 데이터 중 관심있는 k개 데이터를 선택하는 제1 선택 모듈(12); 차원을 입력으로 하고 시각화 가능한 2차원 잠재공간과 적절한 은닉층을 가지는 시각화 오토인코더

(

는

인코더,

디코더 함수로 구성됨)를 설계하는 설계 모듈(13);

를 k개의

,

(

)) < delta 를 만족할 때 까지 반복하는 반복 모듈(14); 학습이 완료된 후, k 개의

를

를 통해 2차원 잠재공간으로 매핑하는 매핑 모듈(15); 2차원 시각화된 공간에서

를 포함하는 적절한 영역 (가령

의 가장 작은 값과 큰 값 사이)를 선택하는 제2 선택 모듈(16); 추출된 영역 를 각 디코더 함수를 통해 차원 공간으로 매핑하고, 차원에 매핑된 O를 이미지로 변환하는 변환 모듈(17);을 포함한다.Specifically, referring to FIG. 6A, the visualization auto-encoder 10 generates a neural network that has learned m training image data.

a neural network generation module 11 that generates a; encoder function (

)Through the a first selection module 12 for selecting k pieces of data of interest among m pieces of data mapped to the dimensional latent space; A visualization autoencoder that takes dimensions as input and has a visualizable two-dimensional latent space and an appropriate hidden layer.

(

Is

encoder,

a design module 13 for designing a decoder function);

to k

,

(

)) Iteration module 14 repeating until satisfies <delta; After learning is complete, k

cast

A mapping module 15 for mapping to a two-dimensional latent space through in a two-dimensional visualized space

A suitable area containing (chamberlain

a second selection module 16 that selects between the smallest and largest values of ; extracted area through each decoder function map into dimensional space, and a conversion module 17 that converts O mapped to a dimension into an image.

여기에서 실험에 사용한 데이터셋은 MNIST 숫자 데이터으로서, (28X28) 픽셀을 가지는 60,000개의 숫자 이미지 데이터셋이다.The dataset used in the experiment here is MNIST digit data, which is a dataset of 60,000 digit images with (28X28) pixels.

도 6b는 1 단계에서 학습된 심층 오토인코더 구조 등을 보여주는 도면이다.6B is a diagram showing a deep autoencoder structure learned in step 1, and the like.

자세히 보면, F=필터 수, K=커널사이즈, S=스트라이드, pad=패딩여부, abTanh,Sigmoid=활성함수,N=노드 수, 총 파라미터 수, 에폭마다 손실함수 값을 보여준다.In detail, F = number of filters, K = kernel size, S = stride, pad = padding, abTanh, Sigmoid = activation function, N = number of nodes, total number of parameters, and loss function value for each epoch.

예를 들어 학습된 심층 오토인코더 구조를 보면, Keras를 사용, 손실함수 : Binary Crossentropy, 옵티마이저 : Adam, 배치사이즈 : 128, 에폭 : 50이다.For example, looking at the structure of the learned deep autoencoder, Keras is used, loss function: Binary Crossentropy, optimizer: Adam, batch size: 128, and epoch: 50.

도 7은 2 단계에서 선택된 5숫자 100개를 보여주는 도면이다.7 is a diagram showing 100 5 numbers selected in the second step.

본 발명에 따른 실험에 사용한 데이터셋은 MNIST 숫자 데이터로서, 28X28 픽셀을 가지는 60,000개의 숫자 이미지 데이터셋이다.The dataset used in the experiment according to the present invention is MNIST digit data, which is a 60,000 digit image dataset having 28X28 pixels.

예를 들어 k 개를 60,000개의 숫자 이미지 중 5 숫자를 나타내는 약 6,000개의 데이터 중 100개를 랜덤으로 선택했다고 해보자.For example, let's say we randomly selected 100 out of about 6,000 data representing 5 numbers out of k 60,000 number images.

도 8은 3 단계로 차원을 입력으로 하고 시각화 가능한 2차원 잠재공간과 적절한 은닉층을 가지는 시각화 오토인코더

(

는

인코더,

디코더 함수로 구성됨)를 설계한다.8 is in three steps A visualization autoencoder that takes dimensions as input and has a visualizable two-dimensional latent space and an appropriate hidden layer.

(

Is

encoder,

composed of decoder functions).

도 9는 4 단계로

를 k개의

,

(

)) < delta 를 만족할 때 까지 반복한다.9 is in four steps

to k

,

(

)) until < delta is satisfied.

도 10은 5 단계로 학습이 완료된 후, k 개의

를

를 통해 2차원 잠재공간으로 매핑한다.10 is after learning is completed in 5 steps, k

cast

It is mapped to a two-dimensional latent space through

도 11은 6 단계로 2차원 시각화된 공간에서

를 포함하는 적절한 영역 (가령

의 가장 작은 값과 큰 값 사이)를 선택한다.11 is a two-dimensional visualized space in six steps

A suitable area containing (chamberlain

between the smallest and largest values of

즉 D' 구간을 (10 X 10)그리드로 데이터를 추출한다.That is, the data is extracted in the (10 X 10) grid of section D'.

도 12는 7 단계로 추출된 영역 를 각 디코더 함수를 통해 차원 공간으로 매핑하고, 차원에 매핑된 O를 이미지로 변환한다.12 is an area extracted in 7 steps through each decoder function map into dimensional space, Convert O mapped to dimension to image.

도 13은 다양한 데이터 생성 예시로서, 특히 2차원 시각화 오토인코더를 사용한 다양한 데이터 생성 예시를 보여준다.13 shows various data generation examples, in particular, various data generation examples using a two-dimensional visualization autoencoder.

도 14는 (a) 두꺼운 0과 얇은 0 이미지 2개를 입력으로 하는 경우, (b) 가로로 긴 0과 세로로 긴 0 이미지 2개를 입력으로 하는 경우, (c) 각도가 다른 3 숫자 2개를 입력으로 하는 경우, (d) 1차원 시각화 오토인코더 학습 후 매핑된 1차원에서 그리드로 그린 이미지 이다.14 shows (a) when inputting two thick 0 and thin 0 images, (b) when inputting two horizontally long 0 and vertically long 0 images, (c) 3 numbers 2 with different angles. In the case of dogs as input, (d) It is an image drawn as a grid in 1D mapped after learning 1D visualization autoencoder.

도 15는 다양한 변환 과정 시각화 예시로서, (a) 5개의 1을 입력으로 하는 경우, (b) 1차원 시각화 오토인코더 학습 후 매핑된 1차원에서 그리드로 그린 이미지를 보여주는 도면이다.15 is an example of visualization of various conversion processes, (a) when five 1s are input, (b) a diagram showing an image drawn in a grid in a 1-dimensional map after learning a 1-dimensional visualization autoencoder.

상술한 내용을 종합하면, 본 발명은 시각화 오토 인코더를 이용한 시각화 오토인코더를 이용한 학습된 생성신경망의 잠재공간 조작 방법으로서, m개의 학습 이미지 데이터를 학습한 생성 신경망

를 생성하는 1단계; 인코더 함수(

)를 통해 차원 잠재공간에 매핑된 m개의 데이터 중 관심있는 k개 데이터를 선택하는 2단계(도 7); 차원을 입력으로 하고 시각화 가능한 2차원 잠재공간과 적절한 은닉층을 가지는 시각화 오토인코더

(

는

인코더,

디코더 함수로 구성됨)를 설계하는 3 단계(도 8);

를 k개의

를 입력으로 학습하고, 상기 학습은 손실함수 L 값이 적당한 기준값 (사용자에 의해 설정) 보다 작은 L(

,

(

)) < delta 를 만족할 때 까지 반복하는 4 단계(도 9); 상기 4단계의 학습이 완료된 후, k 개의

를

를 통해 2차원 잠재공간으로 매핑하는 5 단계(도 10); 상기 5 단계 후에, 2차원 시각화된 공간에서

를 포함하는 적절한 영역 (가령

의 가장 작은 값과 큰 값 사이)를 선택하는 6 단계(도 11); 상기 6 단계 후에 추출된 영역 를 각 디코더 함수를 통해 차원 공간으로 매핑하고, 차원에 매핑된 O를 이미지로 변환하는 7 단계(도 12); 상기 학습 과정을 시각화하고, 학습된 잠재공간을 조작하여 상기 1단계로 돌아가 반복하는 단계;를 포함한다.In summary, the present invention is a latent space manipulation method of a learned generative neural network using a visualization autoencoder using a visualization autoencoder.

Step 1 to generate; encoder function (

)Through the Step 2 of selecting k pieces of data of interest among m pieces of data mapped to the dimensional latent space (FIG. 7); A visualization autoencoder that takes dimensions as input and has a visualizable two-dimensional latent space and an appropriate hidden layer.

(

Is

encoder,

3 steps of designing a decoder function) (Fig. 8);

to k

is learned as an input, and the learning is such that the value of the loss function L is smaller than the appropriate reference value (set by the user) L (

,

(

)) Step 4 is repeated until <delta is satisfied (FIG. 9); After the learning of the above 4 steps is completed, k

cast

Step 5 of mapping to a 2-dimensional latent space through (FIG. 10); After the above 5 steps, in the 2D visualized space

A suitable area containing (chamberlain

Step 6 of selecting (between the smallest and largest values of ) (FIG. 11); Area extracted after step 6 above through each decoder function map into dimensional space, Step 7 of converting O mapped to a dimension into an image (FIG. 12); Visualizing the learning process, manipulating the learned latent space, and returning to step 1 and repeating it.

10 : 시각화 오토 인코더
11 : 신경망 생성 모듈
12 : 제1 선택 모듈
13 : 설계 모듈
14 : 반복 모듈
15 : 매핑 모듈
16 : 제2 선택 모듈
17 : 변환 모듈10: Visualization Autoencoder
11: Neural network generation module
12: first selection module
13: design module
14: repeat module
15: Mapping module
16: second selection module
17: conversion module

Claims

As a latent space manipulation (Control) system of a learned generative neural network using a visualization autoencoder,
As far as the visualization autoencoder in which the mapping function from the latent space to the output space satisfies the one-to-one correspondence property and the locally smoothness property and the learned generative neural network capable of generating data belonging to the group of the output space,
Autoencoder trained on random trained generative neural networks Assuming Is dimensionally characterized matrix of input data and encoder and decoder Matrix restored via the loss function of about When it is said to satisfy Is go means a criterion that is close to and belongs to the population,
Visualization of any latent space or hidden layer of any learned generative neural network using the visualization autoencoder in three dimensions or less to create a sample of interest, or visualization of the transformation process of the sample of interest and its manipulation. Latent space manipulation system of trained generative neural networks using encoders.

delete

The method of claim 1,
The visualization autoencoder Is mapped to dimensional latent space of the two-dimensional latent space through Indicates that it has been restored to , and the standard of go close to Assuming a criterion with a point belonging to Is belongs to and is subsets of the population through because it belongs in dimensional sample A latent space manipulation system of a learned generative neural network using a visualization autoencoder, characterized in that it is mapped by (Equation 6).
[Equation 6]

(mapped to 2D Is and by map to samples close to )

The method of claim 1,
The learned auto-encoder, as shown in Equation 7 below,

A suitable area in a two-dimensional visualized space containing

hold,

silver

Through A surface that corresponds one-to-one to the dimensional latent space

is mapped to

Is

Through A surface that has a one-to-one correspondence in dimensional space

to create a two-dimensional

corresponds to this one-to-one dimensional surface

The conversion path of the sample of interest by mapping to

can be visualized,

at

The periphery of is caused by the locally smoothness property

samples of interest in

can generate
A latent space manipulation system of a learned generative neural network using a visualization autoencoder, characterized in that it can generate a sample of interest in a deep neural network trained using a visualization autoencoder or visualize a conversion process.
[Equation 7]

The method of claim 1,
The visualization auto-encoder 10,
A generative neural network trained on m training image data

A neural network generating module for generating;
encoder function (

A design module for designing;

to k

,

(

cast

A suitable area containing (chamberlain

a second selection module for selecting (between the smallest and largest values of ); and
extracted area through each decoder function map into dimensional space, A latent space manipulation system of a learned generative neural network using a visualization autoencoder including a conversion module that converts O mapped to a dimension into an image.

A latent space manipulation method of a learned generative neural network using the latent space manipulation system of a learned generative neural network using the visualization autoencoder of claim 1,
The latent space manipulation system of the learned generative neural network using the visualization autoencoder,
A generative neural network trained on m training image data

Step 1 to generate;
encoder function (

(

Is

encoder,

3 steps to design a decoder function); and

cast

doggy

,

(

)) 4 steps repeated until <delta is satisfied; a latent space manipulation method of a learned generative neural network using a visualization autoencoder, including.

The method of claim 6,
After the learning of the above 4 steps is completed, k

cast

A latent space manipulation method of a learned generative neural network using a visualization autoencoder, including step 5 of mapping to a two-dimensional latent space through

The method of claim 7,
After the above 5 steps, in the 2D visualized space

A suitable area containing A method for manipulating the latent space of a learned generative neural network using a visualization autoencoder, including step 6 of selecting a.

The method of claim 8,
Area extracted after step 6 above through each decoder function map into dimensional space, A latent space manipulation method of a learned generative neural network using a visualization autoencoder, including step 7 of converting O mapped to a dimension into an image.

The method according to any one of claims 6 to 9,
Visualizing the learning process, manipulating the learned latent space, and returning to step 1 and repeating the process;