KR102103727B1

KR102103727B1 - Apparatus and method for generating image using skim-pixel convolution neural network

Info

Publication number: KR102103727B1
Application number: KR1020180104354A
Authority: KR
Inventors: 유영준; 전상혁; 유재준; 윤상두; 하정우
Original assignee: 네이버 주식회사; 라인 가부시키가이샤
Priority date: 2018-09-03
Filing date: 2018-09-03
Publication date: 2020-04-24
Also published as: JP2020038639A; JP7045351B2; KR20200026435A

Abstract

본 출원은 스킴-픽셀CNN을 이용한 이미지 자동 생성 방법 및 이미지 자동 생성 장치에 관한 것으로서, 본 발명의 일 실시예에 의한 스킴-픽셀CNN을 이용한 이미지 자동 생성 방법은, 픽셀예측부가, 이미지 내에 이미 생성된 기존 픽셀들의 픽셀값을 이용하여, 생성하고자 하는 복수의 대상 픽셀들의 픽셀예측값을 동시에 생성하는 예측단계; 신뢰도추정부가, 상기 대상 픽셀별로 상기 픽셀예측값들에 대한 신뢰도(confidence)를 생성하는 신뢰도생성단계; 상기 대상 픽셀의 신뢰도가 설정값 이상이면, 픽셀생성부가 상기 픽셀예측값을 상기 대상 픽셀의 픽셀값으로 설정하는 스키밍 단계; 및 상기 대상 픽셀의 신뢰도가 상기 설정값 미만이면, 상기 픽셀생성부가 픽셀 CNN모델(Pixel Convolution Neural Network)을 이용하여 상기 대상 픽셀의 픽셀 추론값을 생성하고, 상기 픽셀추론값을 상기 대상 픽셀의 픽셀값으로 설정하는 드로우(draw) 단계를 포함할 수 있다. The present application relates to a method for automatically generating an image using a scheme-pixel CNN and an apparatus for automatically generating an image. In the method for automatically generating an image using a scheme-pixel CNN according to an embodiment of the present invention, a pixel prediction unit is already generated in an image. A prediction step of simultaneously generating pixel prediction values of a plurality of target pixels to be generated using the pixel values of the existing pixels; A reliability generation step of generating a confidence for the pixel predicted values for each target pixel by a reliability estimation unit; If the reliability of the target pixel is greater than or equal to a set value, a skimming step in which the pixel generator sets the pixel prediction value to the pixel value of the target pixel; And if the reliability of the target pixel is less than the set value, the pixel generator generates a pixel inference value of the target pixel using a pixel CNN model (Pixel Convolution Neural Network), and the pixel inference value is a pixel of the target pixel. It may include a draw step of setting the value.

Description

Scheme—Automatic image generation method using pixel CN and automatic image generation device {Apparatus and method for generating image using skim-pixel convolution neural network}

본 출원은 학습된 이미지를 이용하여 새로운 이미지를 자동으로 생성할 수 있는 이미지 자동 생성 방법 및 이미지 자동 생성 장치에 관한 것으로서, 특히 픽셀CNN모델을 이용하는 이미지 자동 생성 방법 및 이미지 자동 생성 장치에 관한 것이다. The present application relates to an automatic image generation method and an automatic image generation device capable of automatically generating a new image using a learned image, and more particularly to an automatic image generation method and an automatic image generation device using a pixel CNN model.

기계 학습은 인터넷 정보검색, 텍스트 마이닝, 음성인식, 로보틱스, 서비스업 등 거의 모든 분야에서 이용되는 핵심 기술이다. 최근 기계 학습의 일 분야인 딥러닝(Deep learning) 기술이 다양한 분야에서 각광받고 있으며, 특히 영상 기반의 객체 인식(recognition) 분야에서는 딥러닝 기술의 일종으로서 컨벌루션 신경망(CNN: convolutional neural network)을 이용한 기계 학습 기법이 주목 받고 있다.Machine learning is a core technology used in almost all fields, such as Internet information retrieval, text mining, speech recognition, robotics, and service industry. Deep learning, a field of machine learning, has recently been spotlighted in various fields, and in the field of image-based object recognition, a convolutional neural network (CNN) is used as a kind of deep learning technology. Machine learning techniques are drawing attention.

컨벌루션 신경망 기술은, 회선 신경망 기술이라고도 하며, 입력된 이미지를 계산을 거쳐 이해하고, 특징을 추출하여 정보를 획득하거나, 새로운 이미지를 생성하는 등 다양한 영상 처리 내지 컴퓨터 비전 분야에서 활발히 연구되고 있으며, 사람의 신경 계통을 모사하여 설계되는 인공신경망 기술의 일종이다.Convolutional neural network technology, also called convolutional neural network technology, is actively researched in various image processing or computer vision fields, such as understanding the input image through calculation, extracting features, and generating new images. It is a kind of artificial neural network technology designed by simulating the nervous system of.

본 출원은, 학습된 이미지를 이용하여 새로운 이미지를 자동으로 생성할 수 있는 스킴-픽셀CNN을 이용한 이미지 자동 생성 방법 및 이미지 자동 생성 장치를 제공하고자 한다. The present application is to provide a method for automatically generating an image and a method for automatically generating an image using a scheme-pixel CNN that can automatically generate a new image using a learned image.

본 출원은, 이미지 생성에 소요되는 연산량 및 연산시간을 획기적으로 감소시킬 수 있는 스킴-픽셀CNN을 이용한 이미지 자동 생성 방법 및 이미지 자동 생성 장치를 제공하고자 한다. This application is intended to provide a method for automatically generating an image and an apparatus for automatically generating an image using scheme-pixel CNN, which can dramatically reduce the amount of computation and time required for image generation.

본 발명의 일 실시예에 의한 스킴-픽셀CNN을 이용한 이미지 자동 생성 방법은, 픽셀예측부가, 이미지 내에 이미 생성된 기존 픽셀들의 픽셀값을 이용하여, 생성하고자 하는 복수의 대상 픽셀들의 픽셀예측값을 동시에 생성하는 예측단계; 신뢰도추정부가, 상기 대상 픽셀별로 상기 픽셀예측값들에 대한 신뢰도(confidence)를 생성하는 신뢰도생성단계; 상기 대상 픽셀의 신뢰도가 설정값 이상이면, 픽셀생성부가 상기 픽셀예측값을 상기 대상 픽셀의 픽셀값으로 설정하는 스키밍 단계; 및 상기 대상 픽셀의 신뢰도가 상기 설정값 미만이면, 상기 픽셀생성부가 픽셀 CNN모델(Pixel Convolution Neural Network)을 이용하여 상기 대상 픽셀의 픽셀 추론값을 생성하고, 상기 픽셀추론값을 상기 대상 픽셀의 픽셀값으로 설정하는 드로우(draw) 단계를 포함할 수 있다. In the method of automatically generating an image using the scheme-pixel CNN according to an embodiment of the present invention, the pixel prediction unit simultaneously uses the pixel values of existing pixels already generated in the image, to simultaneously generate pixel prediction values of a plurality of target pixels to be generated. A generating prediction step; A reliability generation step of generating a confidence for the pixel predicted values for each target pixel by a reliability estimation unit; If the reliability of the target pixel is greater than or equal to a set value, a skimming step in which the pixel generator sets the pixel prediction value to the pixel value of the target pixel; And if the reliability of the target pixel is less than the set value, the pixel generator generates a pixel inference value of the target pixel using a pixel CNN model (Pixel Convolution Neural Network), and the pixel inference value is a pixel of the target pixel. It may include a draw step of setting the value.

본 발명의 일 실시예에 의한 스킴-픽셀CNN을 이용한 이미지 자동 생성 장치는, 이미지 내에 이미 생성된 기존 픽셀들의 픽셀값을 이용하여, 생성하고자 하는 복수의 대상 픽셀들의 픽셀예측값을 동시에 생성하는 픽셀예측부; 상기 대상 픽셀별로 상기 픽셀예측값들에 대한 신뢰도(confidence)를 생성하는 신뢰도추정부; 및 상기 대상 픽셀의 신뢰도가 설정값 이상이면 상기 픽셀예측값을 상기 대상 픽셀의 픽셀값으로 설정하고, 상기 대상 픽셀의 신뢰도가 상기 설정값 미만이면 픽셀 CNN모델(Pixel Convolution Neural Network)을 이용하여 상기 대상 픽셀의 픽셀 추론값을 생성하고 상기 픽셀추론값을 상기 대상 픽셀의 픽셀값으로 설정하는 픽셀생성부를 포함할 수 있다. The apparatus for automatically generating an image using the scheme-pixel CNN according to an embodiment of the present invention uses the pixel values of the existing pixels already generated in the image to generate pixel prediction values of a plurality of target pixels simultaneously. part; A reliability estimation unit generating confidence for the pixel prediction values for each target pixel; And if the reliability of the target pixel is greater than or equal to a set value, the pixel predicted value is set as a pixel value of the target pixel, and if the reliability of the target pixel is less than the set value, the target is used using a pixel CNN model (Pixel Convolution Neural Network). And a pixel generating unit generating a pixel inference value of a pixel and setting the pixel inference value to a pixel value of the target pixel.

본 발명의 다른 실시예에 의한 스킴-픽셀CNN을 이용한 이미지 자동 생성 장치는, 프로세서; 및 상기 프로세서에 커플링된 메모리를 포함하는 것으로서, 상기 메모리는 상기 프로세서에 의하여 실행되도록 구성되는 하나 이상의 모듈을 포함하고, 상기 하나 이상의 모듈은, 이미지 내에 이미 생성된 기존 픽셀들의 픽셀값을 이용하여, 생성하고자 하는 복수의 대상 픽셀들의 픽셀예측값을 동시에 생성하고, 상기 대상 픽셀별로 상기 픽셀예측값들에 대한 신뢰도(confidence)를 생성하며, 상기 대상 픽셀의 신뢰도가 설정값 이상이면, 상기 픽셀예측값을 상기 대상 픽셀의 픽셀값으로 설정하고, 상기 대상 픽셀의 신뢰도가 상기 설정값 미만이면, 픽셀 CNN모델(Pixel Convolution Neural Network)을 이용하여 상기 대상 픽셀의 픽셀 추론값을 생성하고, 상기 픽셀추론값을 상기 대상 픽셀의 픽셀값으로 설정하는, 명령어를 포함할 수 있다. An apparatus for automatically generating an image using scheme-pixel CNN according to another embodiment of the present invention includes a processor; And a memory coupled to the processor, wherein the memory includes one or more modules configured to be executed by the processor, and the one or more modules use pixel values of existing pixels already generated in an image. , Simultaneously generating pixel prediction values of a plurality of target pixels to be generated, generating confidence for the pixel prediction values for each target pixel, and if the reliability of the target pixel is greater than or equal to a set value, the pixel prediction value is Set the pixel value of the target pixel, and if the reliability of the target pixel is less than the set value, generate a pixel inference value of the target pixel using a pixel CNN model (Pixel Convolution Neural Network), and the pixel inference value It may include a command to set the pixel value of the target pixel.

덧붙여 상기한 과제의 해결수단은, 본 발명의 특징을 모두 열거한 것이 아니다. 본 발명의 다양한 특징과 그에 따른 장점과 효과는 아래의 구체적인 실시형태를 참조하여 보다 상세하게 이해될 수 있을 것이다.In addition, not all the features of the present invention are listed in the solution means of the above-mentioned subject. Various features of the present invention and the advantages and effects thereof may be understood in more detail with reference to specific embodiments below.

본 발명의 일 실시예에 의한 스킴-픽셀CNN을 이용한 이미지 자동 생성 방법 및 이미지 자동 생성 장치에 의하면, 이미지 생성시 상대적으로 중요하지 않는 픽셀 영역에 대하여는 간단한 예측모델로 픽셀값을 설정할 수 있다. 즉, 픽셀CNN모델을 이용한 픽셀추론값 생성을 생략(skim)할 수 있으므로, 필요한 연산량을 획기적으로 감소시킬 수 있다. 또한, 상대적으로 중요도가 높은 영역에서는 픽셀CNN모델을 이용하여 픽셀추론값을 직접 생성하므로, 이미지 생성에 필요한 연산량을 줄여 연산 속도를 향상시키면서도, 높은 품질의 이미지를 생성하는 것이 가능하다.According to the method for automatically generating an image using the scheme-pixel CNN and the apparatus for automatically generating an image according to an embodiment of the present invention, a pixel value may be set as a simple prediction model for a relatively insignificant pixel region when generating an image. That is, since it is possible to omit the generation of pixel inference values using the pixel CNN model, it is possible to significantly reduce the amount of computation required. In addition, since the pixel inference value is directly generated using the pixel CNN model in a region of relatively high importance, it is possible to generate a high-quality image while reducing the amount of computation required to generate the image and improving the computation speed.

다만, 본 발명의 실시예들에 따른 스킴-픽셀CNN을 이용한 이미지 자동 생성 방법 및 이미지 자동 생성 장치가 달성할 수 있는 효과는 이상에서 언급한 것들로 제한되지 않으며, 언급하지 않은 또 다른 효과들은 아래의 기재로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.However, the effects that can be achieved by the method for automatically generating an image and the method for automatically generating an image using the scheme-pixel CNN according to embodiments of the present invention are not limited to those mentioned above, and other effects not mentioned are as follows. From the description of the present invention will be clearly understood by those of ordinary skill in the art.

도1은 본 발명의 일 실시예에 의한 이미지 생성 장치를 나타내는 개략도이다.
도2 및 도3은 본 발명의 일 실시예에 의한 이미지 생성 장치를 나타내는 블록도이다.
도4 내지 도6은 본 발명의 일 실시예에 의한 이미지 생성 장치에 의한 이미지 생성을 나타내는 개략도이다.
도7은 본 발명의 일 실시예에 의한 이미지 생성 장치를 이용하여 생성한 이미지들을 나타내는 도면이다.
도8은 본 발명의 일 실시예에 의한 이미지 생성 장치의 이미지 생성 속도를 나타내는 그래프이다.
도9는 본 발명의 일 실시예에 의한 이미지 생성 방법을 나타내는 순서도이다. 1 is a schematic diagram showing an image generating apparatus according to an embodiment of the present invention.
2 and 3 are block diagrams showing an image generating apparatus according to an embodiment of the present invention.
4 to 6 are schematic diagrams showing image generation by an image generation apparatus according to an embodiment of the present invention.
7 is a view showing images generated using an image generating apparatus according to an embodiment of the present invention.
8 is a graph showing an image generation speed of an image generating apparatus according to an embodiment of the present invention.
9 is a flowchart illustrating an image generation method according to an embodiment of the present invention.

이하, 첨부된 도면을 참조하여 본 명세서에 개시된 실시 예를 상세히 설명하되, 도면 부호에 관계없이 동일하거나 유사한 구성요소는 동일한 참조 번호를 부여하고 이에 대한 중복되는 설명은 생략하기로 한다. 이하의 설명에서 사용되는 구성요소에 대한 접미사 "모듈" 및 "부"는 명세서 작성의 용이함만이 고려되어 부여되거나 혼용되는 것으로서, 그 자체로 서로 구별되는 의미 또는 역할을 갖는 것은 아니다. 즉, 본 발명에서 사용되는 '부'라는 용어는 소프트웨어, FPGA 또는 ASIC과 같은 하드웨어 구성요소를 의미하며, '부'는 어떤 역할들을 수행한다. 그렇지만 '부'는 소프트웨어 또는 하드웨어에 한정되는 의미는 아니다. '부'는 어드레싱할 수 있는 저장 매체에 있도록 구성될 수도 있고 하나 또는 그 이상의 프로세서들을 재생시키도록 구성될 수도 있다. 따라서, 일 예로서 '부'는 소프트웨어 구성요소들, 객체지향 소프트웨어 구성요소들, 클래스 구성요소들 및 태스크 구성요소들과 같은 구성요소들과, 프로세스들, 함수들, 속성들, 프로시저들, 서브루틴들, 프로그램 코드의 세그먼트들, 드라이버들, 펌웨어, 마이크로 코드, 회로, 데이터, 데이터베이스, 데이터 구조들, 테이블들, 어레이들 및 변수들을 포함한다. 구성요소들과 '부'들 안에서 제공되는 기능은 더 작은 수의 구성요소들 및 '부'들로 결합되거나 추가적인 구성요소들과 '부'들로 더 분리될 수 있다.Hereinafter, exemplary embodiments disclosed herein will be described in detail with reference to the accompanying drawings, but the same or similar elements are assigned the same reference numbers regardless of the reference numerals, and overlapping descriptions thereof will be omitted. The suffixes "modules" and "parts" for the components used in the following description are given or mixed only considering the ease of writing the specification, and do not have meanings or roles distinguished from each other in themselves. That is, the term 'unit' used in the present invention refers to a hardware component such as software, FPGA or ASIC, and 'unit' performs certain roles. However, 'wealth' is not limited to software or hardware. The 'unit' may be configured to be in an addressable storage medium or may be configured to reproduce one or more processors. Thus, as an example, 'part' refers to components such as software components, object-oriented software components, class components and task components, processes, functions, attributes, procedures, Includes subroutines, segments of program code, drivers, firmware, microcode, circuitry, data, database, data structures, tables, arrays and variables. The functionality provided within components and 'parts' may be combined into a smaller number of components and 'parts' or further separated into additional components and 'parts'.

또한, 본 명세서에 개시된 실시 예를 설명함에 있어서 관련된 공지 기술에 대한 구체적인 설명이 본 명세서에 개시된 실시 예의 요지를 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다. 또한, 첨부된 도면은 본 명세서에 개시된 실시 예를 쉽게 이해할 수 있도록 하기 위한 것일 뿐, 첨부된 도면에 의해 본 명세서에 개시된 기술적 사상이 제한되지 않으며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다.In addition, in describing the embodiments disclosed in the present specification, when it is determined that detailed descriptions of related known technologies may obscure the gist of the embodiments disclosed herein, detailed descriptions thereof will be omitted. In addition, the accompanying drawings are only for easy understanding of the embodiments disclosed in the present specification, and the technical spirit disclosed in the specification is not limited by the accompanying drawings, and all modifications included in the spirit and technical scope of the present invention , It should be understood to include equivalents or substitutes.

도1은 본 발명의 일 실시예에 의한 이미지 자동 생성 장치를 나타내는 개략도이다.1 is a schematic diagram showing an apparatus for automatically generating an image according to an embodiment of the present invention.

이미지 자동생성장치(100)는 딥러닝(deep learing) 등의 기계학습(machine learing) 기법을 이용하여 학습용 이미지(t_i)들을 학습할 수 있으며, 학습한 결과를 바탕으로 임의의 이미지(g_i)를 새롭게 생성할 수 있다. 예를들어, 인물사진들을 이미지 자동생성장치(100)에 학습시킨 후, 새로운 인물사진을 생성할 것을 지시하면, 이미지 자동생성장치(100)는 이미 학습된 이미지들과는 상이한 새로운 인물사진을 생성할 수 있다. 여기서, 동일한 종류의 이미지들에 포함되는 픽셀(pixel)들의 확률분포는 서로 유사하게 형성될 수 있으며, 이미지 자동생성장치(100)는 복수의 학습용 이미지(t_i)들을 학습하여, 동일한 종류의 이미지들에 포함되는 픽셀들의 확률분포를 얻을 수 있다. The image automatic generation device 100 may learn the learning images t_i using a machine learning technique such as deep learing, and random images (g_i) based on the learned results. Can be newly created. For example, after learning portrait pictures on the image automatic generation device 100 and instructing to generate a new portrait picture, the image automatic generation device 100 may generate a new portrait picture different from images already learned. have. Here, the probability distribution of pixels included in the same kind of images may be formed similarly to each other, and the image automatic generation device 100 learns a plurality of training images t_i, thereby making the images of the same kind. Probability distribution of pixels included in can be obtained.

도1에 도시한 바와 같이, 이미지 자동생성장치(100)는 n개의 행과 m개의 열로 배열되는 복수의 픽셀들을 포함하는 이미지(g_i)를 생성할 수 있다. 이때, 각각의 행들은 위에서부터 아래로 순차적으로 생성될 수 있으며, 각각의 행에 포함된 픽셀들에 대하여 좌측에서 우측으로 진행하면서 픽셀값들이 설정될 수 있다. 여기서, 이미지 자동생성장치(100)가 생성하는 이미지(g_i)는, 각각의 픽셀에 대응하는 픽셀값을 이용하여 시퀀스(sequence)로 나타낼 수 있으며, 예를들어, 이미지 X는 X = {x_i | i = 1, ... , n×m}로 나타낼 수 있다.As shown in FIG. 1, the image automatic generation device 100 may generate an image g_i including a plurality of pixels arranged in n rows and m columns. At this time, each row may be sequentially generated from top to bottom, and pixel values may be set while proceeding from left to right with respect to the pixels included in each row. Here, the image (g_i) generated by the image automatic generation device 100 may be represented as a sequence using pixel values corresponding to each pixel, for example, the image X is X = {x _i | i = 1, ..., n × m}.

여기서, 이미지 자동생성장치(100)는 픽셀CNN(Pixel Convolution Neural Network) 모델을 활용하여, 이미지를 생성할 수 있다. 픽셀CNN모델은 다음과 같은 수학식으로 나타낼 수 있다. Here, the image auto-generation device 100 may generate an image by using a Pixel Convolution Neural Network (CNN) model. The pixel CNN model can be expressed by the following equation.

여기서, X_≤i = {x₁,...,x_i}는 생성 중인 이미지 내에 이미 생성된 기존 픽셀들의 픽셀값이고, X_j:i = {x_i+1,...,x_j}는 생성하고자 하는 대상 픽셀들은 픽셀값이며, p(X)는 n×m개 픽셀을 포함하는 이미지 X의 픽셀값에 대한 확률함수에 해당한다. 또한, n×m는 이미지 내에 포함된 전체 픽셀의 개수이고, j>i, ∀j,i∈ [1, n×m]을 만족한다. Here, X _≤ _i = {x ₁ , ..., x _i } is the pixel values of existing pixels already generated in the image being generated, and X _{j: i} = {x _{i + 1} , ..., x _j } The target pixels to be generated are pixel values, and p (X) corresponds to a probability function for the pixel value of the image X including n × m pixels. Further, n × m is the total number of pixels included in the image, and satisfies j> i, ∀j, i∈ [1, n × m].

이때, 픽셀CNN모델에서는 p_θ(x_l|x₁,...,x_l-1)를 마스크 컨볼루션(masked convolution)을 이용하여 필터링하는 방식으로 근사화할 수 있다. 이를 통하여, 픽셀CNN모델은 이미지 내의 기존 픽셀(X_≤i)들의 픽셀값들로부터 다음 픽셀(X_i+1)의 픽셀값을 추론할 수 있으며, 추론한 픽셀추론값들을 이용하여 이미지를 생성할 수 있다.At this time, in the pixel CNN model, p _θ (x _l | x ₁ , ..., x _l-1 ) can be approximated by filtering using masked convolution. Through this, the pixel CNN model can _infer the pixel value of the next pixel (X _{i + 1} ) from the pixel values of the existing pixels (X _≤ _i ) in the image, and generate an image using the inferred pixel inference values. You can.

다만, 픽셀CNN모델을 이용하는 경우, 직전 픽셀까지의 픽셀값을 알아야 다음 픽셀의 픽셀값을 추론하는 것이 가능하다. 따라서, 생성하고자 하는 이미지에 포함되는 각각의 픽셀들에 대한 픽셀추론값들을 모두 연산해야 하며, 각각의 픽셀추론값들은 하나씩 순차적으로 연산되어야 한다. 즉, 픽셀CNN모델을 이용하여 새로운 이미지를 생성하는 것은 가능하지만, 수행해야하는 연산량이 많고, 이미지 생성에 상대적으로 많은 시간이 소요될 수 있다. However, when using the pixel CNN model, it is possible to infer the pixel value of the next pixel by knowing the pixel value up to the previous pixel. Accordingly, all pixel inference values for each pixel included in the image to be generated must be calculated, and each pixel inference value must be sequentially calculated one by one. That is, although it is possible to generate a new image using the pixel CNN model, there is a large amount of computation to be performed and it may take a relatively long time to generate the image.

이를 해결하기 위하여, 본 발명의 일 실시예에 의한 이미지 자동생성장치(100)는, 스킴-픽셀CNN(skim-pixel convolution neural network)을 이용하여, 이미지 생성시 상대적으로 중요하지 않는 픽셀 영역에 대하여는 간단한 예측모델로 픽셀값을 설정할 수 있다. 즉, 픽셀CNN모델을 이용한 픽셀추론값 생성을 생략(skim)할 수 있으므로, 필요한 연산량을 획기적으로 감소시킬 수 있다. 또한, 상대적으로 중요도가 높은 영역에서는 픽셀CNN모델을 이용하여 픽셀추론값을 직접 생성하므로, 이미지 생성에 필요한 연산량을 줄여 연산 속도를 향상시키면서도, 높은 품질의 이미지를 생성하는 것이 가능하다. In order to solve this, the image automatic generation apparatus 100 according to an embodiment of the present invention uses a skim-pixel convolution neural network (CNN), for pixel regions that are relatively insignificant when generating an image. A pixel value can be set with a simple prediction model. That is, since it is possible to omit the generation of pixel inference values using the pixel CNN model, it is possible to significantly reduce the amount of computation required. In addition, since the pixel inference value is directly generated using the pixel CNN model in a region of relatively high importance, it is possible to generate a high-quality image while reducing the amount of computation required to generate the image and improving the computation speed.

도2는 본 발명의 일 실시예에 의한 이미지 자동생성장치(100)를 나타내는 블록도이다. 2 is a block diagram showing an image automatic generation device 100 according to an embodiment of the present invention.

도2를 참조하면, 본 발명의 일 실시예에 의한 이미지 자동생성장치(100)는 픽셀예측부(110), 신뢰도추정부(120) 및 픽셀생성부(130)를 포함할 수 있다. Referring to FIG. 2, the apparatus 100 for automatically generating an image according to an embodiment of the present invention may include a pixel prediction unit 110, a reliability estimation unit 120, and a pixel generation unit 130.

이하, 도2를 참조하여 본 발명의 일 실시예에 의한 이미지 자동생성장치를 설명한다. Hereinafter, an automatic image generation apparatus according to an embodiment of the present invention will be described with reference to FIG. 2.

픽셀예측부(110)는 이미지 내에 이미 생성된 기존 픽셀들의 픽셀값을 이용하여, 생성하고자 하는 복수의 대상 픽셀들의 픽셀예측값들을 동시에 생성할 수 있다. 여기서, 픽셀예측부(110)는 도4(a)에 도시한 바와 같이, 대상영역(p1)을 미리 설정할 수 있으며, 대상영역(p1)에 포함되는 대상 픽셀들의 픽셀 예측값을 동시에 생성할 수 있다. 이후, 하나의 행에 대한 픽셀값의 설정이 완료되면, 다음 기 설정된 개수(예를들어, 2개)의 행을 대상영역으로 설정하고, 해당 대상영역 내에 포함된 각각의 대상픽셀들에 대한 픽셀 예측값을 생성할 수 있다. The pixel prediction unit 110 may simultaneously generate pixel prediction values of a plurality of target pixels to be generated by using pixel values of existing pixels already generated in an image. Here, the pixel prediction unit 110 may set the target region p1 in advance, as illustrated in FIG. 4 (a), and may simultaneously generate pixel prediction values of target pixels included in the target region p1. . Then, when the setting of the pixel value for one row is completed, the next predetermined number of rows (for example, two) is set as the target region, and the pixels for each target pixel included in the target region are set. Forecasts can be generated.

구체적으로, i번째 픽셀까지 픽셀값이 설정된 상태에서 j번째 대상픽셀에 대한 픽셀예측값을 생성하는 경우(j>i, ∀j, i∈ [1, n×m]), 픽셀예측부(110)는 i번째 픽셀까지의 픽셀값들과, i+1번째 대상 픽셀부터 j-1번째 대상 픽셀까지의 사전예측값을 픽셀CNN모델에 적용하는 방식으로, j번째 대상픽셀에 대한 픽셀 예측값을 생성할 수 있다. Specifically, when a pixel prediction value is generated for the j-th target pixel while the pixel value is set up to the i-th pixel (j> i, ∀j, i∈ [1, n × m]), the pixel prediction unit 110 By applying the pixel values up to the i-th pixel and the pre-prediction values from the i + 1th target pixel to the j-1th target pixel in the pixel CNN model, a pixel prediction value for the j-th target pixel can be generated. have.

기존의 픽셀CNN모델을 이용하는 경우, j번째 대상픽셀에 대한 픽셀추론값을 연산하기 위해서는, j-1번째 대상픽셀까지의 픽셀추론값들을 모두 계산하여야 했다. 반면에, 픽셀예측부(110)에서는, i+1번째 픽셀부터 j-1번째 픽셀까지의 사전예측값을 적용하므로, j-1번째 픽셀까지의 픽셀값을 알지 못하는 경우에도 j번째 대상 픽셀에 대한 픽셀예측값을 미리 생성하는 것이 가능하다. 즉, 픽셀예측부(110)는 기존 픽셀들의 픽셀값과, 사전예측값들을 픽셀CNN모델에 병렬적으로 적용할 수 있으므로, 복수의 대상픽셀들에 대한 각각의 픽셀 예측값들을 동시에 계산하는 것이 가능하다.When using the existing pixel CNN model, in order to calculate the pixel inference value for the j-th target pixel, it was necessary to calculate all the pixel inference values up to the j-1th target pixel. On the other hand, the pixel prediction unit 110 applies the pre-prediction value from the i + 1th pixel to the j-1th pixel, so even if the pixel value up to the j-1th pixel is unknown, the It is possible to generate pixel prediction values in advance. That is, since the pixel predictor 110 can apply the pixel values of the existing pixels and the pre-prediction values in parallel to the pixel CNN model, it is possible to simultaneously calculate each pixel prediction value for a plurality of target pixels.

여기서, 사전예측값은 U-net 신경망을 이용하여 추출할 수 있다. U-net 신경망은 오토인코더(autoencoder) 구조를 가지는 것으로서, 첫번째 픽셀부터 i번째 픽셀까지의 픽셀값을 이용하여, i+1번째 대상 픽셀부터 j-1번째 대상 픽셀의 픽셀값에 대한 근사치를 제공할 수 있다. 이때, 사전예측값은 IID(independent identically distributed)의 특성을 가지는 것으로서, j번째 대상픽셀에 대한 픽셀예측값과, j+1번째 대상픽셀에 대한 픽셀예측값을 동시에 연산하는 것이 가능하다. 즉, 각각의 대상픽셀들에 대한 사전예측값들을 미리 생성할 수 있으므로, 픽셀예측부(110)는 기존 픽셀들의 픽셀값과, 사전예측값들을 픽셀CNN모델에 병렬적으로 적용할 수 있으며, 이를 통하여 복수의 대상픽셀들에 대한 각각의 픽셀 예측값들을 동시에 계산하는 것이 가능하다. Here, the predicted value can be extracted using a U-net neural network. The U-net neural network has an autoencoder structure and provides an approximation to the pixel value of the i + 1th target pixel to the j-1th target pixel using the pixel values from the first pixel to the i-th pixel. can do. At this time, the pre-predicted value has the characteristics of independent identically distributed (IID), and it is possible to simultaneously calculate the pixel predicted value for the j-th target pixel and the pixel predicted value for the j + 1st target pixel. That is, since the pre-prediction values for each of the target pixels can be generated in advance, the pixel predictor 110 can apply the pixel values of the existing pixels and the pre-prediction values to the pixel CNN model in parallel. It is possible to calculate the respective pixel prediction values for the target pixels of.

구체적으로, 픽셀예측부(110)는, 수학식 1에 사전예측값을 적용하여 다음 수학식 2로 나타낼 수 있다. Specifically, the pixel prediction unit 110 may apply Equation (1) to Equation (1) to represent Equation (2) below.

여기서, q(x)는 사전예측값을 적용한 이미지X의 근사화된 확률함수이고, 대상픽셀들에 대한 각각의 사전예측값인 Z_j-1:i = {z_i+1, ..., z_j-1}는 Z_j-1:i = f_w(X_≤i)로 정의될 수 있으며, 사전예측값은 X_≤i에서 IID(independent and identically distributed)의 특성을 가질 수 있다. 여기서 f_w(X_≤i)가 U-net 신경망에 해당할 수 있으며, p_θ(x_l|X_≤i,z_i+1,...,z_l-1)는 픽셀CNN모델과 동일하게 마스크 컨볼루션을 이용하여 필터링하는 방식으로 계산할 수 있다. 여기서, q(x)는 픽셀CNN모델과 함께 복수의 학습용 이미지들을 학습하여 생성될 수 있다. Here, q (x) is an approximate probability function of the image X to which the pre-prediction value is applied, and each pre-prediction value for the target pixels is Z _{j-1: i} = {z _{i + 1} , ..., z _{j- 1}} Z _j-1: may be defined as _{_{_{i = f w (X ≤i)}}} , advance predictive value may have the characteristics of the IID (independent and identically distributed) in the X _≤i. Here, f _w (X _{≤ i} ) may correspond to a U-net neural network, and p _θ (x _l | X _≤ _i , z _{i + 1} , ..., z _l-1 ) is the same as the pixel CNN model. It can be calculated by filtering using mask convolution. Here, q (x) may be generated by learning a plurality of training images together with the pixel CNN model.

신뢰도추정부(120)는 픽셀예측부(110)에서 생성한 픽셀예측값들에 대한 신뢰도(confidence)를 생성할 수 있다. 픽셀예측부(110)에서 생성하는 픽셀예측값은, i+1번째 대상 픽셀부터 j-1번째 대상 픽셀까지의 사전예측값들을 AR(auto regressive) 방식으로 연산하므로, 사전예측값에 포함된 오차는 점차 커질 수 있다. 즉, 픽셀예측부(110)에서 예측한 값만으로 이미지를 생성하는 경우에는, 오차에 의해 원하는 결과와 상이한 이미지를 생성할 위험이 존재한다. 따라서, 신뢰도추정부(120)는 픽셀예측부(110)에서 생성한 픽셀예측값을 사용할 수 있는지에 대한 신뢰도를 계산하여 정량적인 값으로 제공할 수 있다. The reliability estimator 120 may generate confidence in pixel prediction values generated by the pixel prediction unit 110. Since the pixel prediction value generated by the pixel prediction unit 110 calculates pre-prediction values from the i + 1th target pixel to the j-1th target pixel in an AR (auto regressive) method, the error included in the pre-prediction value gradually increases. You can. That is, when the image is generated with only the values predicted by the pixel prediction unit 110, there is a risk of generating an image different from a desired result due to an error. Therefore, the reliability estimator 120 may calculate the reliability of whether the pixel prediction value generated by the pixel prediction unit 110 can be used and provide it as a quantitative value.

구체적으로, 신뢰도는 다음 수학식 3을 이용하여 계산할 수 있다. Specifically, the reliability can be calculated using Equation 3 below.

여기서, f_k는 k번째 대상픽셀의 픽셀예측값 x_k에 대한 신뢰도이고, 픽셀 추론값

는

을 만족하며, k=i+1, ... ,j이다. 즉, 신뢰도(f_k)는 픽셀예측값(x_k)이 픽셀CNN모델을 이용하여 계산한 픽셀추론값(

)과 동일할 확률에 대응한다. 여기서, 픽셀예측값이 픽셀CNN모델을 이용하여 계산한 픽셀추론값과 동일할 확률이 높을수록 픽셀예측값에 대한 신뢰도가 높고, 확률이 낮을수록 픽셀예측값에 대한 신뢰도는 낮게 설정될 수 있다. Here, f _k is the reliability of the pixel predicted value x _k of the k th target pixel, and the pixel inference value

The

And k = i + 1, ..., j. That is, the reliability (f _k ) is the pixel inference value (x _k ) calculated by using the pixel CNN model.

). Here, the higher the probability that the pixel prediction value is the same as the pixel inference value calculated using the pixel CNN model, the higher the reliability of the pixel prediction value, and the lower the probability, the lower the reliability of the pixel prediction value.

신뢰도추정부(120)는 픽셀CNN모델에서 생성한 픽셀 추론값과 픽셀예측부(110)에서 생성한 픽셀예측값의 차이를 딥러닝 등의 기계학습기법으로 학습할 수 있으며, 학습된 모델에 따라 신뢰도를 연산할 수 있다. 실시예에 따라서는, 픽셀CNN모델을 이용하여 샘플 이미지를 생성한 후, 생성한 샘플 이미지에 포함된 각각의 픽셀들의 픽셀 추론값과, 픽셀예측부(110)에서 생성하는 픽셀 예측값의 차이를 학습할 수 있다. The reliability estimator 120 can learn the difference between the pixel inference value generated by the pixel CNN model and the pixel prediction value generated by the pixel prediction unit 110 by machine learning techniques such as deep learning, and reliability according to the learned model. Can be calculated. According to an embodiment, after generating a sample image using the pixel CNN model, learning a difference between a pixel inference value of each pixel included in the generated sample image and a pixel prediction value generated by the pixel prediction unit 110 can do.

추가적으로, 실시예에 따라서는, 신뢰도를 이진분류(binary classification)하여 표시하는 것도 가능하다. 즉, 신뢰도가 설정값(attention threshold) 이상인 경우에는 픽셀예측값을 신뢰할 수 있는 경우로 판별하여 신뢰도를 1로 재설정할 수 있으며, 신뢰도가 설정값 미만인 경우에는 픽셀예측값을 신뢰하지 못하는 경우로 판별하여 신뢰도를 0으로 재설정할 수 있다. Additionally, depending on the embodiment, it is also possible to display the reliability by binary classification. That is, when the reliability is higher than the set value (attention threshold), the pixel predicted value can be determined as a reliable case to reset the reliability to 1, and when the reliability is less than the set value, the pixel predicted value is determined as unreliable, and the reliability is determined. Can be reset to zero.

픽셀생성부(130)는 대상 픽셀에 대한 픽셀예측값의 신뢰도에 따라, 대상 픽셀의 픽셀값을 설정할 수 있다. 즉, 대상 픽셀의 신뢰도가 설정값 이상이면 픽셀예측값을 대상 픽셀의 픽셀값으로 설정하고, 대상 픽셀의 신뢰도가 설정값 미만이면 픽셀 CNN모델로 생성한 픽셀추론값을 대상 픽셀의 픽셀값으로 설정할 수 있다. The pixel generator 130 may set the pixel value of the target pixel according to the reliability of the pixel predicted value for the target pixel. That is, if the reliability of the target pixel is greater than or equal to the set value, the pixel predicted value is set as the pixel value of the target pixel, and if the reliability of the target pixel is less than the set value, the pixel inference value generated by the pixel CNN model can be set as the pixel value of the target pixel. have.

도4에 도시한 바와 같이, 픽셀예측부(110)와 신뢰도추정부(120)는 대상영역(P1)이 설정되면, 대상영역(P1)에 대응하는 픽셀예측값 및 신뢰도를 생성할 수 있다. 여기서, 신뢰도는 설정값을 기준으로, 이진화영상으로 표시될 수 있다. 즉, 신뢰도가 설정값 이상인 경우에는 흰색(1)로 표시하고, 신뢰도가 설정값 미만인 경우에는 검은색(0)으로 나타낼 수 있다. As illustrated in FIG. 4, when the target region P1 is set, the pixel prediction unit 110 and the reliability estimator 120 may generate pixel prediction values and reliability corresponding to the target region P1. Here, the reliability may be displayed as a binarization image based on a set value. That is, when the reliability is higher than the set value, it may be displayed as white (1), and when the reliability is lower than the set value, it may be represented as black (0).

이후, 도5(a)에 도시한 바와 같이, a1 영역에 포함된 대상픽셀들의 픽셀값부터 순차적으로 픽셀값을 설정할 수 있다. 여기서, 도5(a)에서는 a1영역에 대응하는 신뢰도가 흰색으로 표시되어 있으므로, 픽셀예측값을 신뢰할 수 있는 경우에 해당한다. 따라서, a1 영역에 대한 픽셀값은 픽셀예측값에 따라 설정할 수 있다. Thereafter, as illustrated in FIG. 5 (a), pixel values may be sequentially set from pixel values of target pixels included in a1 area. Here, in FIG. 5 (a), since the reliability corresponding to the area a1 is displayed in white, this corresponds to a case where the pixel predicted value is reliable. Therefore, the pixel value for the area a1 can be set according to the pixel prediction value.

또한, a1 영역의 다음 영역에 해당하는 a2 영역에 포함된 대상픽셀들의 픽셀값은, 도5(b)에 도시한 바와 같이 설정할 수 있다. 즉, a2 영역에 대응하는 신뢰도는 검은색으로 표시되어 있으므로, 픽셀예측값을 신뢰할 수 없는 경우에 해당한다. 따라서, a2 영역에 대응하는 픽셀예측값들을 a2 영역에 적용하지 않을 수 있다. 대신에, 픽셀CNN모델을 이용하여 픽셀추론값을 연산하고, 연산된 픽셀추론값들을 a2영역의 픽셀값으로 설정할 수 있다. 이 경우, 도6(a)에 도시한 바와 같이, a2영역의 픽셀값들이 설정될 수 있다. In addition, the pixel values of the target pixels included in the area a2 corresponding to the area next to the area a1 can be set as shown in FIG. 5 (b). That is, since the reliability corresponding to the region a2 is displayed in black, it corresponds to a case where the pixel prediction value is not reliable. Therefore, pixel prediction values corresponding to the area a2 may not be applied to the area a2. Instead, pixel inference values may be calculated using the pixel CNN model, and the calculated pixel inference values may be set as pixel values in the a2 region. In this case, as shown in Fig. 6 (a), pixel values in the area a2 can be set.

한편, 도6(a)와 같이, 대상 픽셀의 픽셀값이 픽셀추론값으로 설정되면, 나머지 대상영역(p2)에 대한 픽셀예측값 및 신뢰도를 다시 연산하여 업데이트할 수 있다. 즉, 픽셀추론값으로 설정한 대상픽셀의 픽셀값을 반영하여 픽셀예측값 및 신뢰도를 업데이트함으로써, 기존의 픽셀예측값에 포함된 오차들을 제거하고 보다 정확한 픽셀예측값 및 신뢰도를 생성할 수 있다.Meanwhile, as shown in FIG. 6 (a), when the pixel value of the target pixel is set as the pixel inference value, the pixel prediction value and the reliability of the remaining target area p2 may be calculated and updated again. That is, by updating the pixel prediction value and reliability by reflecting the pixel value of the target pixel set as the pixel inference value, errors included in the existing pixel prediction value can be removed and more accurate pixel prediction value and reliability can be generated.

추가적으로, 픽셀 생성부(130)는 이미지 생성을 위한 최초 k개의 픽셀을 미리 생성할 수 있다. 즉, 최초 k개의 픽셀에 대하여는 픽셀예측값 등의 연산을 수행하지 않고, 픽셀CNN모델만을 이용하여 픽셀추론값을 연산하고, 이를 이용하여 이미지를 생성할 수 있다. 경우에 따라서는, 최초 k개의 픽셀에 랜덤한 픽셀값들을 부여하는 것도 가능하다. 예를들어, 이미지 생성시 최초 3개의 열까지는 픽셀CNN모델을 이용한 픽셀추론값이나 랜덤값으로 픽셀값을 설정할 수 있다.Additionally, the pixel generator 130 may generate first k pixels for image generation in advance. That is, for the first k pixels, the pixel inference value may be calculated using only the pixel CNN model, and an image may be generated by using the pixel CNN model. In some cases, it is also possible to assign random pixel values to the first k pixels. For example, up to the first three columns when creating an image, pixel values can be set as pixel inference values or random values using the pixel CNN model.

픽셀생성부(130)는 이미지 생성을 완료할 때까지 상술한 과정들을 반복하여 수행할 수 있다. The pixel generator 130 may repeatedly perform the above-described processes until image generation is completed.

한편, 도3에 도시한 바와 같이, 본 발명의 일 실시예에 의한 이미지 자동 생성장치(100)는, 프로세서(10), 메모리(40) 등의 물리적인 구성을 포함할 수 있으며, 메모리(40) 내에는 프로세서(10)에 의하여 실행되도록 구성되는 하나 이상의 모듈이 포함될 수 있다. 구체적으로, 하나 이상의 모듈에는, 픽셀예측모듈, 신뢰도추정모듈 및 픽셀생성모듈 등이 포함될 수 있다. Meanwhile, as illustrated in FIG. 3, the apparatus 100 for automatically generating an image according to an embodiment of the present invention may include physical configurations such as a processor 10 and a memory 40, and the memory 40 ), One or more modules configured to be executed by the processor 10 may be included. Specifically, the one or more modules may include a pixel prediction module, a reliability estimation module, and a pixel generation module.

프로세서(10)는, 다양한 소프트웨어 프로그램과, 메모리(40)에 저장되어 있는 명령어 집합을 실행하여 여러 기능을 수행하고 데이터를 처리하는 기능을 수행할 수 있다. 주변인터페이스부(30)는, 이미지 자동 생성장치(100)의 입출력 주변 장치를 프로세서(10), 메모리(40)에 연결할 수 있으며, 메모리 제어기(20)는 프로세서(10)나 이미지 자동 생성장치(100)의 구성요소가 메모리(40)에 접근하는 경우에, 메모리 액세스를 제어하는 기능을 수행할 수 있다. 실시예에 따라서는, 프로세서(10), 메모리 제어기(20) 및 주변인터페이스부(30)를 단일 칩 상에 구현하거나, 별개의 칩으로 구현할 수 있다. The processor 10 may perform various functions and execute data processing functions by executing various software programs and a set of instructions stored in the memory 40. The peripheral interface unit 30 may connect the input / output peripheral devices of the image automatic generation device 100 to the processor 10 and the memory 40, and the memory controller 20 may include the processor 10 or the image automatic generation device ( When a component of 100) accesses the memory 40, a function of controlling memory access may be performed. Depending on the embodiment, the processor 10, the memory controller 20 and the peripheral interface unit 30 may be implemented on a single chip or may be implemented as separate chips.

메모리(40)는 고속 랜덤 액세스 메모리, 하나 이상의 자기 디스크 저장 장치, 플래시 메모리 장치와 같은 불휘발성 메모리 등을 포함할 수 있다. 또한, 메모리(40)는 프로세서(10)로부터 떨어져 위치하는 저장장치나, 인터넷 등의 통신 네트워크를 통하여 엑세스되는 네트워크 부착형 저장장치 등을 더 포함할 수 있다. The memory 40 may include a high-speed random access memory, one or more magnetic disk storage devices, and non-volatile memory such as a flash memory device. In addition, the memory 40 may further include a storage device located away from the processor 10 or a network attached storage device accessed through a communication network such as the Internet.

디스플레이부(50)는 사용자가 시각을 통하여 생성된 이미지를 확인할 수 있도록 표시하는 구성일 수 있다. 예를들어, 디스플레이부(50)는 액정 디스플레이(liquid crystal display), 박막 트랜지스터 액정 디스플레이(thin film transistor-liquid crystal display), 유기 발광 다이오드(organic light-emitting diode), 플랙시블 디스플레이(flexible display), 3차원 디스플레이(3D display), 전기영동 디스플레이(electrophoretic display) 등을 이용하여 시각적으로 표시할 수 있다. 다만, 본 발명의 내용은 이에 한정되는 것은 아니며, 이외에도 다양한 방식으로 디스플레이부를 구현할 수 있다. The display unit 50 may be configured to display the user so that the image generated through time can be checked. For example, the display unit 50 includes a liquid crystal display, a thin film transistor-liquid crystal display, an organic light-emitting diode, and a flexible display. , 3D display (3D display), can be displayed visually using an electrophoretic display (electrophoretic display). However, the content of the present invention is not limited to this, and the display unit may be implemented in various ways.

입력부(60)는 사용자로부터 입력을 인가받는 것으로서, 키보드(keyboard), 키패드(keypad), 마우스(mouse), 터치펜(touch pen), 터치 패드(touch pad), 터치 패널(touch panel), 조그 휠(jog wheel), 조그 스위치(jog switch) 등이 입력부(60)에 해당할 수 있다.The input unit 60 receives input from a user, and includes a keyboard, a keypad, a mouse, a touch pen, a touch pad, a touch panel, and a jog A jog wheel, a jog switch, and the like may correspond to the input unit 60.

한편, 도3에 도시한 바와 같이, 본 발명의 일 실시예에 의한 이미지 자동 생성장치(100)는, 메모리(40)에 운영체제를 비롯하여, 응용프로그램에 해당하는 픽셀예측모듈, 신뢰도추정모듈 및 픽셀생성모듈 등을 포함할 수 있다. 여기서, 각각의 모듈들은 상술한 기능을 수행하기 위한 명령어의 집합으로, 메모리(40)에 저장될 수 있다. On the other hand, as shown in Figure 3, the automatic image generating apparatus 100 according to an embodiment of the present invention, the memory 40, the operating system, the pixel prediction module, reliability estimation module and pixel corresponding to the application program It may include a generation module. Here, each module is a set of instructions for performing the above-described functions, and may be stored in the memory 40.

따라서, 본 발명의 일 실시예에 의한 단말장치(100)는, 프로세서(10)가 메모리(40)에 액세스하여 각각의 모듈에 대응하는 명령어를 실행할 수 있다. 다만, 픽셀예측모듈, 신뢰도추정모듈 및 픽셀생성모듈은 상술한 픽셀예측부, 신뢰도추정부 및 픽셀생성부에 각각 대응하므로 여기서는 자세한 설명을 생략한다. Therefore, in the terminal device 100 according to an embodiment of the present invention, the processor 10 may access the memory 40 and execute instructions corresponding to each module. However, the pixel prediction module, the reliability estimation module, and the pixel generation module correspond to the pixel prediction unit, the reliability estimation unit, and the pixel generation unit, respectively, so detailed descriptions thereof are omitted here.

도7은 본 발명의 일 실시예에 의한 스킴-픽셀CNN을 이용하여 생성한 이미지들을 나타내는 예시다. 도7에서 좌측열은 픽셀예측값을 적용한 영역을 나타낸 것(흰색)이고, 중앙열은 이미지의 신뢰도를 나타내는 것으로 신뢰도가 높을수록 붉은색, 신뢰도가 낮을수록 푸른색으로 표시한 것이다. 마지막 우측열은 실제 생성한 이미지에 해당한다. 또한, 도7(a)는 전체 이미지를 픽셀예측값을 이용하여 생성한 것이고, 도7(f)는 전체 이미지를 픽셀CNN모델을 이용하여 픽셀추론값으로 생성한 것이며, 도7(a)에서 도7(f)로 갈수록 픽셀CNN모델을 적용하기 위한 설정값을 높게 설정하였다. 도7을 참조하면, 인물이미지의 경우, 이목구비 등 인물의 특징이 되는 부분에 대한 신뢰도가 상대적으로 낮게 설정됨을 확인할 수 있으며, 픽셀CNN모델만을 이용하여 생성하는 경우와 비교할 때, 이미지 품질의 차이가 크게 나지 않음을 확인할 수 있다. 또한, 도8에 도시한 바와 같이, 픽셀예측값을 이용하는 비율이 높을수록 이미지 생성속도가 빨라지는 것을 확인할 수 있다. 7 is an example showing images generated using a scheme-pixel CNN according to an embodiment of the present invention. In FIG. 7, the left column indicates the area to which the pixel predicted value is applied (white), and the center column indicates the reliability of the image, and the higher the reliability, the red and the lower the reliability, the blue. The last right column corresponds to the actual created image. In addition, FIG. 7 (a) shows that the entire image is generated using pixel prediction values, and FIG. 7 (f) shows that the entire image is generated as pixel inference values using the pixel CNN model. The setting value for applying the pixel CNN model was set higher as it went to 7 (f). Referring to FIG. 7, in the case of a portrait image, it can be confirmed that the reliability of the characteristic part of the character, such as the aspect ratio, is set relatively low, and the difference in image quality is compared to the case where only the pixel CNN model is used. It can be confirmed that it does not appear significantly. In addition, as shown in FIG. 8, it can be seen that the higher the ratio using the pixel prediction value, the faster the image creation speed.

도9는 본 발명의 일 실시예에 의한 이미지 자동 생성 방법을 나타내는 순서도이다. 9 is a flowchart illustrating an automatic image generation method according to an embodiment of the present invention.

도9를 참조하면 본 발명의 일 실시예에 의한 이미지 자동 생성 방법은, 초기생성단계(S10), 예측단계(S20), 신뢰도생성단계(S30), 스키밍단계(S40), 드로우단계(S50) 및 업데이트단계(S60)를 포함할 수 있다. Referring to FIG. 9, the automatic image generation method according to an embodiment of the present invention includes an initial generation step (S10), a prediction step (S20), a reliability generation step (S30), a skimming step (S40), and a draw step (S50). And it may include an update step (S60).

이하, 도9를 참조하여 본 발명의 일 실시예에 의한 이미지 자동 생성 방법을 설명한다. Hereinafter, a method for automatically generating an image according to an embodiment of the present invention will be described with reference to FIG. 9.

초기생성단계(S10)에서는, 픽셀 생성부가 픽셀CNN모델로부터 추출한 픽셀 추론값을 이용하여, 생성하고자 하는 이미지에 대한 최초 k개의 픽셀을 생성할 수 있다. 즉, 최초 k개의 픽셀에 대하여는 픽셀예측값 등의 연산을 수행하지 않고, 픽셀CNN모델만을 이용하여 픽셀추론값을 연산하고, 이를 이용하여 이미지를 생성할 수 있다. 경우에 따라서는, 최초 k개의 픽셀에 랜덤한 픽셀값들을 부여하는 것도 가능하다. 예를들어, 이미지 생성시 최초 3개의 열까지는 픽셀CNN모델을 이용한 픽셀추론값이나 랜덤값으로 픽셀값을 설정할 수 있다. In the initial generation step (S10), the pixel generating unit may generate the first k pixels for the image to be generated using the pixel inference value extracted from the pixel CNN model. That is, for the first k pixels, the pixel inference value may be calculated using only the pixel CNN model, and an image may be generated by using the pixel CNN model. In some cases, it is also possible to assign random pixel values to the first k pixels. For example, up to the first three columns when creating an image, pixel values can be set as pixel inference values or random values using the pixel CNN model.

예측단계(S20)에서는, 픽셀예측부가 이미지 내에 이미 생성된 기존 픽셀들의 픽셀값을 이용하여, 생성하고자 하는 복수의 대상 픽셀들의 픽셀예측값을 동시에 생성할 수 있다. 여기서, 픽셀예측부는 대상영역을 미리 설정할 수 있으며, 대상영역에 포함되는 대상 픽셀들의 픽셀 예측값을 동시에 생성할 수 있다. 이후, 하나의 행에 대한 픽셀값의 설정이 완료되면, 다음 기 설정된 개수(예를들어, 2개)의 행을 대상영역으로 설정하고, 해당 대상영역 내에 포함된 각각의 대상픽셀들에 대한 픽셀 예측값을 생성할 수 있다. In the prediction step S20, the pixel prediction unit may simultaneously generate pixel prediction values of a plurality of target pixels to be generated by using pixel values of existing pixels already generated in an image. Here, the pixel prediction unit may set a target region in advance, and simultaneously generate pixel prediction values of target pixels included in the target region. Then, when the setting of the pixel value for one row is completed, the next predetermined number of rows (for example, two) is set as the target region, and the pixels for each target pixel included in the target region are set. Forecasts can be generated.

구체적으로, i번째 픽셀까지 픽셀값이 설정된 상태에서 j번째 대상픽셀에 대한 픽셀예측값을 생성하는 경우(j>i, ∀j, i∈ [1, n×m])에는, 픽셀예측부가 i번째 픽셀까지의 픽셀값들과, i+1번째 대상 픽셀부터 j-1번째 대상 픽셀까지의 사전예측값을 픽셀CNN모델에 적용하는 방식으로, j번째 대상픽셀에 대한 픽셀 예측값을 생성할 수 있다. 여기서, 사전예측값은 U-net 신경망을 이용하여 추출할 수 있으며, 사전예측값은 IID(independent identically distributed)의 특성을 가질 수 있다. 즉, 각각의 대상픽셀들에 대한 사전예측값들을 미리 생성할 수 있으므로, 픽셀예측부는 기존 픽셀들의 픽셀값과, 사전예측값들을 픽셀CNN모델에 병렬적으로 적용할 수 있으며, 이를 통하여 복수의 대상픽셀들에 대한 각각의 픽셀 예측값들을 동시에 계산하는 것이 가능하다. Specifically, when a pixel prediction value is generated for the j-th target pixel while the pixel value is set up to the i-th pixel (j> i, ∀j, i∈ [1, n × m]), the pixel prediction unit is the i-th A pixel prediction value for the j-th target pixel may be generated by applying the pixel values up to the pixel and the pre-prediction values from the i + 1st target pixel to the j-1th target pixel in the pixel CNN model. Here, the predicted value may be extracted using a U-net neural network, and the predicted value may have the characteristics of independent identically distributed (IID). That is, since the pre-prediction values for each of the target pixels can be generated in advance, the pixel predictor can apply the pixel values of the existing pixels and the pre-prediction values to the pixel CNN model in parallel, through which a plurality of target pixels It is possible to calculate the respective pixel prediction values for.

신뢰도생성단계(S30)에서는, 신뢰도추정부가 대상 픽셀별로 상기 픽셀예측값들에 대한 신뢰도(confidence)를 생성할 수 있다. 즉, 신뢰도추정부는 픽셀예측부에서 생성한 픽셀예측값을 사용할 수 있는지에 대한 신뢰도를 계산하여 정량적인 값으로 제공할 수 있다. 여기서, 신뢰도는 대상 픽셀에 대한 픽셀 예측값이 상기 대상 픽셀에 대한 픽셀 추론값과 일치할 확률일 수 있으며, 신뢰도추정부는 픽셀추론값과 픽셀예측값의 차이를 딥러닝 등의 기계학습기법으로 학습하여 신뢰도를 연산할 수 있다. 구체적으로, 픽셀CNN모델을 이용하여 샘플 이미지를 생성한 후, 생성한 샘플 이미지에 포함된 각각의 픽셀들의 픽셀 추론값과, 픽셀예측부에서 생성하는 픽셀 예측값을 비교하는 방식으로 학습할 수 있다. In the reliability generation step (S30), the reliability estimator may generate confidence for the pixel prediction values for each target pixel. That is, the reliability estimator may calculate the reliability of whether the pixel prediction value generated by the pixel prediction unit can be used and provide it as a quantitative value. Here, the reliability may be a probability that a pixel prediction value for a target pixel coincides with a pixel inference value for the target pixel, and the reliability estimator learns the difference between the pixel inference value and the pixel prediction value using a machine learning technique such as deep learning. Can be calculated. Specifically, after generating a sample image using the pixel CNN model, it can be learned by comparing the pixel inference value of each pixel included in the generated sample image with the pixel prediction value generated by the pixel prediction unit.

스키밍 단계(S40)에서는, 대상 픽셀의 신뢰도를 설정값과 비교할 수 있으며, 신뢰도가 설정값 이상이면 픽셀생성부가 픽셀예측값을 대상 픽셀의 픽셀값으로 설정할 수 있다. 즉, 대상영역에 포함되는 각각의 대상픽셀들에 대하여 신뢰도를 순차적으로 판별할 수 있으며, 신뢰도가 설정값 이상인 대상 픽셀에 대하여는 픽셀예측값으로 픽셀값을 설정할 수 있다. 이 경우, 픽셀CNN모델을 이용하여 픽셀추론값을 계산하지 않으므로, 신속하게 픽셀값을 설정할 수 있다. 반면에, 대상 픽셀의 신뢰도가 설정값 미만이면 드로우 단계(S50)로 진행할 수 있다. In the skimming step (S40), the reliability of the target pixel may be compared with the set value, and if the reliability is greater than or equal to the set value, the pixel generator may set the pixel prediction value as the pixel value of the target pixel. That is, reliability can be sequentially determined for each target pixel included in the target area, and a pixel value can be set as a pixel prediction value for a target pixel having a reliability greater than or equal to a set value. In this case, since the pixel inference value is not calculated using the pixel CNN model, the pixel value can be quickly set. On the other hand, if the reliability of the target pixel is less than the set value, the process may proceed to the drawing step S50.

드로우단계(S50)에서는, 대상 픽셀의 신뢰도가 설정값 미만이면, 픽셀생성부가 픽셀 CNN모델(Pixel Convolution Neural Network)을 이용하여 대상 픽셀의 픽셀 추론값을 생성하고, 픽셀추론값을 대상 픽셀의 픽셀값으로 설정할 수 있다. 즉, 대상 픽셀의 픽셀예측값에 대한 신뢰도가 낮으므로, 픽셀예측값 대신에 픽셀CNN모델을 이용한 픽셀추론값으로 픽셀값을 설정할 수 있다. 여기서, 픽셀CNN모델을 이용하여 대상픽셀의 픽셀추론값을 연산하는 경우에는 연산시간이 상대적으로 오래 소요될 수 있으나, 보다 정확한 이미지 생성이 가능하다. In the drawing step (S50), if the reliability of the target pixel is less than the set value, the pixel generator generates a pixel inference value of the target pixel using a pixel CNN model (Pixel Convolution Neural Network), and the pixel inference value is a pixel of the target pixel. Can be set as a value. That is, since the reliability of the pixel prediction value of the target pixel is low, the pixel value can be set as the pixel inference value using the pixel CNN model instead of the pixel prediction value. Here, when the pixel inference value of the target pixel is calculated using the pixel CNN model, the calculation time may be relatively long, but more accurate image generation is possible.

한편, 픽셀추론값으로 대상 픽셀의 픽셀값을 설정한 이후에는, 나머지 대상영역에 대한 픽셀예측값 및 신뢰도를 다시 연산하여 업데이트하는 업데이트 단계(S60)를 수행할 수 있다. 즉, 픽셀추론값으로 설정한 대상픽셀의 픽셀값을 반영하여 픽셀예측값 및 신뢰도를 업데이트함으로써, 기존의 픽셀예측값에 포함된 오차들을 제거하고 보다 정확한 픽셀예측값 및 신뢰도를 생성할 수 있다.Meanwhile, after the pixel value of the target pixel is set as the pixel inference value, an update step S60 of calculating and updating the pixel predicted value and reliability of the remaining target area again may be performed. That is, by updating the pixel prediction value and reliability by reflecting the pixel value of the target pixel set as the pixel inference value, errors included in the existing pixel prediction value can be removed and more accurate pixel prediction value and reliability can be generated.

이후, 스키밍 단계(S40), 드로우단계(S50) 및 업데이트단계(S60)를 이미지 생성을 완료할 때까지 반복하여, 이미지를 생성할 수 있다. Thereafter, the skimming step (S40), the draw step (S50), and the update step (S60) may be repeated until image generation is completed to generate an image.

전술한 본 발명은, 프로그램이 기록된 매체에 컴퓨터가 읽을 수 있는 코드로서 구현하는 것이 가능하다. 컴퓨터가 읽을 수 있는 매체는, 컴퓨터로 실행 가능한 프로그램을 계속 저장하거나, 실행 또는 다운로드를 위해 임시 저장하는 것일 수도 있다. 또한, 매체는 단일 또는 수개 하드웨어가 결합된 형태의 다양한 기록수단 또는 저장수단일 수 있는데, 어떤 컴퓨터 시스템에 직접 접속되는 매체에 한정되지 않고, 네트워크 상에 분산 존재하는 것일 수도 있다. 매체의 예시로는, 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM 및 DVD와 같은 광기록 매체, 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical medium), 및 ROM, RAM, 플래시 메모리 등을 포함하여 프로그램 명령어가 저장되도록 구성된 것이 있을 수 있다. 또한, 다른 매체의 예시로, 애플리케이션을 유통하는 앱 스토어나 기타 다양한 소프트웨어를 공급 내지 유통하는 사이트, 서버 등에서 관리하는 기록매체 내지 저장매체도 들 수 있다. 따라서, 상기의 상세한 설명은 모든 면에서 제한적으로 해석되어서는 아니되고 예시적인 것으로 고려되어야 한다. 본 발명의 범위는 첨부된 청구항의 합리적 해석에 의해 결정되어야 하고, 본 발명의 등가적 범위 내에서의 모든 변경은 본 발명의 범위에 포함된다.The above-described present invention can be embodied as computer readable codes on a medium on which a program is recorded. The computer-readable medium may be a computer that continuously stores executable programs or may be temporarily stored for execution or download. In addition, the medium may be various recording means or storage means in the form of a single or several hardware combinations, and is not limited to a medium directly connected to a computer system, but may be distributed on a network. Examples of the medium include magnetic media such as hard disks, floppy disks, and magnetic tapes, optical recording media such as CD-ROMs and DVDs, and magneto-optical media such as floptical disks, And program instructions including ROM, RAM, flash memory, and the like. In addition, examples of other media include an application store for distributing applications, a site for distributing or distributing various software, and a recording medium or storage medium managed by a server. Accordingly, the above detailed description should not be construed as limiting in all respects, but should be considered illustrative. The scope of the present invention should be determined by rational interpretation of the appended claims, and all changes within the equivalent scope of the present invention are included in the scope of the present invention.

본 발명은 전술한 실시예 및 첨부된 도면에 의해 한정되는 것이 아니다. 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 있어, 본 발명의 기술적 사상을 벗어나지 않는 범위 내에서 본 발명에 따른 구성요소를 치환, 변형 및 변경할 수 있다는 것이 명백할 것이다.The present invention is not limited by the above-described embodiments and the accompanying drawings. For those skilled in the art to which the present invention pertains, it will be apparent that components according to the present invention can be substituted, modified and changed without departing from the technical spirit of the present invention.

100: 이미지 자동생성 장치 110: 픽셀예측부
120: 신뢰도추정부 130: 픽셀생성부
S10: 초기생성단계 S20: 예측단계
S30: 신뢰도생성단계 S40: 스키밍단계
S50: 드로우단계 S60: 업데이트단계100: automatic image generation device 110: pixel prediction unit
120: reliability estimation 130: pixel generation department
S10: Initial generation stage S20: Prediction stage
S30: Reliability creation step S40: Skimming step
S50: Draw stage S60: Update stage

Claims

In the method of automatically generating an image using a Skim-PixelCNN (Skim-Pixel Convolution Neural Network),
A prediction step of generating a pixel prediction value of a plurality of target pixels to be generated by using the pixel prediction unit using the pixel values of the existing pixels already generated in the image and the pixel prediction values in the image in which the pixel values are not determined. ;
A reliability generation step of generating a confidence for the pixel predicted values for each target pixel by a reliability estimation unit;
If the reliability of the target pixel is greater than or equal to a set value, a skimming step in which the pixel generator sets the pixel prediction value to the pixel value of the target pixel; And
If the reliability of the target pixel is less than the set value, the pixel generator generates pixel inference values of the target pixel by applying the pixel values up to the pixel immediately before the target pixel to the pixel CNN model (Pixel Convolution Neural Network), And a drawing step of setting the pixel inference value to a pixel value of the target pixel.

The method of claim 1, wherein the image
It is created to include a plurality of pixels arranged in n rows and m columns.
Each row is sequentially generated from top to bottom, and the pixel values of the pixels included in the row are generated while proceeding from left to right.

The method of claim 1, wherein the prediction step
When the setting of a pixel value for one row is completed, an automatic image generation method characterized by generating pixel prediction values for a target region including a predetermined number of rows at the same time.

According to claim 3,
When the pixel value of the target pixel is set to the pixel inference value, further comprising an update step of reflecting the pixel value of the target pixel and updating the pixel prediction value and reliability of the remaining target area after the target pixel. A method for automatically generating an image.

In the method of automatically generating an image using a Skim-PixelCNN (Skim-Pixel Convolution Neural Network),
A prediction step in which the pixel prediction unit simultaneously generates pixel prediction values of a plurality of target pixels to be generated by using pixel values of existing pixels already generated in an image;
A reliability generation step of generating a confidence for the pixel predicted values for each target pixel by a reliability estimation unit;
If the reliability of the target pixel is greater than or equal to a set value, a skimming step in which the pixel generator sets the pixel prediction value to the pixel value of the target pixel; And
If the reliability of the target pixel is less than the set value, the pixel generator generates a pixel inference value of the target pixel using a pixel CNN model (Pixel Convolution Neural Network), and the pixel inference value is a pixel value of the target pixel Including the draw (draw) step to set,
The image above
It is created to include a plurality of pixels arranged in n rows and m columns.
Each row is sequentially generated from top to bottom, and pixel values of pixels included in the row are generated while proceeding from left to right,
The prediction step
When the pixel prediction value for the j-th target pixel is generated while the pixel value is set up to the i-th pixel (j> i, ∀j, i∈ [1, n × m]), the pixel values up to the i-th pixel And generating a pixel prediction value for the j-th target pixel by applying a pre-prediction value from the i + 1st target pixel to a j-1th target pixel in the pixel CNN model.

The method of claim 5, wherein the prediction step
A method for automatically generating an image, wherein the predicted values from the i + 1 th target pixel to the j-1 th target pixel are respectively extracted using a U-net neural network.

The method of claim 6, wherein the prediction step
A method of automatically generating an image, wherein the pixel values of the existing pixels and the pre-prediction values are applied in parallel to the pixel CNN model, and the respective pixel prediction values for a plurality of target pixels are calculated at the same time.

In the method of automatically generating an image using a Skim-PixelCNN (Skim-Pixel Convolution Neural Network),
A prediction step in which the pixel prediction unit simultaneously generates pixel prediction values of a plurality of target pixels to be generated by using pixel values of existing pixels already generated in an image;
A reliability generation step of generating a confidence for the pixel predicted values for each target pixel by a reliability estimation unit;
If the reliability of the target pixel is greater than or equal to a set value, a skimming step in which the pixel generator sets the pixel prediction value to the pixel value of the target pixel; And
If the reliability of the target pixel is less than the set value, the pixel generator generates a pixel inference value of the target pixel using a pixel CNN model (Pixel Convolution Neural Network), and the pixel inference value is a pixel value of the target pixel Including the draw (draw) step to set,
The reliability estimation unit
Using the sample image generated using the pixel CNN model, learning and generating a difference between a pixel inference value of each pixel included in the sample image and a pixel prediction value generated by the pixel prediction unit. How to automatically generate images.

The method of claim 8, wherein the reliability estimation unit
A method for automatically generating an image, wherein a probability that a pixel prediction value for the target pixel matches a pixel inference value for the target pixel is calculated and provided as a reliability for the target pixel.

The method of claim 8, wherein the pixel prediction unit
A method for automatically generating an image, characterized in that learning is performed using a plurality of training images together with the pixel CNN model.

According to claim 1,
And the pixel generating unit further comprises an initial generation step of generating the first k pixels of the image by using the pixel inference value extracted from the pixel CNN model.

A computer program stored in a medium in combination with hardware to execute the automatic image generation method of any one of claims 1 to 11.

A pixel prediction unit generating a pixel prediction value of a plurality of target pixels to be generated by using a pixel value of existing pixels already generated in an image and a pre-prediction value for each pixel whose pixel value is not determined in the image;
A reliability estimation unit generating confidence for the pixel prediction values for each target pixel; And
If the reliability of the target pixel is greater than or equal to the set value, the pixel predicted value is set as the pixel value of the target pixel, and if the reliability of the target pixel is less than the set value, pixel values up to the immediately preceding pixel of the target pixel are pixel CNN models ( Skim-PixelCNN (Skim-Pixel Convolution) including a pixel generator that applies to a Pixel Convolution Neural Network to generate a pixel inference value of the target pixel and sets the pixel inference value to a pixel value of the target pixel. Neural Network).

Processor; And
Including a memory coupled to the processor,
The memory includes one or more modules configured to be executed by the processor,
The one or more modules,
A pixel prediction value of a plurality of target pixels to be generated is generated by using a pixel value of existing pixels already generated in the image and a pre-prediction value for each pixel in which the pixel value is not determined in the image,
Confidence of the pixel prediction values is generated for each target pixel,
If the reliability of the target pixel is greater than or equal to a set value, the pixel predicted value is set to the pixel value of the target pixel,
If the reliability of the target pixel is less than the set value, the pixel inference value of the target pixel is generated by applying the pixel values up to the immediately preceding pixel of the target pixel to the pixel CNN model (Pixel Convolution Neural Network), and the pixel inference value Setting the pixel value of the target pixel,
An apparatus for automatically generating an image using a Skim-PixelCNN (Skim-Pixel Convolution Neural Network) including a command.