KR20180108288A

KR20180108288A - Apparatus and method for compressing image

Info

Publication number: KR20180108288A
Application number: KR1020170037840A
Authority: KR
Inventors: 최형원; 성낙호; 신진우
Original assignee: 주식회사 엔씨소프트; 한국과학기술원
Priority date: 2017-03-24
Filing date: 2017-03-24
Publication date: 2018-10-04
Also published as: KR101990092B1

Abstract

The present invention discloses a device and a method for compressing an image. According to an embodiment of the present invention, the device for compressing an image comprises: an input unit which is inputted with an image; a down-sampling unit which conducts down-sampling of the inputted image; a first compressing unit which generates a first compressed image from the down-sampled image by using at least one first autoencoder formed of a convolution neural network; an up-sampling unit which conducts up-sampling of the first compressed image; a subtracting unit which subtracts the up-sampled image from the inputted image; a second compressing unit which generates a second compressed image from the subtracted image by using at least one second autoencoder formed of the convolution neural network; and an adding unit which generates a result image by adding the up-sampled image and the second compressed image. Therefore, the device and the method for compressing an image can compress an image by using a convolution neural network.

Description

[0001] APPARATUS AND METHOD FOR COMPRESSING IMAGE [0002]

본 발명의 실시예들은 영상 압축 기술과 관련된다.Embodiments of the invention relate to image compression techniques.

컨볼루션 신경망(convolution neural network)은 주로 영상 처리에 응용되는 앞먹임 신경망으로써, 영상 압축(특히, 손실 영상 압축)에 적용되는 방법들 중 하나로 연구되고 있다. 그러나, 종래 컨볼루션 신경망을 이용한 영상 압축 기술들은 예를 들어, JPEG(joint photographic coding experts group), JPEG 2000 등과 같은 코덱들에 비해 좋은 성능을 나타내지 못한다는 문제가 있었다.Convolution neural network is a forward neural network mainly applied to image processing, and is being studied as one of the methods applied to image compression (especially lossy image compression). However, conventionally, image compression techniques using convolutional neural networks do not show good performance compared to codecs such as JPEG (joint photographic coding experts group) and JPEG 2000, for example.

이러한 문제를 해결하기 위해, 최근 태스크의 복잡성(complexity of task)을 분할함으로써 컨볼루션 신경망을 이용한 영상 압축 기술의 성능을 향상시키는 기술들이 연구되고 있다.In order to solve this problem, techniques for improving the performance of an image compression technique using a convolutional neural network by dividing the complexity of tasks have recently been studied.

대한민국 등록특허공보 제10-0248072호 (1999.12.15 등록)Korean Registered Patent No. 10-0248072 (registered on December 15, 1999)

본 발명의 실시예들은 컨볼루션 신경망(convolution neural network)을 이용하여 영상을 압축하는 장치 및 방법을 제공하기 위한 것이다.Embodiments of the present invention provide an apparatus and method for compressing an image using a convolution neural network.

본 발명의 일 실시예에 따른 영상 압축 장치는, 영상을 입력 받는 입력부, 상기 입력된 영상을 다운 샘플링(down-sampling)하는 다운 샘플링부, 컨볼루션 신경망으로 구성된 적어도 하나의 제1 오토 인코더(auto encoder)를 이용하여, 상기 다운 샘플링된 영상으로부터 제1 압축 영상을 생성하는 제1 압축부, 상기 제1 압축 영상을 업 샘플링(up-sampling)하는 업 샘플링부, 상기 입력된 영상에서 상기 업 샘플링된 영상을 감산하는 감산부, 컨볼루션 신경망으로 구성된 적어도 하나의 제2 오토 인코더를 이용하여, 상기 감산된 영상으로부터 제2 압축 영상을 생성하는 제2 압축부, 및 상기 업 샘플링된 영상 및 상기 제2 압축 영상을 가산하여 결과 영상을 생성하는 가산부를 포함한다.The image compressing apparatus according to an embodiment of the present invention includes an input unit for inputting an image, a down-sampling unit for down-sampling the input image, at least one first auto-encoder auto a first compression unit for generating a first compressed image from the downsampled image using an encoder, an up-sampling unit for up-sampling the first compressed image, A second compression unit for generating a second compressed image from the subtracted image using at least one second auto encoder configured by a convolutional neural network and a second compression unit for generating the upsampled image and the 2 compressed image to generate a resultant image.

상기 제2 압축부는, 복수의 제2 오토 인코더를 이용하여 상기 감산된 영상을 반복 압축할 수 있다.The second compression unit may repeatedly compress the subtracted image using a plurality of second auto-encoders.

상기 제1 오토 인코더 및 제2 오토 인코더는, 상기 입력된 영상 및 상기 결과 영상 사이의 평균제곱오차(Mean Squared Error, MSE)를 최소화하도록 학습될 수 있다.The first auto encoder and the second auto encoder may be learned to minimize a mean squared error (MSE) between the input image and the resultant image.

상기 제2 오토 인코더는, 상기 학습된 제1 오토 인코더의 파라미터들을 이용하여 학습될 수 있다.The second auto encoder can be learned by using the learned parameters of the first auto encoder.

상기 입력된 영상은, 복수의 채널을 포함하고, 상기 제1 압축부는, 적어도 하나의 신경망 층을 공유하는 복수의 제1 오토 인코더를 이용하여 상기 다운 샘플링된 영상의 각 채널을 압축할 수 있다.The input image may include a plurality of channels and the first compression unit may compress each channel of the downsampled image using a plurality of first auto encoders sharing at least one neural network layer.

상기 입력된 영상은, 복수의 채널을 포함하고, 상기 제2 압축부는, 적어도 하나의 신경망 층을 공유하는 복수의 제2 오토 인코더를 이용하여 상기 감산된 영상의 각 채널을 압축할 수 있다.The input image may include a plurality of channels and the second compression unit may compress each channel of the subtracted image using a plurality of second auto-encoders sharing at least one neural network layer.

본 발명의 일 실시예에 따른 영상 압축 방법은, 영상을 입력 받는 단계, 상기 입력된 영상을 다운 샘플링(down-sampling)하는 단계, 컨볼루션 신경망으로 구성된 적어도 하나의 제1 오토 인코더(auto encoder)를 이용하여, 상기 다운 샘플링된 영상으로부터 제1 압축 영상을 생성하는 단계, 상기 제1 압축 영상을 업 샘플링(up-sampling)하는 단계, 상기 입력된 영상에서 상기 업 샘플링된 영상을 감산하는 단계, 컨볼루션 신경망으로 구성된 적어도 하나의 제2 오토 인코더를 이용하여, 상기 감산된 영상으로부터 제2 압축 영상을 생성하는 단계, 및 상기 업 샘플링된 영상 및 상기 제2 압축 영상을 가산하여 결과 영상을 생성하는 단계를 포함한다.An image compression method according to an embodiment of the present invention includes receiving an image, down-sampling the input image, at least one first auto encoder configured with a convolution neural network, Sampling a first compressed image from the downsampled image, up-sampling the first compressed image, subtracting the upsampled image from the input image, Generating a second compressed image from the subtracted image using at least one second auto-encoder configured by a convolutional neural network, and adding the upsampled image and the second compressed image to generate a resultant image .

상기 제2 압축 영상을 생성하는 단계는, 복수의 제2 오토 인코더를 이용하여 상기 감산된 영상을 반복 압축할 수 있다.The step of generating the second compressed image may repeatedly compress the subtracted image using a plurality of second auto-encoders.

상기 입력된 영상은, 복수의 채널을 포함하고, 상기 제1 압축 영상을 생성하는 단계는, 적어도 하나의 신경망 층을 공유하는 복수의 제1 오토 인코더를 이용하여 상기 다운 샘플링된 영상의 각 채널을 압축할 수 있다.Wherein the input image includes a plurality of channels and the generating of the first compressed image comprises using a plurality of first auto-encoders sharing at least one neural network layer to convert each channel of the down- Can be compressed.

상기 입력된 영상은, 복수의 채널을 포함하고, 상기 제2 압축 영상을 생성하는 단계는, 적어도 하나의 신경망 층을 공유하는 복수의 제2 오토 인코더를 이용하여 상기 감산된 영상의 각 채널을 압축할 수 있다.Wherein the input image includes a plurality of channels and the generating of the second compressed image comprises compressing each channel of the subtracted image using a plurality of second auto-encoders sharing at least one neural network layer can do.

본 발명의 실시예들에 따르면, 태스크의 복잡성(complexity of task)을 분할하여 영상 압축 장치의 전체적인 성능을 향상시킬 수 있다.According to embodiments of the present invention, the overall performance of the image compressing apparatus can be improved by dividing the complexity of tasks.

또한, 본 발명의 실시예들에 따르면, 영상이 복수의 채널을 갖는 경우, 각 채널에 이용되는 오토 인코더(auto encoder)들이 서로 일부 정보를 공유함으로써, 복수의 채널을 갖는 영상들을 효율적으로 압축할 수 있다.According to embodiments of the present invention, when an image has a plurality of channels, auto encoders used for respective channels share some information with each other, thereby efficiently compressing images having a plurality of channels .

도 1은 본 발명의 일 실시예에 따른 영상 압축 장치의 블록도
도 2는 본 발명의 일 실시예에 따른 오토 인코더(auto encoder)의 블록도
도 3은 본 발명의 일 실시예에 따른 오토 인코더의 인코더에 포함된 레지듀얼 블록(residual block)의 블록도
도 4는 본 발명의 일 실시예에 따른 영상 압축 과정의 일 예를 나타낸 도면
도 5는 본 발명의 추가적인 실시예에 따른 오토 인코더의 구성을 나타낸 도면
도 6은 본 발명의 일 실시예에 따른 영상 압축 방법을 나타낸 흐름도1 is a block diagram of an image compressing apparatus according to an embodiment of the present invention;
2 is a block diagram of an auto encoder according to an embodiment of the present invention.
3 is a block diagram of a residual block included in an encoder of an auto-encoder according to an embodiment of the present invention.
4 is a diagram illustrating an example of an image compression process according to an embodiment of the present invention.
5 is a diagram showing a configuration of an auto encoder according to a further embodiment of the present invention;
6 is a flowchart illustrating an image compression method according to an embodiment of the present invention.

이하, 도면을 참조하여 본 발명의 구체적인 실시형태를 설명하기로 한다. 이하의 상세한 설명은 본 명세서에서 기술된 방법, 장치 및/또는 시스템에 대한 포괄적인 이해를 돕기 위해 제공된다. 그러나 이는 예시에 불과하며 본 발명은 이에 제한되지 않는다.Hereinafter, specific embodiments of the present invention will be described with reference to the drawings. The following detailed description is provided to provide a comprehensive understanding of the methods, apparatus, and / or systems described herein. However, this is merely an example and the present invention is not limited thereto.

본 발명의 실시예들을 설명함에 있어서, 본 발명과 관련된 공지기술에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략하기로 한다. 그리고, 후술되는 용어들은 본 발명에서의 기능을 고려하여 정의된 용어들로서 이는 사용자, 운용자의 의도 또는 관례 등에 따라 달라질 수 있다. 그러므로 그 정의는 본 명세서 전반에 걸친 내용을 토대로 내려져야 할 것이다. 상세한 설명에서 사용되는 용어는 단지 본 발명의 실시예들을 기술하기 위한 것이며, 결코 제한적이어서는 안 된다. 명확하게 달리 사용되지 않는 한, 단수 형태의 표현은 복수 형태의 의미를 포함한다. 본 설명에서, "포함" 또는 "구비"와 같은 표현은 어떤 특성들, 숫자들, 단계들, 동작들, 요소들, 이들의 일부 또는 조합을 가리키기 위한 것이며, 기술된 것 이외에 하나 또는 그 이상의 다른 특성, 숫자, 단계, 동작, 요소, 이들의 일부 또는 조합의 존재 또는 가능성을 배제하도록 해석되어서는 안 된다.DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings. In the following description, well-known functions or constructions are not described in detail since they would obscure the invention in unnecessary detail. The following terms are defined in consideration of the functions of the present invention, and may be changed according to the intention or custom of the user, the operator, and the like. Therefore, the definition should be based on the contents throughout this specification. The terms used in the detailed description are intended only to describe embodiments of the invention and should in no way be limiting. Unless specifically stated otherwise, the singular form of a term includes plural forms of meaning. In this description, the expressions "comprising" or "comprising" are intended to indicate certain features, numbers, steps, operations, elements, parts or combinations thereof, Should not be construed to preclude the presence or possibility of other features, numbers, steps, operations, elements, portions or combinations thereof.

도 1은 본 발명의 일 실시예에 따른 영상 압축 장치의 블록도이다.1 is a block diagram of an image compressing apparatus according to an embodiment of the present invention.

도 1을 참조하면, 본 발명의 일 실시예에 따른 영상 압축 장치(100)는 입력부(110), 다운 샘플링부(120), 제1 압축부(130), 업 샘플링부(140), 감산부(150), 제2 압축부(160) 및 가산부(170)를 포함한다.1, an image compression apparatus 100 according to an exemplary embodiment of the present invention includes an input unit 110, a downsampling unit 120, a first compression unit 130, an upsampling unit 140, A first compression unit 150, a second compression unit 160, and an addition unit 170.

이때, 영상 압축 장치(100)는 컨볼루션 신경망(Convolution Neural Network; CNN)을 이용하여 입력 영상을 압축하기 위한 장치일 수 있다. 구체적으로, 본 발명의 일 실시예에 따르면, 영상 압축 장치(100)는 컨볼루션 신경망으로 구성된 적어도 하나의 압축 오토 인코더(compressive auto-encoder)를 이용하여 입력 영상을 압축할 수 있다. At this time, the image compression apparatus 100 may be a device for compressing an input image using a Convolution Neural Network (CNN). In particular, according to one embodiment of the present invention, the image compression apparatus 100 may compress an input image using at least one compressive auto-encoder configured with a convolutional neural network.

입력부(110)는 영상을 입력 받는다.The input unit 110 receives an image.

이때, 입력부(110)로 입력 되는 영상은 예를 들어, 흑백 영상, 3채널의 RGB 영상, 노멀 맵(normal map), 반짝임 맵(specular map) 등일 수 있다. 그러나, 입력되는 영상이 반드시 상술한 예에 한정되는 것은 아니므로, 입력되는 영상은 상술한 예 외에도 다양한 크기, 다양한 채널을 가진 다양한 형태의 영상일 수 있다.At this time, the image input to the input unit 110 may be, for example, a monochrome image, a 3-channel RGB image, a normal map, a specular map, and the like. However, since the input image is not necessarily limited to the above-described example, the input image may be various types of images having various sizes and various channels in addition to the examples described above.

다운 샘플링부(120)는 입력부(110)를 통해 입력된 영상을 다운 샘플링(down-sampling) 한다.The down-sampling unit 120 down-samples an image input through the input unit 110. [

구체적으로, 다운 샘플링부(120)는 다운 샘플링을 통해 입력된 영상을 1보다 작은 비율로 축소시킬 수 있다. Specifically, the downsampling unit 120 can reduce the input image through downsampling to a ratio smaller than one.

한편, 다운 샘플링부(120)는 공지된 다양한 방식의 다운 샘플링 기법을 이용하여 입력된 영상을 다운 샘플링할 수 있다.On the other hand, the downsampling unit 120 may downsample an input image using various known downsampling techniques.

제1 압축부(130)는 컨볼루션 신경망으로 구성된 적어도 하나의 제1 오토 인코더(auto encoder)를 이용하여, 다운 샘플링부(120)를 통해 다운 샘플링된 영상으로부터 제1 압축 영상을 생성한다.The first compression unit 130 generates a first compressed image from the downsampled image through the downsampling unit 120 using at least one first auto encoder configured as a convolutional neural network.

이때, 제1 오토 인코더는 예를 들어, 각각 컨볼루션 신경망으로 구성된 인코더(encoder), 이진화기(binarizer) 및 디코더(decoder)로 구성될 수 있다. 한편, 제1 압축부(130)의 압축 영상 생성 방법은 도 2 및 도 3을 참조하여 상세히 설명하기로 한다.At this time, the first auto encoder may be composed of, for example, an encoder, a binarizer, and a decoder, each of which is composed of a convolutional neural network. The compressed image generation method of the first compression unit 130 will be described in detail with reference to FIGS. 2 and 3. FIG.

업 샘플링부(140)는 제1 압축부(130)에서 생성된 제1 압축 영상을 업 샘플링(up-sampling) 한다.The up-sampling unit 140 up-samples the first compressed image generated by the first compression unit 130.

구체적으로, 업 샘플링부(140)는 업 샘플링을 통해 제1 압축 영상을 1보다 큰 비율로 확대시킬 수 있다. 이때, 업 샘플링부(140)는 제1 압축 영상의 크기가 입력부(110)를 통해 입력된 영상의 크기와 동일하도록 제1 압축 영상을 확대시킬 수 있다.Specifically, the upsampling unit 140 may expand the first compressed image at a rate larger than 1 through upsampling. At this time, the upsampling unit 140 may enlarge the first compressed image so that the size of the first compressed image is the same as the size of the image input through the input unit 110. [

한편, 업 샘플링부(140)는 예를 들어, 바이리니어(bilinear), 바이큐빅(bicubic) 등과 같이 공지된 다양한 방식의 업 샘플링 기법을 이용하여 영상을 제1 압축 영상을 업 샘플링할 수 있다. Meanwhile, the upsampling unit 140 may upsample the first compressed image using an upsampling technique known in various methods such as bilinear, bicubic, and the like.

감산부(150)는 입력부(110)를 통해 입력된 영상에서 업 샘플링부(140)를 통해 업 샘플링된 영상을 감산한다.The subtraction unit 150 subtracts the image upsampled by the upsampling unit 140 from the image input through the input unit 110. [

예를 들어, 감산부(150)는 입력부(110)를 통해 입력된 영상의 픽셀 값에서 업 샘플링부(140)를 통해 업 샘플링된 영상의 픽셀 값을 감산할 수 있다.For example, the subtraction unit 150 may subtract the pixel value of the image up-sampled through the up-sampling unit 140 from the pixel value of the image input through the input unit 110. [

제2 압축부(160)는 컨볼루션 신경망으로 구성된 적어도 하나의 제2 오토 인코더를 이용하여, 감산부(150)를 통해 감산된 영상으로부터 제2 압축 영상을 생성한다.The second compression unit 160 generates a second compressed image from the subtracted image through the subtraction unit 150 using at least one second auto-encoder configured as a convolutional neural network.

이때, 제2 오토 인코더는 예를 들어, 각각 컨볼루션 신경망으로 구성된 인코더(encoder), 이진화기(binarizer) 및 디코더(decoder)로 구성될 수 있다. 한편, 제1 압축부(130)의 압축 영상 생성 방법은 도 2 및 도 3을 참조하여 상세히 설명하기로 한다.At this time, the second auto encoder may be composed of, for example, an encoder, a binarizer, and a decoder each configured of a convolutional neural network. The compressed image generation method of the first compression unit 130 will be described in detail with reference to FIGS. 2 and 3. FIG.

또한, 제2 압축부(160)는 복수의 제2 오토 인코더를 이용하여 감산부(150)를 통해 감산된 영상을 반복 압축할 수 있다.In addition, the second compression unit 160 can repeatedly compress the subtracted image through the subtraction unit 150 using a plurality of second auto-encoders.

예를 들어, 제2 압축부(160)는 제2-1 오토 인코더를 이용하여 감산된 영상을 압축하고, 제2-1 오토 인코더를 통해 압축된 영상을 제2-2 오토 인코더를 이용하여 한번 더 압축할 수 있다.For example, the second compression unit 160 compresses the subtracted image by using the 2-1 auto encoder, and compresses the image compressed by the 2-1 auto encoder by using the 2-2 auto encoder Can be further compressed.

가산부(170)는 업 샘플링부(140)를 통해 업 샘플링된 영상 및 제2 압축부(160)에서 생성된 제2 압축 영상을 가산하여 결과 영상을 생성한다.The adder 170 adds the upsampled image through the upsampling unit 140 and the second compressed image generated by the second compression unit 160 to generate a resultant image.

구체적으로, 가산부(170)는 업 샘플링된 영상의 픽셀 값 및 이에 대응하는 제2 압축 영상의 픽셀 값을 서로 가산하여 결과 영상을 생성할 수 있다.Specifically, the adder 170 may add the pixel values of the upsampled image and the pixel values of the corresponding second compressed image to each other to generate a resultant image.

한편, 일 실시예에서, 도 1에 도시된 입력부(110), 다운 샘플링부(120), 제1 압축부(130), 업 샘플링부(140), 감산부(150), 제2 압축부(160) 및 가산부(170)는 하나 이상의 프로세서 및 그 프로세서와 연결된 컴퓨터 판독 가능 기록 매체를 포함하는 하나 이상의 컴퓨팅 장치 상에서 구현될 수 있다. 컴퓨터 판독 가능 기록 매체는 프로세서의 내부 또는 외부에 있을 수 있고, 잘 알려진 다양한 수단으로 프로세서와 연결될 수 있다. 컴퓨팅 장치 내의 프로세서는 각 컴퓨팅 장치로 하여금 본 명세서에서 기술되는 예시적인 실시예에 따라 동작하도록 할 수 있다. 예를 들어, 프로세서는 컴퓨터 판독 가능 기록 매체에 저장된 명령어를 실행할 수 있고, 컴퓨터 판독 가능 기록 매체에 저장된 명령어는 프로세서에 의해 실행되는 경우 컴퓨팅 장치로 하여금 본 명세서에 기술되는 예시적인 실시예에 따른 동작들을 수행하도록 구성될 수 있다.In an embodiment, the input unit 110, the downsampling unit 120, the first compression unit 130, the upsampling unit 140, the subtraction unit 150, the second compression unit 160 and adder 170 may be implemented on one or more computing devices that include one or more processors and a computer readable recording medium coupled to the processors. The computer readable recording medium may be internal or external to the processor, and may be coupled to the processor by any of a variety of well known means. A processor in the computing device may cause each computing device to operate in accordance with the exemplary embodiment described herein. For example, a processor may execute instructions stored on a computer-readable recording medium, and instructions stored on the computer readable recording medium may cause a computing device to perform operations in accordance with the exemplary embodiments described herein For example.

도 2는 본 발명의 일 실시예에 따른 오토 인코더(auto encoder)의 블록도이며, 도 3은 본 발명의 일 실시예에 따른 오토 인코더의 인코더에 포함된 레지듀얼 블록(residual block)의 블록도이다.FIG. 2 is a block diagram of an auto encoder according to an embodiment of the present invention. FIG. 3 is a block diagram of a residual block included in an encoder of the auto encoder according to an embodiment of the present invention. to be.

한편, 도 2에 도시된 오토 인코더는 예를 들어, 제1 압축부(130)에 포함되는 제1 오토 인코더 또는 제2 압축부(160)에 포함되는 제2 오토 인코더일 수 있다.2 may be a first auto encoder included in the first compression unit 130 or a second auto encoder included in the second compression unit 160, for example.

도 2 및 도 3을 참조하면, 본 발명의 일 실시예에 따른 오토 인코더는 인코더(encoder), 이진화기(binarizer) 및 디코더(decoder)를 포함할 수 있다.2 and 3, an auto encoder according to an embodiment of the present invention may include an encoder, a binarizer, and a decoder.

구체적으로, 인코더 및 디코더는 예를 들어, 각각 하나의 컨볼루션 층(Conv) 및 3개의 레지듀얼 블록(Residual block #1, #2, #3)을 포함하는 신경망으로 구성될 수 있다.Specifically, the encoder and the decoder may be composed of, for example, a neural network including one convolution layer (Conv) and three residual blocks (Residual blocks # 1, # 2, and # 3), respectively.

한편, 도 2 및 도 3에 도시된 예에서, 각 컨볼루션 층에 표기된 "Cx Kx K"는 C개의 필터를 가지는 K x K 컨볼루션을 의미하며, "스트라이드(Stride)"는 필터의 적용 위치 간격을 의미할 수 있다. 예를 들어, 인코더의 컨볼루션 층의 표기 "64x5x5 Stride 2"는 스트라이드 값이 2이며, 64개의 필터를 가지는 5x5 컨볼루션을 의미할 수 있다.On the other hand, in the example shown in Figs. 2 and 3, "C x K x K "refers to K x K convolution with C filters, and" Stride " can refer to the applied position spacing of the filter. For example, the notation "64x5x5 Stride 2 "may mean a 5x5 convolution with a stride value of 2 and 64 filters.

한편, 인코더에 포함된 레지듀얼 블록들은 예를 들어, 3개의 컨볼루션 층(Conv) 및 하나의 가산 층(sum)을 포함할 수 있다. On the other hand, the residual blocks included in the encoder may include, for example, three convolution layers (Conv) and one addition layer (sum).

이진화기(Binarizer)는 하나의 컨볼루션 층(Conv)을 포함하는 신경망을 이용하여 이진화를 수행할 수 있다. 구체적으로, 이진화기는 예를 들어, tanh 함수를 출력 층의 활성화 함수로 이용하여 하기 수학식 1과 같이 이진화(b(x))를 수행할 수 있다.A binarizer can perform binarization using a neural network including a convolution layer (Conv). Specifically, the binarizer can perform binarization (b (x)) using, for example, tanh function as an activation function of the output layer as shown in Equation 1 below.

[수학식 1][Equation 1]

이때,

일 수 있다.At this time,

Lt; / RTI >

한편, 도 2 및 도 3에는 도시하지 않았으나, 디코더의 레지듀얼 블록들은 적어도 하나의 디컨볼루션 층(deconvolutional layer) 또는 적어도 하나의 서브 픽셀 컨볼루션 층(sub-pixel convolution layer)을 포함할 수 있다.2 and 3, the residual blocks of the decoder may include at least one deconvolutional layer or at least one sub-pixel convolution layer .

한편, 오토 인코더 및 레지듀얼 블록의 구성은 반드시 도 2 및 3에 도시된 예에 한정되는 것은 아니며, 실시예에 따라 다양하게 변형될 수 있다. The configurations of the auto encoder and the residual block are not limited to the examples shown in FIGS. 2 and 3, but may be modified variously according to the embodiment.

한편, 본 발명의 일 실시예에 따르면, 오토 인코더는, 입력부(110)를 통해 입력된 영상 및 가산부(170)에 의해 생성된 결과 영상 사이의 평균제곱오차(Mean Squared Error, MSE)를 최소화하도록 학습될 수 있다.Meanwhile, according to one embodiment of the present invention, the auto encoder minimizes the mean squared error (MSE) between the image input through the input unit 110 and the resultant image generated by the adder 170 .

구체적으로, 제1 압축부(130)의 제1 오토 인코더의 파라미터는 입력부(110)를 통해 입력된 영상 및 가산부(170)를 통해 가산된 영상 사이의 평균제곱오차를 최소화하도록 학습될 수 있으며, 제2 압축부(160)의 제2 오토 인코더의 파라미터는 학습된 제1 오토 인코더의 파라미터들을 이용하여 학습될 수 있다.Specifically, the parameters of the first auto encoder of the first compression unit 130 can be learned so as to minimize the mean square error between the image input through the input unit 110 and the images added through the adder 170 , The parameters of the second auto encoder of the second compression unit 160 can be learned using the learned parameters of the first auto encoder.

도 4는 본 발명의 일 실시예에 따른 영상 압축 과정의 일 예를 나타낸 도면이다.4 is a diagram illustrating an example of an image compression process according to an embodiment of the present invention.

도 4를 참조하면, 본 발명의 일 실시예에 따른 영상 압축 장치(100)의 다운 샘플링부(120)는 입력 영상(i₀)을 다운 샘플링 하여 입력 영상(i₀)보다 크기가 작은 다운 샘플링된 영상(i₁)을 생성할 수 있다.4, the down-sampling section 120 is larger than the input image (i _0), a down-sampling to the input image (i _0), a small down-sampling of the image compression apparatus 100 according to an embodiment of the present invention (I ₁ ).

또한, 영상 압축 장치(100)의 제1 압축부(130)는 다운 샘플링된 영상(i₁)으로부터 제1 압축 영상(c₁)을 생성할 수 있다.In addition, the first compression unit 130 of the image compression apparatus 100 may generate the first compressed image c ₁ from the downsampled image i ₁ .

또한, 영상 압축 장치(100)의 업 샘플링부(140)는 예를 들어, 바이큐빅 샘플링을 이용하여 제1 압축 영상(c₁)을 업 샘플링하여, 입력 영상(i₀)과 크기가 동일한 업 샘플링된 영상(c₀)을 생성할 수 있다.Further, the up-sampling unit 140 of the image compression device 100, for example, by up-sampling a first compressed image (c ₁₎ using a bi-cubic sampling, the input image (i ₀₎ and the size is the same up It is possible to generate the sampled image c ₀ .

또한, 영상 압축 장치(100)의 감산부(150)는 입력 영상(i₀)에서 업 샘플링된 영상(c₀)를 감산할 수 있다.Also, the subtraction unit 150 of the image compression apparatus 100 may subtract the image c _o up-sampled from the input image i ₀ .

또한, 영상 압축 장치(100)의 제2 압축부(160)는 감산부(150)를 통해 감산된 영상을 압축하여 제2 압축 영상(r₀)을 생성할 수 있다.The second compression unit 160 of the image compression apparatus 100 may compress the subtracted image through the subtraction unit 150 to generate the second compressed image r ₀ .

또한, 영상 압축 장치(100)는 가산부(170)는 업 샘플링된 영상(c₀) 및 제2 압축 영상(r₀)를 가산하여 결과 영상(r₀+c₀)를 생성할 수 있다.In addition, the image compressing apparatus 100 may add the upsampled image c ₀ and the second compressed image r ₀ to generate the resultant image r ₀ + c ₀ .

한편, 만약 입력부(110)를 통해 입력된 영상이 n x m 해상도의 3채널 영상인 경우, 총 코드 길이(code length)는 하기 수학식 2로 표현될 수 있다.On the other hand, if the image input through the input unit 110 is a 3-channel image of n x m resolution, the total code length can be expressed by the following equation (2).

[수학식 2]&Quot; (2) "

이때, N_C 및 N_R은 각각 제1 오토 인코더 및 제2 오토 인코더에 대한 이진화층의 출력 채널 크기를 나타낸다.Where N _C and N _R represent the output channel size of the binarization layer for the first auto encoder and the second auto encoder, respectively.

도 5는 본 발명의 추가적인 실시예에 따른 오토 인코더의 구성을 나타낸 도면이다. 구체적으로, 도 5는 입력부(110)를 통해 입력된 영상이 복수의 채널(예를 들어, RGB의 3채널 등)을 포함하는 경우, 각 채널에 이용되는 복수의 오토 인코더들의 구성을 나타낸 도면이다.5 is a diagram showing a configuration of an auto encoder according to a further embodiment of the present invention. 5 is a diagram illustrating a configuration of a plurality of auto encoders used for each channel when an image input through the input unit 110 includes a plurality of channels (for example, three channels of RGB) .

구체적으로, 도 5의 (a) 는 각 채널에 이용된 오토 인코더들이 오토 인코더 내 포함되는 인코더의 모든 신경망 층들을 공유하는 것을 나타내며, 도 5의 (b)는 각 채널에 이용된 오토 인코더들이 오토 인코더 내 포함되는 디코더의 모든 신경망 층들을 공유하는 것을 나타낸다.5 (a) shows that the auto encoders used in each channel share all the neural network layers of the encoders included in the auto encoder, and FIG. 5 (b) And shares all neural network layers of the decoder included in the encoder.

또한, 도 5의 (c)는 각 채널에 이용되는 오토 인코더들이 오토 인코더 내 포함되는 신경망 층들 중 하나의 층을 공유하는 것을 나타낸다. 구체적으로, 도 5의 (c)는 오토 인코더 내 포함되는 디코더의 신경망 층들 중 첫 번째 층(D1)을 공유하는 것을 나타낸다.5 (c) shows that the auto encoders used for each channel share one layer of the neural network layers included in the auto encoder. Specifically, FIG. 5C shows that the first layer D1 among the neural network layers of the decoder included in the auto encoder is shared.

한편, 도 5에 도시된 오토 인코더들은 예를 들어, 제1 압축부(130)에 포함되는 제1 오토 인코더 또는 제2 압축부(160)에 포함되는 제2 오토 인코더일 수 있다.5 may be a first auto encoder included in the first compression unit 130 or a second auto encoder included in the second compression unit 160, for example.

구체적으로, 본 발명의 일 실시예에 따르면, 제1 압축부(130)는 적어도 하나의 신경망 층을 공유하는 복수의 제1 오토 인코더를 이용하여 다운 샘플링부(120)를 통해 다운 샘플링된 영상의 각 채널을 압축할 수 있다.In particular, according to one embodiment of the present invention, the first compression unit 130 may include a plurality of first auto-encoders sharing at least one neural network layer, Each channel can be compressed.

또한 본 발명의 일 실시예에 따르면, 제2 압축부(160)는 적어도 하나의 신경망 층을 공유하는 복수의 제2 오토 인코더를 이용하여 감산부(150)를 통해 감산된 영상의 각 채널을 압축할 수 있다.According to an embodiment of the present invention, the second compression unit 160 compresses each channel of the image subtracted by the subtraction unit 150 using a plurality of second auto-encoders sharing at least one neural network layer can do.

예를 들어, 만약 입력부(110)를 통해 입력된 영상이 3개의 채널을 포함하는 경우, 제1 영상 압축부(130)는 제1-1 오토 인코더, 제1-2 오토 인코더 및 제1-3 오토 인코더 각각을 다운 샘플링부(120)에 의해 다운 샘플링된 영상의 각 채널을 압축하기 위해 이용할 수 있다. 이때, 제1-1 내지 제1-3 오토 인코더들 각각은 도 5에 도시된 예와 같이 적어도 하나의 신경망 층을 공유할 수 있다.For example, if the image input through the input unit 110 includes three channels, the first image compression unit 130 compresses the first 1-1 auto encoder, the 1-2 auto encoder, Each of the auto-encoders can be used to compress each channel of the downsampled image by the down-sampling unit 120. At this time, each of the 1-1 to 1-3 auto encoders may share at least one neural network layer as in the example shown in FIG.

또한, 제2 영상 압축부(160)는 제2-1 오토 인코더, 제2-2 오토 인코더 및 제2-3 오토 인코더 각각을 감산부(150)에 의해 감산된 영상의 각 채널을 압축하기 위해 이용할 수 있다. 이때, 제2-1 내지 제2-3 오토 인코더들 각각은 도 5에 도시된 예와 같이 적어도 하나의 신경망 층을 공유할 수 있다.The second image compressing unit 160 compresses each channel of the image subtracted by the subtracting unit 150 from the 2-1 auto encoder, the 2-2 auto encoder, and the 2-3 auto encoder, Can be used. At this time, each of the 2-1 to 2-3 auto encoders may share at least one neural network layer as in the example shown in FIG.

도 6은 본 발명의 일 실시예에 따른 영상 압축 방법을 나타낸 흐름도이다.6 is a flowchart illustrating an image compression method according to an embodiment of the present invention.

도 6에 도시된 방법은 예를 들어, 도 1에 도시된 영상 압축 장치(100)에 의해 수행될 수 있다.The method shown in Fig. 6 can be performed, for example, by the image compressing apparatus 100 shown in Fig.

한편, 도 6에 도시된 흐름도에서는 동작을 복수 개의 단계로 나누어 기재하였으나, 적어도 일부의 단계들은 순서를 바꾸어 수행되거나, 다른 단계와 결합되어 함께 수행되거나, 생략되거나, 세부 단계들로 나뉘어 수행되거나, 또는 도시되지 않은 하나 이상의 단계가 부가되어 수행될 수 있다.In the flowchart shown in FIG. 6, although the operation is described by dividing the operation into a plurality of steps, at least some of the steps may be performed in order, or may be performed in combination with other steps, omitted, Or one or more steps not shown may be added.

도 6을 참조하면, 영상 압축 장치(100)는 영상을 입력 받는다(601). 이때, 영상 압축 장치(100)로 입력되는 영상은 복수의 채널을 포함할 수 있다.Referring to FIG. 6, the image compression apparatus 100 receives an image (601). At this time, the image input to the image compression apparatus 100 may include a plurality of channels.

영상 압축 장치(100)는 입력된 영상을 다운 샘플링 한다(602).The image compression apparatus 100 down-samples the input image (602).

영상 압축 장치(100)는 컨볼루션 신경망으로 구성된 적어도 하나의 제1 오토 인코더를 이용하여, 다운 샘플링된 영상으로부터 제1 압축 영상을 생성한다(603). 이때, 영상 압축 장치(100)는 적어도 하나의 신경망 층을 공유하는 복수의 제1 오토 인코더를 이용하여 다운 샘플링된 영상의 각 채널을 압축할 수 있다.The image compression apparatus 100 generates a first compressed image from the downsampled image using at least one first auto encoder configured as a convolutional neural network (603). At this time, the image compressing apparatus 100 may compress each channel of the downsampled image using a plurality of first auto-encoders sharing at least one neural network layer.

영상 압축 장치(100)는 제1 압축 영상을 업 샘플링한다(604).The image compressing apparatus 100 upsamples the first compressed image (604).

영상 압축 장치(100)는 입력된 영상에서 업 샘플링된 영상을 감산한다(605).The image compression apparatus 100 subtracts the upsampled image from the input image (605).

영상 압축 장치(100) 컨볼루션 신경망으로 구성된 적어도 하나의 제2 오토 인코더를 이용하여, 감산된 영상으로부터 제2 압축 영상을 생성한다(606). 이때, 영상 압축 장치(100)는 복수의 제2 오토 인코더를 이용하여 감산된 영상을 반복 압축할 수 있다.The second compressed image is generated from the subtracted image using at least one second auto-encoder configured by the image compressing apparatus 100 convolutional neural network (606). At this time, the image compression apparatus 100 can repeatedly compress the subtracted image using a plurality of second auto-encoders.

또한, 영상 압축 장치(100)는 적어도 하나의 신경망 층을 공유하는 복수의 제2 오토 인코더를 이용하여 감산된 영상의 각 채널을 압축할 수 있다.Also, the image compression apparatus 100 may compress each channel of the subtracted image using a plurality of second auto-encoders sharing at least one neural network layer.

영상 압축 장치(100)는 업 샘플링된 영상 및 제2 압축 영상을 가산하여 결과 영상을 생성한다(607). 이때, 제1 오토 인코더 및 제2 오토 인코더는, 입력된 영상 및 결과 영상 사이의 평균제곱오차(Mean Squared Error, MSE)를 최소화하도록 학습될 수 있다. 또한, 제2 오토 인코더는 제1 오토 인코더의 파라미터들을 이용하여 학습될 수 있다.The image compressing apparatus 100 adds the upsampled image and the second compressed image to generate a resultant image (607). At this time, the first auto encoder and the second auto encoder can be learned to minimize the mean squared error (MSE) between the input image and the resultant image. Also, the second auto encoder can be learned using the parameters of the first auto encoder.

한편, 본 발명의 실시예는 본 명세서에서 기술한 방법들을 컴퓨터상에서 수행하기 위한 프로그램을 포함하는 컴퓨터 판독 가능 기록매체를 포함할 수 있다. 상기 컴퓨터 판독 가능 기록매체는 프로그램 명령, 로컬 데이터 파일, 로컬 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체는 본 발명을 위하여 특별히 설계되고 구성된 것들이거나, 또는 컴퓨터 소프트웨어 분야에서 통상적으로 사용 가능한 것일 수 있다. 컴퓨터 판독 가능 기록매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM, DVD와 같은 광 기록 매체, 플로피 디스크와 같은 자기-광 매체, 및 롬, 램, 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함할 수 있다.On the other hand, an embodiment of the present invention may include a computer-readable recording medium including a program for performing the methods described herein on a computer. The computer-readable recording medium may include a program command, a local data file, a local data structure, or the like, alone or in combination. The media may be those specially designed and constructed for the present invention, or may be those that are commonly used in the field of computer software. Examples of computer readable media include magnetic media such as hard disks, floppy disks and magnetic tape, optical recording media such as CD-ROMs and DVDs, magneto-optical media such as floppy disks, and magnetic media such as ROMs, And hardware devices specifically configured to store and execute program instructions. Examples of program instructions may include machine language code such as those generated by a compiler, as well as high-level language code that may be executed by a computer using an interpreter or the like.

이상에서 대표적인 실시예를 통하여 본 발명에 대하여 상세하게 설명하였으나, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자는 전술한 실시예에 대하여 본 발명의 범주에서 벗어나지 않는 한도 내에서 다양한 변형이 가능함을 이해할 것이다. 그러므로 본 발명의 권리범위는 설명된 실시예에 국한되어 정해져서는 안 되며, 후술하는 특허청구범위뿐만 아니라 이 특허청구범위와 균등한 것들에 의해 정해져야 한다. While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is to be understood that the invention is not limited to the disclosed embodiments, but, on the contrary, I will understand. Therefore, the scope of the present invention should not be limited to the above-described embodiments, but should be determined by equivalents to the appended claims, as well as the appended claims.

100: 영상 압축 장치
110: 입력부
120: 다운 샘플링부
130: 제1 압축부
140: 업 샘플링부
150: 감산부
160: 제2 압축부
170: 가산부100: image compression device
110: input unit
120: Downsampling unit
130: first compression section
140: upsampling unit
150:
160: second compression section
170:

Claims

An input unit for inputting an image;
A down-sampling unit for down-sampling the input image;
A first compression unit for generating a first compressed image from the downsampled image using at least one first auto encoder configured as a convolutional neural network;
An up-sampling unit for up-sampling the first compressed image;
A subtracter for subtracting the upsampled image from the input image;
A second compression unit for generating a second compressed image from the subtracted image using at least one second auto-encoder configured by a convolutional neural network; And
And an adder for adding the upsampled image and the second compressed image to generate a resultant image.

The method according to claim 1,
And the second compression unit repeatedly compresses the subtracted image using a plurality of second auto-encoders.

The method according to claim 1,
Wherein the first auto encoder and the second auto encoder are learned to minimize Mean Squared Error (MSE) between the input image and the resultant image.

The method of claim 3,
Wherein the second auto-encoder is learned using parameters of the learned first auto-encoder.

The method according to claim 1,
Wherein the input image includes a plurality of channels,
Wherein the first compression unit compresses each channel of the downsampled image using a plurality of first auto-encoders sharing at least one neural network layer.

The method according to claim 1,
Wherein the input image includes a plurality of channels,
Wherein the second compression unit compresses each channel of the subtracted image using a plurality of second auto-encoders sharing at least one neural network layer.

Receiving an image;
Down-sampling the input image;
Generating a first compressed image from the downsampled image using at least one first auto encoder configured as a convolution neural network;
Up-sampling the first compressed image;
Subtracting the upsampled image from the input image;
Generating a second compressed image from the subtracted image using at least one second auto-encoder configured as a convolutional neural network; And
And adding the upsampled image and the second compressed image to generate a resultant image.

The method of claim 7,
Wherein the step of generating the second compressed image repeatedly compresses the subtracted image using a plurality of second auto-encoders.

The method of claim 7,
Wherein the first auto encoder and the second auto encoder are learned to minimize Mean Squared Error (MSE) between the input image and the resultant image.

The method of claim 9,
Wherein the second auto-encoder is learned using parameters of the learned first auto-encoder.

The method of claim 7,
Wherein the input image includes a plurality of channels,
Wherein the generating the first compressed image comprises compressing each channel of the downsampled image using a plurality of first auto-encoders sharing at least one neural network layer.

The method of claim 7,
Wherein the input image includes a plurality of channels,
Wherein the step of generating the second compressed image compresses each channel of the subtracted image using a plurality of second auto-encoders sharing at least one neural network layer.