KR20230075248A

KR20230075248A - Device of compressing data, system of compressing data and method of compressing data

Info

Publication number: KR20230075248A
Application number: KR1020210161676A
Authority: KR
Inventors: 남우승; 이경한
Original assignee: 서울대학교산학협력단
Priority date: 2021-11-22
Filing date: 2021-11-22
Publication date: 2023-05-31
Also published as: KR102706107B1

Abstract

A data compression device according to an embodiment of the present invention comprises: an input unit for acquiring data; a first encoder unit for generating first data in which data is lossy compressed; a probability distribution prediction unit previously trained to predict a plurality of parameters for the probability distribution of residual data using the residual data as input; and a second encoder unit for generating second data in which the residual data is losslessly compressed using the plurality of predicted parameters of the residual data. The first encoder unit and the second encoder unit may be previously trained so that the sum of the size of the first data and the size of the second data has a minimum value. Accordingly, a high compression ratio can be achieved even with small complexity.

Description

Data compression device, data compression system and data compression method

본 발명은 데이터를 압축하는 장치, 데이터 압축 시스템 및 데이터 압축 방법에 관한 것이다.The present invention relates to an apparatus for compressing data, a data compression system, and a data compression method.

기존 이동통신 시스템과의 차별화를 위해 최근 5G 및 6G 이동통신 시스템은 고 수준의 응용 성능이 보장되는 초실감 원격 실재와 같은 고품질 네트워크 서비스들에 대한 지원이 가능한 형태로 설계될 것을 요구받고 있다. 초실감 원격 실재와 같은 고품질 응용서비스들의 요구 성능을 보장하기 위해서는 8K 이상의 해상도를 가지는 영상, 고품질 오감 데이터 등을 포함하는 고품질 볼류메트릭(Volumetric) 데이터를 초저지연으로 전송할 필요가 있다.In order to differentiate from existing mobile communication systems, recent 5G and 6G mobile communication systems are required to be designed in a form capable of supporting high-quality network services such as ultra-realistic remote reality that guarantees high-level application performance. In order to guarantee the required performance of high-quality application services such as ultra-realistic remote reality, it is necessary to transmit high-quality volumetric data including 8K or higher resolution video and high-quality five-sensory data with ultra-low latency.

초고품질 데이터에 대한 전송 지연은 다양한 지연 요소 중 전송량(data size)을 전송률(data_rate)로 나눈 값으로 정의되는 전송 시간에 크게 영향을 받는다. 인터넷 병목 구간에서 물리적으로 전송 한계가 존재할 경우, 전송량 자체를 줄일 필요가 있고, 전송량을 줄이기 위하여 초고효율 데이터 압축 방법이 필요하다. Transmission delay for ultra-high quality data is greatly affected by transmission time, which is defined as a value obtained by dividing a transmission amount (data size) by a transmission rate (data_rate) among various delay factors. When there is a physical transmission limit in the Internet bottleneck section, it is necessary to reduce the transmission amount itself, and an ultra-high efficiency data compression method is required to reduce the transmission amount.

한편, 종래 PNG 및 GIF 와 같은 압축 코텍 및 심층신경망(DNN, Deep Neural Network) 기반의 압축 코덱은 원본 데이터를 무손실 압축하거나 원본 데이터를 손실 압축하고, 원본 데이터를 무손실 복원하기 위해 원본과 복원된 데이터의 차이로 정의되는 잔차(residual) 데이터를 무손실 압축하는 방법이 이용되고 있다. 종래 기술은 원본 데이터를 손실 압축하는 방법과 잔차 데이터를 무손실 압축하는 방법이 독립적으로 수행되어, 최적의 압축률을 달성할 수 없는 문제가 있다.On the other hand, compression codecs such as conventional PNG and GIF and deep neural network (DNN)-based compression codecs losslessly compress original data or losslessly compress original data, and losslessly restore original data to restore original and restored data. A method of losslessly compressing residual data defined by the difference of is used. In the prior art, a method for lossy compressing original data and a method for losslessly compressing residual data are performed independently, and thus an optimal compression rate cannot be achieved.

본 발명이 해결하고자 하는 과제는, 원본 데이터의 손실 압축 데이터와 잔차 데이터의 무손실 압축 데이터를 동시에 고려하여, 향상된 데이터 압축률을 갖는 데이터 압축 장치, 데이터 압축 시스템 및 데이터 압축 방법을 제공하는 것이다.An object of the present invention is to provide a data compression device, a data compression system, and a data compression method having an improved data compression ratio by simultaneously considering lossy compressed data of original data and lossless compressed data of residual data.

다만, 본 발명이 해결하고자 하는 과제는 이상에서 언급한 것으로 제한되지 않으며, 언급되지 않은 또 다른 해결하고자 하는 과제는 아래의 기재로부터 본 발명이 속하는 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.However, the problem to be solved by the present invention is not limited to those mentioned above, and another problem to be solved that is not mentioned can be clearly understood by those skilled in the art from the description below. will be.

본 발명의 실시예에 따른 데이터 압축 장치는 데이터를 획득하는 입력부; 상기 데이터가 손실 압축(lossy compression)된 제1 데이터를 생성하는 제1 인코더부; 상기 데이터에서 손실 복원된 데이터를 뺀 잔차(residual) 데이터를 입력으로 하여 상기 잔차 데이터의 확률 분포에 대한 복수의 파라미터를 예측하도록 기 학습된 확률 분포 예측부; 및 상기 예측된 잔차 데이터의 복수의 파라미터를 이용하여, 잔차 데이터가 무손실 압축(lossless compression)된 제2 데이터를 생성하는 제2 인코더부를 포함하되, 상기 제1 데이터의 크기와 상기 제2 데이터의 크기의 합이 최소 값을 갖도록 상기 제1 인코더부 및 상기 제2 인코더부가 기 학습된 것이다.A data compression device according to an embodiment of the present invention includes an input unit for acquiring data; a first encoder unit generating first data obtained by lossy compression of the data; a probability distribution prediction unit pre-learned to predict a plurality of parameters for a probability distribution of the residual data by taking residual data obtained by subtracting loss-reconstructed data from the data as an input; and a second encoder unit generating second data obtained by lossless compression of the residual data using a plurality of parameters of the predicted residual data, wherein the size of the first data and the size of the second data The first encoder unit and the second encoder unit are pre-learned so that the sum of s has a minimum value.

상기 제1 인코더부는 생성적 적대 신경망(Generative Adversarial Network) 및 오토인코더 (Autoencoder) 중 적어도 하나를 포함할 수 있다.The first encoder unit may include at least one of a generative adversarial network and an autoencoder.

상기 확률 분포 예측부는, 학습용 잔차 데이터를 입력으로 하고 학습용 잔차 데이터의 확률 분포에 대한 복수의 파라미터를 레이블로 하여, 상기 학습용 잔차 데이터의 확률 분포에 대한 복수의 파라미터를 예측하도록 기학습된 것일 수 있다.The probability distribution prediction unit may be pre-learned to predict a plurality of parameters of a probability distribution of the residual data for learning using residual data for learning as an input and a plurality of parameters for a probability distribution of the residual data for learning as labels. .

상기 학습용 잔차 데이터의 확률 분포는 가우시안 분포, 프아송 분포 및 로지스틱 분포 중 적어도 하나를 포함하고, 상기 복수의 파라미터는 상기 확률 분포의 다중 매개변수 혼합(multi parameters mixture)일 수 있다.The probability distribution of the residual data for learning includes at least one of a Gaussian distribution, a Poisson distribution, and a logistic distribution, and the plurality of parameters may be a multi-parameter mixture of the probability distribution.

상기 확률 분포 예측부는 합성곱 신경망(Convolutional Neural Network) 및 순환 신경망(Recurrent Neural Network) 중 적어도 하나를 포함할 수 있다.The probability distribution prediction unit may include at least one of a convolutional neural network and a recurrent neural network.

상기 제1 인코더부 및 상기 제2 인코더부는 상기 제1 데이터의 엔트로피와 상기 제2 데이터의 엔트로피의 합이 합동 최적화(joint-optimization)로 최소화되도록 기 학습된 것일 수 있다.The first encoder unit and the second encoder unit may be pre-learned such that a sum of the entropy of the first data and the entropy of the second data is minimized through joint-optimization.

상기 제1 인코더부 및 상기 제2 인코딩부는, 상기 제1 데이터의 엔트로피와, 상기 제2 데이터 및 상기 예측된 잔차 데이터의 확률 분포에 대한 복수의 파라미터 사이의 크로스 엔트로피(cross-entropy)의 합이 최소화되도록 기 학습된 것일 수 있다.The first encoder unit and the second encoding unit may determine the sum of the entropy of the first data and the cross-entropy between a plurality of parameters for probability distributions of the second data and the predicted residual data. It may be pre-learned to be minimized.

본 발명의 다른 실시예에 따른 데이터 압축 시스템은 데이터를 획득하는 입력부; 상기 데이터가 손실 압축(lossy compression)된 제1 데이터를 생성하는 제1 인코더부; 상기 데이터의 잔차(residual) 데이터를 입력으로 하여 상기 잔차 데이터의 확률 분포에 대한 복수의 파라미터를 예측하도록 기 학습된 확률 분포 예측부; 상기 예측된 잔차 데이터의 복수의 파라미터를 이용하여, 잔차 데이터가 무손실 압축(lossless compression)된 제2 데이터를 생성하는 제2 인코더부; 및 상기 제1 데이터 및 상기 제2 데이터를 획득하여 각각 복원하고, 상기 제1 데이터로부터 손실 복원된 데이터와 상기 제2 데이터로부터 복원된 잔차 데이터를 더하여 상기 데이터를 손실없이 복원하는 디코더부를 포함하되, 상기 제1 데이터의 크기와 상기 제2 데이터의 크기의 합이 최소 값을 갖도록 상기 제1 인코더부 및 상기 제2 인코더부가 기 학습된 것일 수 있다.A data compression system according to another embodiment of the present invention includes an input unit for obtaining data; a first encoder unit generating first data obtained by lossy compression of the data; a probability distribution prediction unit pre-learned to predict a plurality of parameters for a probability distribution of the residual data by taking residual data of the data as an input; a second encoder unit generating second data obtained by lossless compression of the residual data using a plurality of parameters of the predicted residual data; And a decoder unit for obtaining and restoring the first data and the second data, respectively, and restoring the data without loss by adding loss-restored data from the first data and residual data restored from the second data; The first encoder unit and the second encoder unit may be pre-learned such that a sum of the size of the first data and the size of the second data has a minimum value.

본 발명의 일 측면에 따른 데이터를 압축하는 방법은 상기 데이터를 획득하는 단계; 상기 획득된 데이터가 손실 압축(lossy compression)된 제1 데이터를 생성하는 단계; 기 학습된 확률 분포 예측 신경망을 이용하여, 상기 데이터의 잔차(residual) 데이터를 입력으로 하여 상기 잔차 데이터의 확률 분포에 대한 복수의 파라미터를 예측하는 단계; 및 상기 예측된 잔차 데이터의 복수의 파라미터를 이용하여, 잔차 데이터가 무손실 압축(lossless compression)된 제2 데이터를 생성하는 단계를 포함하되, 상기 상기 제1 데이터의 크기와 상기 제2 데이터의 크기의 합이 최소 값을 갖도록 기 학습된다.A method for compressing data according to an aspect of the present invention includes obtaining the data; generating first data obtained by lossy compression of the obtained data; predicting a plurality of parameters of a probability distribution of the residual data by using residual data of the data as an input using a pre-learned probability distribution prediction neural network; and generating second data obtained by lossless compression of the residual data using a plurality of parameters of the predicted residual data, wherein the size of the first data and the size of the second data are The sum is pre-learned to have a minimum value.

본 발명의 다른 측면에 따른 컴퓨터 판독 가능한 기록 매체는 컴퓨터 프로그램을 포함하고, 상기 컴퓨터 프로그램은 데이터를 압축하는 방법을 프로세서가 수행하도록 하기 위한 명령어를 포함하고, 상기 방법은 상기 데이터를 획득하는 단계; 상기 획득된 데이터가 손실 압축(lossy compression)된 제1 데이터를 생성하는 단계; 기 학습된 확률 분포 예측 신경망을 이용하여, 상기 데이터의 잔차(residual) 데이터를 입력으로 하여 상기 잔차 데이터의 확률 분포에 대한 복수의 파라미터를 예측하는 단계; 및 상기 예측된 잔차 데이터의 복수의 파라미터를 이용하여, 잔차 데이터가 무손실 압축(lossless compression)된 제2 데이터를 생성하는 단계를 포함하되, 상기 상기 제1 데이터의 크기와 상기 제2 데이터의 크기의 합이 최소 값을 갖도록 기 학습된다.A computer readable recording medium according to another aspect of the present invention includes a computer program, the computer program includes instructions for causing a processor to perform a method of compressing data, the method comprising: acquiring the data; generating first data obtained by lossy compression of the obtained data; predicting a plurality of parameters of a probability distribution of the residual data by using residual data of the data as an input using a pre-learned probability distribution prediction neural network; and generating second data obtained by lossless compression of the residual data using a plurality of parameters of the predicted residual data, wherein the size of the first data and the size of the second data are The sum is pre-learned to have a minimum value.

본 발명의 또 다른 측면에 따른 컴퓨터 판독 가능한 기록 매체는 컴퓨터 프로그램을 포함하고, 상기 컴퓨터 프로그램은 데이터를 압축하는 방법을 프로세서가 수행하도록 하기 위한 명령어를 포함하고, 상기 방법은 상기 데이터를 획득하는 단계; 상기 획득된 데이터가 손실 압축(lossy compression)된 제1 데이터를 생성하는 단계; 기 학습된 확률 분포 예측 신경망을 이용하여, 상기 데이터의 잔차(residual) 데이터를 입력으로 하여 상기 잔차 데이터의 확률 분포에 대한 복수의 파라미터를 예측하는 단계; 및 상기 예측된 잔차 데이터의 복수의 파라미터를 이용하여, 잔차 데이터가 무손실 압축(lossless compression)된 제2 데이터를 생성하는 단계를 포함하되, 상기 상기 제1 데이터의 크기와 상기 제2 데이터의 크기의 합이 최소 값을 갖도록 기 학습된다.A computer readable recording medium according to another aspect of the present invention includes a computer program, the computer program includes instructions for causing a processor to perform a method of compressing data, the method comprising: obtaining the data; ; generating first data obtained by lossy compression of the obtained data; predicting a plurality of parameters of a probability distribution of the residual data by using residual data of the data as an input using a pre-learned probability distribution prediction neural network; and generating second data obtained by lossless compression of the residual data using a plurality of parameters of the predicted residual data, wherein the size of the first data and the size of the second data are The sum is pre-learned to have a minimum value.

본 발명의 일 실시예에 의하면, 원본 데이터를 무손실 압축하면서도 압축 데이터양을 획기적으로 줄일 수 있으므로, 고품질 데이터를 최단 시간에 전송할 수 있어, 다양한 형태의 정보들이 데이터로써 빠르게 축적되고 있는 차세대 네트워크 환경에서, 초실감 원격실재와 같은 고부가가치의 응용서비스를 실현할 수 있는 효과가 있다.According to an embodiment of the present invention, since the amount of compressed data can be drastically reduced while losslessly compressing original data, high-quality data can be transmitted in the shortest time, so that various types of information can be rapidly accumulated as data in a next-generation network environment. However, it has the effect of realizing high value-added application services such as ultra-realistic remote reality.

또한, 본 발명의 일 실시예에 의하면 데이터를 압축 및 복원하는 복잡도가 낮기 때문에, 작은 복잡도로도 높은 압축률을 달성할 수 있는 효과가 있다.In addition, according to an embodiment of the present invention, since the complexity of compressing and restoring data is low, there is an effect of achieving a high compression rate even with low complexity.

도 1은 본 발명의 실시예에 따른 데이터 압축 장치의 기능을 개념적으로 나타내는 블록도이다.
도 2는 데이터를 압축하고 전송하기 위한 전송량을 그림으로 나타낸다.
도 3은 본 발명의 실시예에 따른 데이터 압축 시스템의 기능을 개념적으로 나타내는 블록도이다.
도 4은 본 발명의 실시예에 따른 데이터 압축 시스템이 데이터를 압축 및 복원하는 흐름을 개념적으로 나타낸 블록도이다.
도 5는 본 발명의 실시예에 따른 데이터 압축 방법과 다른 데이터 압축 방법의 성능을 비교한 그래프이다.
도 6은 본 발명의 실시예에 따른 데이터 압축 장치를 하드웨어적 측면에서 설명하기 위한 블록 구성도이다.
도 7은 본 발명의 실시예에 따른 데이터 압축 방법을 나타내는 흐름도이다.1 is a block diagram conceptually illustrating the functions of a data compression apparatus according to an embodiment of the present invention.
2 shows a transmission amount for compressing and transmitting data as a picture.
3 is a block diagram conceptually illustrating functions of a data compression system according to an embodiment of the present invention.
4 is a block diagram conceptually illustrating a flow of compressing and restoring data by a data compression system according to an embodiment of the present invention.
5 is a graph comparing performance of a data compression method according to an embodiment of the present invention and other data compression methods.
6 is a block diagram illustrating a data compression device according to an embodiment of the present invention in terms of hardware.
7 is a flowchart illustrating a data compression method according to an embodiment of the present invention.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 그러나 본 발명은 이하에서 개시되는 실시예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있으며, 단지 본 실시예들은 본 발명의 개시가 완전하도록 하고, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다.Advantages and features of the present invention, and methods of achieving them, will become clear with reference to the detailed description of the following embodiments taken in conjunction with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below, but may be implemented in various different forms, and only these embodiments make the disclosure of the present invention complete, and common knowledge in the art to which the present invention belongs. It is provided to completely inform the person who has the scope of the invention, and the present invention is only defined by the scope of the claims.

본 발명의 실시예들을 설명함에 있어서 공지 기능 또는 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략할 것이다. 그리고 후술되는 용어들은 본 발명의 실시예에서의 기능을 고려하여 정의된 용어들로서 이는 사용자, 운용자의 의도 또는 관례 등에 따라 달라질 수 있다. 그러므로 그 정의는 본 명세서 전반에 걸친 내용을 토대로 내려져야 할 것이다.In describing the embodiments of the present invention, if it is determined that a detailed description of a known function or configuration may unnecessarily obscure the subject matter of the present invention, the detailed description will be omitted. In addition, terms to be described later are terms defined in consideration of functions in the embodiment of the present invention, which may vary according to the intention or custom of a user or operator. Therefore, the definition should be made based on the contents throughout this specification.

도 1은 본 발명의 실시예에 따른 데이터 압축 장치의 기능을 개념적으로 나타내는 블록도이다.1 is a block diagram conceptually illustrating the functions of a data compression apparatus according to an embodiment of the present invention.

도 1을 참조하면, 데이터 압축 장치(100)는 입력부(110), 제1 인코더부(120), 제2 인코더부(130) 및 확률 분포 예측부(140)를 포함할 수 있다. 다만, 데이터 압축 장치(100)의 구성이 도 1에 도시된 것으로 한정 해석되지는 않는다. 예컨데, 데이터 압축 장치(100)는 도 1에는 도시되지 않았지만, 다른 장치와 통신을 수행하기 위한 통신부, 압축 수행 결과를 디스플레이하기 위한 디스플레이부 등을 포함할 수 있다.Referring to FIG. 1 , the data compression device 100 may include an input unit 110, a first encoder unit 120, a second encoder unit 130, and a probability distribution prediction unit 140. However, the configuration of the data compression device 100 is not limited to that shown in FIG. 1 . For example, although not shown in FIG. 1 , the data compression device 100 may include a communication unit for communicating with other devices, a display unit for displaying compression results, and the like.

실시예에 따라, 데이터 압축 장치(100)는 압축을 수행할 데이터를 입력 받아, 원본 데이터를 손실 압축하고, 원본 데이터에서 손실 복원된 데이터를 뺀 잔차 데이터를 무손실 압축할 수 있다. 여기서, 데이터 압축 장치(100)는 손실 압축된 데이터와 무손실 압축된 데이터의 합이 최소가 되도록 압축률을 정할 수 있다.Depending on the embodiment, the data compression apparatus 100 may receive data to be compressed, loss-compress original data, and losslessly compress residual data obtained by subtracting loss-reconstructed data from original data. Here, the data compression apparatus 100 may determine a compression rate such that the sum of lossy compressed data and lossless compressed data is minimized.

입력부(110)는 데이터 압축 장치(100)와는 별개의 외부 장치 등으로부터 압축할 데이터를 입력 받을 수 있다. 데이터 입력을 위하여, 입력부(110)는 유, 무선 통신을 이용하여 데이터를 입력 받을 수도 있고, 사용자로부터 직접 데이터를 입력 받을 수 도 있으며, 데이터 압축 장치(100)내에 저장된 데이터를 입력부(110)를 통해 입력 받을 수 있다. 입력부(110)가 압축할 데이터를 입력받는 방법은 이에 한정되지 않는다.The input unit 110 may receive data to be compressed from an external device separate from the data compression device 100 . For data input, the input unit 110 may receive data using wired or wireless communication, or may directly receive data from a user, or may receive data stored in the data compression device 100 through the input unit 110. can be entered through A method for the input unit 110 to receive data to be compressed is not limited thereto.

이하 도 2에서는 먼저 데이터 압축 장치(100)가 데이터를 압축하기 위한 데이터 전송량에 대해 설명하기로 한다.In FIG. 2 , first, the data transmission amount for the data compression device 100 to compress data will be described.

도 2의 (a)는 데이터를 무손실 압축하고 전송하기 위한 데이터 전송량을 나타낸다.(a) of FIG. 2 shows the amount of data transmission for lossless compression and transmission of data.

도 2의 (a)를 더 참조하면, 원본 데이터가 압축된 데이터의 크기(Compressed data)는 무손실 압축된 데이터 심볼 개수 N과 무손실 압축된 데이터의 평균 심볼의 길이 L의 곱으로 나타낼 수 있다. 또한, 압축된 데이터를 복원하기 위하여 무손실 압축된 데이터의 심볼에 대한 확률 분포가 필요하고, 상기 확률 분포를 p라고 정의하면, 원본 데이터를 복원하기 위하여 필요한 실제 데이터 전송량(T)은 압축된 데이터의 크기(NL)와 상기 확률분포를 전달하기 위해 필요한 데이터 I(p)로 정의 될 수 있다.Referring further to (a) of FIG. 2 , the size of data obtained by compressing the original data (compressed data) may be expressed as a product of the number of symbols of lossless compressed data N and the average symbol length L of the lossless compressed data. In addition, in order to restore compressed data, a probability distribution for symbols of lossless compressed data is required, and if the probability distribution is defined as p, the actual data transmission amount (T) required to restore the original data is It can be defined as the size (NL) and the data I(p) necessary to convey the probability distribution.

여기서, 상기 확률 분포 p를 그대로 전달하기 위해서는 많은 데이터양(bits)을 요구하기 크기 때문에, 정확한 확률 분포 p를 전달하는 대신 인공신경망으로 추정된

에 대한 데이터를 전달할 수 있다.Here, since a large amount of data (bits) is required to deliver the probability distribution p as it is, instead of delivering the accurate probability distribution p, the artificial neural network estimated

data can be transmitted.

도 2의 (b)는 데이터를 손실 압축하고, 잔차 데이터를 무손실 압축하여 전송하기 위한 데이터 전송량을 나타낸다.(b) of FIG. 2 shows the amount of data transmission for transmission after loss-compressing data and losslessly compressing residual data.

도 1 및 도 2의 (b)를 더 참조하면, 데이터 압축 장치(100)는 상기 도 2 의 (a)와 같이 무손실 압축으로 데이터를 전송하는 경우, 원본 데이터의 분포 범위가 넓고 불확실한 특성을 지닐 수 있기 때문에, 원본 데이터 및 원본 데이터의 잔차 데이터를 각각 손실 압축과 무손실 압축하여, 원본 데이터의 분포 범위가 넓고 불확실한 특성을 보완할 수 있다.Further referring to FIGS. 1 and 2(b), the data compression device 100 transmits data through lossless compression as shown in FIG. Since the original data and the residual data of the original data can be lossy compressed and lossless compressed, respectively, it is possible to compensate for the wide distribution range and uncertain characteristics of the original data.

이를 위해, 제1 인코더부(120)는 데이터를 무손실 압축하지 않고 데이터를 손실 압축할 수 있고, 제2 인코더부(130)는 원본 데이터에서 상기 제1 인코더부로부터 얻은 압축 데이터를 손실 복원한 데이터를 뺀 잔차 데이터를 무손실 압축할 수 있다. To this end, the first encoder unit 120 may loss-compress the data without losslessly compressing the data, and the second encoder unit 130 loss-restores the compressed data obtained from the first encoder unit in the original data. The residual data after subtracting can be losslessly compressed.

제1 인코더부(120)는 손실 압축을 수행하기 위하여, 데이터를 작은 사이즈로 압축하는 신경망을 포함할 수 있다. 실시예에 따라, 제1 인코더부(120)는 생성적 적대 신경망(Generative Adversarial Network) 및 오토인코더(Autoencoder) 중 적어도 하나를 포함할 수 있다.The first encoder unit 120 may include a neural network that compresses data into a small size in order to perform lossy compression. According to an embodiment, the first encoder unit 120 may include at least one of a generative adversarial network and an autoencoder.

이 경우, 데이터 압축 장치(100)가 압축하는 데이터 전송량(T)은 원본 데이터를 손실 압축한 데이터(Compressed data with loss)의 크기와 잔차 데이터를 무손실 압축한 데이터(Compressed Data without loss)의 크기 및 후술할 확률 분포 예측부(140)가 예측한 잔차 데이터에 대한 예측 확률 분포(Distribution information)에 대한 데이터 크기를 모두 합한 양이다. 데이터 압축 장치(100)가 압축하는 데이터 전송량(T)은 하기 수학식 1과 같이 나타낼 수 있다.In this case, the data transmission amount (T) compressed by the data compression device 100 is the size of data obtained by compressing the original data with loss (Compressed data with loss) and the size of data obtained by compressing the residual data without loss (Compressed Data without loss) and It is the sum of all the data sizes of the predicted probability distribution (Distribution information) of the residual data predicted by the probability distribution predictor 140 to be described later. The amount of data transmission (T) compressed by the data compression device 100 can be expressed as in Equation 1 below.

여기서, x는 원본 데이터에 해당하고, E()는 손실 압축기이고 b()는 함수의 입력을 전송하기 위한 전송 데이터양을 의미한다. 또한,

은 잔차 데이터의 확률 분포를 의미하고,

은 추정된 잔차 데이터의 확률 분포를 의미한다. Here, x corresponds to the original data, E() is a lossy compressor, and b() means the amount of transmission data for transmitting the input of the function. also,

is the probability distribution of the residual data,

denotes the probability distribution of the estimated residual data.

확률 분포 예측부(140)는 잔차 데이터의 확률 분포를 예측하기 위하여, 기 학습된 인공신경망을 포함할 수 있다.The probability distribution predictor 140 may include a pre-learned artificial neural network in order to predict the probability distribution of residual data.

보다 자세하게는, 확률 분포 예측부(140)는 학습용 잔차 데이터를 입력으로 하고 학습용 잔차 데이터의 확률 분포에 대한 복수의 파라미터를 레이블로 하여, 상기 학습용 잔차 데이터의 확률 분포에 대한 복수의 파라미터를 예측하도록 미리 학습될 수 있다.More specifically, the probability distribution predictor 140 takes residual data for learning as an input and labels a plurality of parameters for the probability distribution of the residual data for learning to predict a plurality of parameters for the probability distribution of the residual data for learning. can be learned in advance.

또한, 실시예에 따라, 확률 분포 예측부(140)는 학습용 잔차 데이터에 대한 상기 복수의 파라미터를 예측하기 위하여, 합성곱 신경망(Convolutional Neural Network) 및 순환 신경망(Recurrent Neural Network) 중 적어도 하나를 포함할 수 있다. 다만, 확률 분포 예측부(140)에 포함된 상기 인공신경망은 예시적인 것에 불과하며, 다른 기 공지된 인공신경망 모델을 이용하여, 학습용 잔차 데이터에 대한 상기 복수의 파라미터를 예측할 수 있다.Also, according to an embodiment, the probability distribution predictor 140 includes at least one of a convolutional neural network and a recurrent neural network in order to predict the plurality of parameters of residual data for learning. can do. However, the artificial neural network included in the probability distribution predictor 140 is merely exemplary, and the plurality of parameters of the residual data for learning may be predicted using other known artificial neural network models.

확률 분포 예측부(140)에 입력되는 학습용 잔차 데이터의 확률 분포는 가우시안 분포, 프아송 분포 및 로지스틱 분포 중 적어도 하나를 포함하고, 상기 복수의 파라미터는 상기 확률 분포의 다중 매개변수 혼합(multi parameters mixture)일 수 있다.The probability distribution of residual data for learning input to the probability distribution prediction unit 140 includes at least one of a Gaussian distribution, a Poisson distribution, and a logistic distribution, and the plurality of parameters are a multi-parameter mixture of the probability distribution ) can be.

여기서, 데이터 압축 장치(100)는 상기 수학식 1의 데이터 전송량(T)을 최소화 하기 위하여, 공동 최적화(joint optimization)방법을 이용하여, 무손실 압축을 수행하면서도, 데이터 전송량(T)을 최소화 시킬 수 있다.Here, the data compression device 100 can minimize the data transmission amount (T) while performing lossless compression using a joint optimization method in order to minimize the data transmission amount (T) of Equation 1. there is.

손실 압축기에 의해 인코딩된 데이터가 z이고 잔차 데이터를 r이라고 하면, 데이터 압축 장치(100)는 인코딩된 데이터 z의 확률 분포에 대한 엔트로피(H(p_z))와 잔차 데이터의 확률 분포에 대한 엔트로피(H(p_r))의 합을 공동 손실 함수로 하여, 이를 최소화하도록 제1 인코더부(110)와 제2 인코더부(120)를 학습시킴으로써, 상기 데이터 전송량(T)을 최소화 할 수 있다.If the data encoded by the lossy compressor is z and the residual data is r, the data compression apparatus 100 calculates the entropy (H(p _z )) of the probability distribution of the encoded data z and the entropy of the probability distribution of the residual data The data transfer amount T can be minimized by taking the sum of (H( _pr )) as a common loss function and learning the first encoder unit 110 and the second encoder unit 120 to minimize it.

또한, 공동 손실 함수의 하한을 고려할 때, 데이터 압축 장치(100)는 인코딩된 데이터 z의 엔트로피(H(p_z))와 잔차 데이터의 엔트로피(H(p_r))의 합 대신에, 인코딩된 데이터 z의 엔트로피(H(p_z))와 실제 잔차 데이터의 확률 분포와 확률 분포 예측부(140)에 의해 예측된 잔차 데이터의 확률 분포의 크로스 엔트로피(Cross-entropy)의 합을 손실 함수로 하여, 이를 최소화 하도록 최소화하도록 제1 인코더부(110)와 제2 인코더부(120)를 학습시킬 수 있다. In addition, when considering the lower limit of the joint loss function, the data compression apparatus 100 instead of the sum of the entropy (H(p _z )) of the encoded data z and the entropy (H( _pr )) of the residual data, the encoded data z The sum of the entropy of data z (H(p _z )), the probability distribution of actual residual data, and the cross-entropy of the probability distribution of residual data predicted by the probability distribution predictor 140 is used as the loss function , the first encoder unit 110 and the second encoder unit 120 may be trained to minimize this.

이하에서는, 데이터를 압축하고, 다시 복원하는 과정에 대해 상세하게 설명한다.Hereinafter, a process of compressing and restoring data will be described in detail.

도 3은 본 발명의 실시예에 따른 데이터 압축 시스템의 기능을 개념적으로 나타내는 블록도이고, 도 4은 본 발명의 실시예에 따른 데이터 압축 시스템이 데이터를 압축 및 복원하는 흐름을 개념적으로 나타낸 블록도이다.3 is a block diagram conceptually showing functions of a data compression system according to an embodiment of the present invention, and FIG. 4 is a block diagram conceptually showing a flow of data compression and restoration by the data compression system according to an embodiment of the present invention. am.

도 3을 참조하면, 데이터 압축 시스템(10)은 데이터 압축 장치(100) 및 데이터 복원 장치(200)를 포함할 수 있다. 여기서, 데이터 압축 장치(100)의 각 기능은 도 1에서 상술한 바와 같다.Referring to FIG. 3 , the data compression system 10 may include a data compression device 100 and a data restoration device 200 . Here, each function of the data compression device 100 is as described above with reference to FIG. 1 .

데이터 복원 장치(200)는 데이터 수신부(210) 및 디코더부(220)를 포함할 수 있다.The data recovery apparatus 200 may include a data receiver 210 and a decoder 220 .

도 4를 더 참조하면, 데이터 압축 장치(100)는 압축할 데이터(x)를 입력 받아, 제1 인코더(120)에 해당하는 인코더(ENC)에 의해 손실 압축할 수 있다. 또한, 제2 인코더(130)는 원본 데이터에서 압축할 데이터(x)를 인코더(ENC)로 손실 압축한 후, 디코더(DEC)로 복원한 복원 데이터 사이의 차이인 잔차 데이터(r)를 무손실 압축하고, 확률 분포 예측부(140)에 의해 잔차 데이터(r)의 예측된 확률 분포 데이터와 함께 인코딩할 수 있다.Referring further to FIG. 4 , the data compression apparatus 100 may receive data (x) to be compressed and loss-compress it by an encoder (ENC) corresponding to the first encoder 120 . In addition, the second encoder 130 loss-compresses the data (x) to be compressed from the original data by the encoder (ENC), and then losslessly compresses the residual data (r), which is the difference between the restored data restored by the decoder (DEC). And, by the probability distribution prediction unit 140, the residual data (r) can be encoded together with the predicted probability distribution data.

도 4에서는 설명의 편의를 위하여, 제1 인코더(120)가 오토인코더(Autoencoder)로 구현된 경우를 가정하였지만, 이에 한정되지 않는다. 예를 들어, 제1 인코더(120)는 생성적 적대 신경망(GAN)으로 구현될 수도 있다.In FIG. 4, for convenience of explanation, it is assumed that the first encoder 120 is implemented as an autoencoder, but is not limited thereto. For example, the first encoder 120 may be implemented as a generative adversarial network (GAN).

도 4에 도시되지는 않았지만, 데이터 압축 장치(100)는 손실 압축된 데이터를 양자화하고, 양자화된 데이터를 엔트로피 코딩하는 과정을 더 포함할 수 있다. 여기서, 엔트로피 코딩은 심볼이 나올 확률에 따라 심볼을 나타내는 코드의 길이를 달리하는 부호화 방법으로, 허프만 코딩, 범위 코딩 및 산술 코딩 등의 방법이 이용될 수 있다.Although not shown in FIG. 4 , the data compression apparatus 100 may further include a process of quantizing the lossy compressed data and entropy-coding the quantized data. Here, entropy coding is a coding method in which the length of a code representing a symbol is varied according to the probability of a symbol appearing, and methods such as Huffman coding, range coding, and arithmetic coding may be used.

데이터 수신부(210)는 데이터 압축 장치(100)로부터 압축된 데이터를 수신받을 수 있다. 여기서, 데이터 수신부(210)가 데이터 압축 장치(100)로부터 압축 데이터를 수신받는 방법은 유, 무선 통신을 이용하여 수신받을 수 있고, 데이터 복원 시스템(10)이 하나의 장치에서 구현되는 경우, 내부 시그널링을 통해 압축된 데이터를 수신받을 수도 있다. 데이터 수신부(210)가 데이터 압축 장치(100)로부터 압축된 데이터를 획득하는 방법은 이에 한정되지 않는다.The data receiver 210 may receive compressed data from the data compression device 100 . Here, the method for the data receiving unit 210 to receive the compressed data from the data compression device 100 may be received using wired or wireless communication, and when the data recovery system 10 is implemented in one device, the internal Compressed data may be received through signaling. A method for the data receiving unit 210 to obtain compressed data from the data compression device 100 is not limited thereto.

여기서 데이터 수신부(210)가 수신받는 압축된 데이터는 원본 데이터가 손실 압축된 데이터, 잔차 데이터가 무손실 압축된 데이터 및 잔차 데이터의 예측된 확률 분포에 대한 데이터일 수 있다. Here, the compressed data received by the data receiving unit 210 may be original data loss-compressed data, residual data loss-free compression data, and data about a predicted probability distribution of the residual data.

디코더부(220)는 수신된 원본 데이터가 손실 압축된 데이터를 디코더(DEC)를 이용하여, 손실 데이터(

)로 복원할 수 있다. 또한, 디코더부(220)는 잔차 데이터가 무손실 압축된 데이터 및 잔차 데이터의 예측된 확률 분포에 대한 데이터를 이용하여, 무손실 디코더(Lossless Decoder)에 의해 디코딩하여, 잔차 데이터(r)를 복원할 수 있다. 최종적으로 디코더부(220)는 손실 데이터(

)와 잔차 데이터(r)를 더하여 원본 데이터(x)를 복원할 수 있다.The decoder unit 220 converts the received original data to lossy compressed data using a decoder (DEC), and loses data (

) can be restored. In addition, the decoder unit 220 may restore the residual data r by decoding the residual data by a lossless decoder using lossless compressed data and data about a predicted probability distribution of the residual data. there is. Finally, the decoder unit 220 loses data (

) and the residual data (r), the original data (x) can be restored.

도 5의 (a)는 이미지 사이즈가 512x512인 데이터를 복수의 데이터 압축 방법을 이용하여, 압축 및 복원한 결과를 나타내고, 도 5의 (b)는 사이즈가 4K UHD인 데이터를 복수의 데이터 압축 방법을 이용하여, 압축 및 복원한 결과를 나타낸다.5(a) shows the result of compressing and restoring data having an image size of 512x512 using a plurality of data compression methods, and FIG. 5(b) shows data having a size of 4K UHD using a plurality of data compression methods. Shows the result of compression and restoration using .

도 5의 (a) 및 (b)를 참조하면, 다른 데이터 압축 방법에 비해, 본원 발명의 실시예(Ours)에 의한 데이터 압축 방법이 압축하고자 하는 데이터 양에 관계없이 압축 및 복원을 수행하는 시간(Encoding+Decoding Time)도 가장 짧은 것을 확인할 수 있다. 또한, 다른 데이터 압축 방법에 비해, 본원 발명의 실시예(Ours)에 의한 데이터 압축 방법이 압축하고자 하는 데이터 양에 관계없이 픽셀당 전송 비트수(Bits Per Pixel)도 가장 낮아, 제안하는 데이터 압축 방법이 가장 효율적인 압축률을 보이는 것을 확인할 수 있다.Referring to (a) and (b) of FIG. 5 , compared to other data compression methods, the data compression method according to the embodiment of the present invention (Ours) takes time to perform compression and restoration regardless of the amount of data to be compressed. (Encoding+Decoding Time) can also be confirmed as the shortest. In addition, compared to other data compression methods, the data compression method according to the embodiment of the present invention (Ours) has the lowest number of bits per pixel regardless of the amount of data to be compressed, so the proposed data compression method It can be seen that this shows the most efficient compression ratio.

도 6은 본 발명의 실시예에 따른 데이터 압축 장치를 하드웨어적 측면에서 설명하기 위한 블록 구성도이다.6 is a block diagram illustrating a data compression device according to an embodiment of the present invention in terms of hardware.

도 1 및 도 6을 참조하면, 데이터 압축 장치(100)는 적어도 하나의 명령을 저장하는 저장장치(191) 및 상기 저장장치(191)의 적어도 하나의 명령을 실행하는 프로세서(192), 송수신 장치(193), 입력 인터페이스 장치(194) 및 출력 인터페이스 장치(195)를 포함할 수 있다.1 and 6, the data compression device 100 includes a storage device 191 for storing at least one command, a processor 192 for executing at least one command of the storage device 191, and a transmitting and receiving device. 193, an input interface device 194 and an output interface device 195.

데이터 압축 장치(100)에 포함된 각각의 구성 요소들(191, 192, 193, 194, 195)은 데이터 버스(bus, 196)에 의해 연결되어 서로 통신을 수행할 수 있다.Each of the components 191 , 192 , 193 , 194 , and 195 included in the data compression device 100 are connected by a data bus 196 to communicate with each other.

저장장치(191)는 메모리 또는 휘발성 저장 매체 및 비휘발성 저장 매체 중에서 적어도 하나를 포함할 수 있다. 예를 들어, 저장장치(191)는 읽기 전용 메모리(read only memory, ROM) 및 랜덤 액세스 메모리(random access memory, RAM) 중에서 적어도 하나를 포함할 수 있다.The storage device 191 may include at least one of a memory or a volatile storage medium and a non-volatile storage medium. For example, the storage device 191 may include at least one of read only memory (ROM) and random access memory (RAM).

저장장치(191)는 후술될 프로세서(192)에 의해 실행될 적어도 하나의 명령을 더 포함할 수 있고, 입력 인터페이스 장치(194)에서 사용자로부터 입력된 데이터 손실 압축 방법, 확률 분포의 파라미터 등을 저장할 수 있다.The storage device 191 may further include at least one command to be executed by the processor 192 to be described later, and may store a data loss compression method, parameters of a probability distribution, and the like input from a user through the input interface device 194. there is.

프로세서(192)는 중앙 처리 장치(central processing unit, CPU), 그래픽 처리 장치(graphics processing unit, GPU), MCU(micro controller unit) 또는 본 발명의 실시예들에 따른 방법들이 수행되는 전용의 프로세서를 의미할 수 있다. The processor 192 may be a central processing unit (CPU), a graphics processing unit (GPU), a micro controller unit (MCU), or a dedicated processor on which methods according to embodiments of the present invention are performed. can mean

도 1을 더 참조하면, 프로세서(192)는 앞서 설명한 바와 같이, 저장장치(191)에 저장된 적어도 하나의 프로그램 명령에 의해 제1 인코더부(120), 제2 인코더부(130) 및 확률 분포 예측부(140)의 기능을 수행할 수 있으며, 이들 각각은 적어도 하나의 모듈의 형태로 메모리에 저장되어 프로세서에 의해 실행될 수 있다. Referring further to FIG. 1 , as described above, the processor 192 operates the first encoder unit 120, the second encoder unit 130, and probability distribution prediction by at least one program command stored in the storage device 191. The functions of the unit 140 may be performed, and each of these may be stored in a memory in the form of at least one module and executed by a processor.

송수신 장치(193)는 내부 장치 또는 통신으로 연결된 외부 장치로부터 데이터를 수신하거나 송신할 수 있고, 입력부(110)의 기능을 수행할 수 있다.The transmitting/receiving device 193 may receive or transmit data from an internal device or an external device connected through communication, and may perform the function of the input unit 110 .

입력 인터페이스 장치(194)는 사용자로부터 적어도 하나의 제어 신호 또는 설정 수치를 입력받을 수 있다. 예를 들어, 입력 인터페이스 장치(194)는 데이터 전송 명령, 압축 명령 등의 사용자 입력을 받을 수 있다.The input interface device 194 may receive at least one control signal or set value from a user. For example, the input interface device 194 may receive a user input such as a data transmission command or a compression command.

출력 인터페이스 장치(195)는 프로세서(192)의 동작에 의해 데이터를 압축하는 압축률을 포함하는 적어도 하나의 정보를 출력하여 가시화할 수 있다.The output interface device 195 may output and visualize at least one piece of information including a compression rate for compressing data by the operation of the processor 192 .

도 7은 본 발명의 실시예에 따른 데이터 압축 방법을 나타내는 흐름도이다.7 is a flowchart illustrating a data compression method according to an embodiment of the present invention.

도 7을 더 참조하면, 먼저 송수신 장치(193)는 데이터를 획득할 수 있다(S100).Referring further to FIG. 7 , first, the transmitting/receiving device 193 may obtain data (S100).

이어서, 프로세서(192)는 획득된 데이터가 손실 압축된 제1 데이터를 생성할 수 있다(S200).Subsequently, the processor 192 may generate first data obtained by loss-compressing the obtained data (S200).

또한, 프로세서(192)는 기 학습된 확률분포 예측 신경망을 이용하여, 상기 데이터의 잔차 데이터를 입력으로 하여 상기 잔차 데이터의 확률 분포에 대한 복수의 파라미터를 예측 할 수 있다(S300).In addition, the processor 192 may predict a plurality of parameters of the probability distribution of the residual data by using the pre-learned probability distribution prediction neural network by using the residual data of the data as an input (S300).

이어서, 프로세서(192)는 상기 예측된 잔차 데이터의 복수의 파라미터를 이용하여, 잔차 데이터가 무손실 압축된 제2 데이터를 생성할 수 있다(S400).Subsequently, the processor 192 may generate second data obtained by losslessly compressing the residual data using a plurality of parameters of the predicted residual data (S400).

이와 같이, 본 발명의 실시예에 따른 데이터 압축 장치 및 데이터 압축 방법은 원본 데이터를 무손실 압축하면서도 압축 데이터양을 획기적으로 줄일 수 있으므로, 고품질 데이터를 최단 시간에 전송할 수 있어, 다양한 형태의 정보들이 데이터로써 빠르게 축적되고 있는 차세대 네트워크 환경에서, 초실감 원격실재와 같은 고부가가치의 응용서비스를 실현할 수 있는 효과가 있다. 또한, 데이터를 압축 및 복원하는 복잡도가 낮기 때문에, 데이터를 압축 및 복원하는 시간을 단축시킬 수 있는 효과가 있다.As such, since the data compression apparatus and method according to an embodiment of the present invention can losslessly compress original data while dramatically reducing the amount of compressed data, high-quality data can be transmitted in the shortest time, and various types of information can be stored as data. In the rapidly accumulating next-generation network environment, it has the effect of realizing high value-added application services such as ultra-realistic remote reality. In addition, since the complexity of compressing and restoring data is low, there is an effect of shortening the time for compressing and restoring data.

본 발명에 첨부된 블록도의 각 블록과 흐름도의 각 단계의 조합들은 컴퓨터 프로그램 인스트럭션들에 의해 수행될 수도 있다. 이들 컴퓨터 프로그램 인스트럭션들은 범용 컴퓨터, 특수용 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비의 인코딩 프로세서에 탑재될 수 있으므로, 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비의 인코딩 프로세서를 통해 수행되는 그 인스트럭션들이 블록도의 각 블록 또는 흐름도의 각 단계에서 설명된 기능들을 수행하는 수단을 생성하게 된다. 이들 컴퓨터 프로그램 인스트럭션들은 특정 방법으로 기능을 구현하기 위해 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비를 지향할 수 있는 컴퓨터 이용 가능 또는 컴퓨터 판독 가능 메모리에 저장되는 것도 가능하므로, 그 컴퓨터 이용가능 또는 컴퓨터 판독 가능 메모리에 저장된 인스트럭션들은 블록도의 각 블록 또는 흐름도 각 단계에서 설명된 기능을 수행하는 인스트럭션 수단을 내포하는 제조 품목을 생산하는 것도 가능하다. 컴퓨터 프로그램 인스트럭션들은 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비 상에 탑재되는 것도 가능하므로, 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비 상에서 일련의 동작 단계들이 수행되어 컴퓨터로 실행되는 프로세스를 생성해서 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비를 수행하는 인스트럭션들은 블록도의 각 블록 및 흐름도의 각 단계에서 설명된 기능들을 실행하기 위한 단계들을 제공하는 것도 가능하다.Combinations of each block of the block diagram and each step of the flowchart accompanying the present invention may be performed by computer program instructions. Since these computer program instructions may be loaded into an encoding processor of a general-purpose computer, special-purpose computer, or other programmable data processing equipment, the instructions executed by the encoding processor of the computer or other programmable data processing equipment are each block or block diagram of the block diagram. Each step in the flow chart creates means for performing the functions described. These computer program instructions may also be stored in a computer usable or computer readable memory that can be directed to a computer or other programmable data processing equipment to implement functionality in a particular way, such that the computer usable or computer readable memory It is also possible for the instructions stored in to produce an article of manufacture containing instruction means for performing the function described in each block of the block diagram or each step of the flow chart. The computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operational steps are performed on the computer or other programmable data processing equipment to create a computer-executed process to generate computer or other programmable data processing equipment. It is also possible that the instructions performing the processing equipment provide steps for executing the functions described in each block of the block diagram and each step of the flowchart.

또한, 각 블록 또는 각 단계는 특정된 논리적 기능(들)을 실행하기 위한 하나 이상의 실행 가능한 인스트럭션들을 포함하는 모듈, 세그먼트 또는 코드의 일부를 나타낼 수 있다. 또, 몇 가지 대체 실시 예들에서는 블록들 또는 단계들에서 언급된 기능들이 순서를 벗어나서 발생하는 것도 가능함을 주목해야 한다. 예컨대, 잇달아 도시되어 있는 두 개의 블록들 또는 단계들은 사실 실질적으로 동시에 수행되는 것도 가능하고 또는 그 블록들 또는 단계들이 때때로 해당하는 기능에 따라 역순으로 수행되는 것도 가능하다.Further, each block or each step may represent a module, segment or portion of code that includes one or more executable instructions for executing specified logical function(s). It should also be noted that in some alternative embodiments, it is possible for the functions recited in the blocks or steps to occur out of order. For example, two blocks or steps shown in succession may in fact be performed substantially concurrently, or the blocks or steps may sometimes be performed in reverse order depending on their function.

이상의 설명은 본 발명의 기술 사상을 예시적으로 설명한 것에 불과한 것으로서, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 발명의 본질적인 품질에서 벗어나지 않는 범위에서 다양한 수정 및 변형이 가능할 것이다. 따라서, 본 발명에 개시된 실시 예들은 본 발명의 기술 사상을 한정하기 위한 것이 아니라 설명하기 위한 것이고, 이러한 실시 예에 의하여 본 발명의 기술 사상의 범위가 한정되는 것은 아니다. 본 발명의 보호 범위는 아래의 청구범위에 의하여 해석되어야 하며, 그와 균등한 범위 내에 있는 모든 기술사상은 본 발명의 권리범위에 포함되는 것으로 해석되어야 할 것이다.The above description is merely an example of the technical idea of the present invention, and various modifications and variations can be made to those skilled in the art without departing from the essential qualities of the present invention. Therefore, the embodiments disclosed in the present invention are not intended to limit the technical idea of the present invention, but to explain, and the scope of the technical idea of the present invention is not limited by these embodiments. The protection scope of the present invention should be construed according to the claims below, and all technical ideas within the scope equivalent thereto should be construed as being included in the scope of the present invention.

10 : 데이터 압축 시스템
100 : 데이터 압축 장치
200 : 데이터 복원 장치10: Data Compression System
100: data compression device
200: data restoration device

Claims

an input unit for obtaining data;
a first encoder unit generating first data obtained by lossy compression of the data;
a probability distribution prediction unit pre-learned to predict a plurality of parameters for a probability distribution of the residual data by taking residual data obtained by subtracting loss-reconstructed data from the data as an input; and
A second encoder unit generating second data obtained by lossless compression of the residual data using a plurality of parameters of the predicted residual data,
The first encoder unit and the second encoder unit are pre-learned so that the sum of the size of the first data and the size of the second data has a minimum value,
data compression device.

According to claim 1,
The first encoder unit
At least one of a generative adversarial network and an autoencoder
data compression device.

According to claim 1,
The probability distribution prediction unit,
Pre-learned to predict a plurality of parameters of the probability distribution of the residual data for learning using residual data for learning as input and a plurality of parameters for the probability distribution of the residual data for learning as labels
data compression device.

According to claim 3,
The probability distribution of the residual data for learning is
At least one of a Gaussian distribution, a Poisson distribution, and a logistic distribution;
The plurality of parameters are
is a multi parameters mixture of the probability distribution.
data compression device.

According to claim 1,
The probability distribution prediction unit
At least one of a convolutional neural network and a recurrent neural network
data compression device.

According to claim 1,
The first encoder unit and the second encoder unit
Pre-learned to minimize the sum of the entropy of the first data and the entropy of the second data through joint-optimization.
data compression device.

According to claim 1,
The first encoder unit and the second encoder unit,
Pre-learned to minimize the sum of cross-entropy between the entropy of the first data and a plurality of parameters for the probability distribution of the second data and the predicted residual data.
data compression device.

an input unit for obtaining data;
a first encoder unit generating first data obtained by lossy compression of the data;
a probability distribution prediction unit pre-learned to predict a plurality of parameters for a probability distribution of the residual data by taking residual data of the data as an input;
a second encoder unit generating second data obtained by lossless compression of the residual data using a plurality of parameters of the predicted residual data; and
A decoder unit for obtaining and restoring the first data and the second data, respectively, and restoring the data without loss by adding loss-restored data from the first data and residual data restored from the second data;
The first encoder unit and the second encoder unit are pre-learned so that the sum of the size of the first data and the size of the second data has a minimum value,
data compression system.

A method of compressing data performed by a data compression system, comprising:
acquiring the data;
generating first data obtained by lossy compression of the obtained data;
predicting a plurality of parameters of a probability distribution of the residual data by using residual data of the data as an input using a pre-learned probability distribution prediction neural network; and
Generating second data obtained by lossless compression of the residual data using a plurality of parameters of the predicted residual data,
The sum of the size of the first data and the size of the second data is pre-learned to have a minimum value.
Data compression method.

A computer-readable recording medium storing a computer program,
The computer program,
Including instructions for causing a processor to perform a method of compressing data;
The method,
acquiring the data;
generating first data obtained by lossy compression of the obtained data;
predicting a plurality of parameters of a probability distribution of the residual data by using residual data of the data as an input using a pre-learned probability distribution prediction neural network; and
Generating second data obtained by lossless compression of the residual data using a plurality of parameters of the predicted residual data,
The sum of the size of the first data and the size of the second data is pre-learned to have a minimum value.
A computer-readable recording medium.

A computer program stored on a computer readable recording medium,
The computer program,
Including instructions for causing a processor to perform a method of compressing data;
The method,
acquiring the data;
generating first data obtained by lossy compression of the obtained data;
predicting a plurality of parameters of a probability distribution of the residual data by using residual data of the data as an input using a pre-learned probability distribution prediction neural network; and
Generating second data obtained by lossless compression of the residual data using a plurality of parameters of the predicted residual data,
The sum of the size of the first data and the size of the second data is pre-learned to have a minimum value.
computer program.