KR20030096504A

KR20030096504A - method and apparatus for multi-symbol data compression using a binary arithmetic coder

Info

Publication number: KR20030096504A
Application number: KR1020020033019A
Authority: KR
Inventors: 이철수; 박현욱
Original assignee: 한국과학기술원
Priority date: 2002-06-12
Filing date: 2002-06-12
Publication date: 2003-12-31
Also published as: KR100462789B1

Abstract

PURPOSE: A method and an apparatus for compressing multiple code data using binary arithmetic coding are provided to improve compression and release speed by using binary arithmetic coding, and increase a compression rate by coding bits with conditional probability to binary code. CONSTITUTION: Multiple code data is transformed into binary type data(S110). An MSB(most significant bit) is coded by using probability having a null condition(S120). A second bit is coded by using conditional probability having considered a value of the MSB(S130). A kth bit is coded by using conditional probability having considered all values of a superior position(S140).

Description

Method and apparatus for multi-symbol data compression using a binary arithmetic coder}

본 발명은 이진 산술 부호기(BAC: binary arithmetic coder)를 이용하여 다중 부호 데이터 압축을 제공하는 방법 및 장치에 관한 것으로서, 특히 이미지 데이터 처리를 위하여 사용되는 다중 부호 데이터 압축에서 이진 산술 부호기를 사용하는 방법 및 장치에 관한 것이다.The present invention relates to a method and apparatus for providing multiple coded data compression using a binary arithmetic coder (BAC), and more particularly to a method of using a binary arithmetic code in multiple coded data compression used for image data processing. And to an apparatus.

데이터를 압축하는 방법은 크게 두 가지로 나눌 수 있다. 즉, 그 하나는 손실이 있는 압축이고, 다른 하나는 손실이 없는 압축이다. 손실이 있는 압축의 대표적인 예로는 JPEG를 들 수 있고, 손실이 없는 압축의 대표적인 예로는 LZW(GIF 파일)를 들 수 있다. 손실이 있는 압축의 경우에는 정보가 많은 부분의 데이터만 활용하고 나머지 부분은 버리는 방법을 취한다. 하지만, 손실이 없는 압축의 경우에는 압축 전의 데이터와 압축 후의 데이터가 같아야 한다. 손실이 있는 압축의 경우 압축률이 높지만, 상당량의 정보를 잃어버리게 되므로 데이터의 완벽한 복원을 위해서는 무손실 압축 방법을 이용한다.There are two main ways to compress data. That is, one is lossy compression and the other is lossless compression. A representative example of lossless compression is JPEG, and a representative example of lossless compression is LZW (GIF file). In the case of lossy compression, only the data with the most information is used and the rest is discarded. However, in the case of lossless compression, the data before compression and the data after compression must be the same. Lossy compression has a high compression rate, but since a large amount of information is lost, a lossless compression method is used for perfect data recovery.

한편, 무손실 압축의 경우에는 데이터의 엔트로피에 접근하는 방법으로 압축방법을 만든다. 기존의 압축 방법에서 널리 쓰이는 기술로는 허프만(Huffman) 코딩이나 산술 코딩이 있다. 이 중, 허프만 코딩의 경우에는 심벌의 확률값이 1/2의 멱승이어야 엔트로피에 근접할 수 있는 반면, 산술 코딩의 경우에는 그러한 제약이 없이도 엔트로피에 도달한다. 따라서, 압축 효율에 있어서는 산술 코딩이 매우 뛰어나므로 많은 경우의 알고리즘에서 산술 코딩을 채택하고 있다.On the other hand, in the case of lossless compression, a compression method is created by accessing the entropy of data. The techniques widely used in the conventional compression method are Huffman coding or arithmetic coding. Of these, in Huffman coding, the probability of a symbol must be a power of 1/2 to be close to entropy, whereas in arithmetic coding, entropy is reached without such a restriction. Therefore, since the arithmetic coding is very excellent in compression efficiency, arithmetic coding is adopted in many algorithms.

산술 코딩에서 가장 많이 사용되는 형태는 위튼(Witten)이 발표한 다중 심벌의 산술 코딩이다. 이는 0과 1사이의 구간을 여러 개의 심벌 구간으로 나누고, 심벌이 발생하는 확률을 각각의 구간에 할당하는 방법이다. 각각의 구간을 재귀적인 방법으로 계산하고, 그 구간 내의 한 수를 전송한다.The most commonly used form of arithmetic coding is the arithmetic coding of multiple symbols published by Witten. This is a method of dividing the interval between 0 and 1 into several symbol intervals, and assigning the probability of generating a symbol to each interval. Each interval is calculated in a recursive manner and one number in that interval is transmitted.

산술 코딩을 위해서는 심벌의 발생 확률이 필요하며, 이를 위해 여러 가지 방법들이 사용되고 있다. 즉, 고정된 또는 적응 모델을 이용해 다중 심벌을 다루게 되는데, 이 중 위튼이 사용한 적응적인 방법에서는 기본적으로 심벌의 발생 빈도를이용해 확률을 계산한다. 즉, 초기 확률을 균일 분포로 가정하고, 심벌이 발생할 때마다, 각각의 심벌의 발생 빈도를 기록하고 이를 총 일어난 사건의 수로 나눔으로써 심벌의 확률을 계산한다. 또한, 각각의 심벌이 발생하면 그 구간의 시작점과 끝점을 계산해야 하는데 그 계산에 있어서는 곱셈이 사용된다.Arithmetic coding requires the probability of occurrence of symbols, and various methods are used for this. That is, the fixed or adaptive model is used to handle multiple symbols. Of these, the adaptive method used by Witten calculates the probability using the frequency of symbol occurrence. In other words, the initial probability is assumed to be a uniform distribution, and each time a symbol occurs, the probability of the symbol is calculated by recording the frequency of occurrence of each symbol and dividing it by the total number of events. In addition, when each symbol occurs, the start point and the end point of the interval must be calculated, and multiplication is used in the calculation.

그러나, 이와 같이 나눗셈과 곱셈 연산이 반복 수행되어야 하는 경우 계산이 복잡하여 압축 시간이 오래 걸린다는 문제점이 있다.However, when the division and multiplication operations are to be repeatedly performed as described above, there is a problem in that the calculation is complicated and the compression takes a long time.

본 발명은 이와 같은 문제점을 해결하기 위하여 만들어진 것으로, 다중 부호 데이터를 빠른 시간 내에 압축할 수 있는 새로운 방법 및 장치를 제공하는 것을 그 목적으로 한다.The present invention has been made to solve such a problem, and an object thereof is to provide a new method and apparatus capable of compressing multiple code data in a short time.

본 발명의 다른 목적은 높은 압축율을 갖는 다중 부호 데이터 압축 방법 및 그 장치를 제공하는 것이다.Another object of the present invention is to provide a multi-code data compression method and apparatus thereof having a high compression rate.

도 1은 본 발명의 실시예에 따른 부호화 방법을 나타내는 흐름도이다.1 is a flowchart illustrating an encoding method according to an embodiment of the present invention.

도 2는 본 발명의 실시예에 따른 부호화 방법에서 각 비트를 부호화하는 데 사용되는 노드를 표시한 트리이다.2 is a tree showing a node used to encode each bit in the encoding method according to the embodiment of the present invention.

도 3은 본 발명의 실시예에 따른 부호화 방법과 종래의 부호화 방법에 따른 압축/해제 시간을 비교한 표이다.3 is a table comparing compression / decompression times according to an encoding method and a conventional encoding method according to an embodiment of the present invention.

도 4는 본 발명의 실시예에 따른 부호화 방법과 종래의 부호화 방법에 따른 압축률을 비교한 표이다.4 is a table comparing compression rates according to an encoding method and a conventional encoding method according to an embodiment of the present invention.

이와 같은 목적을 달성하기 위하여 본 발명에서는 이진 산술 부호기(BAC: binary arithmetic coder)를 이용하여 다중 부호 데이터를 압축하되 조건부 엔트로피의 개념을 이용하여 각각의 비트를 부호화한다. BAC는 확률 구간을 계산할 때 뺄셈과 비트의 이동만을 이용하고, 적응적 확률 분포를 계산하기 위해서 베이시언(Bayesian) 추정 원리에 기초를 둔 상태기(state machine)를 사용하여 부호 확률의 계산시에 나눗셈 연산을 하지 않아도 되므로 계산이 간단하며, 부호화할 각 비트에 대해서 상위의 모든 값을 고려한 조건부 확률을 이용하여 부호화하므로독립적인 부호화를 하는 경우에 비해 압축률을 높일 수 있다.In order to achieve the above object, the present invention compresses multiple code data using a binary arithmetic coder (BAC), but encodes each bit using the concept of conditional entropy. The BAC uses only subtraction and bit shifts to calculate the probability intervals, and uses a state machine based on the Bayesian estimation principle to calculate the adaptive probability distributions. Since it is not necessary to perform a division operation, the calculation is simple. Since each bit to be encoded is encoded using conditional probabilities in consideration of all upper values, the compression ratio can be increased compared to the case of independent encoding.

즉, 본 발명에 따른 다중 부호 데이터 압축 방법은, 다중 부호 데이터를 구성하는 각 부호를 2진 형태로 변환하는 단계와, 변환된 2진 형태의 부호를 산술 코딩을 이용하여 부호화하는 단계를 포함하여 이루어진다.That is, the multi-code data compression method according to the present invention includes converting each code constituting the multi-code data into binary form, and encoding the converted binary code by using arithmetic coding. Is done.

여기에서, 상기 부호화 단계에서는 이진 산술 부호기를 이용할 수 있으며, 또한 상기 부호화 단계에서는 상기 변환된 2진 형태의 부호를 순차적으로 부호화하되, 최초로 부호화된 비트 이후의 비트에 대해서는 먼저 부호화된 비트(들)을 조건으로 하는 조건부 확률을 이용하여 부호화하는 것이 바람직하다.Here, in the encoding step, a binary arithmetic encoder may be used, and in the encoding step, the converted binary code is sequentially encoded, but the bit (s) encoded first for the bits after the first coded bit. It is preferable to encode using conditional probabilities with the condition.

또한, 본 발명에 따른 다중 부호 데이터 압축 장치, 즉 부호기는, 기억장치, 상기 기억장치와 연결되어 있는 처리장치를 포함하며, 상기 기억장치는 상기 처리장치를 제어할 수 있는 프로그램을 저장하고 있으며, 상기 처리장치는 상기 프로그램과 함께 동작하여, 상기 다중 부호 데이터를 구성하는 각 부호를 2진 형태로 변환하고, 변환된 2진 형태의 부호를 산술 코딩을 이용하여 부호화한다.In addition, the multiple code data compression device, that is, the encoder according to the present invention includes a storage device and a processing device connected to the storage device, and the storage device stores a program for controlling the processing device. The processing device operates in conjunction with the program, converts each code constituting the multi-code data into binary form, and encodes the converted binary form code using arithmetic coding.

또한, 본 발명에서는 위와 같은 방법으로 압축된 다중 부호 데이터를 복원하는 방법과 이를 위한 장치를 제공한다.In addition, the present invention provides a method for recovering multiple coded data compressed by the above method and an apparatus therefor.

이제 본 발명의 실시예에 대하여 첨부한 도면을 참고로 하여 상세히 설명한다.Embodiments of the present invention will now be described in detail with reference to the accompanying drawings.

도 1에 본 발명의 실시예에 따른 부호화 방법의 간략한 흐름도가 나타나 있다. 먼저, 이진 산술 코딩을 사용하여 다중 부호 데이터를 부호화하기 위해서는 부호를 이진 형태로 변환하여야 한다(S110). 예를 들어 처리해야 할 M 개의 부호가 있다면 각각의 부호는 N 비트의 이진수로 표현될 수 있다. 여기에서 2^N-1M < 2^N이다.1 is a simplified flowchart of an encoding method according to an embodiment of the present invention. First, in order to encode multiple code data using binary arithmetic coding, a code must be converted into a binary form (S110). For example, if there are M signs to be processed, each sign can be represented by N bits of binary number. Where 2 ^N-1 M <2 ^N.

다음, 각각의 비트는 그 자신의 확률 상태를 가지고 이진 산술 코딩을 이용해 부호화된다(S120 - S140). 즉, 확률 추정을 위해 N 개의 독립적인 컨텍스트(context)가 존재한다.Each bit is then coded using binary arithmetic coding with its own probability state (S120-S140). That is, there are N independent contexts for probability estimation.

예를 들어 임의(random) 변수 X의 알파벳 A의 수가 4라면 임의 변수 X를 2개의 임의 변수 B₀와 B₁으로 나타낼 수 있으며, 각각의 임의 변수 B₀와 B₁은 0, 1의 2개의 부호를 가진다. 임의 변수 X의 확률을 p(x)라 할 때 p(x) = Pr{X=x} = Pr{B₀= b₀, B₁= b₁}, b_iIN {0, 1}로 나타낼 수 있다.For example, if the number of alphabets A of random variable X is 4, then random variable X can be represented by two random variables B ₀ and B ₁ , and each of random variables B ₀ and B ₁ is 0, 1 Has a sign. P (x) = Pr {X = x} = Pr {B ₀ = b ₀ , B ₁ = b ₁ }, b _i IN {0, 1} Can be.

여기에서, Witten이 제안한 코더를 이용해서 X를 부호화 할 때 예상되는 부호의 길이는 엔트로피 H(X)보다 약간 크거나 같다. 이 때, H(X)는 다음과 같이 정의된다.Here, the expected code length when encoding X using the coder proposed by Witten is slightly larger than or equal to entropy H (X). At this time, H (X) is defined as follows.

p(x)log₂p(x) p (x) log ₂ p (x)

임의 변수 X의 확률분포는 두 개의 임의 변수 B₀와 B₁의 결합(joint) 확률분포로 나타낼 수 있다. 결합(joint) 분포 p(b₀, b₁)를 가진 한 쌍의 이산 임의 변수(B₀, B₁)의 결합(joint) 엔트로피 H(B₀, B₁)는 다음과 같이 정의된다.The probability distribution of the random variable X can be expressed as a joint probability distribution of two random variables B ₀ and B ₁ . The joint entropy H (B ₀ , B ₁ ) of a pair of discrete random variables B ₀ , B ₁ with a joint distribution p (b ₀ , b ₁ ) is defined as follows.

p(b₀, b₁) log₂p(b₀, b₁) p (b ₀ , b ₁ ) log ₂ p (b ₀ , b ₁ )

= H(B₀) + H(B₁vert B₀)= H (B ₀ ) + H (B ₁ vert B ₀ )

H(B₀) + H(B₁) H (B ₀ ) + H (B ₁ )

여기에서 H(B₁vert B₀)은 조건부 엔트로피이고, 두 개의 임의 변수 B₀와 B₁이 서로 독립일 때 등호가 성립한다. 그러나 두 개의 임의 변수가 서로 독립임을 보장하기는 어렵다. 그러므로 두 개의 임의 변수 B₀와 B₁를 독립적으로 부호화하면(이진수 표현에 대한 독립 부호화), 예상되는 부호의 길이는 임의 변수 X의 다중 부호 산술코딩의 부호 길이보다 길어지게 된다.Where H (B ₁ vert B ₀ ) is conditional entropy, and the equal sign is established when two arbitrary variables B ₀ and B ₁ are independent of each other. However, it is difficult to guarantee that two arbitrary variables are independent of each other. Therefore, if two arbitrary variables B ₀ and B ₁ are encoded independently (independent encoding for binary representation), the expected length of the code is longer than the code length of the multiple code arithmetic coding of the random variable X.

이진수로 표현된 부호에 대한 코딩을 위해서는 이진 산술 부호기(BAC)가 사용될 수 있다. 이진 산술 부호기로는 Q-코더, QM-코더, MQ-코더 등이 있다. BAC는 확률 구간을 계산하는 데 있어서 단지 뺄셈과 비트의 이동만을 필요로 한다. 뿐만 아니라, 적응적 확률 분포를 계산하기 위해서 베이시언(Bayesian) 추정 원리에 근거를 둔 상태기를 사용한다. 이러한 상태기를 사용하면 부호 확률의 계산시에 나눗셈 연산을 하지 않아도 된다. 그러므로 BAC는 Witten의 다중 부호 산술 코더에 비해 구현이 용이하다.A binary arithmetic coder (BAC) may be used for coding a code represented by a binary number. Binary arithmetic encoders include Q-coders, QM-coders, MQ-coders, and the like. The BAC only needs subtraction and bit shifting to calculate the probability interval. In addition, a state machine based on the Bayesian estimation principle is used to calculate the adaptive probability distribution. This state machine eliminates the need for division operations when computing sign probabilities. Therefore, BAC is easier to implement than Witten's multiple code arithmetic coder.

한편, 이진수 표현에 대한 다중 부호 산술 코딩의 압축율을 향상시키기 위해서 본 발명에서는 조건부 엔트로피의 개념을 도입한다. 먼저 공조건(null condition)을 가진 확률 p(B_N-1)을 이용하여 최상위 비트(MSB: most significant bit)를 부호화한다(S120). 두 번째 비트는 조건으로 MSB의 값을 고려한 조건부 확률 p(B_N-2vert B_N-1)을 이용하여 부호화된다(S130). 마찬가지의 방법으로 k 번째 비트는 조건부 확률 p(B_kvert B_N-1, B_N-2, ....., B_k+1)을 가지고 부호화한다(S140).On the other hand, the present invention introduces the concept of conditional entropy in order to improve the compression ratio of multiple code arithmetic coding for binary representation. First, the most significant bit (MSB) is encoded using a probability p (B _N-1 ) having a null condition (S120). The second bit is encoded using the conditional probability p (B _N-2 vert B _N-1 ) considering the value of the MSB as a condition (S130). In the same manner, the k th bit is encoded with a conditional probability p (B _k vert B _N-1 , B _N-2 ,..., B _{k + 1} ) (S140).

이러한 상황은 도 2에 나타난 이진 트리와 같은 형태로 표현될 수 있다.This situation may be expressed in the form of a binary tree shown in FIG. 2.

즉, MSB는 공조건(null condition)인 루트 노드(210)를 이용한다. 두 번째 비트 b_N-2를 부호화하기 위한 조건은 MSB의 값에 따른 2명의 자손(220_1, 220_2) 중의 하나이다. 임의 변수 X에 대한 이진수 표현의 k번째 비트(b_N-k; 0k<N)를 부호화하기 위해 본 발명에서는 이전에 부호화된 N-k-1 비트의 값에 따라 루트 노드의 k-1의 자손(240_1, ..., 240_m) 중의 하나를 조건으로 이용한다.In other words, the MSB uses the root node 210 which is a null condition. The condition for encoding the second bit b _N-2 is one of two descendants 220_1 and 220_2 according to the value of the MSB. Kth bit of the binary representation of arbitrary variable X (b _Nk ; 0 In order to encode k <N), the present invention uses one of the descendants 240_1, ..., 240_m of k-1 of the root node as a condition according to a previously encoded value of Nk-1 bits.

본 발명에 따른 방법은 조건부 확률 분포를 이용한다는 점에서 이진수 표현에 대한 독립적인 부호화 방법과 구분된다. 즉, 이진수 표현에 대한 독립적인 부호화 방법은 단지 독립적인 확률 분포, 즉, p(B_N-1), p(B_N-2), ..., p(B₀)만을 다루는 반면, 본 발명의 실시예에 따른 방법에서는 조건으로 도 2에 나타난 바와 같은 트리의 모든 노드를 이용한다.The method according to the invention is distinguished from an independent coding method for binary representations in that it uses a conditional probability distribution. That is, the independent coding method for binary representations only deals with independent probability distributions, that is, p (B _N-1 ), p (B _N-2 ), ..., p (B ₀ ), In the method according to the embodiment, all nodes of the tree as shown in FIG.

[수학식 2]에 따르면, 이러한 방법에서는 이진수 표현에 대한 독립적인 부호화의 경우에 비해 좀 더 짧은 부호화 길이를 얻을 수 있다.According to Equation 2, in this method, a shorter encoding length may be obtained than in the case of independent encoding of binary representations.

이제 컨텍스트(context)의 수를 분석해보기로 한다. 이진수 표현에 대한 독립적인 부호화 방법을 이용해 다중 부호 임의 변수 X를 부호화하기 위해서는 단지 N개의 컨텍스트만이 필요하다. 왜냐하면 N 비트가 서로 독립적이라고 가정했기 때문이다. 본 발명에 따르면, k번째 비트를 부호화하기 위해 이전의 N-k-1 비트의 값을 모두 알고 있어야 한다. 그러므로 k번째 비트의 부호화는 2^N-k-1의 컨텍스트를 갖는다. 즉, N 비트의 이진수로 표현된 임의 변수 X를 부호화하는데 필요한 총 컨텍스트의 수는 다음과 같다.Now let's analyze the number of contexts. Only N contexts are needed to encode a multi-signed random variable X using an independent coding method for binary representations. This is because the N bits are assumed to be independent of each other. According to the present invention, all the values of the previous Nk-1 bits must be known to encode the kth bit. Therefore, the encoding of the k th bit has a context of 2 ^Nk-1 . That is, the total number of contexts required to encode the arbitrary variable X represented by N bits of binary is as follows.

2^i-1= 2^N-1 2 ^i-1 = 2 ^N -1

만약 알파벳의 수가 2의 멱승이 아니라면 총 컨텍스트의 수는 어느 정도 줄어들게 된다. BAC에서 컨텍스트 수는 컨텍스트 상태를 저장하기 위해서 요구되는 메모리의 양을 나타낸다.If the number of alphabets is not a power of 2, the total number of contexts is reduced to some extent. In BAC, the context count represents the amount of memory required to store the context state.

<실험예>Experimental Example

본 발명의 압축률과 계산시간을 다른 방법들과 비교하기 위한 실험을 수행하였다. 다른 특성을 갖는 6개의 계조 이미지가 실험에 사용되었으며, 각 이미지는 화소당 8 비트, 512*512 화소로 구성되었다. 3가지 다른 방법을 이용하여 압축과 해제를 수행한 결과가 도 3 및 도 4의 표에 나타나 있다. 도 3과 도 4는 각각 본발명의 실시예에 따른 부호화 방법과 종래의 부호화 방법에 따른 압축/해제 시간 및 압축률을 비교한 표이다.Experiments were performed to compare the compression rate and computation time of the present invention with other methods. Six gradation images with different characteristics were used in the experiment, and each image consisted of 8 bits per pixel, 512 * 512 pixels. The results of compression and decompression using three different methods are shown in the tables of FIGS. 3 and 4. 3 and 4 are tables comparing compression / decompression time and compression ratio according to an encoding method according to an embodiment of the present invention and a conventional encoding method, respectively.

먼저, 각 화소의 값이 다중 심벌이라고 가정하고 Witten의 적응적 방법을 이용하여 압축하였다. 이리하여 256개의 심벌이 처리되었다. 두 번째로 각 화소를 8 비트의 이진수로 변환하고 각 비트를 이진수 표현에 대한 독립적 부호화를 이용하여 각각 압축하였다. 마지막으로 각 화소 값을 이진수로 변환한 후에 각 비트를 상술한 바와 같이 조건부 확률을 갖는 이진 산술 코더인 본 발명의 코더를 이용하여 압축하였다. 이 때 이진 산술 코더로는 QM 코더를 사용하였다.First, it is assumed that the value of each pixel is multi-symbol and compressed using Witten's adaptive method. Thus 256 symbols were processed. Secondly, each pixel was converted into an 8-bit binary number, and each bit was compressed using independent encoding of a binary representation. Finally, after converting each pixel value into a binary number, each bit is compressed using the coder of the present invention, which is a binary arithmetic coder with conditional probability as described above. At this time, QM coder was used as a binary arithmetic coder.

계산 시간과 압축률을 512MB 메모리를 갖는 펜티엄 III(1.0GHz) 컴퓨터에서 윈도우즈 2000 운영체제를 사용하여 비교하였다.Computation time and compression rates were compared using a Windows 2000 operating system on a Pentium III (1.0 GHz) computer with 512 MB of memory.

각 방법을 이용해 압축/해제하는 데 걸린 시간이 도 3에 나타나 있다. 확률 분포를 추정하고 확률 구간을 계산하는 데에 QM 코더가 곱셈과 나눗셈을 사용하지 않기 때문에 Witten의 코더에서와 같은 다중 심벌 산술 코더에 비해 계산이 빠른 것을 알 수 있다.The time taken to compress / decompress using each method is shown in FIG. 3. Since the QM coder does not use multiplication and division to estimate the probability distribution and calculate the probability interval, it is faster to calculate than the multi-symbol arithmetic coder as in Witten's coder.

도 4에서는 2진수 표현에 대한 독립 부호화 방법이 약간 낮은 압축률을 나타내고 있다. 이는 8 개의 비트가 서로 독립적이지 않다는 것을 의미한다. 본 발명의 방법에 따라 이미지를 압축한 결과 압축률이 다른 방법에 비해 높은 것을 알 수 있다. Witten의 코더와 본 발명의 방법에 있어서의 압축률 차이는 확률 추정과 그 구간을 계산하는 것의 차이에 기인하는 것으로 보인다.In FIG. 4, the independent coding method for the binary representation shows a slightly lower compression rate. This means that the eight bits are not independent of each other. As a result of compressing the image according to the method of the present invention, it can be seen that the compression ratio is higher than that of other methods. The difference in compression ratio in the coder of Witten and the method of the present invention seems to be due to the difference between probability estimation and calculating the interval.

결론적으로 도 3 및 도 4에 나타난 실험 결과는 본 발명의 방법이 다중 심벌데이터 압축의 압축률과 계산 속도를 현저히 개선하였음을 나타낸다.In conclusion, the experimental results shown in FIGS. 3 and 4 show that the method of the present invention significantly improves the compression rate and the calculation speed of the multi-symbol data compression.

한편, 앞서 설명한 실시예에서는 이진 부호를 부호화하기 위하여 QM 코더를 사용하였으나, 이는 단순히 예로서 제시된 것이며, 다른 종류의 이진 산술 부호기가 사용될 수도 있음은 물론이다.Meanwhile, in the above-described embodiment, the QM coder is used to encode a binary code, but this is merely presented as an example and other types of binary arithmetic coders may be used.

지금까지 바람직한 실시예를 참고로 하여 이 발명을 상세히 설명하였으나 이 발명의 범위는 이에 한정되는 것은 아니며, 다음의 특허청구범위에 의해 해석되어야 할 것이다. 또한, 이 발명이 속하는 분야의 통상의 기술자라면 이 발명의 사상을 벗어나지 않고도 다양한 변형이나 변경이 가능함을 이해할 수 있을 것이다.The present invention has been described in detail with reference to preferred embodiments, but the scope of the present invention is not limited thereto, and should be interpreted by the following claims. In addition, it will be understood by those skilled in the art that various modifications or changes may be made without departing from the spirit of the present invention.

앞서 설명한 바와 같이, 본 발명에 따르면 이진 산술 부호화를 이용함으로써 압축 및 해제 속도를 향상시킬 수 있고, 이진 부호에 대해 조건부 확률을 이용한 부호화를 수행함으로써 압축률을 높일 수 있다.As described above, according to the present invention, the compression and decompression speed can be improved by using binary arithmetic coding, and the compression rate can be increased by performing encoding using conditional probability on the binary code.

Claims

In a method for compressing multiple code data using a computer,

Converting each code constituting the multi-code data into binary form;

And encoding the transformed binary code using arithmetic coding.

The method of claim 1,

In the encoding step, a binary coded data compression method using a binary arithmetic coder (BAC).

The method of claim 1,

In the encoding step, the encoded binary code is sequentially encoded, and the code after the first coded bit is encoded using conditional probabilities subject to the coded bit (s). .

4. A method for recovering multiple coded data compressed by the method of any one of claims 1 to 3 using a computer.

An apparatus capable of compressing multiple code data using a computer,

Memory,

A processing device connected with the storage device,

The storage device stores a program for controlling the processing device,

The processing device operates in conjunction with the program,

Convert each code constituting the multi-code data into binary form,

A multi-code data compression apparatus for encoding a converted binary code using arithmetic coding.

The method of claim 5,

And a multi-code data compression device encoding the converted binary code using BAC.

The method of claim 5,

And encoding the converted binary code sequentially, and encoding the bit after the first coded bit using conditional probabilities subject to the coded bit (s).

Memory,

A processing device connected with the storage device,

The storage device stores a program for controlling the processing device,

The processing device operates in conjunction with the program,

4. A multiple code data decompression device for decompressing multiple code data compressed by the method of any one of claims 1 to 3.

A recording medium having recorded thereon a program that can be read and executed using a computer,

The program is run on the computer,

Convert each code constituting the multi-code data into binary form,

A recording medium readable and executable by a computer that encodes the converted binary code using arithmetic coding.