KR101590790B1

KR101590790B1 - Binary data compression and restoration method and apparatus

Info

Publication number: KR101590790B1
Application number: KR1020140025762A
Authority: KR
Inventors: 김정훈
Original assignee: 김정훈
Priority date: 2014-03-04
Filing date: 2014-03-04
Publication date: 2016-02-02
Also published as: KR20150103992A

Abstract

본 발명은 이진데이터 압축장치에 의해 수행되는 이진데이터의 압축방법으로서, 원본 이진데이터를 입력받는 단계; 상기 원본 이진데이터를 데이터 분할단위로 스캐닝하여 분할하는 단계; 스캐닝된 각 분할 데이터의 발생 빈도수에 따라, 각 분할 데이터에 대해 그룹을 할당하고 각 그룹 내 순위를 지정하여 참조사전을 생성하는 단계; 상기 참조사전을 참조하여, 상기 원본 이진데이터의 특정 방향으로 배열된 각 분할데이터에 대하여 해당 각 분할데이터가 포함된 그룹을 나타내는 그룹데이터 및 각 그룹 내 순위데이터를 축적하는 단계; 및 상기 참조사전 및 상기 축적된 그룹데이터와 순위데이터를 결합하여 압축데이터를 생성하는 단계를 포함하는 것을 특징으로 하는, 이진 데이터의 압축방법에 관한 것이다.A method of compressing binary data performed by a binary data compression apparatus, the method comprising: receiving original binary data; Scanning the original binary data in units of data segments and dividing the original binary data; Generating a reference dictionary by assigning a group to each divided data and assigning a ranking within each group according to the frequency of occurrence of each divided data; Accumulating group data indicating group including each divided data and each group ranking data for each divided data arranged in a specific direction of the original binary data by referring to the reference dictionary; And combining the reference dictionary and the accumulated group data with the ranking data to generate compressed data.

Description

TECHNICAL FIELD [0001] The present invention relates to a binary data compression and restoration method and apparatus,

본 발명은 이진 데이터의 압축 및 복원 방법과 장치에 관한 것으로서, 보다 구체적으로는 간단한 연산과 하드웨어적 구성을 통해 이진 데이터를 효과적이고 효율적으로 압축하고 복원할 수 있을 뿐만 아니라 데이터 전송 속도와 효율도 향상시킬 수 있는 이진 데이터의 압축 및 복원 방법과 장치에 관한 것이다.
The present invention relates to a method and apparatus for compressing and restoring binary data, and more particularly, to an apparatus and method for efficiently and efficiently compressing and restoring binary data through a simple operation and a hardware configuration, And more particularly to a method and apparatus for compressing and restoring binary data.

일반적으로, 통상의 전송 채널에서 이용 가능한 주파수 대역폭은 제한되어 있으므로, 많은 양의 데이터를 전송하기 위해서 모뎀과 같은 다양한 전송 시스템은 전송 데이터의 양을 압축하거나 줄일 수 있는 효과적인 데이터 압축 기법을 이용해 왔다.In general, since the frequency bandwidth available in a normal transmission channel is limited, various transmission systems such as a modem have used an effective data compression technique to compress or reduce the amount of transmission data in order to transmit a large amount of data.

다양한 압축기법 중의 하나로서, 국제 전기 통신 동맹(ITU : International Telecommunication Union)에 의해 표준화된 부호화 알고리즘으로, 모뎀과 같은 데이터 전송 시스템에서 채용하고 있는 CCITT V.42 bis 가 있다. 이 부호화 표준안에 적용된 기초는 Ziv-Lempel code(ZLC)이며, 이 방식은 입력 데이터로부터 적응적으로 사전을 형성해 가면서 앞의 입력 데이터와 동일한 구문(phrase)이 저장되어 있는 사전의 주소값을 부호어로 전송하는 방법이다. 사전화(dictionary) 작업은 입력 데이터와 계속적인 스트링 매칭(string matching)을 수행하여 최대 길이의 매칭 스트링에 매칭안된 문자를 결합하여 사전에 추가하는 과정으로 사전을 업데이트한다.One of the various compression schemes is the CCITT V.42 bis employed in a data transmission system such as a modem with a coding algorithm standardized by the International Telecommunication Union (ITU). The basis applied to this coding standard is a Ziv-Lempel code (ZLC). In this method, an address value of a dictionary storing the same phrase as the previous input data is formed as a codeword while adaptively forming a dictionary from the input data. Lt; / RTI > The dictionary operation performs a continuous string matching with the input data to update the dictionary by adding the unmatched characters to the maximum matching string and adding them to the dictionary.

그러나, 이러한 종래의 압축 방식은 데이터의 압축 및 복원에 대한 처리 연산이 복잡하고 비교적 고사양의 하드웨어적 장치를 필요로 하며, 처리 속도의 향상에 제한이 따르고 압축 결과값에 대한 신뢰성을 높이기 힘든 문제점이 있었다.
However, such a conventional compression method requires complicated processing of data compression and decompression, requires a relatively high-performance hardware device, limits the improvement of the processing speed, and increases the reliability of the compression result value there was.

본 발명의 배경기술은 대한민국 공개특허공보 제 1999-0022960호(1999. 3. 25 공개)에 개시되어 있다.
The background art of the present invention is disclosed in Korean Patent Laid-Open Publication No. 1999-0022960 (published on Mar. 25, 1999).

본 발명이 이루고자하는 기술적 과제는, 간단한 연산과 하드웨어적 구성을 통해 이진 데이터를 신속하고 효율적으로 압축하고 복원할 수 있고, 압축률도 뛰어나며 압축 데이터 및 복원 데이터의 신뢰성도 높일 수 있을 뿐만 아니라 데이터 전송시 전송효율과 속도도 향상시킬 수 있는 이진 데이터의 압축 및 복원 방법과 장치를 제공하는 데에 있다.
SUMMARY OF THE INVENTION The present invention has been made in view of the above problems, and it is an object of the present invention to provide a data compression method and a data compression method that can compress and restore binary data quickly and efficiently through simple computation and hardware configuration, And a method and apparatus for compressing and restoring binary data that can improve transmission efficiency and speed.

본 발명의 일 측면에 따르면, 본 발명은 이진데이터 압축장치에 의해 수행되는 이진데이터의 압축방법으로서, 원본 이진데이터를 입력받는 단계; 상기 원본 이진데이터를 데이터 분할단위로 스캐닝하여 분할하는 단계; 스캐닝된 각 분할 데이터의 발생 빈도수에 따라, 각 분할 데이터에 대해 그룹을 할당하고 각 그룹 내 순위를 지정하여 참조사전을 생성하는 단계; 상기 참조사전을 참조하여, 상기 원본 이진데이터의 특정 방향으로 배열된 각 분할데이터에 대하여 해당 각 분할데이터가 포함된 그룹을 나타내는 그룹데이터 및 각 그룹 내 순위데이터를 축적하는 단계; 및 상기 참조사전 및 상기 축적된 그룹데이터와 순위데이터를 결합하여 압축데이터를 생성하는 단계를 포함하는 것을 특징으로 하는, 이진 데이터의 압축방법을 제공한다.According to an aspect of the present invention, there is provided a method of compressing binary data performed by a binary data compression apparatus, comprising: receiving original binary data; Scanning the original binary data in units of data segments and dividing the original binary data; Generating a reference dictionary by assigning a group to each divided data and assigning a ranking within each group according to the frequency of occurrence of each divided data; Accumulating group data indicating group including each divided data and each group ranking data for each divided data arranged in a specific direction of the original binary data by referring to the reference dictionary; And generating compressed data by combining the reference dictionary and the accumulated group data with the ranking data.

본 발명에서, 상기 데이터 분할단위는 상기 원본 이진데이터를 분할하는 특정 비트길이 단위를 의미하는 것을 특징으로 한다.In the present invention, the data division unit is a specific bit length unit for dividing the original binary data.

본 발명에서, 상기 각 그룹에 포함되는 분할 데이터의 갯수는 각 그룹마다 동일하거나 각 그룹마다 개별적으로 설정가능한 것을 특징으로 한다.In the present invention, the number of divided data included in each group may be the same for each group or individually set for each group.

본 발명에서, 상기 참조사전 생성시, 발생빈도수가 높은 순 또는 낮은 순으로 각 분할데이터의 그룹을 할당함과 동시에 각 그룹 내 순위도 지정하는 것을 특징으로 한다.According to the present invention, in the reference dictionary generation, groups of divided data are assigned in order of increasing occurrence frequency or lower, and at the same time, intra-group ranks are also specified.

본 발명에서, 상기 그룹데이터의 축적시, 각 분할데이터가 포함되는 상기 그룹을 나타내는 갯수의 제 1이진수 또는 제 2이진수를 단계적으로 축적하되, 상기 제 1이진수와 제 2이진수는 각 단계별로 번갈아 축적되는 것을 특징으로 한다.In the present invention, when accumulating the group data, a first binary number or a second binary number indicating the group including each divided data is accumulated step by step, and the first binary number and the second binary number are accumulated alternately .

본 발명에서, 상기 그룹데이터의 축적시, 각 분할데이터가 포함되는 상기 그룹을 나타내는 이진수를 단계적으로 축적하되, 상기 이진수는 유일복호성 이진수인 것을 특징으로 한다.In the present invention, when accumulating the group data, binary numbers indicating the group including each divided data are accumulated stepwise, and the binary number is a singular binary number.

본 발명에서, 상기 유일복호성 이진수는 최상위비트로부터 "10"으로 시작하여 1개 이상의 연속된 0이나 1로 끝나는 이진수인 것을 특징으로 한다.In the present invention, the singlet binary binary number is a binary number starting from "10" from the most significant bit and ending with one or more consecutive 0's or 1's.

본 발명은 상기 원본 이진데이터를 분할하는 상기 데이터 분할단위 및 분할되는 이진데이터 각각을 그룹화하기 위한 상기 그룹의 갯수를 입력받는 단계를 더 포함하는 것을 특징으로 한다.
The present invention is characterized by further comprising the step of receiving the data division unit for dividing the original binary data and the number of the groups for grouping each of the divided binary data.

또한, 본 발명의 다른 측면에 따르면, 본 발명은 이진데이터 압축방법에 의해 압축된 이진데이터를 복원하는 방법으로서, 상기 참조사전을 참조하여 상기 압축데이터에 포함되어 있는 상기 그룹데이터 및 순위데이터로부터 원본 이진데이터를 복원하는 단계를 포함하는 것을 특징으로 하는, 이진데이터의 복원방법을 제공한다.
According to another aspect of the present invention, there is provided a method of restoring binary data compressed by a binary data compression method, the method comprising: extracting, from the group data and rank data included in the compressed data, And restoring the binary data. The method for restoring binary data according to the present invention includes the steps of:

본 발명의 또 다른 측면에 따르면, 본 발명은 이진데이터를 압축하는 이진데이터의 압축장치로서, 입력되는 원본 이진데이터를 데이터 분할단위로 스캐닝하여 분할하는 데이터 스캐닝부; 스캐닝된 각 분할 데이터의 발생 빈도수에 따라, 각 분할 데이터에 대해 그룹을 할당하고 각 그룹 내 순위를 지정하여 참조사전을 생성하는 참조사전 생성부; 상기 참조사전을 참조하여, 상기 원본 이진데이터의 특정 방향으로 배열된 각 분할데이터에 대하여 해당 각 분할데이터가 포함된 그룹을 나타내는 그룹데이터 및 각 그룹 내 순위데이터를 축적하고, 상기 참조사전 및 상기 축적된 그룹데이터와 순위데이터를 결합하여 압축데이터를 생성하는 압축부를 포함하는 것을 특징으로 하는, 이진데이터의 압축장치를 제공한다.According to another aspect of the present invention, there is provided a binary data compression apparatus for compressing binary data, comprising: a data scanning unit for scanning and dividing input original binary data in a data division unit; A reference dictionary generating unit for assigning groups to each divided data according to the frequency of occurrence of each divided data scanned and generating a reference dictionary by assigning a ranking within each group; Group data indicating a group including each divided data and each group ranked rank data are accumulated for each divided data arranged in a specific direction of the original binary data by referring to the reference dictionary and the reference dictionary and the accumulation And a compression unit for generating compressed data by combining the group data and the ranking data.

본 발명에서, 상기 참조사전 생성시, 상기 참조사전 생성부는 발생빈도수가 높은 순 또는 낮은 순으로 각 분할데이터의 그룹을 할당함과 동시에 각 그룹 내 순위도 지정하는 것을 특징으로 한다.In the present invention, when generating the reference dictionary, the reference dictionary generation unit allocates groups of divided data in order of increasing occurrence frequency or lower frequency, and also designates an order of each group.

본 발명에서, 상기 그룹데이터의 축적시, 상기 압축부는 각 분할데이터가 포함되는 상기 그룹을 나타내는 갯수의 제 1이진수 또는 제 2이진수를 단계적으로 축적하되, 상기 제 1이진수와 제 2이진수는 각 단계별로 번갈아 축적되는 것을 특징으로 한다.In the present invention, when the group data is stored, the compression unit accumulates a first number of binary numbers or a second number of binary numbers representing the group including each divided data stepwise, wherein the first binary number and the second binary number are stored in each step As shown in FIG.

본 발명에서, 상기 그룹데이터의 축적시, 상기 압축부는 각 분할데이터가 포함되는 상기 그룹을 나타내는 이진수를 단계적으로 축적하되, 상기 이진수는 유일복호성 이진수인 것을 특징으로 한다.In the present invention, when the group data is stored, the compression unit accumulates a binary number indicating the group including each divided data step by step, and the binary number is a unique binary number.

본 발명에서, 상기 원본 이진데이터를 분할하는 상기 데이터 분할단위 및 분할되는 이진데이터 각각을 그룹화하기 위한 상기 그룹의 갯수는 사용자로부터 입력받거나 미리 설정되어 있는 것을 특징으로 한다.
In the present invention, the number of the groups for grouping the data division unit and the divided binary data for dividing the original binary data may be input from a user or set in advance.

본 발명의 또 다른 측면에 따르면, 본 발명은 이진데이터 압축장치에 의해 압축된 이진데이터를 복원하는 장치로서, 상기 참조사전을 참조하여 상기 압축데이터에 포함되어 있는 상기 그룹데이터 및 순위데이터로부터 원본 이진데이터를 복원하는 복원부를 포함하는 것을 특징으로 하는, 이진데이터의 복원장치를 제공한다.
According to another aspect of the present invention, there is provided an apparatus for restoring binary data compressed by a binary data compression device, the binary data being compressed by referring to the reference dictionary, And a reconstruction unit for reconstructing the reconstructed data.

본 발명에 따른 이진 데이터의 압축 및 복원 방법과 장치는, 간단한 연산과 하드웨어적 구성을 통해 이진 데이터를 신속하고 효율적으로 압축하고 복원할 수 있고, 압축률도 뛰어나며 압축 데이터 및 복원 데이터의 신뢰성도 높일 수 있을 뿐만 아니라 데이터 전송시 전송효율과 속도도 향상시킬 수 있다.
The method and apparatus for compressing and restoring binary data according to the present invention are capable of quickly and efficiently compressing and restoring binary data through a simple operation and a hardware configuration, and also have excellent compression rate and reliability of compressed data and restored data Not only the transmission efficiency and the speed of data transmission can be improved.

도 1은 본 발명에 의한 일 실시예에 따른 이진 데이터의 압축장치 및 복원장치의 구성을 도시한 것이다.
도 2는 본 발명에 의한 일 실시예에 따른 이진 데이터의 압축방법을 설명하기 위한 흐름도이다.
도 3은 본 실시예에서 스캐닝에 의해 얻어지는 각 분할데이터의 출현 빈도수에 따른 분포를 나타낸 것이다.
도 4는 본 실시예에서 압축부가 참조사전을 참조하여 원본 이진데이터로부터 그룹데이터와 순위데이터를 축적하는 것을 설명하기 위한 개념도이다.
도 5는 본 실시예에서 참조사전 생성부에서 생성되는 참조사전의 예를 도시한 것이다.
도 6은 본 실시예에서 압축부가 참조사전을 참조하여 원본 이진데이터로부터 그룹데이터와 순위데이터를 축적하는 다른 예를 도시한 것이다.1 is a block diagram of a binary data compression apparatus and a decompression apparatus according to an embodiment of the present invention.
2 is a flowchart illustrating a method of compressing binary data according to an embodiment of the present invention.
Fig. 3 shows the distribution according to the appearance frequency of each divided data obtained by scanning in this embodiment.
4 is a conceptual diagram for explaining accumulation of group data and rank data from the original binary data by referring to the reference dictionary in the present embodiment.
FIG. 5 shows an example of a reference dictionary generated in the reference dictionary generation unit in this embodiment.
6 shows another example of accumulating group data and rank data from the original binary data by referring to the reference part dictionary in the present embodiment.

아래에서는 첨부한 도면을 참고로 하여 본 발명의 실시예에 대하여 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 그러나 본 발명은 여러가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면부호를 붙였다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those skilled in the art can easily carry out the present invention. The present invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. In order to clearly illustrate the present invention, parts not related to the description are omitted, and similar parts are denoted by like reference characters throughout the specification.

명세서 전체에서, 어떤 부분이 어떤 구성 요소를 "포함" 한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다.
Throughout the specification, when an element is referred to as "comprising ", it means that it can include other elements as well, without excluding other elements unless specifically stated otherwise.

도 1은 본 발명에 의한 일 실시예에 따른 이진 데이터의 압축장치 및 복원장치의 구성을 도시한 것이고, 도 2는 본 발명에 의한 일 실시예에 따른 이진 데이터의 압축방법을 설명하기 위한 흐름도이고, 도 3은 본 실시예에서 스캐닝에 의해 얻어지는 각 분할데이터의 출현 빈도수에 따른 분포를 나타낸 것이고, 도 4는 본 실시예에서 압축부가 참조사전을 참조하여 원본 이진데이터로부터 그룹데이터와 순위데이터를 축적하는 것을 설명하기 위한 개념도이고, 도 5는 본 실시예에서 참조사전 생성부에서 생성되는 참조사전의 예를 도시한 것으로서, 이를 참조하여 본 발명을 설명하면 다음과 같다.BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a block diagram of a binary data compression apparatus and a decompression apparatus according to an embodiment of the present invention. FIG. 2 is a flowchart for explaining a binary data compression method according to an embodiment of the present invention. FIG. 3 shows a distribution according to the appearance frequency of each divided data obtained by scanning in the present embodiment. FIG. 4 shows the distribution of group data and rank data from the original binary data by referring to the reference dictionary of the compression unit in this embodiment. FIG. 5 shows an example of a reference dictionary generated by the reference dictionary generation unit in the present embodiment, and the present invention will be described with reference to FIG.

도 1에 도시된 바와 같이, 본 실시예에 따른 이진 데이터의 압축장치(100)는 데이터 스캐닝부(110), 참조사전 생성부(120), 압축부(130) 및 송신부(140)를 포함한다. 1, the apparatus for compressing binary data 100 according to the present embodiment includes a data scanning unit 110, a reference dictionary generation unit 120, a compression unit 130, and a transmission unit 140 .

데이터 스캐닝부(110)는 입력되는 원본 이진데이터를 데이터 분할단위로 스캐닝하여 분할한다.The data scanning unit 110 scans and divides the input original binary data in a data division unit.

참조사전 생성부(120)는 스캐닝된 각 분할 데이터의 발생 빈도수에 따라, 각 분할 데이터에 대해 그룹을 할당하고 각 그룹 내 순위를 지정하여 참조사전을 생성한다.The reference dictionary generation unit 120 assigns groups to each divided data according to the frequency of occurrence of each divided data segment, and generates a reference dictionary by designating the ranking within each group.

압축부(130)는 상기 참조사전을 참조하여, 상기 원본 이진데이터의 특정 방향으로 배열된 각 분할데이터에 대하여 해당 각 분할데이터가 포함된 그룹을 나타내는 그룹데이터 및 각 그룹 내 순위데이터를 축적하고, 상기 참조사전 및 상기 축적된 그룹데이터와 순위데이터를 결합하여 압축데이터를 생성한다.The compression unit 130 refers to the reference dictionary and accumulates group data indicating each group including the corresponding divided data and each group ranking data for each divided data arranged in a specific direction of the original binary data, And combines the reference dictionary and the accumulated group data with the ranking data to generate compressed data.

상기에서, 데이터 분할단위는 상기 원본 이진데이터를 분할하는 특정 비트길이 단위를 의미하고, 상기 각 그룹에 포함되는 분할 데이터의 갯수는 각 그룹마다 동일하거나 각 그룹마다 개별적으로 설정가능하다.Herein, the data division unit refers to a specific bit length unit for dividing the original binary data, and the number of divided data included in each group may be the same for each group or individually set for each group.

참조사전 생성부(120)는, 상기 참조사전 생성시, 발생빈도수가 높은 순 또는 낮은 순으로 각 분할데이터의 그룹을 할당함과 동시에 각 그룹 내 순위도 지정할 수 있다.The reference dictionary creation unit 120 may assign the group of each divided data in the order of higher occurrence frequency or lower frequency, and also designate the intra-group ranking at the time of generating the reference dictionary.

압축부(130)는, 상기 그룹데이터의 축적시, 각 분할데이터가 포함되는 상기 그룹을 나타내는 갯수의 제 1이진수 또는 제 2이진수를 단계적으로 축적하되, 상기 제 1이진수와 제 2이진수는 각 단계별로 번갈아 축적할 수 있다. When the group data is stored, the compression unit 130 accumulates a first number of binary numbers or a second number of binary numbers indicating the group including the divided data step by step, and the first binary number and the second binary number are stored in each step Can be accumulated alternately.

또한, 이와 달리, 압축부(130)는, 상기 그룹데이터의 축적시, 각 분할데이터가 포함되는 상기 그룹을 나타내는 이진수를 단계적으로 축적할 수도 있으며, 이때 상기 이진수는 유일복호성 이진수인 것을 특징으로 한다. 유일복호성(unique decodability)을 갖는 이진수는 일단 축적이 되면 이후 디코딩을 할 때 자연스럽게 각각의 코드들로 분할될 수 있는 성질을 가진 이진수를 의미한다. 유일복호성 이진수는 최상위비트로부터 "10"으로 시작하여 1개 이상의 연속된 0이나 1로 끝나는 이진수일 수 있으며, "10", "100", "101", "1000" "1001", "1011",...등과 같은 바이너리 코드들이 그 예가 될 수 있고, 물론 그 외에도 다양한 종류의 코드들이 사용될 수 있다. Alternatively, when the group data is stored, the compression unit 130 may accumulate a binary number indicating the group including the divided data step by step, wherein the binary number is an uniquely binary binary number . A binary number with unique decodability means a binary number that can be divided into individual codes naturally when decoding is performed once it is accumulated. 100, 101, 1000, 1001, 1011, 1011, 100, 101, 1000, 1001, 1011, , ..., and the like, and of course, various kinds of codes can be used.

상기 원본 이진데이터를 분할하는 상기 데이터 분할단위 및 분할되는 이진데이터 각각을 그룹화하기 위한 상기 그룹의 갯수는 사용자로부터 입력받거나 미리 설정되어 있을 수 있다.The number of the groups for grouping the data division unit and the divided binary data for dividing the original binary data may be input from a user or set in advance.

또한, 도 1에 도시된 바와 같이, 본 실시예에 따른 이진 데이터의 복원장치(200)는 수신부(210) 및 복원부(220)를 포함한다. 수신부(210)는 송신부(140) 등을 통해 전달된 압축데이터를 수신하여 복원부(220)에 전달한다. 1, the apparatus 200 for recovering binary data according to the present embodiment includes a receiving unit 210 and a restoring unit 220. The receiving unit 210 receives the compressed data transmitted through the transmitting unit 140 and transmits the received compressed data to the restoring unit 220.

복원부(220)는 상기 참조사전을 참조하여 상기 압축데이터에 포함되어 있는 상기 그룹데이터 및 순위데이터로부터 원본 이진데이터를 복원한다.
The restoring unit 220 refers to the reference dictionary and restores the original binary data from the group data and the rank data included in the compressed data.

이와 같이 구성된 본 실시예의 동작 및 작용을 도 1 내지 도 6을 참조하여 구체적으로 설명한다.The operation and operation of the present embodiment thus configured will be described in detail with reference to Figs. 1 to 6. Fig.

먼저, 도 2에 도시된 바와 같이 이진데이터의 압축장치(100)의 입력부(미도시)는 원본 이진데이터를 입력받는다(S201).First, as shown in FIG. 2, an input unit (not shown) of the binary data compression apparatus 100 receives original binary data (S201).

이어서, 입력부(미도시)는 상기 원본 이진데이터를 분할하는 데이터 분할단위 및 분할되는 이진데이터 각각을 그룹화하기 위한 그룹의 갯수를 사용자로부터 입력받는다(S202). 여기서, 데이터 분할단위는 원본 이진데이터를 분할하는 특정 비트길이 단위를 의미하는 것으로서, 예를 들어 데이터 분할단위가 8비트인 경우라면 도 4의 상단에 도시된 바와 같은 원본 이진데이터에서 최상위비트(또는 최하위비트)로부터 8비트단위로 차례로 분할하기 위한 값이 된다. 본 실시예에서는 상기 데이터 분할단위가 8비트길이이고 그룹화를 위한 그룹의 갯수가 4개인 경우를 예로 들어 설명하고 있지만, 데이터 분할단위와 그룹의 갯수는 사용자의 설정이나 입력에 따라 다양하게 설정될 수 있다.Then, the input unit (not shown) receives the data division unit for dividing the original binary data and the number of groups for grouping the divided binary data from the user (S202). Herein, the data division unit refers to a specific bit length unit for dividing the original binary data. For example, if the data division unit is 8 bits, the most significant bit (or the most significant bit) of the original binary data Bit from the least significant bit) into 8-bit units. In this embodiment, the data dividing unit is 8 bits long and the number of groups for grouping is four. However, the number of data dividing units and the number of groups may be variously set according to user's setting or input have.

본 실시예에서는 입력부(미도시)가 상기 데이터 분할단위와 그룹갯수를 단계(S201) 이후에 입력받는 것으로 기재하였으나, 이 작업은 단계(S201) 이전에 이루어질 수도 있으며, 또한 실시예에 따라서는 상기 데이터 분할단위와 그룹갯수가 해당 압축장치에 미리 설정 또는 저장되어 있어 단계(S202)의 단계는 생략되어 진행될 수도 있다.In this embodiment, it is described that the input unit (not shown) receives the data division unit and the group number after step S201. However, this operation may be performed before step S201, The step of S202 may be skipped because the data division unit and the group number are previously set or stored in the compression apparatus.

다음으로, 데이터 스캐닝부(110)는 입력된 상기 원본 이진데이터를 상기 데이터 분할단위로 스캐닝하여 분할한다(S203). 즉, 데이터 스캐닝부(110)는 도 4의 상단에 도시된 바와 같은 원본 이진데이터를 특정방향, 예를 들어 최상위비트(MSB)로부터 시작하여 하위 비트방향으로 분할하고, 분할된 각 분할데이터의 값들에 대한 빈도수 등을 산출한다. 그 결과, 도 3에 도시된 바와 같이 스캐닝시 각각의 분할데이터들의 출현 빈도수에 따른 분포가 얻어진다. Next, the data scanning unit 110 scans and divides the input original binary data in units of data division (S203). That is, the data scanning unit 110 divides the original binary data as shown in the upper part of FIG. 4 into a lower bit direction starting from a certain direction, for example, the most significant bit (MSB) And the like. As a result, as shown in FIG. 3, a distribution according to the appearance frequency of each divided data is obtained at the time of scanning.

이어서, 참조사전 생성부(120)는 스캐닝된 각 분할 데이터의 발생 빈도수에 따라, 각 분할 데이터에 대해 그룹을 할당하고 각 그룹 내 순위를 지정하여 참조사전을 생성한다. 즉, 도 5에 도시된 바와 같이, 참조사전 생성부(120)는 각 분할 데이터의 발생빈도수에 따라 각 분할 데이터들을 복수의 그룹으로 그룹화함과 동시에 각 그룹 내에서의 빈도수 순위에 따라 나열함으로써 참조사전을 생성한다. 이 때, 참조사전에서 별도의 순위정보는 저장할 필요는 없다. 참조사전 내에서 분할데이터들의 원본 데이터에서의 빈도의 크기에 따른 나열순서가 바로 순위이기 때문이다. 도 5는 도 3과 같은 발생빈도수 분포를 갖는 각 분할데이터들에 대하여, 각 분할 데이터의 발생빈도수가 높은 순으로 나열하여 그룹화 및 순위를 결정하여 참조사전을 생성하는 경우에 대하여 도시하고 있으나, 발생빈도수에 따른 분포는 그 역순에 따를 수도 있고 그 외에 다른 방법을 따라 참조사전을 생성할 수도 있다. 또한, 도 5에서는 각 그룹에 포함되는 분할 데이터의 갯수(즉, 각 그룹 내에 속해 있는 데이터 순위의 갯수)가 각 그룹마다 64개로 동일한 것으로 도시되어 있으나, 각 그룹에 포함되는 분할 데이터의 갯수는 각 그룹마다 개별적으로 달리 설정함으로써 각 분할 데이터의 발생 빈도수의 분포에 따라 유동적으로 대처하여 압축효율을 높일 수 있도록 할 수도 있다. Then, the reference dictionary generation unit 120 assigns a group to each divided data according to the frequency of occurrence of each divided data segment, and designates a ranking within each group to generate a reference dictionary. 5, the reference dictionary generation unit 120 groups the divided data into a plurality of groups according to the frequency of occurrence of each divided data, and also arranges the divided data according to the frequency ranking in each group, Create a dictionary. At this time, it is not necessary to store separate ranking information in the reference dictionary. This is because the order of sorting the divided data according to the frequency of the original data in the reference dictionary is a rank. FIG. 5 shows the case where the divided data having the occurrence frequency distribution as shown in FIG. 3 is arranged in the descending order of occurrence frequencies of the divided data, and the grouping and ranking are determined to generate the reference dictionary. However, The distribution according to the frequency can be in the reverse order or it can be used to generate reference dictionaries according to other methods. Although the number of divided data included in each group (i.e., the number of data ranks belonging to each group) is shown to be equal to 64 in each group in FIG. 5, the number of divided data included in each group is It is possible to flexibly cope with the distribution of frequency of occurrence of each divided data so as to increase the compression efficiency.

만약, 각 그룹에 포함되는 분할 데이터의 갯수가 각 그룹마다 동일한 경우이고 데이터 분할단위가 8인 경우라면, 8비트짜리 이진수가 나타낼 수 있는 숫자의 갯수는 2⁸=256개 이므로 그룹의 갯수가 4(=2²)이면 각 그룹에 속하는 분할 데이터의 갯수(순위데이터의 갯수)는 64(=2⁶)가 된다. 물론, 그룹의 갯수가 8(=2³)이면 각 그룹에 속하는 분할 데이터의 갯수(순위데이터의 갯수)는 32(=2⁵)가 된다.If the number of divided data included in each group is the same for each group and the number of divided data units is 8, the number of numbers that 8-bit binary numbers can represent is 2 ⁸ = 256, (= 2 ² ), the number of pieces of divided data belonging to each group (the number of ranking data) is 64 (= 2 ⁶ ). Of course, if the number of groups is 8 (= 2 ³ ), the number of pieces of divided data belonging to each group (the number of ranking data) is 32 (= 2 ⁵ ).

다음으로, 압축부(130)는 상기 참조사전을 참조하여, 상기 원본 이진데이터의 특정 방향으로 배열된 각 분할데이터에 대하여 해당 각 분할데이터가 포함된 그룹을 나타내는 그룹데이터 및 각 그룹 내 순위데이터를 축적한다(S205). 특히, 압축부(130)는 상기 그룹데이터의 축적시, 각 분할데이터가 포함되는 상기 그룹을 나타내는 갯수의 제 1이진수(예를 들어 "1") 또는 제 2이진수(예를 들어 "0")를 단계적으로 축적하되, 상기 제 1이진수와 제 2이진수는 각 단계별로 번갈아 축적되도록 한다. 또한, 압축부(130)는 순위데이터의 축적시에는 각 분할데이터의 각 그룹 내에서의 순위를 나타내는 값을 순위스택에 축적한다.Next, referring to the reference dictionary, the compression unit 130 extracts, for each divided data arrayed in a specific direction of the original binary data, group data indicating a group including each divided data and each group ranked ranking data (S205). In particular, when the group data is stored, the compression unit 130 outputs a first binary number (e.g., "1 ") or a second binary number (e.g.," 0 " And the first binary number and the second binary number are accumulated alternately in each step. When the ranking data is accumulated, the compression unit 130 stores a value indicating the ranking in each group of the divided data in the rank stack.

도 4를 예로 들어 설명하면, 도 4에 도시된 원본 이진데이터의 경우 최상위비트(MSB)로부터 "10111000", "11111111", "01010101", "10001010"...와 같은 분할 데이터가 나타난다. 그런데, 이 중 "10111000"은 발생 빈도수가 낮아서 제 4그룹 내에서 여섯번째 순위에, "11111111"는 발생빈도수가 높아서 제 1그룹 내에서 첫번째 순위에, "01010101"은 제 3그룹 내에서 25번째 순위에, "10001010"은 제 2그룹 내에서 64번째 순위에 각각 속한다. 압축부(130)는 "10111000"에 대해서는 제 4그룹에 대응하는 갯수의 1(제 1이진수)을 그룹스택에 축적하고 6비트짜리 숫자 중 여섯번째로 작은 이진수인 "000101"을 순위스택에 먼저 축적하고, 이어서, "11111111"에 대해서는 제 1그룹에 대응하는 갯수의 0(제 2이진수)을 그룹스택에 축적하고 6비트짜리 숫자 중 첫번째로 작은 이진수인 "000000"을 순위스택에 축적한다. 다음으로, 압축부(130)는 "01010101"에 대해서는 제 3그룹에 대응하는 갯수의 1(제 1이진수)을 그룹스택에 축적하고 6비트짜리 숫자 중 25번째로 작은 이진수인 "011000"을 순위스택에 축적하고, 이어서, "10001010"에 대해서는 제 2그룹에 대응하는 갯수의 0(제 2이진수)을 그룹스택에 축적하고 6비트짜리 숫자 중 64번째로 작은 이진수인 "111111"을 순위스택에 축적한다. 압축부(130)는 이와 같은 작업을 원본 이진데이터 전체에 대해 수행하여 그룹데이터 및 각 그룹 내 순위데이터를 축적한다.Referring to FIG. 4, split data such as "10111000", "11111111", "01010101", "10001010", etc. appear from the most significant bit (MSB) in the case of the original binary data shown in FIG. Of these, "10111000" has a low frequency of occurrence, so it is ranked sixth in the fourth group, "11111111" has a high frequency of occurrence and is ranked first in the first group, "01010101 " In the ranking, "10001010" belongs to the 64th ranking in the second group. The compression unit 130 accumulates the number 1 (first binary number) corresponding to the fourth group in the group stack for "10111000 " in the group stack and adds the sixth smallest binary number" 000101 & Accumulates the number 0 (second binary number) corresponding to the first group in the group stack for "11111111 ", and accumulates the first smallest binary number" 000000 " Next, the compression unit 130 accumulates 1 (first binary number) corresponding to the third group in the group stack for "01010101 " in the group stack, and assigns" 011000 ", which is the 25th smallest binary number among the 6- Accumulates in the stack, and then, for "10001010", the number 0 (second binary number) corresponding to the second group is accumulated in the group stack and "111111", which is the 64th smallest number among the six bits, Accumulate. The compression unit 130 performs the above operation on the entire original binary data to accumulate the group data and the ranking data within each group.

도 6은 그룹의 갯수가 16(2⁴)이고 각 그룹에 속하는 분할 데이터의 갯수(순위데이터의 갯수)도 16(=2⁴)인 경우에 그룹데이터와 순위데이터를 축적하는 경우를 나타낸 것이다.FIG. 6 shows a case where group data and rank data are stored when the number of groups is 16 (2 ⁴ ) and the number of pieces of divided data belonging to each group (the number of rank data) is 16 (= 2 ⁴ ).

한편, 상기 그룹데이터의 축적시, 압축부(130)는 각 분할데이터가 포함되는 상기 그룹을 나타내는 이진수로서 유일복호성의 이진수를 단계적으로 축적할 수도 있다. 유일복호성(unique decodability)을 갖는 이진수는 일단 축적이 되면 이후 디코딩을 할 때 자연스럽게 각각의 코드들로 분할될 수 있는 성질을 가진 이진수를 의미한다. 유일복호성 이진수는 최상위비트로부터 "10"으로 시작하여 1개 이상의 연속된 0이나 1로 끝나는 이진수일 수 있으며, "10", "100", "101", "1000", "1001", "1011", "10000", "10001", "10011", "10111"...등과 같은 바이너리 코드들이 그 예가 될 수 있고, 물론 그 외에도 다양한 종류의 코드들이 사용될 수도 있다. 만약, 상술한 실시예의 경우라면, 도 4에서 그룹데이터 축적시 1000(제 4그룹), 10(제 1그룹), 101(제 3그룹), 100(제 2그룹)을 차례대로 축적하여 100010101100으로 이어서 축적할 수 있다. 이러한 데이터들은 유일복호성이 있어서 이 숫자배열에 포함되어 있는 "10"앞에서 자동적으로 끊을 수 있어 1000/10/101/100으로 추후 복원시 분리될 수 있다. 이러한 유일복호성 이진수는 통상적으로 그룹의 갯수가 10개 이상으로 많아지는 경우 그 효과가 뛰어나다.On the other hand, at the time of accumulating the group data, the compression unit 130 may accumulate the binary number of the unique decoding property step by step as the binary number indicating the group including each divided data. A binary number with unique decodability means a binary number that can be divided into individual codes naturally when decoding is performed once it is accumulated. The singleton binary number may be a binary number starting with "10" from the most significant bit and ending with one or more consecutive 0's or 1's, and may be a number "10 "," 100 ", "101 "," "," 10000 "," 10001 "," 10011 "," 10111 ", and the like, and of course, various kinds of codes may be used. In the case of the embodiment described above, 1000 (the fourth group), 10 (the first group), 101 (the third group), 100 (the second group) are accumulated in the order of 100010101100 It can be accumulated subsequently. These data can be automatically disconnected before "10" included in this number array because it is uniquely protected and can be separated by restoration to 1000/10/101/100. This single-bob boss Lee Jin-soo is usually effective when the number of groups is increased to 10 or more.

다음으로, 압축부(130)는 상기 참조사전 및 상기 축적된 그룹데이터와 순위데이터를 결합하여 압축데이터를 생성한다(S206). 상기 축적된 그룹데이터 및 순위데이터는 상기 참조사전을 참조하여 축적되어 생긴 데이터이기 때문에, 이후 복원시에는 참조사전을 이용하여야 원본 이진데이터의 복원이 가능하다. 따라서, 압축부(130)는 상기 참조사전을 상기 축적된 그룹데이터 및 순위데이터와 결합하여 압축데이터를 생성하는 것이다.Next, the compression unit 130 generates compressed data by combining the reference dictionary and the accumulated group data with the ranking data (S206). Since the stored group data and ranking data are data that is accumulated by referring to the reference dictionary, the original binary data can be restored by using a reference dictionary at the time of restoration. Accordingly, the compression unit 130 combines the reference dictionary with the accumulated group data and ranking data to generate compressed data.

마지막으로, 송신부(140)는 생성된 압축데이터를 목적장치, 예를 들어 이진 데이터 복원장치(200)로 전송한다(S207).
Finally, the transmission unit 140 transmits the generated compressed data to the destination apparatus, for example, the binary data restoration apparatus 200 (S207).

상기와 같은 과정을 통해 이진 데이터가 압축되어 전송되면, 이진 데이터 복원장치(200)는 수신부(210)를 통해 상기 압축데이터를 수신하여 복원부(220)에 전달한다. 복원부(220)는 상기 압축데이터에 포함되어 있는 상기 참조사전을 참조하여, 상기 압축데이터에 포함되어 있는 상기 그룹데이터 및 순위데이터로부터 원본 이진 데이터를 복원한다. 이 때 복원부(220)는 상술한 압축과정과는 반대의 과정을 통해 이진 데이터를 복원하되 축적된 그룹데이터와 순위데이터를 이용하여 복원한다. 즉, 마지막에 축적된 그룹데이터와 순위데이터를 시작점으로 하여 복원을 수행하되, 각 복원단계마다 참조사전을 이용하여 분할 데이터들을 복원하며, 이후 최종적인 분할 데이터의 복원이 완료되면, 이들 데이터를 결합하여 원본 이진데이터를 복원한다.
When the binary data is compressed and transmitted through the above process, the binary data decompression apparatus 200 receives the compressed data through the reception unit 210 and transmits the compressed data to the decompression unit 220. The restoring unit 220 refers to the reference dictionary included in the compressed data, and restores the original binary data from the group data and the rank data included in the compressed data. At this time, the restoring unit 220 restores the binary data through the process opposite to the above-described compression process, but restores it using the accumulated group data and the rank data. That is, the restoration is performed with the last stored group data and the ranking data as starting points. In each restoration step, the divided data is restored by using the reference dictionary. After the restoration of the final divided data is completed, And restores the original binary data.

이상 살펴 본 바와 같이, 본 실시예에 따른 이진 데이터의 압축 및 복원 방법과 장치는, 간단한 연산과 하드웨어적 구성을 통해 이진 데이터를 신속하고 효율적으로 압축하고 복원할 수 있고, 압축률도 뛰어나며 압축 데이터 및 복원 데이터의 신뢰성도 높일 수 있을 뿐만 아니라 데이터 전송시 전송효율과 속도도 향상시킬 수 있다.
As described above, the method and apparatus for compressing and restoring binary data according to the present embodiment can quickly and efficiently compress and restore binary data through a simple operation and a hardware configuration, Not only the reliability of the restored data can be increased, but also the transmission efficiency and speed can be improved in data transmission.

이상에서 본 발명의 실시 예에 대하여 상세하게 설명하였지만 본 발명의 권리범위는 이에 한정되는 것은 아니고 다음의 청구범위에서 정의하고 있는 본 발명의 기본 개념을 이용한 당업자의 여러 변형 및 개량 형태 또한 본 발명의 권리범위에 속하는 것이다.
While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is to be understood that the invention is not limited to the disclosed exemplary embodiments, It belongs to the scope of right.

100 : 이진 데이터 압축장치
110 : 데이터 스캥닝부 120 : 참조사전 생성부
130 : 압축부 140 : 송신부
200 : 이진 데이터 복원장치
210 : 수신부 220 : 복원부100: binary data compression device
110: Data Scanning unit 120: Reference dictionary creation unit
130 compression section 140 transmission section
200: Binary data restoration device
210: Receiving unit 220:

Claims

A method of compressing binary data performed by a binary data compression apparatus,
Receiving original binary data;
Scanning the original binary data in units of data segments and dividing the original binary data;
Generating a reference dictionary by assigning a group to each divided data and assigning a ranking within each group according to the frequency of occurrence of each divided data;
Accumulating group data indicating group including each divided data and each group ranking data for each divided data arranged in a specific direction of the original binary data by referring to the reference dictionary; And
Combining the reference dictionary and the accumulated group data with the ranking data to generate compressed data,
Upon accumulation of the group data,
The first binary number and the second binary number corresponding to the group including the divided data are accumulated step by step, and the first binary number and the second binary number are accumulated alternately for each step. Compression method.

The method according to claim 1,
Wherein the data division unit means a specific bit length unit for dividing the original binary data.

The method according to claim 1,
Wherein the number of pieces of divided data included in each group is the same for each group or can be individually set for each group.

The method according to claim 1,
Wherein, when generating the reference dictionary, the groups of divided data are assigned in order of increasing occurrence frequency or higher, and the order of each group is also specified.

delete

A method of compressing binary data performed by a binary data compression apparatus,
Receiving original binary data;
Scanning the original binary data in units of data segments and dividing the original binary data;
Generating a reference dictionary by assigning a group to each divided data and assigning a ranking within each group according to the frequency of occurrence of each divided data;
Accumulating group data indicating group including each divided data and each group ranking data for each divided data arranged in a specific direction of the original binary data by referring to the reference dictionary; And
Combining the reference dictionary and the accumulated group data with the ranking data to generate compressed data,
Upon accumulation of the group data,
Wherein the step of accumulating the binary numbers indicating the group including the divided data stepwise comprises the step of storing the binary data in a binary number format.

The method according to claim 6,
Wherein the singular binary binary number is a binary number starting from "10" from the most significant bit and ending with one or more consecutive 0's or 1's.

The method according to claim 1,
Further comprising the step of receiving the data division unit for dividing the original binary data and the number of the groups for grouping each of the divided binary data.

A method for restoring binary data compressed by the binary data compression method according to any one of claims 1 to 4 and 6 to 8,
And restoring original binary data from the group data and rank data included in the compressed data by referring to the reference dictionary.

CLAIMS 1. A compression device for binary data that compresses binary data,
A data scanning unit that scans and divides input original binary data in a data division unit;
A reference dictionary generating unit for assigning groups to each divided data according to the frequency of occurrence of each divided data scanned and generating a reference dictionary by assigning a ranking within each group;
Group data indicating a group including each divided data and each group ranked rank data are accumulated for each divided data arranged in a specific direction of the original binary data by referring to the reference dictionary and the reference dictionary and the accumulation And a compression unit for generating compressed data by combining the group data and the ranking data,
Upon accumulation of the group data,
Wherein the compressing unit accumulates the first binary number or the second binary number corresponding to the group including the divided data step by step, wherein the first binary number and the second binary number are accumulated alternately for each step. A compression device for binary data.

11. The method of claim 10,
Wherein the data division unit means a specific bit length unit for dividing the original binary data.

11. The method of claim 10,
Wherein the number of pieces of divided data included in each group is the same for each group or can be set individually for each group.

11. The method of claim 10,
Wherein the reference dictionary generation unit assigns groups of each divided data in order of increasing occurrence frequency or higher, and also designates the order of each group when generating the reference dictionary.

delete

CLAIMS 1. A compression device for binary data that compresses binary data,
A data scanning unit that scans and divides input original binary data in a data division unit;
A reference dictionary generating unit for assigning groups to each divided data according to the frequency of occurrence of each divided data scanned and generating a reference dictionary by assigning a ranking within each group;
Group data indicating a group including each divided data and each group ranked rank data are accumulated for each divided data arranged in a specific direction of the original binary data by referring to the reference dictionary and the reference dictionary and the accumulation And a compression unit for generating compressed data by combining the group data and the ranking data,
Upon accumulation of the group data,
Wherein the compression unit accumulates a binary number indicating the group including the divided data step by step, wherein the binary number is a singly biped binary number.

16. The method of claim 15,
Wherein the singular binary binary number is a binary number starting from "10" from the most significant bit and ending with one or more consecutive 0's or 1's.

11. The method of claim 10,
Wherein the number of the groups for grouping each of the data division unit and the divided binary data for dividing the original binary data is input from a user or set in advance.

An apparatus for restoring binary data compressed by a binary data compression apparatus according to any one of claims 10 to 13 and 15 to 17,
And a decompression unit for decompressing the original binary data from the group data and the ranking data included in the compressed data by referring to the reference dictionary.