KR101652735B1

KR101652735B1 - Binary data compression and restoration method and apparatus

Info

Publication number: KR101652735B1
Application number: KR1020150069003A
Authority: KR
Inventors: 김정훈
Original assignee: 김정훈
Priority date: 2015-05-18
Filing date: 2015-05-18
Publication date: 2016-08-31

Abstract

The present invention relates to a method of compressing binary data of a compression apparatus, the compression unit compressing a binary cluster group having a plurality of binary clusters which are encountered while moving from a first position of original binary data in a first direction, Wherein the step of generating the compressed cluster group comprises the steps of: dividing the original binary data each time a bit value is inverted to obtain the plurality of binary clusters; Acquiring the binary cluster group consisting of binary clusters from a first position until a scattering value for a bit length of each binary cluster exceeds a first reference value for the first time; And compressing the last binary cluster included in the binary cluster group.

Description

TECHNICAL FIELD [0001] The present invention relates to a binary data compression and restoration method and apparatus,

The present invention relates to a method and apparatus for compressing and restoring binary data, and more particularly, to an apparatus and method for efficiently and efficiently compressing and restoring binary data through a simple operation and a hardware configuration, And more particularly to a method and apparatus for compressing and restoring binary data.

In general, since the frequency bandwidth available in a normal transmission channel is limited, various transmission systems such as a modem have used an effective data compression technique to compress or reduce the amount of transmission data in order to transmit a large amount of data.

One of the various compression schemes is the CCITT V.42 bis employed in a data transmission system such as a modem with a coding algorithm standardized by the International Telecommunication Union (ITU). The basis applied to this coding standard is a Ziv-Lempel code (ZLC). In this method, an address value of a dictionary storing the same phrase as the previous input data is formed as a codeword while adaptively forming a dictionary from the input data. Lt; / RTI > The dictionary operation performs a continuous string matching with the input data to update the dictionary by adding the unmatched characters to the maximum matching string and adding them to the dictionary.

However, such a conventional compression method requires complicated processing of data compression and decompression, requires a relatively high-performance hardware device, limits the improvement of the processing speed, and increases the reliability of the compression result value there was.

The background art of the present invention is disclosed in Korean Patent Laid-Open Publication No. 1999-0022960 (published on Mar. 25, 1999).

SUMMARY OF THE INVENTION The present invention has been made in view of the above problems, and it is an object of the present invention to provide a data compression method and a data compression method that can compress and restore binary data quickly and efficiently through simple computation and hardware configuration, And a method and apparatus for compressing and restoring binary data that can improve transmission efficiency and speed.

According to an aspect of the present invention, there is provided a method of compressing binary data of a compression device, the method comprising: compressing a binary cluster group having at least one binary cluster meeting while moving in a first direction from a first position of original binary data; Wherein the step of generating the compressed cluster group comprises the steps of: dividing the original binary data every time a bit value is inverted to obtain the at least one binary cluster, ; Obtaining the binary cluster group consisting of binary clusters from a first position until a first comparison value of a scattering value with respect to a bit length of each binary cluster exceeds a first reference value; And compressing the last binary cluster included in the binary cluster group.

In the present invention, the step of compressing the last binary cluster may include generating a compressed cluster by reducing the bit length of the last binary cluster, wherein the bit length of the compressed cluster is such that the value of the dispersion exceeds the reference value The minimum bit length being a minimum bit length; And the compressing unit includes merging the last binary cluster into the compression cluster with an identification code indicating a bit length compressed in the compression cluster.

In the present invention, the identification code is a universal code, wherein the universal code includes a first binary number having a length of 1 bit at the most significant bit, at least one second binary number disposed subsequent to the first binary number, And the second binary number is an inverted binary number of the first binary number and a binary number of a one-bit length, and the second binary number is an inverted binary number of the first binary number and a binary number of 1 bit length .

The present invention is characterized by further comprising the step of generating the compressed cluster group from the next position of the generated current compressed cluster group.

The present invention is characterized in that the generation of the compressed cluster group is repeated from the next position of the generated current cluster group, and the bit length of the compression cluster included in the current cluster group is further reflected when the scatter value is calculated do.

In the present invention, each bit value of the next compression cluster subsequent to the current compression cluster group is not inverted or inverted based on the last bit value of the current compression cluster group.

In the present invention, the dispersion value, the standard deviation, the skewness, or the kurtosis are adopted as the scattering value.

According to another aspect of the present invention, there is provided a method of restoring binary data compressed by a binary data compression method, the restoration unit restoring each compressed cluster group by referring to the identification code, And restoring the binary data of the restoration device.

According to another aspect of the present invention, there is provided a binary data compression apparatus, which compresses binary cluster groups each having at least one binary cluster meeting while moving from a first position of original binary data in a first direction, Wherein the compressing unit divides the original binary data every time a bit value is inverted to generate the at least one binary cluster when the compressed cluster group is generated, Acquiring the binary cluster group constituted of binary clusters until a value of the scattering degree with respect to the bit length of the cluster exceeds the first reference value for the first time, and compressing the last binary cluster included in the binary cluster group A data compression device is provided.

In the present invention, the compression unit compresses the last binary cluster by merging a compression cluster generated by reducing the bit length of the last binary cluster and an identification code indicating the bit length compressed in the compression cluster by the last binary cluster Wherein a bit length of the compressed cluster is a minimum bit length that allows the scatter value to exceed the reference value.

In the present invention, the compression unit repeatedly generates the next compressed cluster group from the next position of the generated current compressed cluster group.

In the present invention, the compression unit generates the next compressed cluster group using the compressed cluster included in the last binary cluster of the current compressed cluster group.

According to another aspect of the present invention, there is provided a binary data decompression apparatus for restoring binary data compressed by a binary data compression apparatus, the binary data decompression apparatus comprising: And a reconstruction unit for reconstructing the reconstructed binary data.

The method and apparatus for compressing and restoring binary data according to the present invention are capable of quickly and efficiently compressing and restoring binary data through a simple operation and a hardware configuration, and also have excellent compression rate and reliability of compressed data and restored data Not only the transmission efficiency and the speed of data transmission can be improved.

1 is a block diagram of a binary data compression apparatus and a decompression apparatus according to an embodiment of the present invention.
2 is a flowchart illustrating a method of compressing binary data according to an embodiment of the present invention.
FIG. 3 shows an example of merging a universal code into a compressed compressed cluster.
Figure 4 shows another example of merging the universal code into a compressed compressed cluster.

Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those skilled in the art can easily carry out the present invention. The present invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. In order to clearly illustrate the present invention, parts not related to the description are omitted, and like parts are denoted by similar reference numerals throughout the specification.

Throughout the specification, when an element is referred to as "comprising ", it means that it can include other elements as well, without excluding other elements unless specifically stated otherwise.

FIG. 1 is a block diagram of a binary data compression apparatus and a decompression apparatus according to an embodiment of the present invention. Referring to FIG. 1, an embodiment according to the present invention will be described below.

As shown in FIG. 1, the binary data compression apparatus 100 according to the present embodiment includes a compression unit 110 and an output unit 120.

The compression unit 110 compresses a binary cluster group having at least one binary cluster meeting while moving in a first direction (e.g., a lower bit direction) from a first position (e.g., a most significant bit) of original binary data Thereby generating a compressed cluster group. The first position will be the most significant bit if compressing from the beginning of the original binary data and may be any specific position in the original binary data if the compressing operation according to the present embodiment is repeatedly performed in a specific direction.

When generating the compressed cluster group, the compression unit 110 divides the original binary data every time the bit value is inverted to obtain at least one binary cluster consisting of only 0 or 1 bits. The compression unit 110 compresses the binary cluster group composed of binary clusters from the first position until the first comparison value of the bit length of each binary cluster exceeds the first reference value (for example, 100) And compresses the last binary cluster contained therein.

The compressing unit 100 merges the compression cluster generated by reducing the bit length of the last binary cluster and the identification code indicating the bit length compressed in the compression cluster by the last binary cluster, Compress the cluster. Here, the bit length of the compressed cluster is a minimum bit length that allows the scatter value to exceed the reference value.

Wherein the identification code is a universal code comprising a first binary number one bit long in the most significant bit, at least one second binary number disposed subsequent to the first binary number, and at least one second binary number subsequent to the plurality of second binary numbers And may be configured to include a first binary number or a first binary number having a length of 2 bits in the most significant bit. The second binary number is an inverted binary number of the first binary number and a binary number of 1 bit length. For example, if the first binary number is 1 and the second binary number is 0, then the universal code may be a code such as 100111, 1000011, 1011111, ..., and so on. On the other hand, if the first binary number is 0 and the second binary number is 1, then the universal code can be a code such as 0111100, 010000, 01111000, and so on. If the first binary number is 0 or 1, the universal code may be a code such as 11, 00, and so on. A more detailed description of the universal code follows.

The compressing unit 110 may compress the original binary data by repeatedly performing the compression operation from the next position of the generated current compressed cluster group to repeatedly generate the compressed cluster group.

Alternatively, the compressing unit 110 may compress the original binary data by repeatedly generating a next compressed cluster group using the compressed cluster included in the generated current compressed cluster group, which will be described in detail later .

The dispersion value, the standard deviation, the skewness, or the kurtosis may be employed as the scatter value. In addition to the above examples, various statistical indicators may be employed that may represent the scatter value of the cluster length values.

1, the apparatus for recovering binary data 200 according to the present embodiment includes an input unit 210 and a decompression unit 220. The input unit 210 receives compressed data transmitted through the output unit 120 and transmits the compressed data to the decompression unit 220.

The restoring unit 220 reconstructs each compressed cluster group by referring to the identification code, and restores the binary data.

The operation and operation of the present embodiment configured as described above will be described in detail with reference to Figs. 1 and 2. Fig.

FIG. 2 is a flowchart for explaining a method of compressing binary data according to an embodiment of the present invention. Referring to FIG. 2, a method of compressing binary data according to this embodiment will be described.

First, the compression unit 110 compresses a binary cluster group having at least one binary cluster meeting while moving in a first direction from a first position of original binary data to generate a cluster group (S210). Here, the first position may be the most significant bit of the original binary data, or may be a specific position in the original binary data when the compression operation according to the present embodiment is repeatedly performed in a specific direction. In addition, the first direction may be a lower bit direction from the most significant bit of the original binary data, or may be a direction from the least significant bit to an upper bit. In this embodiment, the lower bit direction will be described as an example.

In step S210, the compression unit 110 divides the original binary data every time the bit value is inverted to obtain at least one binary cluster (S211).

For example, suppose you have 2,064,386 bits of binary data:

11011010000110011110001000111100000101000010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000011111000000000000011000000001111111011111111000010010000000000011 ....

The compression unit 110 conceptually divides binary streaming every time a bit value changes (every time the bit value is inverted) while the binary data is scanned by 1 bit from the most significant bit to the lower bit . "-" is a promise to conceptually mark the segmentation of binary streaming and is not present in the actual data. A binary cluster divided into "-" is referred to as a binary cluster in this embodiment.

11-0-11-0-1-0000-11-00-1111-000-1-000-1111-00000-1-0-1-0000-11-0-11-000-1-000-11- 0-1-0-111-0000-1-00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000-11111-000000000000000-11-00000000-1111111-0-11111111-0000-1-00-1-0000000000000-11 ....

As can be seen from the above results, in the case of the binary data starting with "1", the even binary cluster is composed of binary numbers consisting only of "0", and the odd binary cluster is composed of binary numbers consisting of only "1" .

Next, the compression unit 110 acquires a binary cluster group composed of the binary cluster (s) from the first position until the scattering value for the bit length of each binary cluster exceeds the preset reference value for the first time ( S212).

The compression unit 110 checks the bit length of at least one binary cluster obtained by sequentially reading from the first position (e.g., most significant bit) of the binary data to the lower bit direction, and determines the bit length To calculate the scattering value. The calculation of the scatter value is continuously performed until the scatter value of the accumulated bit length exceeds a preset reference value. In this case, the dispersion, standard deviation, skewness, or kurtosis of the bit lengths may be employed as the scatter diagram, and a scattering estimation method of various other schemes capable of representing the distribution of frequency numbers may be applied. In this embodiment, dispersion will be described as an example.

Table 1 below shows the result of checking the binary values of binary clusters sequentially from the original binary data and analyzing the bit length of each binary cluster.

Binary cluster Binary cluster bit length Dispersion 11 2 N / A 0 One 0.5 11 2 0.3333 0 One 0.3333 One One 0.3 0000 4 1.3667 11 2 1.1429 00 2 0.9821 1111 4 1.3611 000 3 1.2889 One One 1.2909 000 3 1.2424 1111 4 1.3974 00000 5 1.8077 One One 1.8286 0 One 1.8292 One One 1.8162 0000 4 1.8824 11 2 1.7836 0 One 1.7763 11 2 1.6905 000 3 1.6364 One One 1.6324 000 3 1.587 11 2 1.5233 0 One 1.5215 One One 1.5157 0 One 1.5066 111 3 1.4803 0000 4 1.5448 One One 1.5398 000 130 512.2 ... ... ...

As shown in Table 1, the bit length values are analyzed together while sequentially dividing the binary clusters. For example, the variance of the bit lengths of the first and second binary clusters is 0.5. Next, the variance value for the bit lengths of the first to third binary clusters is calculated to be 0.3333, and these operations are sequentially and repeatedly performed. If the variance value of the accumulated bit length is equal to a preset reference value (for example, 100) It is performed until it overtakes. The compression unit 110 obtains a binary cluster group composed of binary clusters up to that time when the variance value exceeds the reference value.

This is done to increase the efficiency of the data compression described below. Referring to Table 1, when the distribution value of the bit length is obtained from the binary cluster of the above example, the distribution value is smaller than 2 from the first to 31st binary clusters, Indicating that there was no binary cluster having an excessively long length. However, the variance value from the first to the 32nd binary cluster is suddenly increased to 512.2, exceeding the reference value 100. [ This is because the bit length of the 32nd binary cluster has suddenly increased to 130. In this embodiment, the 32nd binary cluster performs the compression operation described later. In this embodiment, the reference value is set to 100. However, the reference value may be set to another appropriate value by the system designer or the user.

Next, the compression unit 110 compresses the last binary cluster included in the acquired binary cluster group (S213). First, the compression unit 110 generates a compressed cluster by reducing the bit length of the last binary cluster of the binary cluster group. At this time, the bit length of the compressed cluster is determined by the dispersion value And a minimum bit length that exceeds the reference value. In the above-described example, the bit length of the last binary cluster (32nd binary cluster) to be compressed is 130, and the bit length of the last binary cluster (32nd binary cluster) to be compressed is 130, Recalculate the value. When the bit length is reduced by 1 bit, the variance value becomes 100 or less at any moment. In this embodiment, the variance value becomes 102.4476 when the bit length is 59 bits, and becomes 98.92641 when the bit length is 58 bits. The compression unit 110 compresses and compresses the last binary cluster up to 59 bits, which is a minimum bit length that allows the variance value to exceed the reference value 100. As a result, the last binary cluster is reduced (compressed) by 71 bits in length from 130 bits to 59 bits in length.

Then, the compression unit 110 merges the identification code indicating the bit length compressed by the last binary cluster into the compression cluster into the compression cluster. Wherein the identification code is a universal code comprising a first binary number one bit long in the most significant bit, at least one second binary number disposed subsequent to the first binary number, and at least one second binary number subsequent to the at least one second binary number And may be configured to include a first binary number or a first binary number having a length of 2 bits in the most significant bit. The second binary number is an inverted binary number of the first binary number and a binary number of 1 bit length. For example, if the first binary number is 1 and the second binary number is 0, then the universal code may be a code such as 100111, 1000011, 1011111, ..., and so on. On the other hand, if the first binary number is 0 and the second binary number is 1, then the universal code can be a code such as 0111100, 010000, 01111000, and so on. If the first binary number is 0 or 1, the universal code may be a code such as 11, 00, and so on.

In the above-described example, the compression unit 110 adds an identification code (universal code) indicating information that the compressed cluster reduced from the 130-bit length to the 59-bit length by 71 bits. The universal codes can be sequentially mapped to arbitrary integers starting with an integer 0 as shown in Table 2 below. That is, a background composed of one "1" and n-1 "0" of n digits (n> = 3) of "100", "1000", "10000" In a universal code binary number, a binary code generated by sequentially filling "1" from the least significant bit is set as a universal code. In the remaining universal codes except for "11", "1" do. If it does not satisfy the above condition after sequentially filling "1" from the least significant bit, n is increased by one and the above process is repeated from the background universal code. According to the rule, a universal code as shown in Table 2 below can be obtained, and these universal codes are sequentially mapped to the compressed bit lengths (0, 1, 2, 3, ...).

Compressed
Bit length Universal code Universal cord length 0 11 2 One 101 3 2 1001 4 3 1011 4 4 10001 5 5 10011 5 6 10111 5 7 100001 6 8 100011 6 9 100111 6 10 101111 6 11 1000001 7 12 1000011 7 13 1000111 7 14 1001111 7 15 1011111 7 16 10000001 8 17 10000011 8 18 10000111 8 19 10001111 8 20 10011111 8 21 10111111 8 22 100000001 9 23 100000011 9 24 100000111 9 25 100001111 9 26 100011111 9 27 100111111 9 28 101111111 9 29 1000000001 10 30 1000000011 10 31 1000000111 10 32 1000001111 10 33 1000011111 10 34 1000111111 10 35 1001111111 10 36 1011111111 10 37 10000000001 11 38 10000000011 11 39 10000000111 11 40 10000001111 11 41 10000011111 11 42 10000111111 11 43 10001111111 11 44 10011111111 11 45 10111111111 11 46 100000000001 12 47 100000000011 12 48 100000000111 12 49 100000001111 12 50 100000011111 12 51 100000111111 12 52 100001111111 12 53 100011111111 12 54 100111111111 12 55 101111111111 12 56 1000000000001 13 57 1000000000011 13 58 1000000000111 13 59 1000000001111 13 60 1000000011111 13 61 1000000111111 13 62 1000001111111 13 63 1000011111111 13 64 1000111111111 13 65 1001111111111 13 66 1011111111111 13 67 10000000000001 14 68 10000000000011 14 69 10000000000111 14 70 10000000001111 14 71 10000000011111 14 72 10000000111111 14 ... ... ...

FIG. 3 shows an example of merging a universal code into a compressed compressed cluster, and shows a case where a compressed cluster is composed of only zero. As shown in FIG. 3, a universal code 10000000011111 representing information that the last binary cluster has been compressed by a 71-bit length from a 130-bit length to a 59-bit length into a compression cluster is called a compression cluster 000 ... 000 It is to merge. In this case, even if the codes in Table 2 are used as the universal codes in the case where the compression clusters are made only of 0 as in the above example, since the compression clusters and the universal codes can be distinguished from each other, smooth restoration is possible.

However, unlike the above example, if the compression clusters are made up of only one, if the codes shown in Table 2 are used as the universal codes, it is impossible to distinguish where the compressed clusters are and where they are from the universal codes. Therefore, in this case, in order to distinguish the compressed clusters made up of only one from the compressed clusters, a code in which all of 0 and 1 of the universal codes shown in Table 2 are reversed is used as a universal code. 4, a universal code 01111111100000 (a code obtained by inverting each bit of 10000000011111), which is the compression identification information of 71 bits long, is attached to the compression clusters 111 ... 111 (59 bits length). By doing so, the compressed cluster and the universal code can be identified separately.

On the other hand, when the last binary cluster has a bit length (59 bits in the above example) that can not be further compressed, the bit length to be compressed is 0, so that "00" or "11" . That is, if the last binary cluster was originally 111 ... 111 (59 bits long), a universal code "00" is attached thereto and if it was originally 000 ... 000 (59 bits long) So that they can be identified.

In this way, the compression unit 110 compresses the last binary cluster included in the binary cluster group, and combines the universal code with the last binary cluster, thereby creating a compressed cluster group. In the above example, since the last cluster is reduced by 71 bits, while the 14-bit universal code is added, the corresponding binary cluster group is compressed by 57 bits in length.

Table 3 shows the result of performing the compression operation for the 32nd binary cluster of Table 1. [

Binary
cluster original
Binary cluster length Compression result compression
Cluster length compression
cluster Universal code Universal cord length Initial dispersion Dispersion after compression 11 2 11 0 One 0 11 2 11 0 One 0 One One One 0000 4 0000 11 2 11 00 2 00 1111 4 1111 000 3 000 One One One 000 3 000 1111 4 1111 00000 5 00000 One One One 0 One 0 One One One 0000 4 0000 11 2 11 0 One 0 11 2 11 000 3 000 One One One 000 3 000 11 2 11 0 One 0 One One One 0 One 0 111 3 111 0000 4 0000 One One One 000 130 0000000000000000000000000000000000000000000000000000000001001000000011111 59 00000000000000000000000000000000000000000000000000000000000 100000000011111 15 512.2006 102.4476

Next, the compression unit 110 determines whether the last bit of the original binary data has been reached (S220). If it is determined that the last bit has been reached, the compression operation is terminated. If the last bit is not reached yet, the process returns to step S210, and the above-described process is repeated to perform the compression operation. That is, the compression unit 110 compresses the binary cluster group to generate a compressed cluster group, and then performs compression on the next binary cluster group. This operation is performed on the entire original binary data.

However, each bit value of the next compressed cluster group following the generated current compressed cluster group may be inverted or not inverted based on the last bit value of the current compressed cluster group. That is, whether or not each bit value of the next compressed cluster group is inverted is determined based on the last bit value of the current compressed cluster group. By doing so, the current compressed cluster group and the next compressed cluster group can be distinguished without a separate identifier. Referring to the above embodiment,

The original binary data 11 ... 000 (130 bits) /11000111001111 ... indicates that the first binary cluster group (first through 32nd binary clusters) ends in 0 and the universal code for this is 10000000011111 , The first compressed cluster group becomes 11 ... 00010000000011111 (underlined part is a universal code), and if the next binary cluster group 11000111001111 ... (after the 33rd binary cluster) is connected, It can not be identified and can not be restored later. Therefore, in the present embodiment, the reversal of each bit value of the next compressed cluster group is determined according to the value of the last bit of the current compressed cluster group, in order to distinguish between neighboring compressed clusters. In the above embodiment, since the value of the last bit of the current compressed cluster group is 1, when the next binary cluster is compressed and pasted, the bit value is inverted in the form of 00111000110000 ... and pasted.

The compressing unit 110 may repeatedly generate the compressed cluster group from the next position of the current compressed cluster group in order to continuously perform the compressing operation after generating the current compressed cluster group. That is, in the case of the above embodiment, the first compression cluster group is generated, and then the 33th binary cluster is cumulatively accumulated to obtain the dispersion value, thereby performing the compression process described above.

On the other hand, the compression unit 110 repeats the compression cluster group generation step (S211 to S213) from the next position of the generated current compression cluster group, but in the calculation of the variance value, Lt; RTI ID = 0.0 > a < / RTI > bit length. In other words, taking the above example as an example, when the dispersion value is obtained by accumulating in order to compress the next binary cluster group, instead of obtaining the variance value from the 33rd binary cluster, The bit length 59 may be further reflected to obtain the variance value. Simulation shows that the compression effect of ordinary binary data can be improved by doing so.

To be more specific, the bit length of the last compressed cluster of Table 3 is included (except for the universal code portion), and then the bit length value of the next at least one binary cluster is collected to accumulate the variance values. The reason for this is that when the restoration (decompression) target bitstream reversal process is sequentially performed to find a restoration (decompression) target cluster (i.e., a compression cluster) when recognizing each binary cluster, (A = "1", B = "0", C = "1") in the form of two consecutive (ie, "11" or "00" Or a bit having a bit change of three times including a bit A such as A = "0", B = "1", C = "0") can be recognized by the restoring unit 220, When reading from the bit to the lower bit direction, it is possible to recognize a pattern of the bit change and have a kind of uniqueness, so that only the compression cluster can be recognized separately.

If X is in the contemporary compression cluster group region and Y is in the compression cluster group region of the next order, X ... XAB ... BC..CY ... Y (where Y is the physical value of the real cluster Regardless of whether it is the same length as the actual physical cluster and has the opposite bit value of C, X is also determined by the state of the previous bit) or X ... XAAY ... Y, where Y is the physical value If X ==> Y direction, then bit inversion is performed if X is constructed in the form of an opposite bit of A with the same length as the actual physical cluster, and X is also determined by the state of the immediately preceding bit) AB ... BC, in which one bit change occurs by one bit (A) and then a bit change occurs again (B..B) and then a bit change (C..C) occurs again by using a pattern of . It is possible to distinguish .C or AA where one bit change occurs by two bits.

Therefore, the compression unit 110 uses the bit length of the compression cluster and the variance value of the bit length of the binary cluster (s) of the next binary cluster group to identify the binary cluster to be compressed in the subsequent binary cluster and perform compression can do. This process continues until the compression is completed for the entire original binary data. Table 4 explains compressing the next binary cluster group by this method.

Binary cluster Source binary cluster length Compression result Compressed cluster length compression
cluster Universal code Universal Code Division Initial dispersion Dispersion after compression 000 130 0000000000000000000000000000000000000000000000000000000001001000000011111 59 00000000000000000000000000000000000000000000000000000000000 100000000011111 15 11111 5 010001 One 0 10001 5 1458 1682

As shown in Table 4, when a 59-bit length (immediately preceding compressed cluster bit length) and a 5-bit binary cluster immediately follow, the variance value at this time is 1458, and thus exceeds a preset reference value of 100. Thus, a 5-bit binary cluster becomes a compression target binary cluster. In the same manner as the above-described compression process, it is possible to confirm how long the dispersion can exceed 100 while decreasing the bit length of the compression target binary cluster by 1 bit. In this embodiment, as the bit length decreases, It grows. In this case, the same bit reduction is performed. However, since compression binary clusters must be at least 1 bit long, decreasing the length of 4 bits is the maximum reduction, and the dispersion is 1682.

On the other hand, if the universal code of the immediately preceding compression cluster group (first compression cluster group) is checked in order to confirm whether or not bit inversion of the compression cluster of the 5-bit-length compression target binary cluster is present, Quot; is stored as " 0 "in one bit length by inverting the bit to" 0 ", and the universal code for decreasing 4 bits is "01110" in which each bit value of "10001" shown in Table 2 is inverted. Therefore, the newly generated second compression cluster group becomes "010001 ".

If the process continues again, the third compression cluster group is generated by further reflecting the 1-bit length of the compression cluster of the second compression cluster group. Table 5 shows the process of generating the third compressed cluster group.

Binary cluster Source binary cluster length Compression result Compressed binary cluster length The compressed binary cluster portion Universal Code Division Universal code bug Initial dispersion Dispersion after compression 11111 5 010001 One 0 10001 5 000000000000000 15 000000000000000 15 000000000000000 11 2 11 2 11 00000000 8 00000000 8 00000000 1111111 7 1111111 7 1111111 0 One 0 One 0 11111111 8 11111111 8 11111111 0000 4 0000 4 0000 One One One One One 00 2 00 2 00 One One One One One 0000000000000 13 0000000000000 13 0000000000000 11 2 11 2 11 000 94 00000000000000000000000000000000000000000101111111111 39 00000000000000000000000000000000000000000 101111111111 12 587.1703 103.956

In Table 5, a third compression target binary cluster is found. When a binary cluster of 94 bits is encountered, the bit length of the compression cluster of the immediately preceding compression cluster group and the bit length of each succeeding binary cluster are accumulated to obtain a dispersion value , And when the variance value exceeds 100 for the first time, 587.17 is obtained. Thus, a binary cluster of 94 bits in length becomes a compression target binary cluster. The binary clusters remaining therebetween alternately proceed according to the order of bit inversion only, and the compressed cluster group is generated as described above.

Compressed data obtained by compressing the original binary data will be the result values stored in the form of physical consecutive compressed data in Table 3, Table 4, and Table 5. In contrast to the conventional algorithms such as entropy encoding and LZW, a dictionary for encoding and decoding Information is not required, and the universal code of Table 2 can be generated automatically by the encoding unit and the decoding unit itself through operation, or can be generated by operation if an arbitrary integer is given. Therefore, There is no.

Next, the output unit 120 outputs the binary data compressed by the compression unit 110 to a destination apparatus such as the decompression apparatus 200 or the like.

The restoration method (decompression method) will be described as follows.

First, the input unit 210 receives binary data compressed by the binary data compression apparatus 100 and provides the binary data to the decompression unit 220.

Then, the restoring unit 220 reconstructs each compressed cluster group by referring to the identification code (universal code), and restores the binary data. A concrete restoration method will be described with reference to the above embodiments.

The restoring unit 220 sequentially reads the compressed binary data from the upper bit to the lower bit and recognizes the bit length of each cluster every time a bit value changes, When the value does not exceed a preset reference value (100 in the present embodiment) at the time of compression, if bit inversion is required according to the bit inversion order immediately on each binary cluster (that is, if there is no bit inversion in the immediately preceding order in the order, Bit inversion is required). The bit is inverted to sequentially restore the original binary data.

When the distribution value including the length of the specific binary cluster reaches 100 after this process, the binary cluster becomes a compressed cluster, that is, a compression cluster to be restored. The restoration target compression cluster is first decoded and decoded into a binary cluster. In the method of separating a universal code, when there are two bits composed of consecutive identical values after the restoration target compression cluster, only a value of 2 bits is a universal code. (1 bit value + n bits inverted from the immediately preceding value + binary number of m bits inverted from the immediately preceding value) pattern, the pattern becomes a universal code such as "100..001 ... 1".

If the universal code is identified, the corresponding integer can be identified using an integer-specific universal code mapping table or formula as shown in Table 2. If the length of the compressed cluster to be restored is longer by the integer value, The data of the length can be recovered.

Table 6 below illustrates the above process with reference to the compressed data generated in Table 3. Referring to Table 6 below, the contiguous form of the compression result is compressed data, i.e., "110110100001100111100010001111 .... ".

Binary cluster Source binary cluster length Compression result Compressed cluster length The compression cluster section Universal code Universal cord length Initial dispersion Dispersion after compression 11 2 11 2 11 0 One 0 One 0 11 2 11 2 11 0 One 0 One 0 One One One One One 0000 4 0000 4 0000 11 2 11 2 11 00 2 00 2 00 1111 4 1111 4 1111 000 3 000 3 000 One One One One One 000 3 000 3 000 1111 4 1111 4 1111 00000 5 00000 5 00000 One One One One One 0 One 0 One 0 One One One One One 0000 4 0000 4 0000 11 2 11 2 11 0 One 0 One 0 11 2 11 2 11 000 3 000 3 000 One One One One One 000 3 000 3 000 11 2 11 2 11 0 One 0 One 0 One One One One One 0 One 0 One 0 111 3 111 3 111 0000 4 0000 4 0000 One One One One One 000 130 0000000000000000000000000000000000000000000000000000000001001000000011111 59 00000000000000000000000000000000000000000000000000000000000 100000000011111 15 512.2006 102.4476

The compression clusters can be separated each time there is bit inversion in the compression result sequentially.

1-0-11-0-1-0000-11-00-1111-000-1-000-1111 -....- 00000000000000000000000000000000000000000000000000000000000- [1-000000000-11111] -...

The compressed clusters thus separated are converted according to the order of bit inversion, and are immediately regarded as binary clusters to recover the original binary data. As shown in Table 6, when the 59th bit of the 32nd cluster is separated and the variance value including the length of the cluster is calculated, 102.447 is over 100 for the first time. Therefore, the 32nd 59-bit cluster is the compression cluster to be restored and further decodes the added universal code to determine how many bits should be restored. When the bit length is changed for the first time in the compression cluster and the length held is 2 bits, the universal code symbolizes "11 ". For example, in the case of 000000110 ..., "000000" is the cluster to be restored and "11" is the universal code. In other cases, when AB..BC ... C pattern (A = "1", B = "0", C = "1" or A = "0" "This pattern is a universal code meaning" 10..01 ... 1 ". For example, in the case of 000000100001110 ..., "000000" is the restoration target cluster, "10000111" is the universal code, and the next "0" is a part of the value representing the subsequent compression cluster for reference.

On the other hand, note that when restoring a compressed cluster, length information and universal code information of the compression cluster are important, not resting on the binary bit value constituting the compression cluster. Is sequentially determined according to whether it is a sequence of "1" or a sequence of "0". For example, when restoring a compressed cluster, it is restored while inverting the bits with respect to the previous restoration cluster.

For example, in the above example, if "000000" is the restoration target cluster and the restoration result is the order of restoring to the cluster of "0" of "00 ..... 0", "10000111" Quot ;, if it is restored, it is determined in order of restoring to the inverted version in comparison with the bit value of the previous restoration cluster, such as "1 ... 1 ". Alternatively, in the above example, if "000000" is the restoration target cluster and the restoration result is the order in which the restoration result is "1 ..... 1", the restoration result is "0" after the "10000111" The subsequent compression clusters are determined in order of being reconstructed in an inverted form with respect to the bit values of the immediately preceding restoration cluster, such as "0 ... 0 ".

Therefore, as shown in Table 6, 100000000011111 after the restoration target compression cluster with a 59-bit length means a universal code. Since this value represents 71 as shown in Table 2, the 71- , And the result of determining whether bit inversion is performed according to the bit inversion sequence is decoded into a binary cluster.

As described above, the method and apparatus for compressing and decompressing binary data according to the present embodiment can compress and restore binary data quickly and efficiently through a simple operation and a hardware configuration, and is capable of compressing and restoring the compressed data and the restored data Not only reliability can be enhanced, but transmission efficiency and speed can also be improved in data transmission. Also, according to the present embodiment, compression and restoration can be performed while moving in the lower bit direction (or higher bit direction) sequentially without a compression dictionary, so that there is a speed gain in compression of a large scale data and real time transmission processing is possible.

While the invention has been shown and described in detail in the foregoing description, it is evident that many alternatives, modifications and variations will be apparent to those skilled in the art, Of the right.

100: binary data compression device
110:
120: Output section
200: Binary data restoration device
210:
220:

Claims

A method of compressing binary data of a compression device,
A compression cluster generation step of generating a cluster group by compressing a binary cluster group having at least one binary cluster meeting while moving in a first direction from a first position of original binary data,
The compressing cluster group generating step includes:
Dividing the original binary data each time a bit value is inverted to obtain the at least one binary cluster;
Obtaining the binary cluster group consisting of binary clusters from a first position until a first comparison value of a scattering value with respect to a bit length of each binary cluster exceeds a first reference value; And
And compressing the last binary cluster included in the binary cluster group.

The method according to claim 1,
Wherein compressing the last binary cluster comprises:
Generating a compressed cluster by reducing a bit length of the last binary cluster, wherein the bit length of the compressed cluster is a minimum bit length that causes the scatter value to exceed the first reference value; And
And the compressing unit comprises merging the last binary cluster into the compression cluster with an identification code indicating the bit length compressed in the compression cluster.

3. The method of claim 2,
Wherein the identification code is a universal code,
Wherein the universal code includes a first binary number having a length of one bit at the most significant bit, at least one second binary number disposed subsequent to the first binary number, and at least one first binary number disposed subsequent to the at least one second binary number Or a first binary number having a length of 2 bits in the most significant bit,
Wherein the second binary number is an inverted binary number of the first binary number and a binary number of 1 bit length.

3. The method of claim 2,
And repeating the generation of the compressed cluster group from the next location of the generated current compressed cluster group.

3. The method of claim 2,
Wherein the step of generating the compressed cluster group is repeated from the next position of the generated current compressed cluster group to further reflect the bit length of the compressed cluster included in the current compressed cluster group when calculating the scatter factor value. Compression method.

The method according to claim 4 or 5,
Wherein each bit value of the next compressed cluster following the current compressed cluster group is inverted or not inverted based on the last bit value of the current compressed cluster group.

The method according to claim 1,
Wherein the dispersion value, the standard deviation, the skewness, or the kurtosis are employed as the scatter value.

A method for restoring binary data compressed by the binary data compression method according to any one of claims 2 to 5,
And reconstructing the binary data by restoring the compressed cluster group with reference to the identification code.

A binary data compression apparatus comprising:
And a compression unit compressing a binary cluster group having at least one binary cluster that meets while moving in a first direction from a first position of the original binary data to generate a compressed cluster group,
Wherein the compression unit divides the original binary data each time the bit value is inverted to obtain the at least one binary cluster, and calculates a scatter value for the bit length of each binary cluster from the first position Acquires the binary cluster group composed of binary clusters until the first reference value exceeds the first reference value, and compresses the last binary cluster included in the binary cluster group.

10. The method of claim 9,
The compression unit compresses the last binary cluster by merging a compression cluster generated by reducing the bit length of the last binary cluster and an identification code indicating the bit length compressed in the compression cluster by the last binary cluster,
Wherein the bit length of the compressed cluster is a minimum bit length that causes the scatter value to exceed the first reference value.

11. The method of claim 10,
Wherein the identification code is a universal code,
Wherein the universal code includes a first binary number having a length of one bit at the most significant bit, at least one second binary number disposed subsequent to the first binary number, and at least one first binary number disposed subsequent to the at least one second binary number Or a first binary number having a length of 2 bits in the most significant bit,
Wherein the second binary number is an inverted binary number of the first binary number and a binary number of 1 bit length.

11. The method of claim 10,
Wherein the compression unit repeatedly generates the next compressed cluster group from the next position of the generated current compressed cluster group.

11. The method of claim 10,
Wherein the compression unit generates the next compressed cluster group using the compressed cluster included in the last binary cluster of the generated current compressed cluster group.

The method according to claim 12 or 13,
Wherein each bit value of the next compressed cluster following the current compressed cluster group is inverted or not inverted based on the last bit value of the current compressed cluster group.

10. The method of claim 9,
Wherein the dispersion value, the standard deviation, the skewness, or the kurtosis are employed as the scattering value.

A binary data restoration apparatus for restoring binary data compressed by the binary data compression apparatus according to any one of claims 10 to 13,
And a reconstruction unit for reconstructing the binary data by restoring each of the compressed cluster groups with reference to the identification code.