KR101682829B1 - Message compression method and apparatus - Google Patents
Message compression method and apparatus Download PDFInfo
- Publication number
- KR101682829B1 KR101682829B1 KR1020150134302A KR20150134302A KR101682829B1 KR 101682829 B1 KR101682829 B1 KR 101682829B1 KR 1020150134302 A KR1020150134302 A KR 1020150134302A KR 20150134302 A KR20150134302 A KR 20150134302A KR 101682829 B1 KR101682829 B1 KR 101682829B1
- Authority
- KR
- South Korea
- Prior art keywords
- code string
- character code
- character
- string
- message
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
Abstract
Description
The present invention relates to a message compression method and apparatus, and more particularly, to a method and apparatus for efficiently compressing a message according to a character code system through a simple operation and a hardware configuration, And to a method and apparatus for compressing a message.
In general, since the frequency bandwidth available in a normal transmission channel is limited, various transmission systems such as a modem have used an effective data compression technique to compress or reduce the amount of transmission data in order to transmit a large amount of data.
One of the various compression schemes is the CCITT V.42 bis employed in a data transmission system such as a modem with a coding algorithm standardized by the International Telecommunication Union (ITU). The basis applied to this coding standard is a Ziv-Lempel code (ZLC). In this method, an address value of a dictionary storing the same phrase as the previous input data is formed as a codeword while adaptively forming a dictionary from the input data. Lt; / RTI > The dictionary operation performs a continuous string matching with the input data to update the dictionary by adding the unmatched characters to the maximum matching string and adding them to the dictionary.
Especially, as the social network services / sites (SNS) are activated in recent years, the transmission and processing of characters in various types of services such as networking, communication, media sharing, and message service have increased exponentially, From the viewpoint of the server, the amount of data to be processed and the system load are excessively increasing.
However, the above-described conventional compression method has a problem in that a processing operation for compressing a message is complicated and requires a relatively high-performance hardware device, a limitation is imposed on the improvement of the processing speed, and it is difficult to increase the reliability of the compression result value .
The background art of the present invention is disclosed in Korean Patent Laid-Open Publication No. 1999-0022960 (published on Mar. 25, 1999).
SUMMARY OF THE INVENTION The present invention provides a message compression method and apparatus capable of effectively and efficiently compressing a message according to a character code system through a simple operation and a hardware configuration, To provide.
According to an aspect of the present invention, And a compressing unit compressing the message based on the character code information with reference to a memory unit storing character code information according to a set character code system, wherein the compressing unit compresses the message into a series of character strings A character code string corresponding to a character or each character is an 11 type character code string with 11 on the most significant bit side, a 10 type character code string with 10, or a 0 type character code string with 0 And compressing the character code string according to whether or not the character code string is recognized.
Wherein said compressing step inverts each bit of a plurality of 1s before 10, which is first appeared from the most significant bit of said character code string, when said character code string is an 11 type character code string, Generating a first code string by generating a second code string by inverting each bit of the remaining code string except for the plurality of 1s and deleting the leading zeros; Combining the first code string and the second code string to generate a third code string; And adding an origin code indicating that the character string is an 11 type character code string at a predetermined position of the third code string.
If the character code string is a 10-type character code string in the step of compressing the message, the compression unit may delete the leading zero while reversing each bit of the character code string.
In the compressing of the message, when the character code string is a 0-type character code string, the compression unit may add an origin code indicating a 0-type character code string to a predetermined position of the character code string.
According to another aspect of the present invention, there is provided a communication system including a message input unit receiving a message; And a memory unit for storing character code information according to a set character code system; And a compressing unit compressing the message based on the character code information with reference to the memory unit, wherein the compressing unit compresses the message in a series of character strings or each character unit, Characterized in that the character code string is compressed according to whether the code string is an 11 type character code string with 11 on the most significant bit side, a 10 type character code string with 10, or a 0 type character code string with 0 Thereby providing a compression device.
When the character code string is an 11-type character code string, the compression unit generates a first code string by inverting each bit of a plurality of 1s before 10 appearing first from the most significant bit of the character code string, Generating a second code string by inverting each bit of the remaining code string except for a plurality of 1's and deleting the leading zeros thereof and generating a second code string by presetting a third code string generated by combining the first code string and the second code string It is possible to add an origin code indicating that the character string is an 11 type character code in the position.
If the character code string is a 10-type character code string, the compression unit can delete the leading zero while inverting each bit of the character code string.
If the character code string is a 0-type character code string, the compression unit may add an origin code indicating a 0-type character code string to a predetermined position in the character code string.
According to another aspect of the present invention, And a compressing unit compressing the message based on the character code information with reference to a memory unit storing character code information according to a set character code system, wherein the compressing unit compresses the message into a series of character strings A character code string corresponding to each character is a 00 type character code string with 00 on the most significant bit side, a 01 type character code string with 01, or a 1 type character code string with 1 And compressing the character code string according to whether or not the character code string is recognized.
The compressing of the message may include: if the character code string is a 00 type character code string, the compression unit inverts each bit of a plurality of zeros before 01 appearing first from the most significant bit of the character code string, Generating a second code string by inverting each bit of the remaining code string except for the plurality of zeros and deleting the first one; Combining the first code string and the second code string to generate a third code string; And adding an origin code indicating that the character string is a 00 type character code string at a predetermined position in the third code string.
In the step of compressing the message, if the character code string is a 01-type character code string, the compression unit may delete the first 1 while reversing each bit of the character code string.
In the step of compressing the message, when the character code string is a 1-type character code string, the compression unit may add an origin code indicating a 1-type character code string to a predetermined position in the character code string.
According to another aspect of the present invention, there is provided an information processing apparatus comprising: a message input unit receiving a message; And a memory unit for storing character code information according to a set character code system; And a compressing unit compressing the message based on the character code information with reference to the memory unit, wherein the compressing unit compresses the message in a series of character strings or each character unit, Characterized in that the character code string is compressed according to whether the code string is a 00 type character code string with 00 on the most significant bit side, a 01 type character code string with 01, or a 1 type character code string with 1, Thereby providing a compression device.
When the character code string is a 00-type character code string, the compression unit generates a first code string by inverting each bit of a plurality of zeros before 01 appearing first from the most significant bit of the character code string, Generating a second code string by inverting each bit of the remaining code string except for a plurality of zeros and deleting the first digit of the first code string and generating a second code string by presetting a third code string generated by combining the first code string and the second code string You can add an origin code to indicate that the location is a 00-type character code string.
If the character code string is a 01-type character code string, the compression unit can delete the first 1 while inverting each bit of the character code string.
When the character code string is a 1-type character code string, the compression unit may add an origin code indicating that the character code string is a 1-type character code string at a predetermined position in the character code string.
A message compression method and apparatus according to an aspect of the present invention not only efficiently compresses a message according to a character code system through a simple operation and a hardware configuration but also reduces a message processing burden of the system, And the efficiency can be improved.
1 is a block diagram of a message compression apparatus according to an embodiment of the present invention.
FIG. 2 is a diagram showing a distribution according to a character system of a most significant bit binary number of a character code assigned to a native language character. FIG.
3A is a reference diagram for explaining a compression method of an 11-type character code string.
3B is a reference diagram for explaining a compression method of a 10-type character code string.
3C is a reference diagram for explaining a compression method of a 0-type character code string.
4 is an exemplary diagram illustrating an example of a message service or a messenger service of the SNS.
5 shows a screen of the Facebook service in SNS.
Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those skilled in the art can easily carry out the present invention. The present invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. In order to clearly illustrate the present invention, parts not related to the description are omitted, and like parts are denoted by similar reference numerals throughout the specification.
Throughout the specification, when an element is referred to as "comprising ", it means that it can include other elements as well, without excluding other elements unless specifically stated otherwise.
FIG. 1 is a diagram illustrating a configuration of a message compression apparatus according to an embodiment of the present invention. FIG. 2 is a diagram illustrating a distribution of the most significant bits of a character code allocated to native language characters according to a character system. FIG. 3B is a reference diagram for explaining a compression method of a 10-type character code string, FIG. 3C is a reference diagram for explaining a compression method of a 0-type character code string, FIG. 4 illustrates an example of a message service or a messenger service of the SNS. FIG. 5 illustrates a screen of the Facebook service of the SNS. Referring to FIG. 5, Respectively.
1, the message compression apparatus according to the present embodiment includes a character code
The character code
The
The
The
The
The operation and operation of this embodiment thus configured will be described in detail with reference to Figs. 1 to 5. Fig.
First, the
On the other hand, the character code
The
When various messages are input from the
Typically, the character code system used for each character of each language (country) is a binary code assigned to the character of the corresponding language (country), and characters and other symbols (special characters) of other languages (countries) Often, different binary codes are assigned different code arrays.
For example, in the case of a 2-byte combined Hangul code system and a 2-byte complete Hangul code system, which are character code systems used for Hangul, these character codes are binary code (for example, 10. ., 11 ...), and assigns binary codes (for example, 0, ...) starting with 0 to alphabetic characters, numbers, and other special characters. That is, in the Hangul character code system, as shown in the first bar of FIG. 2, a binary code having a most significant bit of 1 is assigned to most Hangul characters.
In some of the character code systems in which English is the main input, binary codes (for example, 00 ..., 01 ...) are assigned to start with 0 in the most significant bit for alphabetic characters, For other special characters, a binary code starting with 1 (for example, 1 ...) can be assigned. That is, in the character code system, as shown in the second bar of FIG. 2, a binary code having the most significant bit of 0 is allocated to most English characters.
On the other hand, when the binary code assigned to the most significant bit of the character of the language is not shifted to 0 or 1 as in the third bar display part (third character code system) in Fig. 2, It may not be effective.
First, the compression process will be described taking as an example a character system (character service) using a Hangul code system such as a two-byte combination Hangul code system.
As described above, when the binary code of any character in the message input from the
First Embodiment
The
At this time, in compressing the message, the
If the character code string is an 11-type character code string, the
For example, when the character code string 11100010101 is taken as an example, a plurality of 1's, that is, 11's in front of 10 appearing first from the most significant bit of the character code string 11100010101 are inverted as shown in FIG. 11101010 (second code string) is generated by inverting each bit of 100010101 which is the remaining code string except 11, and deleting the leading zero. Then, the
At the same time, the
If the character code string is a 10-type character code string, the
If the character code string is a 0 type character code string, the
When compression is performed in this way, a compressing effect of 1 bit length is generated for the 10 type character code string, while a substantial compression effect does not occur for the 11 type character code string. In the case of the 0 type character code string, The length thereof is increased. However, since the Hangul input system occupies most of the Korean character input, 1 bit is compressed for each Korean Hangul string code starting from 10 on the most significant bit side, resulting in a considerable compression effect on the server or system providing the corresponding service . For reference, the decompression of the compressed data is performed through the reverse process of the compression process based on the rule in the compression process.
Next, a character system (character service) that uses a specific character code system (a specific character code system for English or other various languages), in which a character code starting from 0 is assigned to a character of a specific language, Describe the process.
As described above, when a binary code of a character in a message input from the
Second Embodiment
The
At this time, in compressing the message, the
First, if the character code string is a 00-type character code string, the
For example, when the character code string 00011101010 is taken as an example, each bit of a plurality of 0's, that is, 00s before 01 appearing first from the most significant bit of the character code string 00 011101010 is inverted to generate 11 (first code string) Each bit of 011101010, which is the remaining code string except 00, is inverted and 1 is deleted to generate 00010101 (second code string). Then, the
At the same time, the
If the character code string is a 01-type character code string, the
When the character code string is a 1-type character code string, the
In this way, a compressing effect of 1 bit length is generated for the 01 type character code string. On the other hand, in the case of the 00 type character code string, the actual compression effect does not occur. In the case of the 1 type character code string, The length thereof is increased. However, in the input system of a certain language such as English, most of the character input of the specific language occupies most of the characters. Therefore, 1 bit is compressed for each string code starting from 01 on the most significant bit side. A considerable level of compression effect is obtained. For reference, the decompression of the compressed data is performed through the reverse process of the compression process based on the rule in the compression process.
As described above, according to the present embodiment, a message can be efficiently and efficiently compressed according to a character code system through a simple operation and a hardware configuration. In addition, the burden of message processing on the system can be reduced, have.
While the invention has been shown and described in detail in the foregoing description, it is evident that many alternatives, modifications and variations will be apparent to those skilled in the art, Of the right.
110: Character code system information input unit
120: message setting unit
130:
140: Memory
150:
Claims (16)
Compressing the message based on the character code information by referring to a memory unit storing character code information according to a set character code system,
Wherein the compressing unit compresses the message in a series of character strings or each character unit, wherein the character string corresponding to the character string or each character is an 11 type character code string having 11 on the most significant bit side, Or a 0-type character code string having 0, and compresses the character code string.
The step of compressing the message may include: when the character code string is an 11-type character code string,
The compression unit generates a first code string by inverting each bit of a plurality of 1s before 10 appearing first from the most significant bit of the character code string, and outputs each bit of the remaining code string except for the plurality of 1s Generating a second code string by inverting the leading zero;
Combining the first code string and the second code string to generate a third code string; And
And adding an origin code indicating that the character string is an 11 type character code string at a predetermined position of the third code string.
Wherein in the compressing of the message, when the character code string is a 10-type character code string, the compressing unit inverts each bit of the character code string and deletes the leading zero.
Wherein in the step of compressing the message, when the character code string is a 0-type character code string, the compression unit adds an origin code indicating a 0-type character code string to a predetermined position in the character code string. Message compression method.
A memory unit storing character code information according to a set character code system; And
And a compression unit for referring to the memory unit and compressing the message based on the character code information,
Wherein the compressing unit compresses the message in a series of character strings or each character unit, wherein the character string corresponding to the character string or each character is an 11 type character code string having 11 on the most significant bit side, Or a 0-type character code string having 0, and compresses the character code string.
If the character code string is an 11-type character code string,
The compression unit generates a first code string by inverting each bit of a plurality of 1s before 10 appearing first from the most significant bit of the character code string and outputs each bit of the remaining code string except for the plurality of 1s An origin code indicating that the character string is an 11 type character code string is formed at a predetermined position of a third code string generated by combining the first code string and the second code string The message compression device further comprising:
If the character code string is a 10-type character code string,
Wherein the compression unit inverts each bit of the character code string and deletes the leading zeros.
If the character code string is a 0 type character code string,
Wherein the compression unit adds an origin code indicating that the character code string is a 0 type character code string at a predetermined position in the character code string.
Compressing the message based on the character code information by referring to a memory unit storing character code information according to a set character code system,
Wherein the compressing unit compresses the message in a series of character strings or each character unit, wherein the character string corresponding to the character string or each character is a 00 type character code string having 00 on the most significant bit side, a 01 type character code string having 01 Or a 1-type character code string having 1, and compresses the character code string.
The step of compressing the message may include: when the character code string is a 00 type character code string,
The compression unit generates a first code string by inverting each bit of a plurality of zeros before 01 appearing first from the most significant bit of the character code string and outputs each bit of the remaining code string except for the plurality of zeros Generating a second code string by inverting the leading 1;
Combining the first code string and the second code string to generate a third code string; And
And adding an origin code indicating that the character string is a 00 type character code string at a predetermined position of the third code string.
Wherein, when compressing the message, if the character code string is a 01 type character code string, the compressing unit inverts each bit of the character code string and deletes the leading one.
Wherein when the character code string is a 1-type character code string, the compression unit adds an origin code indicating that the character code string is a 1-type character code string at a predetermined position in the character code string. Message compression method.
A memory unit storing character code information according to a set character code system; And
And a compression unit for referring to the memory unit and compressing the message based on the character code information,
Wherein the compressing unit compresses the message in a series of character strings or each character unit, wherein the character string corresponding to the character string or each character is a 00 type character code string having 00 on the most significant bit side, a 01 type character code string having 01 Or a 1-type character code string in which 1 is included in the character code string.
If the character code string is a 00-type character code string,
The compression unit generates a first code string by inverting each bit of a plurality of zeros before 01 appearing first from the most significant bit of the character code string, and outputs each bit of the remaining code string except for the plurality of zeros The first code string is generated by combining the first code string and the second code string, and an origin code indicating that the string is a 00 type character code string at a predetermined position of the third code string generated by combining the first code string and the second code string The message compression device further comprising:
If the character code string is a 01 type character code string,
Wherein the compression unit inverts each bit of the character code string and deletes the first 1 in the character code string.
If the character code string is a 1-type character code string,
Wherein the compressing section adds an origin code indicating that the character code string is a type 1 character code string at a predetermined position of the character code string.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020150134302A KR101682829B1 (en) | 2015-09-23 | 2015-09-23 | Message compression method and apparatus |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020150134302A KR101682829B1 (en) | 2015-09-23 | 2015-09-23 | Message compression method and apparatus |
Publications (1)
Publication Number | Publication Date |
---|---|
KR101682829B1 true KR101682829B1 (en) | 2016-12-12 |
Family
ID=57574143
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020150134302A KR101682829B1 (en) | 2015-09-23 | 2015-09-23 | Message compression method and apparatus |
Country Status (1)
Country | Link |
---|---|
KR (1) | KR101682829B1 (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20070043334A (en) * | 2005-10-21 | 2007-04-25 | 주식회사 비즈위너스 | Compression code, letter compression method and device |
KR20140145437A (en) * | 2013-06-13 | 2014-12-23 | 김정훈 | Binary data compression and decompression method and apparatus |
KR20150077194A (en) * | 2013-12-27 | 2015-07-07 | 김정훈 | Binary data compression and restoration method and apparatus |
KR20150093060A (en) * | 2014-02-06 | 2015-08-17 | 김정훈 | Binary data compression and restoration method and apparatus |
-
2015
- 2015-09-23 KR KR1020150134302A patent/KR101682829B1/en active IP Right Grant
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20070043334A (en) * | 2005-10-21 | 2007-04-25 | 주식회사 비즈위너스 | Compression code, letter compression method and device |
KR20140145437A (en) * | 2013-06-13 | 2014-12-23 | 김정훈 | Binary data compression and decompression method and apparatus |
KR20150077194A (en) * | 2013-12-27 | 2015-07-07 | 김정훈 | Binary data compression and restoration method and apparatus |
KR20150093060A (en) * | 2014-02-06 | 2015-08-17 | 김정훈 | Binary data compression and restoration method and apparatus |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP3541930B2 (en) | Encoding device and decoding device | |
EP1147612B1 (en) | Code book construction for variable to variable length entropy encoding | |
US5844508A (en) | Data coding method, data decoding method, data compression apparatus, and data decompression apparatus | |
KR101610609B1 (en) | Data encoder, data decoder and method | |
KR100663421B1 (en) | Short messaging service message exchanging system and method | |
JPS6356726B2 (en) | ||
CN112995199B (en) | Data encoding and decoding method, device, transmission system, terminal equipment and storage medium | |
CN110602498B (en) | Self-adaptive finite state entropy coding method | |
KR101682829B1 (en) | Message compression method and apparatus | |
Mathpal et al. | A research paper on lossless data compression techniques | |
KR101682828B1 (en) | Message compression method and apparatus | |
GB2334653A (en) | Data compression system with dictionary updating algorithm | |
EP2113845A1 (en) | Character conversion method and apparatus | |
US6101281A (en) | Method for improving data encoding and decoding efficiency | |
Jain et al. | A comparative study of lossless compression algorithm on text data | |
KR101791877B1 (en) | Method and apparatus for compressing utf-8 code character | |
KR101791880B1 (en) | Method and apparatus for compressing utf-8 code character | |
KR102098644B1 (en) | Method and apparatus for compressing utf-8 code character | |
Shanmugasundaram et al. | Text preprocessing using enhanced intelligent dictionary based encoding (EIDBE) | |
KR101676421B1 (en) | Data compression and restoration method and apparatus | |
Husodo et al. | Arithmetic coding modification to compress SMS | |
KR101752281B1 (en) | Method and apparatus for compressing utf-8 code character | |
KR20190091586A (en) | TCP/IP Packet data compression method and appratus based on binary compression method | |
JPS6180929A (en) | Information source encoding and transmitting device | |
KR101676420B1 (en) | Data compression and restoration method and apparatus |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
E701 | Decision to grant or registration of patent right | ||
GRNT | Written decision to grant |