Image data compression method and device
Technical Field
The present invention relates to compression technologies, and in particular, to a method and an apparatus for compressing image data.
Background
Compression techniques are mainly used to save storage space, and existing compression methods can be divided into lossless compression and lossy compression. The compression effect is often expressed in terms of compression ratio, the larger the compression ratio, the more space is saved by the compression process.
In order to ensure that the information contained in the compressed data is consistent with the information contained in the data before compression, a lossless compression method can be adopted, but according to the first theorem of shannon, the compression ratio of lossless compression depends on the information entropy of the source data, and the larger the information entropy, the smaller the compression ratio. Therefore, for a natural image rich in color information, the information entropy is large, so that the compression ratio of lossless compression is low, that is, it is difficult to save storage space by lossless compression.
In order to solve the above problems, a lossy compression method, such as Joint Photographic Experts Group (Jpeg) compression method, is usually adopted in the image compression process, and the compression ratio is increased by discarding details in the image that are not easily perceived by human eyes. The lossy compression method has the characteristics of adjustable quality and high compression ratio, but the compression process sequentially comprises the processes of color space conversion, downsampling, DCT (discrete cosine transformation), quantization, Zig-zag reordering, entropy coding and the like, the compression steps are multiple, the operation is complex, and the compression speed is slow.
Disclosure of Invention
The embodiment of the invention provides a method and a device for compressing image data, which are used for solving the problems of multiple compression steps, complex operation and low compression speed of the conventional image data compression method.
In a first aspect, an embodiment of the present invention provides a method for compressing image data, including:
performing mask operation and hash processing operation on data to be compressed to obtain mask data corresponding to the compressed data and key values of a hash table corresponding to the mask data, wherein the data to be compressed is data of preset bytes read from a starting address of a data stream to be compressed of an input data stream; judging whether the key value position stores an effective address or not; if yes, reading the reference data stream from the effective address of the input data stream, and comparing the data stream to be compressed with the reference data stream to obtain a compression result.
According to the method, before the image data is compressed, the mask operation is firstly carried out on the data to be compressed, so that repeated data segments in the image data are increased, the compression ratio of the image data is improved, and meanwhile, the mask operation is only added in the existing lossless compression process, so that the compression steps are fewer, the operation is simple, and the compression speed is improved.
With reference to the first aspect, in a first possible implementation manner of the first aspect, performing a masking operation on data to be compressed to obtain masked data, where the masking operation includes:
acquiring a mask of the preset compression error according to the preset compression error; and performing bitwise AND operation on each byte data of the data to be compressed and the mask to obtain masked mask data.
With reference to the first possible implementation manner of the first aspect, in a second possible implementation manner of the first aspect, the obtaining a mask of the preset compression error according to the preset compression error includes:
judging whether the preset compression error is data obtained by subtracting 1 from 2 exponential powers such as 1, 3, 7 and the like; if so, negating the preset compression error to obtain a mask of the preset compression error; if not, acquiring the minimum value of all 2 exponential powers larger than the preset compression error, subtracting 1 from the minimum value, and then negating to acquire a mask of the preset compression error.
The mask is determined according to the preset compression error input by the user, so that the compression loss of the image data compression method provided by the invention is within the preset range of the user, and the compression effect is ensured.
With reference to the second possible implementation manner of the first aspect, in a third possible implementation manner of the first aspect, comparing the data stream to be compressed with the reference data stream to obtain a compression result, includes:
reading data of one byte from a starting address along a data stream to be compressed in sequence as first data to be compared, and moving a first pointer pointing to the starting address backward by one byte; reading data of one byte from the effective address in sequence along the reference data stream as second data to be compared, and moving a second pointer pointing to the effective address backward by one byte; and obtaining a compression result according to the comparison result of the first data to be compared and the second data to be compared.
With reference to the third possible implementation manner of the first aspect, in a fourth possible implementation manner of the first aspect, obtaining a compression result according to a comparison result of the first data to be compared and the second data to be compared includes:
judging whether the difference value of the first data to be compared and the second data to be compared accords with a preset rule or not; if not, acquiring a termination address according to the address currently pointed by the first pointer; if so, reading again to obtain first data to be compared and second data to be compared according to the first pointer and the second pointer, moving the first pointer and the second pointer backwards by one byte respectively, judging whether the difference value of the first data to be compared and the second data to be compared, which are obtained by reading again, accords with a preset rule or not until the first data to be compared and the second data to be compared do not accord with the preset rule, and obtaining a termination address according to the current address pointed by the first pointer; and obtaining a compression result according to the termination address, the starting address and the effective address.
With reference to the fourth possible implementation manner of the first aspect, in a fifth possible implementation manner of the first aspect, obtaining a compression result according to the ending address, the starting address, and the effective address includes:
acquiring a matching length according to the termination address and the start address; obtaining an offset value according to the starting address and the effective address; and obtaining a compression result according to the matching length and the offset value.
With reference to the fourth possible implementation manner of the first aspect, in a sixth possible implementation manner of the first aspect, if the preset compression error plus 1 is an exponential power of 2, the determining whether a difference between the first to-be-compared data and the second to-be-compared data meets a preset rule includes:
performing mask operation on the first data to be compared to obtain masked first data to be compared; performing mask operation on the second data to be compared to obtain masked second data to be compared; judging whether the difference value of the masked first data to be compared and the masked second data to be compared is 0; if so, determining that the difference value of the first data to be compared and the second data to be compared accords with a preset rule; if not, determining that the difference value of the first data to be compared and the second data to be compared does not accord with a preset rule.
With reference to the fourth possible implementation manner of the first aspect, in a seventh possible implementation manner of the first aspect, if the preset compression error plus 1 is not an exponential power of 2, the determining whether a difference between the first to-be-compared data and the second to-be-compared data meets a preset rule includes:
judging whether the absolute value of the difference value of the first data to be compared and the second data to be compared is smaller than a preset compression error or not; if so, determining that the difference value of the first data to be compared and the second data to be compared accords with a preset rule; if not, determining that the difference value of the first data to be compared and the second data to be compared does not accord with a preset rule.
With reference to the first aspect and any one of the first to the seventh possible implementation manners of the first aspect, in an eighth possible implementation manner of the first aspect, the method further includes:
if no effective address is stored at the position of the key value, storing the initial address at the position of the key value of the hash table, moving a first pointer pointing to the initial address backwards by one byte, and sequentially reading data of preset bytes as new data to be compressed; judging whether a key value position of a hash table corresponding to the new data to be compressed stores an effective address or not; if so, acquiring the termination address currently pointed by the first pointer; if not, the first pointer is continuously moved backwards by one byte, new data to be compressed are read again until an effective address is stored at the key value position of the hash table corresponding to the new data to be compressed, and the termination address currently pointed by the first pointer is obtained; and acquiring the copy length according to the termination address and the start address, reading the copy data according to the start address and the copy length, and acquiring a compression result according to the copy data.
With reference to the first aspect and any one of the sixth to seventh possible implementation manners of the first aspect, in a ninth possible implementation manner of the first aspect, a start address of a data stream to be compressed in the current compression process is an end address or a preset start address in the last compression process.
The following describes an image data compression apparatus provided in an embodiment of the present invention, where the apparatus and the method correspond to each other one to one, so as to implement the image data compression method in the above embodiments, which have the same technical features and technical effects, and no further description is given in this embodiment of the present invention.
In a second aspect, an embodiment of the present invention provides an apparatus for compressing image data, including:
the mask processing module is used for performing mask operation on data to be compressed to obtain mask data after mask processing, wherein the data to be compressed is data of preset bytes read from a starting address of a data stream to be compressed of an input data stream;
the hash processing module is used for carrying out hash processing on the mask data to obtain a key value of a hash table corresponding to the mask data;
the effective address judging module is used for judging whether an effective address is stored at the position of the key value;
and the compression module is used for reading the reference data stream from the effective address of the input data stream when the effective address is stored at the key value position, and comparing the data stream to be compressed with the reference data stream to obtain a compression result.
With reference to the second aspect, in a first possible implementation manner of the second aspect, the mask processing module is specifically configured to: acquiring a mask of the preset compression error according to the preset compression error; and performing bitwise AND operation on each byte data of the data to be compressed and the mask to obtain masked mask data.
With reference to the first possible implementation manner of the second aspect, in a second possible implementation manner of the second aspect, the mask processing module is specifically configured to: judging whether the preset compression error is added with 1 to be an exponential power of 2 or not; if so, negating the preset compression error to obtain a mask of the preset compression error; if not, acquiring the minimum value of all 2 exponential powers larger than the preset compression error, subtracting 1 from the minimum value, and then negating to acquire a mask of the preset compression error; and performing bitwise AND operation on each byte data of the data to be compressed and the mask to obtain masked mask data.
With reference to the second possible implementation manner of the second aspect, in a third possible implementation manner of the second aspect, the compression module includes:
a first data-to-be-compared reading unit configured to read data of one byte in order from a start address along a data stream to be compressed as first data to be compared, and move a first pointer pointing to the start address backward by one byte;
a second data-to-be-compared reading unit configured to read data of one byte in order from the effective address along the reference data stream as second data-to-be-compared, and move a second pointer pointing to the effective address backward by one byte;
and the compression unit is used for obtaining a compression result according to the comparison result of the first data to be compared and the second data to be compared.
With reference to the third possible implementation manner of the second aspect, in a fourth possible implementation manner of the second aspect, the compression unit is specifically configured to: judging whether the difference value of the first data to be compared and the second data to be compared accords with a preset rule or not; if not, acquiring a termination address according to the address currently pointed by the first pointer; if so, the first data to be compared and the second data to be compared are obtained by reading the first data to be compared and the second data to be compared again by the first data to be compared reading unit and the second data to be compared reading unit according to the first pointer and the second pointer, whether the difference value of the first data to be compared and the second data to be compared obtained by reading again meets the preset rule is judged after the first pointer and the second pointer are respectively moved backwards by one byte, until the first data to be compared and the second data to be compared do not meet the preset rule, and the termination address is obtained according to the address currently pointed by the first pointer; and obtaining a compression result according to the termination address, the starting address and the effective address.
With reference to the fourth possible implementation manner of the second aspect, in a fifth possible implementation manner of the second aspect, the compression unit is specifically configured to: and acquiring a matching length according to the ending address and the starting address, acquiring an offset value according to the starting address and the effective address, and acquiring a compression result according to the matching length and the offset value.
With reference to the fourth possible implementation manner of the second aspect, in a sixth possible implementation manner of the second aspect, if the preset compression error plus 1 is an exponential power of 2, the compression comparing unit is specifically configured to: performing mask operation on the first data to be compared to obtain masked first data to be compared; performing mask operation on the second data to be compared to obtain masked second data to be compared; judging whether the difference value of the masked first data to be compared and the masked second data to be compared is 0; if so, determining that the difference value of the first data to be compared and the second data to be compared accords with a preset rule; if not, determining that the difference value of the first data to be compared and the second data to be compared does not accord with a preset rule.
With reference to the fourth possible implementation manner of the second aspect, in a seventh possible implementation manner of the second aspect, if the preset compression error plus 1 is not an exponential power of 2, the compression unit is specifically configured to: judging whether the absolute value of the difference value of the first data to be compared and the second data to be compared is smaller than a preset compression error or not; if so, determining that the difference value of the first data to be compared and the second data to be compared accords with a preset rule; if not, determining that the difference value of the first data to be compared and the second data to be compared does not accord with a preset rule.
With reference to the second aspect and any one of the first to the seventh possible implementation manners of the second aspect, in an eighth possible implementation manner of the second aspect, the apparatus further includes a copying module, where the module is configured to: when no effective address is stored at the position of the key value, storing the initial address at the position of the key value of the hash table, moving a first pointer pointing to the initial address backwards by one byte, and sequentially reading data of preset bytes as new data to be compressed; judging whether a key value position of a hash table corresponding to the new data to be compressed stores an effective address or not; if so, acquiring the termination address currently pointed by the first pointer; if not, the first pointer is continuously moved backwards by one byte, new data to be compressed are read again until an effective address is stored at the key value position of the hash table corresponding to the new data to be compressed, and the termination address currently pointed by the first pointer is obtained; acquiring the copy length according to the termination address and the start address; reading the copied data according to the initial address and the copy length; and obtaining a compression result according to the copied data.
With reference to the second aspect and any one of the sixth to seventh possible implementation manners of the second aspect, in a ninth possible implementation manner of the second aspect, a start address of a data stream to be compressed in the current compression process is an end address or a preset start address in the last compression process.
In a third aspect, an embodiment of the present invention provides an apparatus for compressing image data, including a processor and a storage medium, where the storage medium stores instructions, and when the instructions are called by the processor, the storage medium is configured to perform a method for compressing image data according to any one of the possible implementations of the first aspect and the first to the ninth aspects.
The image data compression device is capable of implementing the method embodiments of the first aspect, and has the same technical features and technical effects as the method of the first aspect, which are not described in detail herein.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive labor.
FIG. 1 is a flowchart illustrating a first embodiment of a method for compressing image data according to the present invention;
FIG. 2 is a flowchart illustrating a second embodiment of a method for compressing image data according to the present invention;
FIG. 3 is a flowchart illustrating a third embodiment of a method for compressing image data according to the present invention;
FIG. 4 is a flowchart illustrating a fourth embodiment of a method for compressing image data according to the present invention;
FIG. 5 is a schematic structural diagram of a first embodiment of an apparatus for compressing image data according to the present invention;
fig. 6 is a schematic structural diagram of a second embodiment of an image data compression apparatus according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the prior art, when a data stream is compressed losslessly, a data segment reappearing in the data stream is searched, the length of a subsequent data segment which repeatedly appears and the address deviation between the subsequent data segment which repeatedly appears and a previous data segment which repeatedly appears are determined, and the subsequent data segment which repeatedly appears is expressed according to the address deviation and the length, so that the data compression is realized. For example, the data stream is a character string aaabnaaababana, wherein one character occupies one byte, when the repeated data segment to be compressed is at least 4 bytes, the character string can be compressed to AAABN (5,4) (8,5), wherein 5 and 8 indicate that the address deviation between the subsequent repeated data segment and the previous repeated data segment is 5 and 8 bytes, 4 and 5 indicate that the length of the repeated data segment is 3 and 5 bytes, and the originally longer character string can be represented by a shorter character string, thereby saving the storage space and having no data loss in the data compression process. The more data segments that are repeated in a data stream, the higher the compression ratio of the data stream. However, image data is rich in detail, with fewer repeated data segments, and thus compression is lower.
Before starting compression, the minimum number of bytes of the repeated data segments to be compressed needs to be preset and recorded as preset bytes. The larger the number of the preset bytes is, the more difficult the same data segment is to be found, which will result in lower compression ratio; however, the smaller the number of the preset bytes is, the more the number of compression times is increased, the compression speed is slowed down, and the storage space occupied by the compressed data may not be reduced compared with the original data. In the following embodiments of the present invention, the number of the predetermined bytes is 4 as an example, and the method for compressing the image data according to the embodiments of the present invention is described, but the number of the predetermined bytes is not limited.
The embodiment of the invention provides an error-controllable lossy image data compression method and device, aiming at the problems that the compression ratio of the existing lossless image compression algorithm is low, the number of compression steps of the lossy image compression algorithm is large, the operation is complex, and the compression speed is low.
The following describes in detail the method and apparatus for compressing image data according to an embodiment of the present invention with specific embodiments.
Fig. 1 is a flowchart illustrating a first embodiment of a method for compressing image data according to the present invention. The execution subject of the method is a compression device of image data, which can be implemented by any software and/or hardware. As shown in fig. 1, the method includes:
step 101, performing mask operation on data to be compressed to obtain mask data after mask processing;
102, carrying out hash processing on the mask data to obtain a key value of a hash table corresponding to the mask data;
step 103, judging whether the key value position stores an effective address or not; if yes, go to step 104;
step 104, reading a reference data stream from an effective address of an input data stream, and comparing the data stream to be compressed with the reference data stream to obtain a compression result;
the data to be compressed is data of a preset byte read from a start address of the data stream to be compressed of the input data stream.
Illustratively, an image is composed of a plurality of pixel points, the value of each pixel point is image data, and the image data determines the color presented when the image is displayed. Considering that the image details are rich, the value of each pixel point is changed after the image is lossy compressed, but the image error generated by the change is not easy to be perceived by human eyes, so that the image can be lossy compressed to improve the compression ratio. The lossy compression of the image is lossy compression of image data.
Specifically, in step 101, image data to be compressed is used as an input data stream, the input data stream is sequentially compressed, the compressed data is recorded as compressed data, the data obtained after the compression processing is recorded as a compression result, and the data not subjected to the compression processing is recorded as a data stream to be compressed. When compression processing is started, data of preset bytes are sequentially read from the starting address of the data stream to be compressed to serve as data to be compressed, and mask operation is carried out on the data to be compressed to obtain mask data after mask processing.
The data to be compressed includes data of preset bytes, each byte of data represents image data, and the image data includes 8 bits, and the value range of the image data is an integer from 0 to 255. The data to be compressed is masked, that is, each byte of data is masked, and the low bit data in each byte of data is replaced by 0, so that the accuracy of each byte of data is reduced. For example, binary data X before masking is 0100_0001 (i.e., character a) and Y is 0100_0010 (i.e., character B), the lower two bits of the binary data X, Y are masked, and the masked data XX is 0100_0000 and YY is 0100_0000 are obtained, so that the accuracy of the masked data XX and YY is reduced, and the masked data XX is YY, that is, the character a is considered to be the same as the character B. By masking the image data, the characters A and B which are originally different can be regarded as the same data to be compressed in the subsequent compression process, namely, the similar data in the error range can be regarded as the same data, so that repeated data segments can be added in the error controllable range, and the compression ratio can be improved.
Specifically, in step 102, in order to avoid directly using mask data to search whether there are repeated data segments in the compressed data, reduce memory usage, and increase the search speed, hash processing is performed on the mask data obtained in step 101. Illustratively, a hash algorithm is used to compress and encrypt the mask data to obtain a value, which is used as a key value of the data to be compressed corresponding to the mask data in the hash table. Different key values correspond to different storage spaces in the hash table, and the storage spaces store the start addresses of the corresponding data to be compressed in the data stream to be compressed. The hash processing is carried out on the data to be compressed, so that the searching and comparing process is simplified.
Specifically, in step 103, it is determined whether an effective address is stored at a key value position of the hash table corresponding to the mask data, and when the effective address is stored at the key value position corresponding to the mask data, it may be considered that data similar to the data to be compressed already exists in the input data stream before the data to be compressed, and the similar data is hashed to obtain a key value identical to the key value corresponding to the data to be compressed. Before the data to be compressed, the similar data is subjected to hash processing to obtain a key value, and the starting address of the similar data is stored at the position of the key value of the hash table. When the effective address is not stored at the key value position corresponding to the mask data, the data to be compressed is considered to appear in the input data stream for the first time, and other similar data do not exist.
Specifically, in step 104, when the effective address is stored at the key value position corresponding to the mask data, it may be considered that a data fragment similar to the data to be compressed exists in the compressed data, and the start address of the data fragment is the effective address, and the length of the similar data fragment is the preset byte. Further, more byte data are read backwards and forwards along the data to be compressed and the similar data segments, and whether the byte data are similar or not is judged. I.e. the reference data stream is read from the effective address of the input data stream and the data stream to be compressed is compared with the reference data stream. Illustratively, when the input data stream is a character string AAABNAABBN and the data stream to be compressed is a character string AABBN, the method in steps 101 to 103 may be adopted to confirm that a similar data fragment AAAB exists in the input data stream before the data to be compressed AABB, so as to obtain the reference data stream AAABNAABBN. And comparing the data stream to be compressed with the reference data stream, determining that the data segments which can be regarded as repeated comprise 'AABBN' and 'AAABN', and then compressing to obtain a compression result. Therefore, the character string AAABNAABBN can be compressed into AAABN (5, 5).
According to the image data compression method provided by the embodiment of the invention, mask operation is firstly carried out on the data to be compressed before the image data is compressed, so that repeated data segments in the image data are increased, the compression ratio of the image data is improved, and meanwhile, as the mask operation is only added in the existing lossless compression process, the compression steps are fewer, the operation is simple, and the compression speed is improved.
Optionally, on the basis of the embodiment shown in fig. 1, the masking operation in step 101 is described in detail. Specifically, performing a masking operation on data to be compressed to obtain masked data after masking processing includes:
step 1011, obtaining a mask of the preset compression error according to the preset compression error;
step 1012, performing bitwise and operation on each byte data of the data to be compressed and the mask to obtain masked mask data.
For example, a user may set an acceptable preset compression error according to the purpose of data image compression, and the larger the preset compression error is, the more image data in the image compression process are regarded as the same data for compression; the smaller the preset compression error is, the less image data in the image compression process is regarded as the same data to be compressed. For example, when the preset compression error is 1, both binary numbers 0100_0000 and 0100_0001 can be considered to be the same; when the preset compression error is 3, the binary numbers 0100_0000, 0100_0001, 0100_0010 and 0100_0011 can all be considered to be the same. The specific implementation is that the step of regarding a plurality of data as the same data is a masking operation.
Specifically, a mask of the preset compression error is obtained according to the preset compression error input by the user, the mask is a byte, usually the lower bit is 0, the remaining upper bit is 1, and the specific number of the lower bits set to 0 is determined according to the preset compression error, for example, when the preset compression error is 1, the lower two bits of the mask are set to 0, so that 3 data of the original lower two bits 01, 10, and 11 are subjected to a mask operation and then are the same as data of the lower two bits 00.
After the mask is obtained, bitwise AND operation is carried out on each byte data of the data to be compressed and the mask to obtain masked mask data. For example, when the mask is 1111_1100 and a byte of data to be compressed is a binary number 0100_0011, performing an and operation on 0100_0011 and the mask 1111_1100 according to corresponding bits to obtain mask data. Since the sum of any number and 0 is 0, the essence of the masking operation is to mask the lower bits of the original two-byte data to 0.
Further, the process of obtaining the mask is described in detail with reference to the embodiment shown in fig. 1. Specifically, obtaining a mask of the preset compression error according to the preset compression error includes:
step 10111, judging whether the preset compression error is added with 1 to be an exponential power of 2; if yes, go to step 10112; if not, go to step 10113;
step 10112, negating the preset compression error, and obtaining a mask of the preset compression error;
step 10113, obtaining all the minimum values of the 2 exponential powers larger than the preset compression error, subtracting 1 from the minimum values, and then negating to obtain the mask of the preset compression error.
Specifically, when the mask is obtained according to the preset compression error input by the user, it is determined whether the preset compression error is an exponential power of 2 after being added by 1, that is, whether the preset compression error is 1, 3, 7, etc. which meet 2n-1 data. Wherein n is a positive integer greater than 0.
If so, the preset compression error may be directly inverted to obtain the mask of the preset compression error, for example, 1, 3, and 7 are inverted to obtain the masks 1111_1110, 1111_1100, 1111_ 1000.
If not, all the accords 2 larger than the preset compression error are obtainednFor example, when the preset compression error is 2, the minimum value of the data of (4) is determined. Then, subtracting 1 from the minimum value to obtain the corresponding 2nData of-1, and finally identical, pair match 2nAnd performing an inversion operation on the data of the-1 to obtain a mask of the preset compression error.
Further, on the basis of the above embodiment, a detailed description is given to the specific compression process in step 104 when there is a data segment that can be regarded as a duplicate in the compressed data. Fig. 2 is a flowchart illustrating a second embodiment of the image data compression method provided by the present invention. As shown in fig. 2, comparing the data stream to be compressed with the reference data stream to obtain a compression result, includes:
step 201, along the data stream to be compressed, sequentially reading data of one byte from the start address as first data to be compared, and moving a first pointer pointing to the start address backward by one byte;
step 202, reading data of one byte from the effective address in sequence along the reference data stream as second data to be compared, and moving a second pointer pointing to the effective address backwards by one byte;
and step 203, obtaining a compression result according to the comparison result of the first data to be compared and the second data to be compared.
Step 201 and step 202 have no chronological sequence and can be executed synchronously.
Specifically, when the key value of the data to be compressed and the key value of the data of the previous preset byte in the reference data stream are determined, it cannot be completely determined that the data of the previous preset byte in the data stream to be compressed and the data of the previous preset byte in the reference data stream are both within the preset compression error, because when the preset compression error is not data such as 1, 3, 7, and the like, when the masking operation is performed, the error range is expanded, so that the data outside the preset compression error range of the user is also regarded as the same data. Therefore, the data in the data stream to be compressed and the data in the reference data stream need to be compared one by one according to bytes, and the compression is carried out according to the comparison result.
In step 201, along the data stream to be compressed, one byte of data is read as first data to be compared, and then a first pointer to the data stream to be compressed is moved backward by one byte. In step 202, one byte of data is read along the reference data stream as second data to be compared, and then a second pointer pointing to the reference data stream is shifted backward by one byte. The aim of comparing the data stream to be compressed and the reference data one by adopting the movement of the pointer one byte by one byte is realized.
In step 203, the first data to be compared and the second data to be compared, which are read each time, are compared to obtain similar data segments that meet a preset compression error of a user in the two data streams, and a compression result is obtained according to the similar data segments.
Optionally, the data to be compressed and the data of the first preset byte in the reference data stream may also be directly regarded as the same data, and the data of the 5 th byte in the data stream to be compressed and the data of the reference data stream are compared to accelerate the compression speed.
Further, on the basis of the embodiment shown in fig. 2, a process of obtaining the compression result in step 203 is described in detail. Fig. 3 is a flowchart illustrating a third embodiment of a method for compressing image data according to the present invention. As shown in fig. 3, obtaining a compression result according to a comparison result of the first data to be compared and the second data to be compared includes:
step 301, judging whether a difference value between first data to be compared and second data to be compared accords with a preset rule or not; if yes, go to step 302; if not, go to step 303;
step 302, according to the first pointer and the second pointer, reading again to obtain first data to be compared and second data to be compared, moving the first pointer and the second pointer backward by one byte respectively, and executing step 301 again;
step 303, obtaining a termination address according to the address currently pointed by the first pointer;
and step 304, obtaining a compression result according to the termination address, the starting address and the effective address.
Specifically, in step 301, it is determined whether the first data to be compared and the second data to be compared conform to the predetermined compression error input by the user. For example, it may be determined whether a difference between the first data to be compared and the second data to be compared meets a preset rule, wherein the preset compression error is 2n-1, comprising possible implementations as follows:
one possible implementation is:
when the predetermined compression error is in accordance with 2n1, judging whether the difference value of the first data to be compared and the second data to be compared meets a preset rule, wherein the judgment comprises the following steps:
performing mask operation on the first data to be compared to obtain masked first data to be compared; performing mask operation on the second data to be compared to obtain masked second data to be compared; and judging whether the difference value of the first data to be compared after the mask and the second data to be compared after the mask is 0.
Specifically, according to a preset compression error, masking operation is carried out on first data to be compared and second data to be compared, the masked first data to be compared and the masked second data to be compared are subjected to difference, whether a difference value is 0 or not is determined, and if the difference value is 0, the difference value is considered to be in accordance with a preset rule; and if not, the value is determined not to be in accordance with the preset rule. By performing masking operations on data in both the data stream to be compressed and the reference data stream, the compression ratio may be increased.
Another possible implementation:
when the predetermined compression error is not compliant with 2n1, judging whether the difference value of the first data to be compared and the second data to be compared meets a preset rule, wherein the judgment comprises the following steps:
and judging whether the absolute value of the difference value of the first data to be compared and the second data to be compared is smaller than a preset compression error or not.
Specifically, the first data to be compared and the second data to be compared are directly subjected to difference to obtain a difference value, whether the absolute value of the difference value is smaller than a preset compression error or not is compared, and if the absolute value of the difference value is smaller than the preset compression error, the difference value is considered to be in accordance with a preset rule; if not, the rule is not in accordance with the preset rule. By doing the difference directly, the compression speed can be increased.
Specifically, in step 302, when it is determined that the first byte data of the data stream to be compressed and the reference data stream can be regarded as the same, the second byte data and the third byte data are compared until a certain byte data can not be regarded as the same. Illustratively, after it is determined that the first data to be compared and the second data to be compared conform to the preset compression error input by the user, according to the first pointer and the second pointer pointing to the data stream to be compressed and the reference data stream, reading new data as new first data to be compared and new second data to be compared, moving the first pointer and the second pointer backward by one byte respectively, and then comparing the new first data to be compared and the new second data to be compared, that is, performing step 301 again;
specifically, in step 303, when the first data to be compared and the second data to be compared do not conform to the preset compression error input by the user, the comparison between the data stream to be compressed and the reference data stream is stopped, and the termination address is obtained according to the address currently pointed by the first pointer. Optionally, the first data to be compared and the second data to be compared may be read first, and when the two data match a preset compression error input by the user, the two pointers are moved backward by one byte respectively. The start address and the end address of the data stream to be compressed determine the length of the repeatedly occurring data segments in the data stream to be compressed.
Specifically, in step 304, the repeated data segments may be encoded according to the end address, the start address and the effective address, and the original data is replaced by the encoded data as the compression result, so that the original data stream to be compressed is compressed, and meanwhile, the current end address is the start address of the next compression.
Optionally, the process of obtaining the compression result in step 304 is described in detail with reference to the embodiment shown in fig. 3. Obtaining a compression result according to the termination address, the start address and the effective address, which specifically comprises:
acquiring a matching length according to the termination address and the start address; obtaining an offset value according to the starting address and the effective address; and obtaining a compression result according to the matching length and the offset value.
Specifically, the number of bytes of the repeated data segments, that is, the matching length, can be obtained by subtracting the ending address from the starting address; then, the initial address of the data stream to be compressed is subtracted from the effective address in the reference data stream, so as to obtain the position offset value between the subsequent repeated data segment and the previous repeated data segment; and according to the matching length and the deviation value, coding is carried out, and the coding replaces the original data to obtain a compression result, so that the original data stream to be compressed is compressed.
Optionally, on the basis of the above embodiment, with reference to fig. 4, a detailed process of compressing image data provided by the present invention is described, and a detailed description is given for a specific compression process when there is no data segment that can be regarded as repeated in the compressed data. Fig. 4 is a schematic flowchart of a fourth embodiment of a method for compressing image data according to the present invention, as shown in fig. 4, the method includes:
step 401, performing mask operation and hash processing on data to be compressed;
step 402, judging whether the key value position stores an effective address; if yes, go to step 403; if not, go to step 404;
step 403, reading first data to be compared in a data stream to be compressed, and reading second data to be compared in a reference data stream;
step 4031, judging whether the difference value of the first data to be compared and the second data to be compared accords with a preset rule or not; if yes, go to step 4032; if not, go to step 4033;
4032, reading the new first data to be compared and the new second data to be compared, and executing 4031 again;
step 4033, obtain the end address, according to end address, start address and effective address, carry on the compressed result.
Step 404, storing the start address at the key value position of the hash table, moving a first pointer pointing to the start address backward by one byte, and sequentially reading data of preset bytes as new data to be compressed;
step 4041, determining whether an effective address is stored at a key value position of a hash table corresponding to the new data to be compressed; if not, go to step 4042; if yes, go to step 4043;
step 4042, moving the first pointer backward by one byte, reading new data to be compressed again, and executing step 4041 again;
step 4043, obtaining the current end address pointed by the first pointer;
step 4044, obtaining the copy length according to the end address and the start address; reading the copied data according to the initial address and the copy length; and obtaining a compression result according to the copied data.
Wherein steps 401 to 403 are the same as the steps in the embodiment shown in fig. 1 to 3, and the present invention is not repeated.
Specifically, a data stream to be compressed is determined, and a start address of the data stream to be compressed in the current compression process is an end address in the last compression process. Then, mask and hash processing are carried out, and whether the position of the key value obtained by the hash processing stores the effective address or not is determined. When the data segment is determined to be the same, the length of the repeated data segment is determined, and compression is carried out. When the data is determined to be absent, the data cannot be compressed, data replication is required, and the length of the replicated data is also determined. Specifically, after storing the start address of the data to be compressed at the key value position of the hash table, moving the first pointer backward by one byte, reading new data to be compressed, performing mask and hash processing again, determining whether an effective address exists at the key value position, repeating the judgment process until the effective address exists at the key value position, taking the address pointed by the current first pointer as an end address, obtaining the data segment which needs to be copied and reserved and cannot be compressed from the start address to the end address of the first pointer, and obtaining the copying length according to the end address and the start address when copying; reading the copy data in the data stream to be compressed according to the initial address and the copy length; the read copied data is used as a compression result.
For example, when the input data stream is the character string MXAAABNAABBNHDEFXYZ, the predetermined compression error is 3, and the data stream to be compressed is the character string AABBNHDEFXYZ, the steps 401, 402, 403, 4031, 4032, and 4033 are performed to confirm that the data segment AAAB that can be regarded as the same exists in the input data stream before the data to be compressed AABB, so that the reference data stream AAABNAABBNHDEFXYZ is obtained. After comparing the data stream to be compressed with the reference data stream byte by byte, it is determined that the data segments that can be regarded as being repeated include "AABBN" and "AAABN", so the data stream to be compressed AABBNHDEFXYZ can be compressed to (5,5) HDEFXYZ. Meanwhile, the end address of the current compression cycle and the position of the character H may be the start address of the next compression cycle, and optionally, the start address of the compression cycle may also be a preset start address input by the user. Performing steps 401, 402, 404, 4041, 4042, 4043 and 4044 confirms that there is no similar data fragment of at least 4 bytes in the input data stream before the data to be compressed HDEFXYZ. Therefore, the data fragment to be copied is determined to be HDEFXYZ. Therefore, the input data stream MXAAABNAABBNHDEFXYZ is compressed to become MXAAABN (5,5) HDEFXYZ, and the storage space is saved.
In another aspect, an embodiment of the present invention further provides an apparatus for compressing image data, where the apparatus may be implemented by software/hardware, and the present invention is not limited thereto. The device and the method embodiments correspond to each other one to one, and the compression method for image data in the embodiments has the same technical features and technical effects, and the embodiments of the present invention will not be described again.
Fig. 5 is a schematic structural diagram of a first embodiment of an apparatus for compressing image data according to the present invention, as shown in fig. 5, the apparatus includes:
a mask processing module 501, configured to perform a mask operation on data to be compressed to obtain mask data after the mask operation, where the data to be compressed is data of a preset byte read from a start address of a data stream to be compressed of an input data stream;
a hash processing module 502, configured to perform hash processing on the mask data to obtain a key value of a hash table corresponding to the mask data;
an effective address determining module 503, configured to determine whether an effective address is stored at the key value position;
the compressing module 504 is configured to, when the key value position stores an effective address, read a reference data stream from the effective address of the input data stream, and compare the data stream to be compressed with the reference data stream to obtain a compression result.
Optionally, on the basis of the embodiment shown in fig. 5, the mask processing module 501 is specifically configured to:
acquiring a mask of the preset compression error according to the preset compression error;
and performing bitwise AND operation on each byte data of the data to be compressed and the mask to obtain masked mask data.
Optionally, on the basis of the embodiment shown in fig. 5, the mask processing module 501 is specifically configured to:
judging whether the preset compression error is added with 1 to be an exponential power of 2 or not;
if so, negating the preset compression error to obtain a mask of the preset compression error;
if not, acquiring the minimum value of all 2 exponential powers larger than the preset compression error, subtracting 1 from the minimum value, and then negating to acquire a mask of the preset compression error;
and performing bitwise AND operation on each byte data of the data to be compressed and the mask to obtain masked mask data.
Alternatively, on the basis of the embodiment shown in fig. 5, the structure of the compression module will be described in detail. Fig. 6 is a schematic structural diagram of a second embodiment of the image data compression apparatus provided in the present invention, and as shown in fig. 6, the compression module 504 includes:
a first data-to-be-compared reading unit 601 for sequentially reading data of one byte from a start address along the data stream to be compressed as first data-to-be-compared, and moving a first pointer pointing to the start address backward by one byte;
a second data-to-be-compared reading unit 602 configured to read data of one byte in order from the effective address along the reference data stream as second data-to-be-compared, and move a second pointer pointing to the effective address backward by one byte;
the compressing unit 603 is configured to obtain a compression result according to a comparison result of the first data to be compared and the second data to be compared.
Optionally, on the basis of the embodiment shown in fig. 6, the compressing unit 603 is specifically configured to:
judging whether the difference value of the first data to be compared and the second data to be compared accords with a preset rule or not;
if not, acquiring a termination address according to the address currently pointed by the first pointer;
if yes, the first data to be compared and the second data to be compared are obtained by reading again in the first data reading unit 601 and the second data reading unit 602 according to the first pointer and the second pointer, after the first pointer and the second pointer are respectively moved backward by one byte, whether the difference value between the first data to be compared and the second data to be compared obtained by reading again meets the preset rule is judged, until the first data to be compared and the second data to be compared do not meet the preset rule, and the termination address is obtained according to the current address pointed by the first pointer;
and obtaining a compression result according to the termination address, the starting address and the effective address.
Optionally, on the basis of the embodiment shown in fig. 6, the compressing unit 603 is specifically configured to:
acquiring a matching length according to the termination address and the start address;
obtaining an offset value according to the starting address and the effective address;
and obtaining a compression result according to the matching length and the offset value.
Optionally, on the basis of the embodiment shown in fig. 6, if the preset compression error plus 1 is an exponential power of 2, the compression unit 603 is specifically configured to:
performing mask operation on the first data to be compared to obtain masked first data to be compared;
performing mask operation on the second data to be compared to obtain masked second data to be compared;
and judging whether the difference value of the first data to be compared after the mask and the second data to be compared after the mask is 0.
Optionally, on the basis of the embodiment shown in fig. 6, if the preset compression error plus 1 is not an exponential power of 2, the compression unit 603 is specifically configured to:
and judging whether the absolute value of the difference value of the first data to be compared and the second data to be compared is smaller than a preset compression error or not.
Optionally, on the basis of any of the above embodiments, the apparatus further includes a copy module; the copy module is to:
when no effective address is stored at the position of the key value, storing the initial address at the position of the key value of the hash table, moving a first pointer pointing to the initial address backwards by one byte, and sequentially reading data of preset bytes as new data to be compressed;
judging whether a key value position of a hash table corresponding to the new data to be compressed stores an effective address or not;
if so, acquiring the termination address currently pointed by the first pointer;
if not, the first pointer is continuously moved backwards by one byte, new data to be compressed are read again until an effective address is stored at the key value position of the hash table corresponding to the new data to be compressed, and the termination address currently pointed by the first pointer is obtained;
acquiring the copy length according to the termination address and the start address;
reading the copied data according to the initial address and the copy length;
and obtaining a compression result according to the copied data.
Optionally, on the basis of any of the above embodiments, the start address of the data stream to be compressed in the current compression process is the end address or the preset start address in the previous compression process.
Yet another aspect of the embodiments of the present invention provides an apparatus for compressing image data, including a processor and a storage medium, where the storage medium stores instructions, and when the instructions are called by the processor, the storage medium is configured to execute a method for compressing image data as in any one of the above embodiments.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.