US12149265B2 - Efficient update of cumulative distribution functions for image compression - Google Patents
Efficient update of cumulative distribution functions for image compression Download PDFInfo
- Publication number
- US12149265B2 US12149265B2 US17/904,030 US202017904030A US12149265B2 US 12149265 B2 US12149265 B2 US 12149265B2 US 202017904030 A US202017904030 A US 202017904030A US 12149265 B2 US12149265 B2 US 12149265B2
- Authority
- US
- United States
- Prior art keywords
- cdf
- array
- symbol
- raw data
- mixing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
- 230000001186 cumulative effect Effects 0.000 title claims abstract description 24
- 238000005315 distribution function Methods 0.000 title claims abstract description 13
- 230000006835 compression Effects 0.000 title description 23
- 238000007906 compression Methods 0.000 title description 23
- 238000000034 method Methods 0.000 claims abstract description 64
- 230000015654 memory Effects 0.000 claims description 51
- 238000004590 computer program Methods 0.000 claims description 18
- 238000012545 processing Methods 0.000 claims description 12
- 230000006870 function Effects 0.000 claims description 11
- 230000008569 process Effects 0.000 claims description 9
- 230000004044 response Effects 0.000 claims description 8
- 230000001419 dependent effect Effects 0.000 abstract description 16
- 238000003491 array Methods 0.000 abstract description 7
- 230000008901 benefit Effects 0.000 abstract description 5
- 238000004364 calculation method Methods 0.000 abstract description 2
- 230000000875 corresponding effect Effects 0.000 description 8
- 238000004891 communication Methods 0.000 description 6
- 238000013139 quantization Methods 0.000 description 6
- 230000001276 controlling effect Effects 0.000 description 5
- 239000011159 matrix material Substances 0.000 description 5
- 238000013459 approach Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 230000009466 transformation Effects 0.000 description 4
- 230000003044 adaptive effect Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000002596 correlated effect Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- 230000000712 assembly Effects 0.000 description 1
- 238000000429 assembly Methods 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000006837 decompression Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000001953 sensory effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/40—Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code
- H03M7/4006—Conversion to or from arithmetic code
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/90—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
- H04N19/91—Entropy coding, e.g. variable length coding [VLC] or arithmetic coding
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/3068—Precoding preceding compression, e.g. Burrows-Wheeler transformation
- H03M7/3071—Prediction
Definitions
- This description relates to image compression and, in particular, the efficient updating of cumulative distribution functions for image compression.
- Compression of color images is performed to reduce the size of files storing images and can be performed, in some implementations, by eliminating redundant information within an image. For example, after transforming color channels from one basis (e.g., RGB) to a luminance-chrominance basis (e.g., YUV coordinates), the values in the transformed color channels can be subtracted from a model of the color value correlation between neighboring pixels to produce residual values in each color channel. These residual values then can be transformed into a frequency-space representation (e.g., discrete cosine transform (DCT), discrete wavelet transform (DVT)) so that high-frequency residual values that have less impact on the image may be eliminated and the size of the image can be reduced accordingly. These transformed residuals then can be quantized to a certain number of bits, and these quantized residuals can be encoded according to an encoding scheme such as entropy encoding.
- DCT discrete cosine transform
- DVD discrete wavelet transform
- Implementations provide an image compression scheme that uses a highly efficient and robust encoder.
- the encoder replaces with codewords an alphabet of symbols, each symbol having a probability of being used according to a probability model.
- the model assigning probability values to the symbols of the alphabet is adaptive so that each time a symbol is observed, the cumulative distribution function (CDF) (i.e., the sum of the probabilities of a specified subsequence of symbols) of the symbols of the alphabet is updated.
- CDF cumulative distribution function
- a robust updating procedure includes generating a change to the CDF based on a precomputed mixing CDF, wherein the mixing CDF includes a respective, separate mixing model corresponding to each symbol of the alphabet.
- the mixing CDF in this case is then a two-dimensional array of mixing CDF values.
- a method can include receiving raw data (such as data obtained based on an image) for encoding, the raw data represented by an alphabet of symbols.
- the method can also include initializing a cumulative distribution function (CDF) array representing a CDF evaluated at a plurality of indices.
- the indices may have a predetermined order, e.g. they may be respective different numerical values (e.g. consecutive integers, such as 0 to N) such that the order of the indices is the order of the numerical values, with lower numerical values being earlier in the order.
- one index is “less” than or “more than” another to mean that the first index is respectively earlier or later in the order than the second index; we also refer to a certain index being less than or more than a “threshold index”, which respectively means before or after the threshold index in the order.
- a certain index being less than or more than a “threshold index”, which respectively means before or after the threshold index in the order.
- the method can further include, in response to receiving a first symbol of the alphabet representing a first portion of the raw data, updating the CDF array based on a first mixing CDF array and a second mixing CDF array to produce an updated CDF array, the first mixing CDF array having values that are independent of the first symbol, the second mixing CDF array having values based on the symbol, the updated CDF array being used to determine a probability of a second symbol of the alphabet representing a second portion of the raw data.
- the method can further include encoding the first symbol and the second symbol using the updated CDF array to produce a codeword, the codeword, when decoded, reproducing the first and second portions of the raw data.
- This reproduction may not be exact; instead, it may produce data which differs from the first and second portions of the raw data by an amount which satisfies a smallness criterion. For example, the proportion of symbols which are not correctly reproduced may be below a threshold.
- a computer program product comprises a non-transitory storage medium, the computer program product including code that, when executed by processing circuitry of a computing device, causes the processing circuitry to perform a method.
- the method can include receiving raw data for encoding, the raw data represented by an alphabet of symbols.
- the method can also include initializing a cumulative distribution function (CDF) array representing a CDF evaluated at a plurality of indices, each of the plurality of indices representing a symbol of an alphabet representing a portion of the raw data, the CDF at an index of the plurality of indices representing a cumulative sum of probabilities of symbols of the alphabet represented by indices of the plurality of indices less than or equal to the index.
- CDF cumulative distribution function
- the method can further include, in response to receiving a first symbol of the alphabet representing a first portion of the raw data, updating the CDF array based on a first mixing CDF array and a second mixing CDF array to produce an updated CDF array, the first mixing CDF array having values that are independent of the first symbol, the second mixing CDF array having values based on the symbol, the updated CDF array being used to determine a probability of a second symbol of the alphabet representing a second portion of the raw data.
- the method can further include encoding the first symbol and the second symbol using the updated CDF array to produce a codeword, the codeword, when decoded, reproducing the first and second portions of the raw data.
- the controlling circuitry can be configured to receive raw data for encoding, the raw data represented by an alphabet of symbols.
- the controlling circuitry can also be configured to initialize a cumulative distribution function (CDF) array representing a CDF evaluated at a plurality of indices, each of the plurality of indices representing a symbol of an alphabet representing a portion of the raw data, the CDF at an index of the plurality of indices representing a cumulative sum of probabilities of symbols of the alphabet represented by indices of the plurality of indices less than or equal to the index.
- CDF cumulative distribution function
- the controlling circuitry can also be configured to, in response to receiving a first symbol of the alphabet representing a first portion of the raw data, update the CDF array based on a first mixing CDF array and a second mixing CDF array to produce an updated CDF array, the first mixing CDF array having values that are independent of the first symbol, the second mixing CDF array having values based on the symbol, the updated CDF array being used to determine a probability of a second symbol of the alphabet representing a second portion of the raw data.
- the controlling circuitry can also be configured to encode the first symbol and the second symbol using the updated CDF array to produce a codeword, the codeword, when decoded, reproducing the first and second portions of the raw data.
- FIG. 1 is a diagram that illustrates an example electronic environment in which improved techniques described herein may be implemented.
- FIG. 2 is a flow chart that illustrates an example method of operating an augmented reality system, according to disclosed implementations.
- FIG. 3 is a diagram illustrating an example of a computer device and a mobile computer device that can be used to implement the described techniques.
- the image compression techniques described herein apply to images that may be encoded using arithmetic encoding techniques. Such images generally include most photographs and images exchanged over a network (e.g., the Internet). Arithmetic encoding is but one technique used on a representation of the image data that has been reduced to improve the compression ratio. As discussed herein, there are several steps needed to reduce the original image data to the quantized residual data encoded using an arithmetic encoder. While other encoding techniques may be applied to the quantized residual data, arithmetic encoding in most cases provides a better improvement to compression ratio over other encoding (e.g., Huffman encoding) techniques.
- Huffman encoding e.g., Huffman encoding
- the values of the color channels at a pixel in such an image are well-correlated with the values of the color channels in neighboring pixels.
- Well-correlated means there is a predictive model that provides an accurate estimate of the color values of a pixel given the color values of neighboring pixels. Of course, such estimates will not provide an exact prediction in any image. Nevertheless, when a good predictive model is applied to an image, the resulting residuals—the differences between the actual color values at a pixel and those values resulting from the predictive model—will no longer be correlated well. Such poorly correlated residual data can be more efficiently coded than the raw image data.
- the next step is to express this residual data in a way that more naturally provides the ability to filter out data that would likely not be perceived by human observers. For example, data associated with high spatial frequencies most likely will have very small brightness values.
- a transform such as a discrete cosine transform (DCT) or a discrete wavelet transform (DVT) is used to identify the high-frequency content of residual data.
- DCT discrete cosine transform
- DVD discrete wavelet transform
- a low-pass filter may be used to reduce the data size by eliminating the data associated with the highest frequencies.
- the transformed residual data is quantized so that the resulting quantized residual data is represented by a finite number of values. While resulting in a lossy compression, quantizing the transformed residual data has the advantage of significantly improving the compression ratio. Because the quantized residual data is represented by a finite number of values, an encoder may represent the quantized residual data in terms of symbols of a finite alphabet used in a codebook.
- the encoder represents a probability of each symbol of an alphabet appearing in the raw data (i.e., the quantized residual data) as follows.
- the interval represents all symbols of the alphabet appearing in the raw data as that interval.
- Each symbol of the alphabet is assigned a subinterval of that interval, just as each symbol is a part of the alphabet. For example, consider an alphabet containing three symbols: a, b, and c. Suppose that the symbol a has a probability of appearing in the raw data of 0.4, the symbol b has a probability of 0.5, and the symbol chas a probability of 0.1.
- the interval is subdivided into a subinterval [0,0.4) for a, a subinterval [0.4,0.9) for b, and [0.9,1.0) for c. If the first symbol observed in a bitstream containing the raw data during encoding is b, then the subinterval selected is [0.4,0.9), as that subinterval represents the symbol b.
- the first 2M bits of a subinterval endpoint will uniquely identify the quantized residual upon decoding.
- the symbols in a data stream are “bbbc,” the final interval has a length of 0.0125, which is equal to (0.5) 3 (0.1).
- the encoded data in this case may take the form of a binary representation without decimals, or 110100. This encoder has reduced the raw data from 8 bits to 6.
- Some arithmetic encoders are adaptive in that they update the symbol probabilities, and hence the respective CDFs, as a symbol is used in an encoding. In many scenarios, it is preferable to update CDFs rather than the probabilities directly because CDFs are easier to work with using integer arithmetic.
- Conventional approaches to updating CDFs in an arithmetic encoder include increasing the CDF at an index corresponding to a symbol, and at indices larger than the symbol.
- a technical solution to the above-described technical problem includes updating the CDF using two, one-dimensional mixing CDF arrays: a symbol-dependent array and a symbol-independent array.
- the symbol-dependent array may be a subarray of a larger, fixed array such that the subarray selected depends on the symbol being used.
- a technical advantage of disclosed implementations is that the above-described encoder uses far fewer resources and is accordingly more efficient than an encoder operating according to the conventional approaches.
- FIG. 1 is a diagram that illustrates an example electronic environment 100 in which the above-described technical solution may be implemented.
- the electronic environment 100 includes a computer 120 configured to perform image compression and decompression.
- the computer 120 includes a network interface 122 , one or more processing units 124 , memory 126 , and a display interface 128 .
- the network interface 122 includes, for example, Ethernet adaptors, Token Ring adaptors, and the like, for converting electronic and/or optical signals received from a communication network to electronic form for use by the computer 120 .
- the set of processing units 124 include one or more processing chips and/or assemblies.
- the memory 126 includes both volatile memory (e.g., RAM) and non-volatile memory, such as one or more ROMs, disk drives, solid state drives, and the like.
- the set of processing units 124 and the memory 126 together form control circuitry, which is configured and arranged to carry out various methods and functions as described herein.
- one or more of the components of the computer 120 can be, or can include processors (e.g., processing units 124 ) configured to process instructions stored in the memory 126 .
- processors e.g., processing units 124
- Examples of such instructions as depicted in FIG. 1 include an image manager 130 , a pre-compression manager 140 , an arithmetic coding manager 150 , and a decoding manager 160 (note that in some variants the computer system may include only an arithmetic coding manager 150 for generating codes for decoding by another computer, or only an decoding manager 160 for generating codes encoded by another computer).
- the memory 126 is configured to store various data, which is described with respect to the respective managers that use such data.
- the image manager 130 is configured to receive or acquire image data 132 .
- the image manager 130 is configured to receive or acquire the image data 132 over the network interface 122 , i.e., over a network (such as network 190 ) from the display device 170 .
- the image manager 130 is configured to receive or acquire the image data 132 from local storage (e.g., a disk drive, flash drive, SSD, or the like).
- the image data 132 represents a color image.
- the image data 132 includes a set of pixels, each of the set of pixels having a coordinate within the image and a set of numerical values, each of the set of numerical values representing a value within a color channel.
- the color channels used in the image data 132 are RBG, e.g., RGB data 133 .
- the image manager 130 is also configured to convert the image data 132 from one set of color channels (e.g., RGB data 133 ) to another set of color channels (e.g., YUV data 134 ).
- the encoder may operate directly on a luminance channel, because compression of such luminance data in the YUV channels generally is more efficient than compression of the color data in the RGB channels.
- a pre-compression manager 140 is configured to generate raw data for use by the arithmetic coding manager 150 for encoding.
- the raw data is not the YUV image data 134 itself but rather a derived form of that data configured for an efficient entropy encoding.
- Such implementations rely on the high amount of correlation between values of the color channels in a small neighborhood surrounding a pixel.
- the pre-compression manager 140 is then configured to apply a predictive model representing the correlation and generate residuals, i.e., a difference between the given image values in the YUV data 134 and the values according to the predictive model.
- Such residual data has a lower entropy than the actual image data due to the reduction in correlation between the residual values.
- the residual data 142 represents the residual values as described above.
- the residual data 142 includes triplets of real values indicating a distribution of deviations from a predictive model as a function of spatial coordinate within the image.
- the predictive model is derived based on the YUV data 134 , such as by known methods.
- the pre-compression manager 140 is also configured to, in some implementations, perform a transformation of the residual data 142 in image coordinate space into transformed residual data 143 in image frequency space.
- the transformation is a Fourier transform.
- the transformation is a discrete cosine transform (DCT).
- the transformation is a discrete wavelet transform (DVT).
- the pre-compression manager 140 is also configured to, in some implementations, perform a quantization of the transformed residual data 143 to produce quantized residual data 144 .
- the quantized residual data 144 allows the encoder to achieve higher compression ratios at the expense of reducing the information content in the image.
- the quantization is performed using a fixed quantization matrix (such as a fixed 8 ⁇ 8 quantization matrix) for the luminance and chrominance components of the YUV data 134 .
- the quantization matrix generally reduces or eliminates residual values corresponding to high frequencies.
- the arithmetic encoding manager 150 is configured to perform an arithmetic coding of the quantized residual data 144 to produce encoded data 156 .
- the arithmetic coding manager replaces symbols (representing the quantized residual data 144 in this case) from an alphabet of symbols represented by symbol data 152 with numerical values that use less memory space than the symbols.
- the symbol data 152 represents an alphabet or set of symbols that encompass the possible elemental representations of the quantized residual data 144 . Because the residual data has been quantized, that data 144 only takes on a finite number of values. Each of those values may be represented by a symbol of the alphabet. Moreover, each symbol has a probability of occurring. The probability may be based on historical data or on a more theoretical understanding of the occurrence of symbols in quantized residual data 144 . It is noted that the alphabet—and hence, the CDF—may apply to not only the quantized residual data 144 but to any syntactic element in the bitstream (e.g., block-type, transform type, predictor type, etc.).
- the CDF array satisfies the following properties:
- the arithmetic encoding manager 150 is configured to update the CDF array data 153 to reflect the fact that the symbol having index k is occurring more frequently. This update may be done on all occasions that a symbol is received, or only in some of these cases, e.g. when a certain number of symbols have been received since the last update was done.
- the difficulty with the updating is that the updating operations are performed using integer arithmetic. Accordingly, maintaining the three properties of the CDF array (denoted CDF) during an updating operation described above is not trivial. Moreover, other constraints may be considered as follows:
- Some updating procedures include using a predefined “mixing” CDF representing a particular CDF model.
- a mixing CDF is used to ensure that the properties of the CDF array described above are maintained during an update.
- one such mixing CDF corresponding to the k th symbol is given by the following expression:
- CDF mixing [k] is the probability one would expect is the symbol corresponding to the index k was repeated dominantly in the bitstream. It is noted that the above mixing CDF is but one example, and other mixing CDFs are possible.
- the mixing CDF depends on the symbol, there is a separate mixing CDF model for each symbol of the alphabet. Accordingly, as shown above, the mixing CDF is a two-dimensional array requiring N (N+1) entries. In an encoding operation in which the CDF updating occupies about 40% of the processor resources in average, this data structure representing the mixing CDF may use too many resources.
- the mixing CDF instead can, in some implementations, be decomposed into two, one-dimensional arrays of: symbol-independent mixing CDF data 154 and symbol-dependent mixing CDF data 155 .
- symbol-independent mixing CDF data 154 represents a symbol-independent mixing CDF array, which is denoted as sym_ind_cdf.
- the symbol-dependent mixing CDF data 155 represents a symbol-dependent mixing CDF array, which is denoted as sym_cdf.
- the array sym_cdf is a subarray of a fixed, one-dimensional array fix_cdf.
- the fixed array fix_cdf has 2N+1 elements and may be defined as follows:
- fix_cdf [ k ] ⁇ 0 , k ⁇ N , P 0 - n , k ⁇ N , where P 0 is a normalized sum of the entries of the CDF array and n is the number of symbols of the alphabet having their probabilities being greater than zero, i.e., the number of symbols used. That is, the fixed array is a step function having N as the threshold index; more generally, the fixed array may represent a sigmoidal function.
- the updating procedure may now be described in the following code. Note that, while the following code is written in the C language, the procedure may be written in any language.
- the update to each value of CDF[k] upon receiving the symbol ai is to increase the current value CDF[k] by an amount which is obtained by performing a rounding operation on a product of the speed value (f) and summation term (delta).
- the summation term is based on the i th element of the symbol-independent mixing CDF array and the i th element of the symbol-dependent mixing CDF array.
- the summation term may be obtained from the sum of the i th element of the symbol-independent mixing CDF array and the i th element of the symbol-dependent mixing CDF array, minus the current value of CDF[k].
- Example code implemented using special Intel SSE4.1 instructions are as follows.
- the arithmetic encoding manager 150 completes the encoding of the quantized residual data 144 according to the interval procedure described above, for example, to produce encoded data 156 .
- the encoded data 156 takes the form of a floating-point number, although in some implementations the encoded data 156 may take the form of a bit string.
- the decoding manager 160 is configured to decode the encoded data 156 to produce decoded quantized residual data 162 , i.e., the quantized residual data 144 .
- the decoding manager 160 in performing the decoding operation on the encoded data 156 , operates in reverse from the arithmetic encoding manager 150 .
- the decoding manager 160 is also configured to produce a lossy version of the original image data 132 in the RGB data format by approximately reversing the operations used by the pre-compression manager 140 : dequantizing the quantized residual data 144 to produce decoded transformed residual data 163 ; it is noted that this dequantization process may not produce the original transformed residual data 143 exactly, but the losses should be as imperceptible as possible.
- An inverse DCT or DVT is applied to the decoded transformed residual data 163 to produce decoded residual data 164 , i.e., in coordinate space representation.
- the decoding manager 160 is further configured to add the predictive model values back to the decoded residual data 164 to produce decoded YUV data 164 , and finally the decoding manager 160 is further configured to transform the YUV channels back to RGB channels to produce decoded RGB data 166 as the product of the decoding process.
- the components (e.g., modules, processing units 124 ) of the user device 120 can be configured to operate based on one or more platforms (e.g., one or more similar or different platforms) that can include one or more types of hardware, software, firmware, operating systems, runtime libraries, and/or so forth.
- the components of the computer 120 can be configured to operate within a cluster of devices (e.g., a server farm). In such an implementation, the functionality and processing of the components of the computer 120 can be distributed to several devices of the cluster of devices.
- the components of the computer 120 can be, or can include, any type of hardware and/or software configured to process attributes.
- one or more portions of the components shown in the components of the computer 120 in FIG. 1 can be, or can include, a hardware-based module (e.g., a digital signal processor (DSP), a field programmable gate array (FPGA), a memory), a firmware module, and/or a software-based module (e.g., a module of computer code, a set of computer-readable instructions that can be executed at a computer).
- DSP digital signal processor
- FPGA field programmable gate array
- a memory e.g., a firmware module, and/or a software-based module (e.g., a module of computer code, a set of computer-readable instructions that can be executed at a computer).
- a software-based module e.g., a module of computer code, a set of computer-readable instructions that can be executed at a computer.
- the components of the computer 120 can be configured to operate within, for example, a data center (e.g., a cloud computing environment), a computer system, one or more server/host devices, and/or so forth.
- the components of the computer 120 can be configured to operate within a network.
- the components of the computer 120 can be configured to function within various types of network environments that can include one or more devices and/or one or more server devices.
- the network can be, or can include, a local area network (LAN), a wide area network (WAN), and/or so forth.
- the network can be, or can include, a wireless network and/or wireless network implemented using, for example, gateway devices, bridges, switches, and/or so forth.
- the network can include one or more segments and/or can have portions based on various protocols such as Internet Protocol (IP) and/or a proprietary protocol.
- IP Internet Protocol
- the network can include at least a portion of the Internet.
- one or more of the components of the computer 120 can be, or can include, processors configured to process instructions stored in a memory.
- processors configured to process instructions stored in a memory.
- an image manager 130 (and/or a portion thereof), a pre-compression 140 (and/or a portion thereof), an arithmetic coding manager 150 (and/or a portion thereof), and a decoding manager 160 (and/or a portion thereof) can be a combination of a processor and a memory configured to execute instructions related to a process to implement one or more functions.
- the memory 126 can be any type of memory such as a random-access memory, a disk drive memory, flash memory, and/or so forth. In some implementations, the memory 126 can be implemented as more than one memory component (e.g., more than one RAM component or disk drive memory) associated with the components of the VR server computer 120 . In some implementations, the memory 126 can be a database memory. In some implementations, the memory 126 can be, or can include, a non-local memory. For example, the memory 126 can be, or can include, a memory shared by multiple devices (not shown). In some implementations, the memory 126 can be associated with a server device (not shown) within a network and configured to serve the components of the computer 120 . As illustrated in FIG.
- the memory 126 is configured to store various data, including image data 132 , quantized residual data 144 , symbol-independent and symbol-dependent mixing CDF data 154 and 155 , encoded data 156 , and decoded RGB data 166 .
- FIG. 2 is a flow chart depicting an example method 200 of according to the above-described improved techniques.
- the method 200 may be performed by software constructs described in connection with FIG. 1 , which reside in memory 126 of the computer 120 and are run by the set of processing units 124 .
- pre-compression manager 140 receive raw data for encoding, the raw data represented by an alphabet of symbols.
- the raw data is the quantized residual data 144 generated by the pre-compression manager 140 based on the image received by the image manager 130 .
- the arithmetic encoding manager 150 initializes a cumulative distribution function (CDF) array (e.g., CDF array data 153 ) representing a CDF evaluated at a plurality of indices, each of the plurality of indices representing a symbol of an alphabet representing a portion of the raw data, the CDF at an index of the plurality of indices representing a cumulative sum of probabilities of symbols of the alphabet represented by indices of the plurality of indices less than or equal to the index.
- CDF cumulative distribution function
- the arithmetic encoding manager 150 in response to receiving a first symbol of the alphabet, updates the CDF array based on a first mixing CDF array (e.g., symbol-independent mixing CDF data 154 ) and a second mixing CDF array (e.g., symbol-dependent mixing CDF data 155 ) to produce an updated CDF array, the first mixing CDF array having values that are independent of the first symbol, the second mixing CDF array having values based on the symbol.
- the updated CDF array may be used to determine a probability of a second symbol of the alphabet representing a second portion of the raw data.
- the arithmetic encoding manager 150 encodes the first symbol and the second symbol using the updated CDF array to produce a codeword, the codeword, when decoded, reproducing the raw data.
- FIG. 3 illustrates an example of a generic computer device 300 and a generic mobile computer device 350 , which may be used with the techniques described here.
- Computer device 300 is one example configuration of computer 120 of FIG. 1
- computing device 300 is intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers.
- Computing device 350 is intended to represent various forms of mobile devices, such as personal digital assistants, cellular telephones, smart phones, and other similar computing devices.
- the components shown here, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed in this document.
- Computing device 300 includes a processor 302 , memory 304 , a storage device 306 , a high-speed interface 308 connecting to memory 304 and high-speed expansion ports 310 , and a low speed interface 312 connecting to low speed bus 314 and storage device 306 .
- Each of the components 302 , 304 , 306 , 308 , 310 , and 312 are interconnected using various busses, and may be mounted on a common motherboard or in other manners as appropriate.
- the processor 302 can process instructions for execution within the computing device 300 , including instructions stored in the memory 304 or on the storage device 306 to display graphical information for a GUI on an external input/output device, such as display 316 coupled to high speed interface 308 .
- multiple processors and/or multiple buses may be used, as appropriate, along with multiple memories and types of memory.
- multiple computing devices 300 may be connected, with each device providing portions of the necessary operations (e.g., as a server bank, a group of blade servers, or a multi-processor system).
- the memory 304 stores information within the computing device 300 .
- the memory 304 is a volatile memory unit or units.
- the memory 304 is a non-volatile memory unit or units.
- the memory 304 may also be another form of computer-readable medium, such as a magnetic or optical disk.
- the storage device 306 is capable of providing mass storage for the computing device 300 .
- the storage device 306 may be or contain a computer-readable medium, such as a floppy disk device, a hard disk device, an optical disk device, or a tape device, a flash memory or other similar solid state memory device, or an array of devices, including devices in a storage area network or other configurations.
- a computer program product can be tangibly embodied in an information carrier.
- the computer program product may also contain instructions that, when executed, perform one or more methods, such as those described above.
- the information carrier is a computer- or machine-readable medium, such as the memory 304 , the storage device 306 , or memory on processor 302 .
- the high speed controller 308 manages bandwidth-intensive operations for the computing device 300 , while the low speed controller 312 manages lower bandwidth-intensive operations. Such allocation of functions is exemplary only.
- the high-speed controller 308 is coupled to memory 304 , display 316 (e.g., through a graphics processor or accelerator), and to high-speed expansion ports 310 , which may accept various expansion cards (not shown).
- low-speed controller 312 is coupled to storage device 306 and low-speed expansion port 914 .
- the low-speed expansion port which may include various communication ports (e.g., USB, Bluetooth, Ethernet, wireless Ethernet) may be coupled to one or more input/output devices, such as a keyboard, a pointing device, a scanner, or a networking device such as a switch or router, e.g., through a network adapter.
- input/output devices such as a keyboard, a pointing device, a scanner, or a networking device such as a switch or router, e.g., through a network adapter.
- the computing device 300 may be implemented in a number of different forms, as shown in the figure. For example, it may be implemented as a standard server 320 , or multiple times in a group of such servers. It may also be implemented as part of a rack server system 324 . In addition, it may be implemented in a personal computer such as a laptop computer 322 . Alternatively, components from computing device 300 may be combined with other components in a mobile device (not shown), such as device 350 . Each of such devices may contain one or more of computing device 300 , 350 , and an entire system may be made up of multiple computing devices 300 , 350 communicating with each other.
- implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, specially designed ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof.
- ASICs application specific integrated circuits
- These various implementations can include implementation in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, coupled to receive data and instructions from, and to transmit data and instructions to, a storage system, at least one input device, and at least one output device.
- the systems and techniques described here can be implemented on a computer having a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user and a keyboard and a pointing device (e.g., a mouse or a trackball) by which the user can provide input to the computer.
- a display device e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor
- a keyboard and a pointing device e.g., a mouse or a trackball
- Other kinds of devices can be used to provide for interaction with a user as well; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user can be received in any form, including acoustic, speech, or tactile input.
- the systems and techniques described here can be implemented in a computing system that includes a back end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front end component (e.g., a client computer having a graphical user interface or a Web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back end, middleware, or front end components.
- the components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network (“LAN”), a wide area network (“WAN”), and the Internet.
- LAN local area network
- WAN wide area network
- the Internet the global information network
- the computing system can include clients and servers.
- a client and server are generally remote from each other and typically interact through a communication network.
- the relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
-
- CDF[0]=0. (The probability of a symbol outside of the alphabet occurring is zero. This holds after updating.)
- CDF[N]=PMAX. This holds after updating.
- CDF[k]≥CDF[k−1] when k>0. That is, no probability is negative.
In some implementations, N=16; that is, there are 16 symbols in the alphabet used in thearithmetic encoding manager 150.
-
- A symbol in an alphabet that is never seen, i.e., that has a probability equal to zero, maintains a probability of zero after an update in which that symbol is not observed.
- Conversely, symbols that have been seen maintain a non-zero probability after updating.
Nevertheless, an update performed by thearithmetic encoding manager 150, in some implementations, uses a speed value (a positive real number) that is indicative of a number of steps taken for a symbol to increase its probability if that symbol were the only symbol encoded in the bitstream.
where δkj is the Kronecker symbol (i.e., equal to zero unless k=j), i is an integer between 1 and N, and u is a fixed, empirically chosen parameter. As shown, CDFmixing[k] is the probability one would expect is the symbol corresponding to the index k was repeated dominantly in the bitstream. It is noted that the above mixing CDF is but one example, and other mixing CDFs are possible.
where P0 is a normalized sum of the entries of the CDF array and n is the number of symbols of the alphabet having their probabilities being greater than zero, i.e., the number of symbols used. That is, the fixed array is a step function having N as the threshold index; more generally, the fixed array may represent a sigmoidal function. The symbol-dependent mixing CDF array sym_cdf is then a subarray of fix_cdf as follows: when a symbol having index k is observed, then sym_cdf[i]=fix_cdf [N−1−k+i]. In this way, the symbol dependence of the symbol-dependent mixing CDF array is expressed in the first element of the subarray of the fixed array.
-
- Input: CDF, N, k (the index of the observed symbol), f (a speed value as described above).
- Output: An updated CDF.
- int* sym_cdf=fixed cdfI[N−1−k];
- for(int i=0; i<N; i++){
- int delta=sym_cdf[i]+sym_ind_cdf[i]−CDF[i];
- CDF[i]+=(int)(delta*f)>>16
- }
-
- _m128i A=_mm_loadu_sil28((const_m128i*)sym_cdf);
- _m128i B=_mm_loadu_sil28((const_m128i*) sym_ind_cdf);
- _m128i C=_mm_loadu_sil28((const_m128i*)CDF);
- _m128i D=_mm_add_epi16(A, B);
- _m128i E=_mm_sub_epi16(D, C);
- _m128i F=_mm_mulhi_epi16(E, f);
- _m128i G=_mm_add_epi16(C, F);
- mm_storeu_si128(CDF, G);
- Here, f is a 16-bit-precision representation of the speed f. These instructions are but one example and other instructions may be possible.
Claims (21)
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/US2020/070236 WO2022010531A1 (en) | 2020-07-06 | 2020-07-06 | Efficient update of cumulative distribution functions for image compression |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| US20230085142A1 US20230085142A1 (en) | 2023-03-16 |
| US12149265B2 true US12149265B2 (en) | 2024-11-19 |
Family
ID=71895317
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US17/904,030 Active 2041-01-05 US12149265B2 (en) | 2020-07-06 | 2020-07-06 | Efficient update of cumulative distribution functions for image compression |
Country Status (4)
| Country | Link |
|---|---|
| US (1) | US12149265B2 (en) |
| EP (1) | EP4052472A1 (en) |
| CN (1) | CN114846806A (en) |
| WO (1) | WO2022010531A1 (en) |
Families Citing this family (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US12301832B2 (en) | 2022-02-03 | 2025-05-13 | Tencent America LLC | Methods, devices, and storage medium for multi-symbol arithmetic coding |
| CN116347092B (en) * | 2023-03-08 | 2025-08-08 | 伟光有限公司 | Video frame decoding method, chip and storage medium |
Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5650783A (en) * | 1995-02-10 | 1997-07-22 | Fujitsu Limited | Data coding/decoding device and method |
Family Cites Families (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| AU2013101210A4 (en) * | 2013-09-11 | 2013-10-10 | Huang, Xu PROF | Enhancing Quality of Magnetic Resonance (MR) Image Based on Wavelet Algorithm |
| US10142635B2 (en) * | 2015-12-18 | 2018-11-27 | Blackberry Limited | Adaptive binarizer selection for image and video coding |
-
2020
- 2020-07-06 US US17/904,030 patent/US12149265B2/en active Active
- 2020-07-06 EP EP20750148.7A patent/EP4052472A1/en not_active Withdrawn
- 2020-07-06 WO PCT/US2020/070236 patent/WO2022010531A1/en not_active Ceased
- 2020-07-06 CN CN202080089257.2A patent/CN114846806A/en active Pending
Patent Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5650783A (en) * | 1995-02-10 | 1997-07-22 | Fujitsu Limited | Data coding/decoding device and method |
Non-Patent Citations (5)
| Title |
|---|
| Fenwick, "A New Data Structure for Cumulative Frequency Tables", Software-Practice and Experience, Wiley & Sons, Bognor Regis, GB; vol. 24, No. 3, Mar. 1, 1994, pp. 327-336. |
| International Search Report and Written Opinion for PCT Application No. PCT/US2020/070236, mailed on Apr. 9, 2021, 11 pages. |
| Moffat, "An Improved Data Structure for Cumulative Probability Tables", Software-Practice and Experience, Wiley & Sons, Bognor Regis, GB; vol. 29, No. 7, Jun. 1, 1999, pp. 647-659. |
| Said, "Introduction To Arithmetic Coding—Theory and Practice", Hewlett Packard Laboratories Report, Apr. 21, 2004, 67 pages. |
| Yue, et al., "An Overview of Core Coding Tools in the AV1 Video Codec", Picture Coding Symposium (PCS); IEEE, Jun. 24, 2018, pp. 41-45. |
Also Published As
| Publication number | Publication date |
|---|---|
| EP4052472A1 (en) | 2022-09-07 |
| CN114846806A (en) | 2022-08-02 |
| WO2022010531A1 (en) | 2022-01-13 |
| US20230085142A1 (en) | 2023-03-16 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN114467302B (en) | Block-based predictive coding for point cloud compression | |
| US9787321B1 (en) | Point cloud data compression using a space-filling curve | |
| US10499086B2 (en) | Video data encoding and decoding methods and apparatuses | |
| JPH0937271A (en) | Image compressing method | |
| US9307247B2 (en) | Compression using range coding with virtual sliding window | |
| US12149265B2 (en) | Efficient update of cumulative distribution functions for image compression | |
| CN115866253B (en) | A method, device, terminal and medium for channel-to-channel conversion based on self-modulation | |
| US9332277B2 (en) | Transform space difference compression | |
| JP2025506961A (en) | Method, apparatus and medium for visual data processing | |
| JP2025502448A (en) | Data processing method, device and medium | |
| CN119052478A (en) | Image coding method, image reconstruction method and device | |
| CN117201797A (en) | Remote sensing image data processing method, device, equipment and storage medium | |
| Tola | Comparative study of compression functions in modern web programming languages | |
| CN116366070B (en) | Wavelet coefficient coding method, device, system, equipment and medium | |
| CN116366867B (en) | Data transformation and recovery method, device and system, electronic equipment and storage medium | |
| CN112188216B (en) | Video data encoding method, apparatus, computer device and storage medium | |
| Hilles et al. | Image coding techniques in networking | |
| CN114359418B (en) | Image processing method, device, electronic device and storage medium | |
| CN116418997A (en) | Characteristic data compression method, device and system, electronic equipment and storage medium | |
| Zha | Progressive lossless image compression using image decomposition and context quantization | |
| Abul-Hassan | Multimedia Networking review paper | |
| WO2025080566A1 (en) | Method, apparatus, and medium for visual data processing | |
| JP2024543333A (en) | Method and apparatus for approximating cumulative distribution functions for use in entropy encoding or decoding of data - Patents.com | |
| CN116567239A (en) | Coding and decoding method, device, coder and decoder, equipment and medium | |
| Hilles et al. | Image Compression Techniques in Networking |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
| AS | Assignment |
Owner name: GOOGLE LLC, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:MASSIMINO, PASCAL;RABAUD, VINCENT;SIGNING DATES FROM 20200630 TO 20200705;REEL/FRAME:060850/0745 |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: EX PARTE QUAYLE ACTION MAILED |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO EX PARTE QUAYLE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
| ZAAB | Notice of allowance mailed |
Free format text: ORIGINAL CODE: MN/=. |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT RECEIVED |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED |
|
| STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
| CC | Certificate of correction |