WO2002007438A1 - Transformee en cosinus discrete inverse generalisee biorthogonale a transformee a chevauchement integree et sequence video a faible debit binaire codant l'extraction d'artefacts - Google Patents
Transformee en cosinus discrete inverse generalisee biorthogonale a transformee a chevauchement integree et sequence video a faible debit binaire codant l'extraction d'artefacts Download PDFInfo
- Publication number
- WO2002007438A1 WO2002007438A1 PCT/US2001/022368 US0122368W WO0207438A1 WO 2002007438 A1 WO2002007438 A1 WO 2002007438A1 US 0122368 W US0122368 W US 0122368W WO 0207438 A1 WO0207438 A1 WO 0207438A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- idct
- transform
- dct
- discrete cosine
- image
- Prior art date
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/527—Global motion vector estimation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/85—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/85—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
- H04N19/86—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving reduction of coding artifacts, e.g. of blockiness
Definitions
- DCT discrete cosine transform
- Picture quality can be enhanced by various methods of post-processing.
- Existing post-processing approaches include MAP estimation, projection onto convex sets (POCS) , and linear/nonlinear filtering.
- MAP estimation and POCS based algorithms are iterative and complicated algorithms. Each step involves forward and inverse transforms due to constraints in different domains. The high computational complexity of these algorithms prohibits their application to real time video sequence decoding.
- Existing filtering based post-processing algorithms involve a number of decision steps to detect the occurrence, level, and type of degradation, and to choose the corresponding filter for enhancement. Propagation of these decision steps to the next frame is often required.
- the disclosed system includes a generalized lapped biorthogonal transform embedded inverse discrete cosine transform (ge-IDCT) , as an alternative to the inverse discrete cosine transform (IDCT) within a system for still image compression.
- the ge-IDCT takes advantage of the DCT front end of the generalized lapped biorthogonal transform (GLBT) , such that it can be used in inverse transforming the DCT coefficients. With the nonlinear weighting in the embedded lapped transform domain, the ge-IDCT can reconstruct the signal with alleviated blockishness . Additional complexity imposed by the replacement of the IDCT by the ge-IDCT is trivial thanks to an efficient lattice structure.
- the disclosed ge-IDCT may be applied in the JPEG still image compression standard.
- the disclosed system improves the picture quality of video frames encoded at relatively low-bit rates by reducing the effects of both blocking and ringing artifacts.
- the disclosed system includes two picture post-processing methods to reduce the anomalies caused by these artifacts.
- the disclosed system operates to apply a lapped orthogonal transform-embedded inverse discrete cosine transform (le-IDCT) , as a substitute for the usual inverse DCT.
- le-IDCT orthogonal transform-embedded inverse discrete cosine transform
- the disclosed system may be embodied to include a nonlinear robust filter to be applied to the decoded picture frame.
- the disclosed system advantageously provides marked improvement in terms of both objective and subjective image quality.
- the computation overhead incurred due to the disclosed procedures is quite moderate, and real-time implementations may be embodied in hardware, software, firmware, or some combination thereof, executing on common desktop computer systems.
- FIG. 1 shows a flowgraph of GLBT, in which the analysis FB and the synthesis FB represent the forward and the inverse transforms, respectively;
- Fig. 2 shows a flowgraph of the DCT and the ge-IDCT, in which the ge-IDCT works in the case where the signal is processed in the DCT domain, and frequency weighting is employed in the embedded GLBT domain;
- Fig. 3 shows the detailed lattice structure of an analysis FB, including the first stage (a) with a DCT front end, and also showing each stage (b) ;
- Fig. 4 shows the detailed lattice structure of a synthesis FB, including the last stage (a) with an IDCT rear end, and each stage (b) ;
- Fig. 5 shows non-overlapping transforms (a) and overlapping transforms (b) ;
- Fig. 6 illustrates improvement of the ge-IDCT with nonlinear weighting in PSNR at various quality factors, (PSNR of the proposed methods) - (PSNR of JPEG) in [dB] vs. quality factor, for (a) airplane, (b) Barbara, (c) Lena, and (d) peppers images;
- Fig. 7 illustrates improvement of the ge-IDCT with nonlinear weighting in MSDS at various quality factors, (MSDS of the proposed methods) - (MSDS of JPEG) in [dB] vs. quality factor, for (a) airplane, (b) Barbara, (c) Lena, and (d) peppers images;
- Fig. 8 shows a portion (a) of the Lena test image, compressed by JPEG at quality factor 15, and an associated edge ap (b) ;
- Fig. 9 shows blocking artifact removal by MAP estimation and the proposed method, including image (a) by the MAP estimation, edge map (b) by the MAP estimation with line process, image (c) by the proposed method, and edge map (d) by the proposed method, and further showing that most of the texture in Lena's hat is missing in the MAP estimate due to over-smoothing;
- Fig. 11 illustrates blocking artifact removal by a deblocking filter and by the disclosed method, where (a) shows the image by the deblocking filter, (b) shows the edge map generated by the deblocking filter, (c) shows image generated by the disclosed method, and (d) shows edge map generated by the proposed method, and showing that the image processed by the deblocking filter is still relatively blockish due to under-smoothing;
- Fig. 12 shows a design example of the GenLOT, including (a) impulse response, and (b) frequency response;
- Fig. 13 shows detailed lattice structure of an embodiment of the disclosed le-IDCT;
- Fig. 14 shows schematics of a modified video sequence decoder, wherein “le” represents a part of the le-IDCT that precedes the IDCT, “RF” represents the robust filter, and “s” is a switch;
- Fig. 19 is a table showing PSNR of compressed and processed sequences, in dB, at 24kb/s, I Frames every 100 frames, wherein results of the disclosed system are represented by the "proposed" column values;
- Fig. 20 is a table showing MSDS of compressed and processed sequences, at 24kb/s, I Frames every 100 frames, wherein results of the disclosed system are represented by the "proposed" column values;
- Fig. 21 is a table showing comparison of video coding artifact removal algorithms in PNSR, in dB, at 24kb/s, I Frames every 100 frames, (DF: deblocking filter in Annex J) , wherein results of the disclosed system are represented by the "proposed" column values;
- Fig. 21 is a table showing comparison of video coding artifact removal algorithms in PNSR, in dB, at 24kb/s, I Frames every 100 frames, (DF: deblocking filter in Annex J) , wherein results of the disclosed system are represented by the "proposed" column values;
- FIG. 22 is a table showing comparison of video coding artifact removal algorithms in MSDS, at 24kb/s, I Frames every 100 frames, (DF: deblocking filter in Annex J) , wherein results of the disclosed system are represented by the "proposed" column values;
- Fig. 23 is a table showing comparison of average run time complexity of video coding artifact removal algorithms on I Frame, in [sec] , (DF: deblocking filter in Annex J) , wherein results of the disclosed system are represented by the "proposed" column values; and
- Fig. 24 is a table showing comparison of average run time complexity of video coding artifact removal algorithms on P Frame, in [sec], (DF: deblocking filter in Annex J) , wherein results of the disclosed system are represented by the "proposed" column values.
- the disclosed system embodies a method of utilizing a lapped transform in such a way that modification only in the decoder section of existing systems is required. Existing encoders may be used without any modification to supply standard bit streams.
- the disclosed method is compliant with the current image/video compression standards that employ the DCT.
- the generalized lapped biorthogonal transform (GLBT) is the most general form of lapped transforms.
- the GLBT is a linear phase perfect reconstruction filter bank (LPPRFB) based on the LP propagating lattice structure.
- LPPRFB linear phase perfect reconstruction filter bank
- the DCT is often used as the front end of the GLBT for its fast and efficient implementations. The DCT front end allows the GLBT to be used in inverse transforming the DCT coefficients.
- the DCT coefficients can be regarded as intermediate results of the GLBT with the DCT front end.
- the disclosed system may complete the rest of the stages in the analysis filter bank (FB) , followed by the synthesis FB to reconstruct the signal.
- This operation is called the GLBT embedded inverse DCT (ge-IDCT) .
- the disclosed ge-IDCT provides an excellent opportunity to process the signal.
- the DCT coefficients are processed already, by the quantization operation for example, the signal can be reprocessed in the embedded lapped transform domain to abate impairment of image quality.
- the blocking artifacts in image/video compression are degradation introduced by coarse quantizations of the DCT coefficients.
- the disclosed system employs nonlinear weighting of lapped transform coefficients.
- the disclosed ge-IDCT with nonlinear weighting may be applied in the JPEG still image compression standard.
- the IDCT of the standard decoder is simply replaced by the ge-IDCT with nonlinear weighting.
- Section 1(A) below introduces the GLBT.
- Section 1(B) below presents the disclosed ge-IDCT that can be paired with the forward DCT.
- Section 1(C) presents the disclosed nonlinear weighting that reduces the blockishness in reconstructed images.
- Section V addresses the design of the ge-IDCT.
- the ge-IDCT is applied to still image compression.
- the GLBT is a lapped transform defined as an LPPRFB with the polyphase transfer matrix (PTM), given by equation (1) .
- the first stage E ⁇ is an M-channel LPPRFB with no delay element, which can be factored as shown in equation (2), in which I and J are the [M/2 x M/2 ] identity and reversal matrices, and the matrices U 0 and V 0 are [M/2 x JV/2] invertible matrices.
- the PTM of each stage G ⁇ ( z) is given by equation (3).
- Vi are [M/2 x N/2 ] invertible matrices.
- the matrix ⁇ (z) has the delay element z '1 .
- the filter lengths increase by M by the delay element of each stage. The total length of the filter becomes KM.
- the analysis FB is an M-channel FB, hence ⁇ K - 1)M tabs of the filter lap over to the samples in previous blocks.
- the disclosed system factorizes each ⁇ matrix as shown in equation (4), where U ⁇ j and i j are orthogonal matrices and T ⁇ and ⁇ i are diagonal matrices.
- the PTM of the synthesis FB is given as shown in equation (5), such that the relationships in equation (6) hold true, and hence the perfect reconstruction (PR) .
- the inverse matrices ( j . and 0 involve the transposition of orthogonal matrices and inversion of diagonal matrices, which are trivial.
- the matrices in the PTM are subject to design procedure. These matrices, or their equivalent Givens rotation angles, are optimized for better properties such as coding gain and stopband attenuation.
- the flowgraph 10 of the analysis FB 12 and the synthesis FB 14 of the GLBT are given in Fig. 1.
- K > 1 the data blocks of the GLBT overlap each other.
- the basis functions of the GLBT have shapes that decay smoothly to zero.
- the GLBT's may be applied in image compression applications to substitute for the forward and inverse DCT.
- the quantization operation may be applied to the GLBT coefficients in various schemes. Experimental results show improved image quality with less blocking artifact even at a high compression ratio.
- the front end of the first stage becomes the DCT.
- the first stage can be written as shown in equation (8) , where E dct is a matrix each row of which consists of the DCT basis function. This approach may be taken in order to exploit fast implementations of the DCT available both in software and hardware.
- the ge-IDCT doesn't look very attractive when processing of the signal is neglected, since the same signal is returned albeit via longer operations. However, the ge-IDCT provides an excellent opportunity to process the signal in the embedded GLBT domain, where the basis functions have much better properties.
- signals may be processed in the lapped transform domains.
- the disclosed ge-IDCT can be embodied such that the processing of the signal is still in the DCT domain.
- the disclosed system can re-process the signal in the embedded lapped transform domain to alleviate harm done by the DCT domain processing.
- the disclosed system may be used to address the blockishness introduced by coarse quantization of the DCT coefficients.
- image compression such coarse quantization results in annoying discontinuity between the data blocks, which is called the blocking artifact. Since the blocking artifact is the result of the independent processing of blocks, it is natural to use information on neighboring blocks in the decoding process to eliminate the blocking artifact. Lapped transforms are excellent examples of such attempts.
- the ge-IDCT makes neighboring block information available to a decoding process in the same way that a lapped transform does.
- the disclosed system uses this neighboring block information to reduce the blocking artifacts.
- Fig. 5 shows how the basis functions of the non- overlapping transforms 30 and the overlapping transforms 32 are interlaid into the entire image.
- the non-overlapping blocks 30 in Fig. 5 are eight pixels long, whereas the overlapping blocks 32 are 16 pixels long, with eight pixels overlapped.
- the blocking artifact is a step at the boundary of two adjacent DCT blocks.
- the location of the step corresponds to the center of the GLBT blocks.
- the center of the overlapping blocks 32 in Fig. 5 aligns with the boundaries of two adjacent non-overlapping blocks 30.
- the step at this location is going to be represented as a linear combination of odd-symmetric GLBT basis functions.
- M is even
- the energy of the odd-symmetric GLBT coefficients may be used as a measure of the blocking effect.
- the goal is to detect is a small step due to the blocking artifact. It can be safely assumed that the energy is fairly small . Any large amount of energy must be due to real structures in the image.
- the blocking artifact is detected by checking the condition shown in equation (16) , where F ⁇ is the kth GLBT coefficient and 6 k is the threshold of energy.
- the fourth and fifth coefficients correspond to first two odd-symmetric basis functions.
- Other odd-symmetric basis functions represent filtering with relatively high pass-bands. They are excluded for this reason.
- the blocking artifact has been detected by investigating the energy of the first two odd-symmetric coefficients.
- the blockishness is due to small but excessive energy in those coefficients.
- the blockishness can be mitigated simply by reducing the energy.
- the odd-symmetric coefficients are weighted with the diagonal weighting matrix shown in equations (17) and (18) .
- the weighting scheme is nonlinear due to the function L .
- the use of nonlinear weighting provides selective removal of the blocking artifact without affecting the real structure of the image. Note that it is still possible that the small energy in F 4 and F ⁇ is not actually due to the blocking artifact. In this case, the shape of the GLBT basis functions along with the fact that the energy is small ensures that no discernible degradation is introduced.
- the DCT and the ge-IDCT with the disclosed nonlinear frequency weighting can be paired as shown in equations (19) and (20) .
- the ge-IDCT used in place of the IDCT can reconstruct the signal with alleviated blockishness.
- the disclosed nonlinear weighting has only one parameter. It is the threshold of energy e used in detecting the blocking artifact.
- the threshold is determined as the F 4 value when the input image is as shown in equation (21) . This F value corresponds to the energy due to a small step at the adjacent block boundary.
- 6 can be determined so that one can detect and eliminate the step of ⁇ can be detected and eliminated.
- the selection of threshold 6 at various step sizes ⁇ is determined off-line, and the results are stored in a look-up table.
- the disclosed weighting scheme uses the quality of the reconstructed signal as an input to look up the corresponding threshold 6 from the table.
- the parameter is internal and there is no external parameter that a user must supply.
- Applications that employ DCT usually have parameters that control the bit- rate and hence the quality.
- the threshold 6 can be chosen in terms of those parameters. For example, quality factor in JPEG, QP in H.263, and mquan in MPEG can be used in parameter selection.
- the GLBT may be implemented in a fast and efficient manner thanks to the lattice structure.
- the disclosed ge-IDCT inherits the efficiency of the GLBT.
- the additional computational complexity imposed by replacing the IDCT with the ge-IDCT is fairly small.
- some operations can be saved because the weighting is only on odd-symmetric coefficients. For example, the complexity is reduced by half using equations (22) and (23) .
- the matrix multiplication operations can be implemented efficiently by the planar rotations through CORDIC.
- the disclosed weighting works with various GLBT's. Embodiments may employ integer parameters to reduce the complexity further. Other operations such as W and ⁇ are trivial. And the operation in equation
- the design of an illustrative embodiment of the ge-IDCT is disclosed.
- the first step is to design a GLBT according to the desired properties.
- the next step is to embed the designed GLBT into the ge-IDCT.
- the following criteria are considered.
- the coding of a transform is defined as shown in
- the coding gain measures the energy compaction or decorrelation of signal from the transform. In compression applications, high coding gain is needed so that we can represent an image with a smaller number of coefficients at low bit rates. In designing the ge-IDCT, high coding gain helps isolate the frequency components responsible for steps at the center of the basis functions.
- the stopband attenuation is defined as shown in equation (25) .
- the stopband attenuation is a classical criteria for FB design.
- Low stopband attenuation helps decorrelation of signal and decreases aliasing between bands .
- Low stopband attenuation also means smooth basis functions.
- the ith band filter h ⁇ with a low pass band The Fourier transform of the filter H ⁇ (e ⁇ w ) not only tells us the frequency response of the filter, but also tells us the shape of the filter's impulse response, i.e. the basis function. The lesser the energy in the stopband, the cleaner the frequency components of the basis function.
- reducing the stopband attenuations means preventing high frequency components. And hence, the basis functions become smoother.
- Smooth basis functions are desired in order to prevent degradation of image quality by the modifications of the lapped transform coefficients.
- deblocking we weight some coefficients. When some of the basis functions are de-emphasized by the weighting, other basis functions become relatively prominent. Any oscillatory behavior of the now-prominent basis functions can degrade image quality.
- both the analysis FB and the synthesis FB we desire the following properties.
- the GLBT is designed through the optimization of equation (26), where ⁇ ' s weight relative importance between the coding gains and the stopband attenuations.
- the optimization is over the parameters of the matrices in the lattice structure. Since the GLBT is a biorthogonal transform, the basis functions of the analysis FB and the synthesis FB are different. The GLBT can be designed with different properties for different FB's. In particular, we emphasize the smoothness of the synthesis FB basis functions by trade off between the cost functions through ⁇ ' s . Once a GLBT with desired properties is designed, it is embedded into a ge-IDCT via equation (20) .
- the disclosed ge-IDCT with frequency weighting may be applied to the JPEG still image compression standard.
- a set of images may be coded by Independent JPEG group's codec at various quality factors, and decoded by the standard JPEG decoder and by the disclosed ge-IDCT with nonlinear weighting. Images may be compressed at quality factors less than 50.
- PSNR PSNR
- MSDS mean square difference of slopes
- the PSNR improvement of the ge-IDCT with nonlinear weighting is shown in plots 40, 42, 44 and 46 of Fig. 6 for the airplane, Barbara, Lena, and peppers images respectively.
- the plots in Fig. 6 show equation (28) in dB.
- the images decoded by the ge-IDCT show consistent improvement over the images decoded by the standard JPEG at all the quality factors.
- the MSDS improvement of the disclosed methods is shown in Fig. 7 for the same images.
- the plots 50, 52, 54, 56 in Fig. 7 show equation (29), where the negative values indicate reduced MSDS and hence reduced blockishness at block boundaries.
- the disclosed methods reduce the MSDS.
- the results are consistent throughout all the test images at various image qualities. For a quality factor above 50, the threshold € is set at zero. Then the results of the disclosed scheme are identical to the results of the standard JPEG decoder.
- Fig. 8 shows a part of the Lena image compressed by JPEG at quality factor 15.
- the image 60 in Fig. 8 shows severe blocking artifact, which iscon_rmedby false edges in the edge map 62 in Fig. 8.
- Fig. 9 shows comparison between the MAP estimation and the disclosed method. Both methods remove the blocking artifact effectively. Differences lie in preservation of details and texture. Because the disclosed method applies nonlinear weighting only on the specific frequency components, it shows superb preservation of details and texture. The differences are shown clearly on Lena's hat. As shown in Fig.
- image (a) 70 is the result of MAP estimation
- edge map (b) 72 the result of MAP estimation with line process
- image (c) 74 the result of the disclosed method
- edge map (d) 76 the result of the disclosed method. Note that most of the texture in Lena's hat is missing in the MAP estimate due to over- smoothing.
- the image 78 and the edge map 80 in Fig. 10 show severe blocking artifacts .
- Fig. 11 shows comparison between the deblocking option and the disclosed method.
- the implementation of the deblocking filter modifies only four pixels near the block boundaries, two pixels on each side.
- Fig. 11 includes the image (a) 90 by the deblocking filter, edge map (b) 92 generated by the deblocking filter, image (c) 94 generated by the disclosed method, and edge map (d) 96 generated by the proposed method, thus showing that the image processed by the deblocking filter is still relatively blockish due to under-smoothing.
- the deblocking filter fails to eliminate blockishness.
- the disclosed method modifies the coefficients of the lapped transform basis functions, which are twice the DCT block length, 16 pixels long to be specific. The disclosed method removes blockishness in smooth regions effectively.
- the disclosed system includes the ge- IDCT, that can be paired with the forward DCT.
- the ge- IDCT inverse transforms the DCT coefficients available at decoders. This aspect is important, because it means there is no incompatibility introduced by replacement of the IDCT by the ge-IDCT.
- the disclosed inverse transform exploits the lapped transform domain weighing to reconstruct the signal with alleviated blockishness.
- the ge-IDCT is based on the lattice structures, which leads to fast and efficient implementation.
- the additional computational complexity imposed by the new inverse transform is trivial.
- Experiments with the JPEG still image compression standard have confirmed the validity of the disclosed transforms.
- the ge-IDCT has proved to provide better performance than those of complex algorithms at low computational complexity.
- the ge-IDCT is a competitive alternative to the IDCT in mid to low bit rate still image/video sequence compression applications.
- the first technique replaces the conventional inverse DCT (IDCT) of a decoder in order to reduce blockishness. It is referred to herein as the lapped orthogonal transform embedded IDCT (le-IDCT) .
- the second disclosed technique is a non-linear data adaptive robust filter based on the Maximum Likelihood (ML) model parameter estimation, and is referred to herein as the robust filter. The disclosed robust filter is applied to alleviate the ringing artifact.
- these two disclosed techniques generally do not require changes in the encoder, or in the bit-stream, and hence may conveniently be standard compliant.
- Computational complexities of the disclosed techniques are moderate and amenable to real-time implementation within a desktop PC environment.
- the disclosed le-IDCT and robust filter are designed carefully such that their use does not degrade major structures of the image. This advantageous property is considered the robustness provided.
- the le-IDCT achieves such robustness by use of selective smoothing through non-linear weighting on only a couple of coefficients.
- the robust filter achieves its robustness by clustering samples into three clusters and using only the samples in one cluster. Having such robust components as those disclosed herein is beneficial 1 in artifact removal algorithms because it simplifies the way they tab into the decoder.
- Some of the existing post-processing algorithms use linear filtering to eliminate artifacts. Linear filters may degrade images when they are applied in wrong places. Such existing algorithms have to detect and retain precise locations of artifacts. These additional detecting and book keeping steps significantly complicate the implementation of such existing algorithms.
- Sections 11(A) and 11(B) below present the le-IDCT and the robust filter for removal of blocking artifacts and ringing artifacts, respectively.
- Section 11(C) below both the le-IDCT and the robust filter are applied to H.263+ video sequence.
- Section 11(D) the disclosed method is compared to deblocking option of H.263+ Annex J in terms of picture quality objectively and subjectively as well as run time complexity.
- the blocking artifact is a consequence of independent processing of adjacent blocks of image pixels. Better quality images can be achieved by processing adjacent blocks simultaneously. Good examples of simultaneous adjacent block processing techniques are lapped transforms, in which adjacent processing blocks overlap each other. These overlapping transform blocks, along with the use of gracefully decaying longer basis functions ensure the reconstructed image is blocking artifact free even at very low bit rates.
- the generalized lapped orthogonal transform is the general form of lapped orthogonal transforms (LOT's).
- the le-IDCT in the disclosed system is based on the GenLOT. Essentially, the disclosed system utilizes the fact that the first stage of the GenLOT can be replaced by the DCT matrix. Below, the GenLOT is reviewed, and the le-IDCT described.
- the Generalized Lapped Orthogonal Transform is the general form of lapped orthogonal transforms (LOT's).
- the GenLOT is defined as a linear phase paraunitary filter bank (LPPUFB) with a polyphase transform matrix (PTM) given by equation (31) .
- the first stage E 0 is a LPPRFB with no delay element, and can be factored as shown in equation (32) , where I is the identity matrix and J is the reversal matrix.
- the PTM of each stage G ⁇ (z) is given by equation (33) .
- the matrix ⁇ (z) contains the delay element z "1 .
- the filter lengths of the GenLOT increase with the delay element at each ith stage.
- the matrices U ⁇ and V ⁇ are orthogonal matrices.
- the first stage Eo becomes the DCT matrix.
- An apparent advantage of having the DCT first stage is to exploit fast and efficient implementation.
- Another advantage is to make use of the GenLOT in the inversion of standard DCT coefficients.
- the analysis FB is the same as the DCT. But the synthesis FB is carried out by completing what's left of the analysis FB in equation (34), followed by a diagonal weighting matrix A and the synthesis FB in equation (35) .
- the le-IDCT provides an excellent opportunity to process a signal in the embedded lapped transform domain, where the basis functions have much better properties.
- Nonlinear Weighting The disclosed le-IDCT can be used to eliminate blocking artifacts introduced by coarse quantization of the DCT coefficients. This can be accomplished by choosing appropriate weighting in the diagonal matrix A. As an example of deblocking, let us consider the
- the detailed lattice structure of the GenLOT is given in equations (39) and (40) .
- Fig. 12 shows an example of the impulse responses 100 and the frequency responses 102 of the GenLOT. This lapped transform can be embedded into the le-IDCT via equation (38) .
- F k be the kth GenLOT coefficient and 6 k be a threshold of energy.
- the weighing matrix can be chosen as shown in equations (41) and (42) .
- the weighting scheme is nonlinear due to the function O . Use of nonlinear weighting provides selective removal of the blocking artifact without affecting the real structure of the image .
- the GenLOT has fast and efficient implementation thanks to the lattice structure.
- the disclosed le-IDCT inherits the efficiency of the GenLOT. Additional computational complexity imposed by replacing the IDCT with the le-IDCT is fairly small.
- the detailed lattice structure of the le-IDCT is shown in Fig. 13. Some operations can be saved because the weighting is only on odd-symmetric coefficients. The complexity is reduced by half using equation (43) , where A odd is a diagonal matrix with the weights for only odd-symmetric coefficients.
- the matrix multiplication operation can be implemented efficiently by the planar rotations through CORDIC. Other operations such as W and A are trivial.
- the operation ] shown in equation (38) is just the
- IDCT All the fast implementations of the IDCT, in either software or hardware, are still applicable. It is noted that the operation of ] is not additional. It is an operation a decoder has to perform during the standard decoding process. Only the operations that precede the ] are additional.
- the disclosed nonlinear weighting has only one parameter. It is a relatively simple deblocking algorithm not only in terms of the computations but also in terms of the number of parameters.
- the parameter is the threshold of energy 6 used in detecting the blocking artifact.
- the threshold is determined as the F 4 value when the input is as shown in equation (44) . It is the energy corresponding to a small step at the adjacent block boundary. Then 6 can be determined such that one can detect and eliminate the step of ⁇ .
- the selection of threshold 6 at various step size ⁇ is determined off-line, and the results are stored in a table.
- the disclosed weighting scheme uses the quality of the reconstructed signal as an input to look up the corresponding threshold e from the table.
- the parameter is internal and there is no external parameter to be supplied by the user.
- Video compression applications that employ DCT usually have parameters that control the bit-rate and hence the quality.
- the threshold € can be chosen in terms of those parameters. For example, QP in H.263+ and mquan in MPEG can be used in parameter selection.
- the disclosed system operates by replacing a rippled surface with a flat surface to remove the ringing artifact.
- the disclosed system attempts to fit a flat surface model to the compressed image as necessary.
- a flat surface model consists of the number of surfaces, grayscale values of each surface, and corresponding surface information. These parameters are estimated from a given compressed image.
- a flat surface model is applied locally to small regions of the image.
- a [ w x w] window centered at ( ⁇ ,j)th pixel slides through the compressed image g pixel by pixel to pick samples G.
- Our flat surface model consists of the number of surfaces K, the grayscale values of surface ⁇ , and the surface information z.
- the surface information z is a [w x w] matrix with its elements taking the values in ⁇ 1 . . . K .
- the grayscale values of each surface form a [K x 1] vector ⁇ .
- the flat surface model image of size [w x w] ' can be written as shown in equation (45) , where 1 is a vector valued indicator function.
- the center pixel of F, denoted by Fc, is taken as the (i, )th pixel of the ringing artifact free image / .
- Fc the center pixel of F
- Equation (46) The parameter estimation problem is shown in equation (46) where G is incomplete data with z missing.
- Equation (47) The estimation problem with the complete date ( G, z) can be written as shown in equation (47), which can be solved by the k-means algorithm.
- the number of surfaces K has to be determined from the samples G before the estimation of the probability density P[G ⁇ ⁇ ] . It can be determined by a hierarchical clustering algorithm with a criterion of merit. A simple alternative is to fix the number of surfaces.
- a three- cluster model whose cluster centers are determined by a simple rule is used in one embodiment. Given the samples G, the cluster centers are initialized as shown in equation (48), where G c denotes the grayscale value of the center pixel in the window. Furthermore, the number of iterations in the k-means algorithm is set to one. The estimate is still an ML estimate under the probability density P[G ⁇ ⁇ ] approximated by the simplified k-means algorithm.
- the center pixel of the window F c is taken as the (i,j)th pixel of the ringing artifact free image . Therefore, ⁇ which the center pixel of F takes is the only parameter of interest.
- the result of the three- cluster model is non-iterative in nature.
- We denote the robust filter as the mapping from the samples G to the estimate F c . It is robust in the sense that major edge is preserved. This is because pixels belonging to the other side of the edge will be clustered into another cluster and will not be used to estimate the current pixel value.
- Equation (53) where j is the conditional mean defined by equation (54), where A ( ⁇ rj;a ) is the number of pixels in the set A (i ,j ; a) .
- Another advantage of the disclosed robust filter is its robustness. It removes the ringing artifact without degrading the major structures in image. Consequently it does not need any pre-steps to detect the region with ringing artifact or a carry to convey information through out the decoder.
- the robust filter can be applied strictly as a post-processing.
- the disclosed techniques have been applied to the coding artifact removal of H.263+ compressed sequences.
- the application of the disclosed system into the decoder is quite simple.
- the IDCT for I Frames are replaced by the le-IDCT, and the robust filter is applied to every frame as post-processing.
- the modification of this embodiment is depicted in Fig. 14, wherein "le” 108 represents a part of the le-IDCT that precedes the IDCT, "RF" represents the robust filter, and "s" 112 is a switch.
- test bench is based on H.263+ v3.0 released by The University of British Columbia.
- a set of test sequences consists of container, foreman, hall, and news in qcif format ( [144 x 176] frame size) .
- the exception is the foreman sequence which is compressed at 48 kb/s.
- Fig. 15 the frame-by-frame improvement in PSRN of our disclosed methods over the baseline H.263+ decoder is shown.
- the plots 120 and 122 show equation (55) for each frame in dB.
- For the foreman sequence the improvement is moderate.
- the hall sequence the improvement is consistent for every frame.
- MSDS is a measure of blockishness that gauges the severity of the blocking artifact.
- the plots 130 and 132 show equation (56) for each frame.
- the image (a) 140 in Fig. 17 suffers from both the blocking artifact and mosquito noise.
- the blocking artifact is most severe in smooth areas of floors and walls, and the ringing artifact is prominent around edges and around the moving person in the center of the frame.
- the image 142 in Fig. 17 shows effective removal of both artifacts.
- the edge maps (c) 144 for image (a) 140 and (d) 146 for image (b) 142 in Fig. 17 validate the removal of blockishness in the image.
- the deblocking filter of Annex J operates with some other advanced options. These options are not available in a baseline implementation. Both the encoder and the decoder have to be equipped with such advanced options . For fair comparison, the disclosed method is applied with the same options that the deblocking filter uses.
- Table III 164 of Fig. 21 and Table IV 166 of Fig. 22 show comparison of methods in PSNR and MSDS. All of the reported numbers are comparable.
- the image (b) 152 in Fig. 18 is the result of the disclosed method. The same options in Annex J are used except its deblocking filter. The result shows effective removal of the coding artifacts.
- the edge maps (c) 154 and (d) 156 corresponding to images (a) 150 and (b) 152 respectively in Fig. 18 validate the claim.
- the computational complexity of the disclosed algorithms in terms of run time is investigated.
- the algorithms are written in straight forward C and embedded into the decoder. They are tested on a 333 MHz dual Pentium PC with 512 MB RAM and SCSI hard-drive running on Windows 2000. The purpose of this comparison is to demonstrate that these algorithms can be applied to H.263+ in real time without assembly coding and human optimization effort. Only speed optimization of Microsoft Visual C++ 5.0 is opted.
- the average run time of I and P frames for each sequence are summarized in Table V 168 of Fig. 23 and Table VI 170 of Fig. 24 respectively.
- the current implementation can decode both I Frames and P Frames at the rate of 20 Frames per second.
- the frame rates can be improved further by reducing the overhead due to data movements in the current implementation.
- the video sequence coding artifacts of blocking artifact and mosquito nose is suppressed significantly by incorporation of disclosed methods into the decoder.
- ROM or CD-ROM disks readable by a computer I/O attachment e.g. ROM or CD-ROM disks readable by a computer I/O attachment
- information alterably stored on writable storage media e.g. floppy disks and hard drives
- information conveyed to a computer through communication media for example using baseband signaling or broadband signaling techniques, including carrier wave signaling techniques, such as over computer or telephone networks via a modem.
- baseband signaling or broadband signaling techniques including carrier wave signaling techniques, such as over computer or telephone networks via a modem.
- the illustrative embodiments may be implemented in computer software, the functions within the illustrative embodiments may alternatively be embodied in part or in whole using hardware components such as Application Specific Integrated Circuits, Field Programmable Gate Arrays, or other hardware, or in some combination of hardware components and software components .
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU2001273510A AU2001273510A1 (en) | 2000-07-17 | 2001-07-17 | Generalized lapped biorthogonal transform embedded inverse discrete cosine transform and low bit rate video sequence coding artifact removal |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US21860000P | 2000-07-17 | 2000-07-17 | |
US60/218,600 | 2000-07-17 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2002007438A1 true WO2002007438A1 (fr) | 2002-01-24 |
Family
ID=22815726
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2001/022368 WO2002007438A1 (fr) | 2000-07-17 | 2001-07-17 | Transformee en cosinus discrete inverse generalisee biorthogonale a transformee a chevauchement integree et sequence video a faible debit binaire codant l'extraction d'artefacts |
Country Status (2)
Country | Link |
---|---|
AU (1) | AU2001273510A1 (fr) |
WO (1) | WO2002007438A1 (fr) |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7136536B2 (en) | 2004-12-22 | 2006-11-14 | Telefonaktiebolaget L M Ericsson (Publ) | Adaptive filter |
CN100348049C (zh) * | 2002-03-27 | 2007-11-07 | 微软公司 | 用于渐进地变换并编码数字数据的系统和方法 |
US7305139B2 (en) | 2004-12-17 | 2007-12-04 | Microsoft Corporation | Reversible 2-dimensional pre-/post-filtering for lapped biorthogonal transform |
US7369709B2 (en) | 2003-09-07 | 2008-05-06 | Microsoft Corporation | Conditional lapped transform |
US7412102B2 (en) | 2003-09-07 | 2008-08-12 | Microsoft Corporation | Interlace frame lapped transform |
US7428342B2 (en) | 2004-12-17 | 2008-09-23 | Microsoft Corporation | Reversible overlap operator for efficient lossless data compression |
WO2008122232A1 (fr) * | 2007-04-10 | 2008-10-16 | Huawei Technologies Co., Ltd. | Procédé et appareil de transformation linéaire ponctuelle pour image numérique |
US7471726B2 (en) | 2003-07-15 | 2008-12-30 | Microsoft Corporation | Spatial-domain lapped transform in digital media compression |
US7471850B2 (en) | 2004-12-17 | 2008-12-30 | Microsoft Corporation | Reversible transform for lossy and lossless 2-D data compression |
EP2201778A2 (fr) * | 2007-09-26 | 2010-06-30 | Hewlett-Packard Company | Traitement d'une image d'entrée pour réduire des artéfacts liés à la compression |
US8036274B2 (en) | 2005-08-12 | 2011-10-11 | Microsoft Corporation | SIMD lapped transform-based digital media encoding/decoding |
US8238675B2 (en) | 2008-03-24 | 2012-08-07 | Microsoft Corporation | Spectral information recovery for compressed image restoration with nonlinear partial differential equation regularization |
US8447591B2 (en) | 2008-05-30 | 2013-05-21 | Microsoft Corporation | Factorization of overlapping tranforms into two block transforms |
US8897359B2 (en) | 2008-06-03 | 2014-11-25 | Microsoft Corporation | Adaptive quantization for enhancement layer video coding |
US9313509B2 (en) | 2003-07-18 | 2016-04-12 | Microsoft Technology Licensing, Llc | DC coefficient signaling at small quantization step sizes |
US9967561B2 (en) | 2006-05-05 | 2018-05-08 | Microsoft Technology Licensing, Llc | Flexible quantization |
US10554985B2 (en) | 2003-07-18 | 2020-02-04 | Microsoft Technology Licensing, Llc | DC coefficient signaling at small quantization step sizes |
CN115334314A (zh) * | 2022-10-14 | 2022-11-11 | 新乡学院 | 一种高清晰度低秩电视高维信号数据的压缩及重建方法 |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5838377A (en) * | 1996-12-20 | 1998-11-17 | Analog Devices, Inc. | Video compressed circuit using recursive wavelet filtering |
US5850482A (en) * | 1996-04-17 | 1998-12-15 | Mcdonnell Douglas Corporation | Error resilient method and apparatus for entropy coding |
US6101279A (en) * | 1997-06-05 | 2000-08-08 | Wisconsin Alumni Research Foundation | Image compression system using block transforms and tree-type coefficient truncation |
-
2001
- 2001-07-17 AU AU2001273510A patent/AU2001273510A1/en not_active Abandoned
- 2001-07-17 WO PCT/US2001/022368 patent/WO2002007438A1/fr active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5850482A (en) * | 1996-04-17 | 1998-12-15 | Mcdonnell Douglas Corporation | Error resilient method and apparatus for entropy coding |
US5838377A (en) * | 1996-12-20 | 1998-11-17 | Analog Devices, Inc. | Video compressed circuit using recursive wavelet filtering |
US6101279A (en) * | 1997-06-05 | 2000-08-08 | Wisconsin Alumni Research Foundation | Image compression system using block transforms and tree-type coefficient truncation |
Cited By (26)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN100348049C (zh) * | 2002-03-27 | 2007-11-07 | 微软公司 | 用于渐进地变换并编码数字数据的系统和方法 |
US7471726B2 (en) | 2003-07-15 | 2008-12-30 | Microsoft Corporation | Spatial-domain lapped transform in digital media compression |
US9313509B2 (en) | 2003-07-18 | 2016-04-12 | Microsoft Technology Licensing, Llc | DC coefficient signaling at small quantization step sizes |
US10063863B2 (en) | 2003-07-18 | 2018-08-28 | Microsoft Technology Licensing, Llc | DC coefficient signaling at small quantization step sizes |
US10554985B2 (en) | 2003-07-18 | 2020-02-04 | Microsoft Technology Licensing, Llc | DC coefficient signaling at small quantization step sizes |
US10659793B2 (en) | 2003-07-18 | 2020-05-19 | Microsoft Technology Licensing, Llc | DC coefficient signaling at small quantization step sizes |
US7369709B2 (en) | 2003-09-07 | 2008-05-06 | Microsoft Corporation | Conditional lapped transform |
US7412102B2 (en) | 2003-09-07 | 2008-08-12 | Microsoft Corporation | Interlace frame lapped transform |
US7551789B2 (en) | 2004-12-17 | 2009-06-23 | Microsoft Corporation | Reversible overlap operator for efficient lossless data compression |
US7471850B2 (en) | 2004-12-17 | 2008-12-30 | Microsoft Corporation | Reversible transform for lossy and lossless 2-D data compression |
US7428342B2 (en) | 2004-12-17 | 2008-09-23 | Microsoft Corporation | Reversible overlap operator for efficient lossless data compression |
US7305139B2 (en) | 2004-12-17 | 2007-12-04 | Microsoft Corporation | Reversible 2-dimensional pre-/post-filtering for lapped biorthogonal transform |
US7136536B2 (en) | 2004-12-22 | 2006-11-14 | Telefonaktiebolaget L M Ericsson (Publ) | Adaptive filter |
US8036274B2 (en) | 2005-08-12 | 2011-10-11 | Microsoft Corporation | SIMD lapped transform-based digital media encoding/decoding |
US9967561B2 (en) | 2006-05-05 | 2018-05-08 | Microsoft Technology Licensing, Llc | Flexible quantization |
WO2008122232A1 (fr) * | 2007-04-10 | 2008-10-16 | Huawei Technologies Co., Ltd. | Procédé et appareil de transformation linéaire ponctuelle pour image numérique |
CN101874409B (zh) * | 2007-09-26 | 2012-09-19 | 惠普开发有限公司 | 处理输入图像以减少压缩相关伪影 |
EP2201778A4 (fr) * | 2007-09-26 | 2011-10-12 | Hewlett Packard Co | Traitement d'une image d'entrée pour réduire des artéfacts liés à la compression |
EP2201778A2 (fr) * | 2007-09-26 | 2010-06-30 | Hewlett-Packard Company | Traitement d'une image d'entrée pour réduire des artéfacts liés à la compression |
US8238675B2 (en) | 2008-03-24 | 2012-08-07 | Microsoft Corporation | Spectral information recovery for compressed image restoration with nonlinear partial differential equation regularization |
US8447591B2 (en) | 2008-05-30 | 2013-05-21 | Microsoft Corporation | Factorization of overlapping tranforms into two block transforms |
US8897359B2 (en) | 2008-06-03 | 2014-11-25 | Microsoft Corporation | Adaptive quantization for enhancement layer video coding |
US9185418B2 (en) | 2008-06-03 | 2015-11-10 | Microsoft Technology Licensing, Llc | Adaptive quantization for enhancement layer video coding |
US9571840B2 (en) | 2008-06-03 | 2017-02-14 | Microsoft Technology Licensing, Llc | Adaptive quantization for enhancement layer video coding |
US10306227B2 (en) | 2008-06-03 | 2019-05-28 | Microsoft Technology Licensing, Llc | Adaptive quantization for enhancement layer video coding |
CN115334314A (zh) * | 2022-10-14 | 2022-11-11 | 新乡学院 | 一种高清晰度低秩电视高维信号数据的压缩及重建方法 |
Also Published As
Publication number | Publication date |
---|---|
AU2001273510A1 (en) | 2002-01-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6983079B2 (en) | Reducing blocking and ringing artifacts in low-bit-rate coding | |
US6360024B1 (en) | Method and apparatus for removing noise in still and moving pictures | |
WO2002007438A1 (fr) | Transformee en cosinus discrete inverse generalisee biorthogonale a transformee a chevauchement integree et sequence video a faible debit binaire codant l'extraction d'artefacts | |
Triantafyllidis et al. | Blocking artifact detection and reduction in compressed data | |
Yang et al. | Removal of compression artifacts using projections onto convex sets and line process modeling | |
Shen et al. | Review of postprocessing techniques for compression artifact removal | |
US9786066B2 (en) | Image compression and decompression | |
US7403665B2 (en) | Deblocking method and apparatus using edge flow-directed filter and curvelet transform | |
Zhu et al. | Second-order derivative-based smoothness measure for error concealment in DCT-based codecs | |
EP1882236B1 (fr) | Procede et dispositif de filtrage des bruits dans le codage video | |
US6760487B1 (en) | Estimated spectrum adaptive postfilter and the iterative prepost filtering algirighms | |
US20050013359A1 (en) | Spatial-domain lapped transform in digital media compression | |
Shirani et al. | Reconstruction of baseline JPEG coded images in error prone environments | |
CA2227495C (fr) | Codeur video utilisant la transposition de pixels | |
US20050196053A1 (en) | Method and apparatus for error concealment for JPEG 2000 compressed images and data block-based video data | |
Tsaig et al. | Variable projection for near-optimal filtering in low bit-rate block coders | |
Vo et al. | Quality enhancement for motion JPEG using temporal redundancies | |
Nakajima et al. | A pel adaptive reduction of coding artifacts for MPEG video signals | |
Jeong et al. | A practical projection-based postprocessing of block-coded images with fast convergence rate | |
US7065212B1 (en) | Data hiding in communication | |
Guleryuz | A nonlinear loop filter for quantization noise removal in hybrid video compression | |
JPH10229559A (ja) | ブロック化による影響を軽減する方法およびフィルタ | |
Jeon et al. | Blocking artifacts reduction in image coding based on minimum block boundary discontinuity | |
Kapinaiah et al. | Block DCT to wavelet transcoding in transform domain | |
Yang et al. | Low bit rate video sequence coding artifact removal |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
122 | Ep: pct application non-entry in european phase | ||
NENP | Non-entry into the national phase |
Ref country code: JP |