KR20130057402A - Method and apparatus for multi-view color and depth videos decoding - Google Patents
Method and apparatus for multi-view color and depth videos decoding Download PDFInfo
- Publication number
- KR20130057402A KR20130057402A KR1020120133063A KR20120133063A KR20130057402A KR 20130057402 A KR20130057402 A KR 20130057402A KR 1020120133063 A KR1020120133063 A KR 1020120133063A KR 20120133063 A KR20120133063 A KR 20120133063A KR 20130057402 A KR20130057402 A KR 20130057402A
- Authority
- KR
- South Korea
- Prior art keywords
- depth image
- current
- pixel
- quantization
- residual signal
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
- H04N13/128—Adjusting depth or disparity
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/24—Systems for the transmission of television signals using pulse code modulation
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Disclosed are a multi-view video decoding method and apparatus. The multi-view video decoding method includes receiving and entropy decoding quantization information of a current depth image, and obtaining a quantized residual signal of the current depth image based on the quantization information. The information includes flag information indicating whether spatial axis quantization is performed on the current depth image.
Description
The present invention relates to a method for encoding / decoding an image, and more particularly, to a method and apparatus for encoding / decoding a multiview video including a color image and a depth image.
ITU-T's Video Coding Experts Group (VCEG) and ISO / IEC's Moving Picture Experts Group (MPEG) formed Joint Collaborative Team on Video Coding (JCT-VC), the next generation of video compression standards after H.264 / AVC. Standardization on HEVC (High Efficient Video Coding) is in progress. On the other hand, the MPEG 3DV group is a HEVC based multiview color image / depth standardized by existing H.264 / AVC and JCT-VC for efficient compression of multiview images and synthesis of virtual view images. Compression standardization of images is in progress.
In order to enable the synthesis of the virtual view image using the depth image, the 3DV group is in the process of standardizing a technology for compressing and transmitting not only a multiview color image but also depth image information. Accordingly, research on high-efficiency compression techniques considering the characteristics of depth image is expected to be actively conducted.
The present invention provides a method and apparatus for multiview video encoding / decoding that can improve encoding / decoding efficiency of a multiview video image.
The present invention provides a method and apparatus for encoding / decoding a depth image capable of improving the accuracy of the depth image.
The present invention provides a quantization method and apparatus capable of improving the accuracy of a depth image.
The present invention provides a filtering method and apparatus capable of preserving edge regions of a depth image and improving image quality.
According to an aspect of the present invention, a multi-view video decoding method is provided. The method includes receiving and entropy decoding quantization information of a current depth image and acquiring a quantized residual of the current depth image based on the quantization information, wherein the quantization information includes: Flag information indicating whether to perform spatial axis quantization on the current depth image is included.
When spatial axis quantization is performed on the current depth image, the quantization information includes difference value information on the quantized residual signal of the current depth image, and the difference value on the quantized residual signal is the current depth. The difference value may be a difference between a quantized residual signal of a current pixel in an image and a quantized residual signal of a neighboring pixel positioned around the current pixel.
The acquiring of the quantized residual signal may include predicting a residual signal from the neighboring pixels and adding a difference value between the predicted residual signal and the quantized residual signal of the current depth image.
The peripheral pixel is an upper pixel located at the top of the current pixel when the current pixel is located in the first column of the current depth image, and when the current pixel is located in a region other than the first column of the current depth image. It may be a left pixel located to the left of the current pixel.
The flag information indicating whether to perform the spatial axis quantization may be information encoded and transmitted based on a transform unit (TU).
According to another aspect of the present invention, a multi-view video decoding apparatus is provided. The apparatus includes an entropy decoder that entropy decodes and receives quantization information about a current depth image, and an inverse quantizer that obtains a quantized residual signal of the current depth image based on the quantization information. The information includes flag information indicating whether spatial axis quantization is performed on the current depth image.
When spatial axis quantization is performed on the current depth image, the quantization information includes difference value information on the quantized residual signal of the current depth image, and the difference value on the quantized residual signal is the current depth. The difference value may be a difference between a quantized residual signal of a current pixel in an image and a quantized residual signal of a neighboring pixel positioned around the current pixel.
The inverse quantization unit may estimate a residual signal from the neighboring pixels, and obtain a quantized residual signal of the current depth image by adding a difference value between the predicted residual signal and the quantized residual signal of the current depth image. .
The peripheral pixel is an upper pixel located at the top of the current pixel when the current pixel is located in the first column of the current depth image, and when the current pixel is located in a region other than the first column of the current depth image. It may be a left pixel located to the left of the current pixel.
The flag information indicating whether to perform the spatial axis quantization may be information encoded and transmitted based on a transform unit (TU).
According to another aspect of the present invention, a multi-view video decoding method is provided. The method includes receiving and entropy decoding a bitstream and performing filtering using an anisotropic median filter on a current depth image reconstructed based on the entropy decoded signal.
The performing of the filtering using the anisotropic intermediate filter may include determining whether a current pixel area in the current depth image is an edge area, and if the current pixel area is an edge area, the filtering target pixel in the current pixel area. Classifying the pixels in the current pixel area into a plurality of groups based on a value and based on a difference between each of the median values calculated from each of the classified groups and the filtering pixel value. And determining the pixel value to be filtered as one of the median values calculated from each of the classified plurality of groups.
In determining whether the current pixel area is an edge area, the threshold value may be compared with a preset threshold based on a difference between pixel values in the current pixel area and an intermediate value calculated from neighboring pixels positioned around the current pixel area. have.
In the classifying into the plurality of groups, the pixels in the current pixel area having a value less than or equal to the filtering target pixel value are classified into a first group, and the current having a value greater than or equal to the filtering target pixel value. The pixels in the pixel area may be classified into a second group.
In the determining of the pixel value to be filtered, filtering the median having a small difference between the first pixel value calculated from the first group and the second pixel value calculated from the second group with the filtering object pixel value. Can be determined by the target pixel value.
The method may further include storing the current depth image filtered using the anisotropic intermediate filter in an image buffer.
According to another aspect of the present invention, a multi-view video decoding apparatus is provided. The apparatus includes an entropy decoding unit for receiving and entropy decoding a bitstream and a filter unit for performing filtering using an anisotropic median filter on the current depth image reconstructed based on the entropy decoded signal.
The image quality of the depth image may be improved by applying an anisotropic intermediate filter to the image degradation problem that may occur in the edge region of the reconstructed depth image. In addition, by applying the spatial axis quantization method to the depth image, it is possible to reduce errors that may occur in the region including the edge, preserve the edge region, and improve rate-distortion performance.
1 is a block diagram schematically illustrating an apparatus for decoding a multiview video image according to an embodiment of the present invention.
2 is a block diagram schematically illustrating an apparatus for decoding a multiview depth image according to an embodiment of the present invention.
3 is a diagram illustrating an encoding structure for inter-view prediction of a multiview image to which the present invention is applied.
FIG. 4 is a diagram illustrating an example of a reference structure for inter-view prediction of time V 2 shown in FIG. 3.
FIG. 5 is a diagram illustrating an example of a reference structure for inter-view prediction of time V 1 shown in FIG. 3.
6 is a flowchart schematically illustrating a space axis quantization method according to an embodiment of the present invention.
7 is a flowchart illustrating a method of inverse quantization in a spatial domain according to an embodiment of the present invention.
8 is a diagram illustrating a process of obtaining a quantized residual signal by applying a pixel-based prediction method according to an embodiment of the present invention.
9 is a flowchart illustrating a method of filtering by applying an anisotropic intermediate filter according to an embodiment of the present invention.
While the invention is susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of example in the drawings and will herein be described in detail. It should be understood, however, that the invention is not intended to be limited to the particular embodiments, but includes all modifications, equivalents, and alternatives falling within the spirit and scope of the invention. Like reference numerals are used for like elements in describing each drawing.
The terms first, second, etc. may be used to describe various components, but the components should not be limited by the terms. The terms are used only for the purpose of distinguishing one component from another. For example, without departing from the scope of the present invention, the first component may be referred to as a second component, and similarly, the second component may also be referred to as a first component. And / or < / RTI > includes any combination of a plurality of related listed items or any of a plurality of related listed items.
It is to be understood that when an element is referred to as being "connected" or "connected" to another element, it may be directly connected or connected to the other element, . On the other hand, when an element is referred to as being "directly connected" or "directly connected" to another element, it should be understood that there are no other elements in between.
The terminology used herein is for the purpose of describing particular example embodiments only and is not intended to be limiting of the present invention. Singular expressions include plural expressions unless the context clearly indicates otherwise. In this application, the terms "comprise" or "have" are intended to indicate that there is a feature, number, step, operation, component, part, or combination thereof described in the specification, and one or more other features. It is to be understood that the present invention does not exclude the possibility of the presence or the addition of numbers, steps, operations, components, components, or a combination thereof.
Hereinafter, with reference to the accompanying drawings, it will be described in detail a preferred embodiment of the present invention. Hereinafter, the same reference numerals are used for the same components in the drawings, and duplicate descriptions of the same components are omitted.
High Efficiency Video Coding (HEVC) -based 3D Video Coding (3D Video Coding) technology enables the acquisition, processing, and processing of 3D video including depth images of each viewpoint as well as multi-view video images. Includes all processes for transfer and playback. The depth image is an image representing 3D distance information of an object existing in the image, and a pixel value of the depth image informs depth information of the corresponding pixel. Since the accuracy of the depth image determines the quality of the virtual mid-view image synthesized using the depth image, it is important to generate an accurate depth image.
Therefore, the 3D video decoder according to the present invention is designed to decode not only a multiview video image but also a depth image. In addition, the three-dimensional video encoder is composed of two layers to encode each of the multi-view video image and the depth image, each layer is inter-view prediction based on the inter-view correlation as well as all the tools of HEVC. Coding is possible using the method. The 3D video decoder according to the present invention may perform the decoding process in a procedure opposite to that of the 3D video encoder.
Hereinafter, a 3D video decoder according to the present invention will be described. The 3D video decoder according to the present invention may be composed of two layers, a decoder of a multiview video image and a decoder of a multiview depth image. In this case, the multiview video image may be a color image. Therefore, the 3D video decoder according to the present invention may be used to decode a 2 viewpoint / 3 viewpoint video image and a 2 viewpoint / 3 viewpoint depth image.
1 is a block diagram schematically illustrating an apparatus for decoding a multiview video image according to an embodiment of the present invention.
Referring to FIG. 1, the apparatus 100 for decoding a multiview video image includes an
The bitstreams V 0 , V 1 , and V 2 of the encoded image may be input to the apparatus 100 for decoding a multiview video image. Each of the plurality of bitstreams V 0 , V 1 , and V 2 may be an image obtained at different time points. For example, the bitstream V 0 may be a base view image, and the base view is a view to which an image to be encoded independently belongs. In addition, the bitstreams V 1 and V 2 may be extended view images, and the extended view is a view to which an image encoded using information of the base view belongs.
The bitstreams V 0 , V 1 , and V 2 input to the decoding apparatus 100 of a multiview video image may be decoded by a procedure opposite to that of a 3D video encoder, and the 3D video encoder may use, for example, HEVC technology. Can be used to encode a multiview video image.
The
The
The
The
In the intra mode, the
The prediction block may be added to the residual block to generate a reconstruction block. The reconstruction block may be provided to the
The
In the case of the base view image, such as the bitstream V 0 , inter-view prediction is not performed and inter prediction / intra prediction is performed. Since the inter-view prediction is performed in the case of the extended view image such as the bitstreams V 1 and V 2 , image information of another view may be referred to. Therefore, in the case of the extended view, prediction may be performed by referring to the reference picture from the image buffer unit at another view. This reference structure for inter-view prediction will be described in more detail with reference to FIGS. 3 to 5.
As described above, the apparatus for decoding a multiview video image according to the embodiment of the present invention illustrated in FIG. 1 performs decoding of one base view image and decoding of two extended view images. Decoding may be performed on two or more extended view images.
2 is a block diagram schematically illustrating an apparatus for decoding a multiview depth image according to an embodiment of the present invention.
Referring to FIG. 2, the apparatus 200 for decoding a multiview depth image includes an
The bitstreams D 0 , D 1 , and D 2 of the encoded depth image may be input to the apparatus 200 for decoding a multiview depth image. Each of the plurality of bitstreams D 0 , D 1 , and D 2 may be depth images obtained at different views. For example, the bitstream D 0 may be a base view image, and the base view is a view to which an image to be encoded independently belongs. In addition, the bitstreams D 1 and D 2 may be extended view images, and the extended view is a view to which an image encoded using information of the base view belongs.
The bitstreams D 0 , D 1 , and D 2 input to the decoding apparatus 200 of the multi-view depth image may be decoded as opposed to the procedure of encoding the depth image in the 3D video encoder. The multiview depth image may be encoded by using the HEVC technique.
The
Also, the
That is, the quantization related information includes flag information indicating whether spatial axis quantization is performed. In addition, when spatial axis quantization is performed, the quantization related information may include difference value information about a quantized residual signal.
The
The
The
In the intra mode, the
The prediction block may be added to the residual block to generate a reconstruction block. The reconstruction block may be provided to the
For example, when the encoder performs encoding according to HEVC, the encoder may be an in-loop filter. In addition, when encoding is performed according to HEVC, a deblocking filter may be applied to remove blocking artifacts on a coding unit (CU) or a prediction unit (PU). If the encoder performs spatial axis quantization, blocking artifacts do not occur in the spatial domain. Therefore, in the present invention, a filter capable of removing edge noise while maintaining an edge component, for example, a generally known bidirectional filter ( bilateral filters) can be used.
In addition, the apparatus 200 for decoding a multiview depth image according to an embodiment of the present invention may use an anisotropic median filter to improve the accuracy of the edge region in the reconstructed depth image that has passed through the
The
In the case of the base view image such as the bitstream D 0 , inter-view prediction is not performed and inter prediction / intra prediction is performed. Since the inter-view prediction is performed in the case of the extended view image such as the bitstreams D 1 and D 2 , image information of another view may be referred to. Therefore, in the case of the extended view, prediction may be performed by referring to the reference picture from the image buffer unit at another view. This reference structure for inter-view prediction will be described in more detail with reference to FIGS. 3 to 5.
The decoding apparatus of the multi-view depth image according to the embodiment of the present invention illustrated in FIG. 2 has been shown to perform decoding of one basic view depth image and decoding of two extended view depth images. In addition, two or more extended view depth images may be decoded.
3 is a diagram illustrating an encoding structure for inter-view prediction of a multiview image to which the present invention is applied.
Referring to FIG. 3, three views V 0 , V 1 , and V 2 may be different views. The view point V 0 is a view that is encoded without prediction from another view and may be a base view or an I view. Point V 1, V 2 are with reference to the different points in time may be in an extended time of predictive coding, the point V 2 is a P point (Predictive view) that with reference to only a single time point the coded predictive encoding, the point V 1 is both It may be a B view (Interpolative view) that is predictively encoded with reference to two viewpoints.
Each picture is divided into an I picture (Intra picture), a P picture (Predictive picture), and a B picture (Interpolative picture) according to an encoding type. The I picture encodes the image itself without inter-picture prediction, the P picture predicts and encodes the picture using the reference picture only in the forward direction, and the B picture uses the reference picture in both the forward and backward directions to inter-picture predictive encoding.
A, with the exception of the point V 0 default time point (V 1, V 2), as shown in Figure 3 can be encoded by a cross-reference to an image obtained at different time points (V 0, V 1, V 2) , and The encoded image may be transmitted to the decoders illustrated in FIGS. 1 and 2. At this time, the view point V 0, which is the base view transmitted to the decoder, does not perform inter-view prediction but only inter or intra prediction between images or within an image. The viewpoints V 1 and V 2, which are extended views, perform inter-view prediction using a reference picture stored in the image buffer unit according to the reference structure shown in FIG. 3 to decode the picture. Here, the arrow indicates a reference relationship between the images.
FIG. 4 is a diagram illustrating an example of a reference structure for inter-view prediction of time V 2 shown in FIG. 3.
Referring to FIG. 4, the viewpoint V 2 may perform inter-view prediction with reference to an image acquired at the viewpoint V 0 . For example, the image B 6 of the view V 2 is based on a reference picture list 0 for forward prediction and a reference picture list 1 for backward prediction. , Inter-view prediction may be performed by referring to the image of view V 0 included in the reference image lists 0 and 1. For example, the reference picture list 0 and 1 includes a picture B 6 at the time of V 0 to the prediction point, the image of the point V 2 B 6, and refer to it.
In this case, when the reference picture is insufficient in the reference picture list 1, the reference picture included in the reference picture list 0 may be copied and used by using the generalized pan and b (GPB) concept of HEVC.
FIG. 5 is a diagram illustrating an example of a reference structure for inter-view prediction of time V 1 shown in FIG. 3.
Referring to FIG. 5, the viewpoint V 1 may perform inter-view prediction with reference to an image acquired at the viewpoint V 0 and an image acquired at the viewpoint V 2 . For example, the by placing all of the image B 6 and video B 6 of the point V 2 of the point V 0 in the reference picture list 0 and 1, the picture B 6 of the point V 1 reference you can perform a prediction time .
On the other hand, since the decoded depth image is used for synthesizing the virtual view, when the accuracy of the depth image is improved, the quality of the synthesized virtual view image may also be improved. Since the human visual system mainly recognizes three-dimensional depth through binocular parallax around sharp edges, the distortion in the edge region may not reduce the image quality of the three-dimensional video image and may not give a three-dimensional effect. Therefore, the subjective image quality of the virtual viewpoint image may be improved by minimizing the edge region distortion of the depth image.
When encoding the image, an error occurs in the entire region when the residual signal is converted and quantized in the frequency domain, thereby reducing the quality of the depth image. Accordingly, the present invention provides a spatial axis quantization method that can reduce the error caused by the frequency axis quantization and preserve the edge region in the depth image.
6 is a flowchart schematically illustrating a space axis quantization method according to an embodiment of the present invention. The method of FIG. 6 may be performed by a 3D video encoder (hereinafter, referred to as an 'encoder').
Referring to FIG. 6, the encoder obtains a residual signal by performing a prediction process on a current depth image (S600). The current depth image may be predicted based on a coding unit or a prediction unit, and the residual signal is a difference between the prediction unit on which the prediction is performed and the prediction target block in the current depth image.
The encoder determines whether to spatial-quantize the residual signal (S610). That is, the encoder determines whether to perform the frequency axis quantization by transforming the residual signal or perform spatial axis quantization without transforming the residual signal. This can be adaptively selected depending on whether the encoder has high efficiency in the process of performing Rate Distortion Optimization (RDO) on the basis of the transform unit for the current depth image.
Whether to perform such spatial axis quantization is determined based on the transform unit in the current depth image, and the information about the determined result may be encoded using a flag and then signaled to the decoder. For example, one bit (eg, spatial_quantization_enable_flag) for each transformation unit may be used to indicate whether to perform spatial axis quantization.
If it is determined in step S610 that spatial axis quantization is to be performed, the encoder generates spatial quantized residual signal by performing spatial axis quantization on the residual signal (S620). The spatial axis quantization may be applied in synchronization with a transform unit split flag based on the transform unit. Further, the number and representation values of the quantization representation levels in the spatial domain are determined according to the absolute error amount in the frequency domain for each quantization parameter. In this case, the expression value in the spatial domain may be set by the dispersion of the error generated in the reconstructed depth image. That is, the space axis quantizer may be designed to match the amount of errors generated by the space axis quantization to the amount of errors generated by the frequency axis quantization. In this case, each quantization parameter for spatial axis quantization is defined at the same time to the encoder and the decoder, the encoder does not need to transmit information for the quantizer to the decoder.
The encoder generates a difference value for the residual signal in units of pixels based on the quantized residual signal (S630). The difference value for the quantized residual signal is a difference value between the quantized residual signal of the current pixel in the current depth image and the quantized residual signal of neighboring pixels positioned around the current pixel.
For example, if the current pixel is located in the first column of the current depth image, the pixel located at the top of the current pixel can be determined as the surrounding pixel. If the current pixel is located except the first column in the current depth image, it is located to the left of the current pixel. The pixel may be determined as a neighboring pixel.
Accordingly, the encoder may calculate a difference value for the residual signal quantized in units of pixels using the current pixel and the neighboring pixels with respect to the current depth image, and may encode the same and transmit the encoded value to the decoder.
7 is a flowchart illustrating a method of inverse quantization in a spatial domain according to an embodiment of the present invention. The method of FIG. 7 may be performed by a decoder (hereinafter, referred to as a 'decoder') of the multi-view depth image illustrated in FIG. 2.
Referring to FIG. 7, the decoder entropy decodes the received bitstream to obtain quantization related information (S700). The quantization related information includes flag information indicating whether spatial axis quantization is performed on the current depth image. In the case where spatial axis quantization is performed in the 3D video encoder, difference information on the quantized residual signal is included together with the flag information.
The decoder determines whether spatial axis quantization is performed on the current depth image based on the quantization related information (S710). That is, the decoder can find out the quantization method performed by the encoder using flag information indicating whether to perform spatial axis quantization. For example, it may be determined whether to perform spatial axis quantization based on the value "0" or "1" of the flag spatial_quantization_enable_flag.
As a result of the determination in step S710, when the encoder determines that the residual signal is transformed and quantized into the frequency domain, the decoder performs inverse quantization based on the entropy decoded transform coefficients, and converts the dequantized transform coefficients to obtain a residual signal. (S720).
If it is determined in step S710 that the encoder quantizes the residual signal into the spatial domain without transformation, the decoder obtains the quantized residual signal by performing inverse quantization based on the quantization information, that is, the difference value information for the quantized residual signal. (S730).
In this case, the quantized residual signal may have redundancy unlike the quantization coefficients in the transformed frequency domain. Therefore, the quantized residual signal q 'according to the present invention may be determined by Equation 1 below. In addition, since the difference information about the quantized residual signal is a value calculated based on a pixel unit, the quantized residual signal q 'may be calculated for each pixel in the current depth image.
Where q is the difference value for the quantized residual signal obtained by entropy decoding, and p is the residual signal predicted from the neighboring pixels.
As described above, the difference value q for the quantized residual signal is a difference value between the quantized residual signal of the current pixel in the current depth image and the quantized residual signal of the neighboring pixels positioned around the current pixel. Therefore, according to Equation 1, the decoder predicts the residual signal from the neighboring pixels, and adds the difference value q of the quantized residual signal transmitted from the encoder to the predicted residual signal p of the current neighboring pixel. A quantized residual signal q 'for may be obtained. p may be a residual signal of a neighbor.
8 is a diagram illustrating a process of obtaining a quantized residual signal by applying a pixel-based prediction method according to an embodiment of the present invention.
Referring to FIG. 8, the quantized residual signal q 'of each pixel in the
For example, if the current pixel is located in the
As described above, the spatial axis quantization method according to an embodiment of the present invention can improve rate-distortion performance when applied to an area including an edge, and can reduce errors caused by quantization in the frequency domain. Therefore, the image quality of the depth image may be improved by improving the image quality of the depth image.
In addition, the present invention provides a method for removing blurring that may occur in a depth image reconstructed by a decoder and ringing artifacts occurring in an edge region in an image in order to improve a quality of a depth image. .
9 is a flowchart illustrating a method of filtering by applying an anisotropic intermediate filter according to an embodiment of the present invention. The method of FIG. 9 may be performed by a decoder (hereinafter, referred to as a 'decoder') of the multi-view depth image illustrated in FIG. 2. In addition, the decoder may be applied to the reconstructed depth image through the in-loop filter.
As described above, the anisotropic median filter may remove noise in a specific direction, and may filter pixels in an area to remove noise by an intermediate value of pixels in the area. For example, the decoder may generate a reconstructed depth image by adding a residual signal obtained based on the above-described spatial axis quantization and a prediction value obtained through the prediction of the depth image. In this case, filtering may be performed by applying an anisotropic intermediate filter to the reconstructed depth image.
Referring to FIG. 9, the decoder determines whether the current pixel area in the reconstructed depth image is an edge area (S900). The current pixel area refers to an area to which the current anisotropic intermediate filter is to be applied in the reconstructed depth image.
In this case, whether the current pixel area is an edge area is compared with a preset threshold based on a difference between pixel values in the current pixel area and intermediate values calculated from peripheral pixels located around the current pixel area. You can judge. For example, a determination equation such as Equation 2 below may be used.
here,
Is the median for the surrounding pixels located at the periphery of the current pixel region, and w i is the reconstructed pixel value at position i in the current pixel region.When the value S Dev calculated by Equation 2 is greater than a preset threshold, it may be determined that the current pixel area is an edge area. In this case, an anisotropic intermediate filter may be applied to the current pixel region.
If it is determined that the current pixel area is an edge area, the decoder divides the pixels in the current pixel area into a plurality of groups based on a pixel value to be applied to the anisotropic intermediate filter in the current pixel area (hereinafter referred to as a 'filtering pixel value'). Classify (S910). In this case, intermediate values of pixels included in the classified plurality of groups may be used as pixel values in the current pixel area.
For example, as shown in Equation 3 below, pixels in the current pixel area may be classified into two groups. Pixels in the current pixel region having a value less than or equal to the filtering target pixel value are classified into the first group R H , and pixels in the current pixel region having a value greater than or equal to the filtering target pixel value are arranged in the second group R. L )
Here, w i is a reconstructed pixel value at position i in the current pixel area, and w cur is a pixel value of the pixel to be filtered.
The decoder determines the filtering target pixel value based on the median value of the classified pixels in the current pixel area (S920). That is, the filtering target pixel value is determined based on the difference between each of the intermediate values calculated from each of the plurality of classified groups and the filtering target pixel value, and is determined as one of the intermediate values calculated from each of the plurality of classified groups. Is determined.
For example, the process of determining the pixel value to be filtered is shown in Equation 4 below.
Here, med is a function for outputting an intermediate value of input pixel values, and w cur is a pixel value of a pixel to be filtered.
As shown in Equation 4, when the filtering target pixel value is closer to the first intermediate value med (R H ) than the second intermediate value med (R L ), the filtering target pixel value is set to the first intermediate value. (med (R H )), and vice versa, an anisotropic intermediate filter may be applied to the edge region of the current depth image by replacing the pixel value to be filtered with the second intermediate value (med (R L )). .
The current depth image filtered by applying the above-described anisotropic intermediate filter may be stored in an image buffer and then used as a reference image. In addition, since the anisotropic intermediate filter technique is applied by utilizing peripheral pixels for each pixel, there is no need to signal additional information for the anisotropic intermediate filter technique.
Hereinafter, a high level syntax to which the above-described technique of the present invention is applied is shown.
Table 1 below shows a sequence parameter set (SPS) for the base view color image.
spatial_quantization_enable_flag indicates whether spatial axis quantization according to the present invention described above is performed. For example, the encoder may set the value of spatial_quantization_enable_flag to "0" or "1" according to whether spatial axis quantization is performed and transmit the same to the decoder.
Table 2 below shows a sub-sequence parameter set for the color image and the depth map of the enhanced view.
As described above, spatial_quantization_enable_flag indicates whether spatial axis quantization according to the present invention described above is performed. For example, the encoder may set the value of spatial_quantization_enable_flag to "0" or "1" according to whether spatial axis quantization is performed and transmit the same to the decoder.
color_video_flag indicates whether it is a color image or a depth image.
The color_inter_view_prediction_pictures_first_flag indicates whether to predict the inter-view of the color image. When performing inter-view prediction, a reference image list for a color image is generated using color_num_anchor_refs_list0, color_num_anchor_refs_list1, color_num_non_anchor_refs_list0, and color_num_non_anchor_refs_list1.
The depth_inter_view_prediction_pictures_first_flag indicates whether the depth image is inter-view prediction. When performing inter-view prediction, reference image lists for depth images are generated using depth_num_anchor_refs_list0, depth_num_anchor_refs_list1, depth_num_non_anchor_refs_list0, and depth_num_non_anchor_refs_list1.
Table 3 below shows information that may be included in the prefix network abstraction layer (NAL).
Table 4 below shows a picture parameter set (PPS).
The high level syntax may be added to the bitstream and transmitted from the encoder to the decoder. The decoder may decode information included in the high level syntax from the transmitted bitstream at the same level as the encoder. By using this, it is possible to decode the bitstream using a procedure opposite to that of the encoder.
The foregoing description is merely illustrative of the technical idea of the present invention, and various changes and modifications may be made by those skilled in the art without departing from the essential characteristics of the present invention. Therefore, the embodiments disclosed in the present invention are intended to illustrate rather than limit the scope of the present invention, and the scope of the technical idea of the present invention is not limited by these embodiments. The scope of protection of the present invention should be construed according to the claims, and all technical ideas within the scope of the claims should be construed as being included in the scope of the present invention.
Claims (17)
Obtaining a quantized residual of the current depth image based on the quantization information;
The quantization information is a multi-view video decoding method comprising the flag information indicating whether or not to perform spatial axis quantization (spatial quantization) for the current depth image.
When spatial axis quantization is performed on the current depth image, the quantization information includes difference value information on the quantized residual signal of the current depth image.
The difference value with respect to the quantized residual signal is a multi-view video decoding, characterized in that the difference between the quantized residual signal of the current pixel in the current depth image and the quantized residual signal of the peripheral pixels located around the current pixel. Way.
Acquiring the quantized residual signal,
Predicting a residual signal from the surrounding pixels; And
And adding a difference value between the predicted residual signal and the quantized residual signal of the current depth image.
The peripheral pixel,
When the current pixel is located in the first column of the current depth image, it is an upper pixel located on the top of the current pixel.
And the current pixel is a left pixel positioned to the left of the current pixel when the current pixel is located in a region other than the first column of the current depth image.
The flag information indicating whether the spatial axis quantization is performed is information encoded and transmitted based on a transform unit (TU).
A dequantization unit configured to obtain a quantized residual signal of the current depth image based on the quantization information,
And the quantization information includes flag information indicating whether to perform spatial axis quantization on the current depth image.
When spatial axis quantization is performed on the current depth image, the quantization information includes difference value information on the quantized residual signal of the current depth image.
The difference value with respect to the quantized residual signal is a multi-view video decoding, characterized in that the difference between the quantized residual signal of the current pixel in the current depth image and the quantized residual signal of the peripheral pixels located around the current pixel. Device.
The inverse quantization unit,
A multi-view video, wherein the residual signal is predicted from the neighboring pixels, and the difference value between the predicted residual signal and the quantized residual signal of the current depth image is added to obtain a quantized residual signal of the current depth image. Decryption device.
The peripheral pixel,
When the current pixel is located in the first column of the current depth image, it is an upper pixel located on the top of the current pixel.
And the current pixel is a left pixel positioned to the left of the current pixel when the current pixel is located in a region other than the first column of the current depth image.
And flag information indicating whether to perform the spatial axis quantization is information encoded and transmitted based on a transform unit (TU).
And performing an filtering using an anisotropic median filter on the current depth image reconstructed based on the entropy decoded signal.
Performing filtering by using the anisotropic intermediate filter,
Determining whether a current pixel area in the current depth image is an edge area;
If the current pixel area is an edge area, classifying pixels in the current pixel area into a plurality of groups based on a value of a pixel to be filtered in the current pixel area; And
Determining the filtering target pixel value based on a difference between each of the intermediate values calculated from each of the classified plurality of groups and the filtering target pixel value,
The filtering target pixel value is determined as one of the intermediate values calculated from each of the classified plurality of groups.
In the determining of whether the current pixel area is an edge area,
And comparing a predetermined threshold based on a difference between pixel values in the current pixel region and intermediate values calculated from neighboring pixels positioned around the current pixel region.
In the step of classifying into a plurality of groups,
Classify pixels in the current pixel area having a value less than or equal to the filtering target pixel value into a first group,
And classifying pixels in the current pixel area having a value greater than or equal to the filtering target pixel value into a second group.
In the determining of the filtering target pixel value,
The first intermediate value calculated from the first group and the second intermediate value calculated from the second group may be determined as the filtering target pixel value. Point video decoding method.
And storing the current depth image filtered using the anisotropic intermediate filter in an image buffer.
And a filter unit configured to perform filtering using an anisotropic median filter on the current depth image reconstructed based on the entropy decoded signal.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/KR2012/009938 WO2013077650A1 (en) | 2011-11-23 | 2012-11-22 | Method and apparatus for decoding multi-view video |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020110122687 | 2011-11-23 | ||
KR20110122687 | 2011-11-23 |
Publications (1)
Publication Number | Publication Date |
---|---|
KR20130057402A true KR20130057402A (en) | 2013-05-31 |
Family
ID=48665137
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020120133063A KR20130057402A (en) | 2011-11-23 | 2012-11-22 | Method and apparatus for multi-view color and depth videos decoding |
Country Status (1)
Country | Link |
---|---|
KR (1) | KR20130057402A (en) |
-
2012
- 2012-11-22 KR KR1020120133063A patent/KR20130057402A/en not_active Application Discontinuation
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10194174B2 (en) | Simplifications for boundary strength derivation in deblocking | |
US20200359056A1 (en) | Low complex deblocking filter decisions | |
KR101316060B1 (en) | Decoding method of inter coded moving picture | |
US9204168B2 (en) | Image decoding method, image coding method, image decoding apparatus, image coding apparatus, and image coding and decoding apparatus | |
JP6026534B2 (en) | Coding a motion depth map with varying depth range | |
TWI530165B (en) | Disparity vector derivation | |
KR101947142B1 (en) | Methods of decoding using skip mode and apparatuses for using the same | |
JP6446488B2 (en) | Video data decoding method and video data decoding apparatus | |
US20130142267A1 (en) | Line memory reduction for video coding and decoding | |
TW201404179A (en) | Redundancy removal for advanced motion vector prediction (AMVP) in three-dimensional (3D) video coding | |
JP2014524707A (en) | 3D video with asymmetric spatial resolution | |
US20130101027A1 (en) | Deblocking control by individual quantization parameters | |
CN114600466A (en) | Image encoding apparatus and method based on cross component filtering | |
KR101862498B1 (en) | Depth picture coding method and device in video coding | |
US20170310993A1 (en) | Movement information compression method and device for 3d video coding | |
KR20150105434A (en) | View synthesis in 3d video | |
KR20130067280A (en) | Decoding method of inter coded moving picture | |
JP2015519834A (en) | Method and system for processing multi-view video for view synthesis using motion vector predictor lists | |
KR20150090057A (en) | Multiview video signal encoding method and decoding method, and device therefor | |
CA2897299C (en) | Method and apparatus for processing video signal | |
KR20170065503A (en) | 3D video encoding / decoding method and apparatus | |
WO2013077650A1 (en) | Method and apparatus for decoding multi-view video | |
KR20130057402A (en) | Method and apparatus for multi-view color and depth videos decoding | |
KR101672008B1 (en) | Method And Apparatus For Estimating Disparity Vector | |
KR20130090846A (en) | Method and apparatus for depth video decoding |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A201 | Request for examination | ||
E902 | Notification of reason for refusal | ||
E601 | Decision to refuse application |