WO2016061743A1 - Segmental prediction for video coding - Google Patents
Segmental prediction for video coding Download PDFInfo
- Publication number
- WO2016061743A1 WO2016061743A1 PCT/CN2014/089040 CN2014089040W WO2016061743A1 WO 2016061743 A1 WO2016061743 A1 WO 2016061743A1 CN 2014089040 W CN2014089040 W CN 2014089040W WO 2016061743 A1 WO2016061743 A1 WO 2016061743A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- segment
- value
- pixel
- calculated
- prediction
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/593—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/105—Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/182—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/46—Embedding additional information in the video signal during the compression process
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/597—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
Definitions
- the invention relates generally to video/image processing.
- Prediction takes a critical role in video coding.
- a prediction block which is generated by intra-prediction or inter-prediction, is obtained first before producing residues at encoder or reconstructing the reconstruction samples at decoder.
- IBC intra-block copy
- SCC screen content coding
- HEVC high efficiency video coding
- Inter simplified depth coding is adopted into 3D-HEVC as a special prediction mode for depth coding.
- InterSDC Inter simplified depth coding
- a normal inter-prediction is performed for the current block first.
- each pixels in the prediction block is added by a coded offset.
- P i,,j represents the prediction value at pixel position (i, j) after performing the normal inter-prediction
- Offset is the offset coded for this block.
- the final prediction value at pixel position is P i,,j +Offset.
- InterSDC mode no residues are coded. Thus the final prediction value will be output as the reconstructed value.
- DLT Depth lookup table
- SDC intra simplified depth coding
- DMM depth map modeling
- Fig. 2 demonstrates an example of DLT approach.
- DLT is signaled in picture parameter set (PPS) . And it is left to be an encoder issue how to get the DLT when encoding.
- Fig. 3 and Fig. 4 demonstrate two examples where there are two and three segments with sharp edges in a block.
- the prediction block is processed by a segmental process before it is used to get residues at an encoder or get reconstruction at a decoder.
- the segmental prediction method comprises classifying pixels in the prediction block into different categories, named as ‘segment’ , wherein the pixels in the segment can be adjacent or not, treating the pixels in different segments in different ways, and forming a new prediction block after the treatment.
- the new prediction block is used to get the residues at the encoder or get the reconstruction at the decoder.
- Fig. 1 is a diagram illustrating intra block copying
- Fig. 2 is a diagram illustrating is a diagram illustrating an example of DLT where five valid values appear in depth samples.
- Fig. 3 is a diagram illustrating a block with two segments with sharp sample value changes
- Fig. 4 is a diagram illustrating a block with three segments with sharp sample value changes
- Fig. 5 is a diagram illustrating an exemplary segmental prediction architecture at the decoder
- Fig. 6 is a diagram illustrating an exemplary segmental process architecture
- Fig. 7 is a diagram illustrating an exemplary treatment for a segment. After the treatment, pixels in the segment can hold different values;
- Fig. 8 is a diagram illustrating an exemplary treatment for a segment. After the treatment, pixels in the segment hold only one value.
- Fig. 9 is a diagram illustrating four corners (painted in black) in a block.
- Fig. 10 is a diagram illustrating special positions (painted in black) in a block used to calculate E U .
- a segmental prediction method is proposed.
- the prediction block is processed by a segmental process before it is used to get the residues at encoder or get the reconstruction at decoder.
- the prediction block is processed by a segmental process, and then the modified prediction block is output as the reconstruction without adding to residues.
- Fig. 5 demonstrates segmental prediction architecture at the decoder.
- the prediction block can be obtained by intra-prediction, inter-prediction, intra-block copy prediction or any combination of them.
- a part of the prediction block can be obtained by inter-prediction, and another part of the prediction block can be obtained by intra-block copy prediction.
- Fig. 6 there are two steps generally as depicted in Fig. 6.
- the first step is called ‘classification’ , in which the pixels in a prediction block are classified into different categories, named as ‘segment’ . Pixels in a segment can be adjacent or not.
- the second step is called ‘treatment’ , in which pixels in different segments are treated in different ways. Finally, all the pixels after the treatment form a new prediction block which is output then.
- the number of segments can be any positive integers such as 1, 2, 3, etc.
- the prediction values can be classified according to their values.
- the prediction values can be classified according to their positions.
- the prediction values can be classified according to their gradients. Gradients can be got by applying operators such as Sobel, Roberts, and Prewitt.
- classification is not applied ifthe number of segments is one.
- the prediction block is classified into two segments in the classification step.
- a pixel is classified according to its relationship with a threshold number T. In one example, a pixel is classified into segment 0 if its value is larger than T, Otherwise it is classified into segment 1. In another example, a pixel is classified into segment 0 if its value is larger than or equal to T, Otherwise it is classified into segment 1.
- T is calculated as the average value of all the pixel values in the prediction block.
- T is calculated as the middle value of all the pixel values in the prediction block.
- T is calculated as the average value of four corner values in the prediction block.
- T is calculated as the average value of the four pixels painted in black color.
- the prediction block is classified into M (M>2) segments in the classification step.
- M M-1 threshold numbers
- a pixel is classified into segment 0 if its value is smaller than T 1 ; it is classified into segment 2 if its value is larger than T 2 ;otherwise it is classified into segment 1.
- a pixel is classified into segment 0 if its value is smaller than or equal to T 1 ; it is classified into segment 2 if its value is larger than T 2 ; otherwise it is classified into segment 1.
- T k is calculated as a function of all the pixel values in the prediction block, where k if from 1 to M-1.
- T k f k (P) , where P represents all the pixel values in the prediction block.
- T k is calculated as
- T 1 (T+Vmin) /2
- T 2 (Vmax+T) /2
- Vmax and Vmin are the minimum and the maximum pixel value in the prediction block respectively.
- an offset Off U is added to a pixel in a segment, denoted as segment U, in the treatment process to get the new prediction value.
- Vnew Vold+Off U , where Vold is the pixel value before the treatment and Vnew is the pixel value after the treatment respectively.
- E U is calculated as the average value of all the pixel prediction values in segment U.
- E U is calculated as the middle value of all the pixel prediction values in segment U.
- E U is calculated as the average value of the minimum and the maximum pixel prediction value in segment U.
- E U (V U max+V U min) /2, where V U max and V U min are the minimum and the maximum pixel prediction value in segment U respectively.
- E U is calculated as the mode value of the pixel values in segment U.
- the mode value is defined as the value that appears most often in segment U.
- E U is calculated based on pixels which are at some special positions as well as are in the segment U.
- E U is calculated as the mode value of the pixels which are at some special positions as well as are in the segment U.
- the mode value is defined as the value that appears most often in segment U.
- E U is calculated as the mode value of the pixels which are at the positions painted in black and in the segment U.
- a variable sampleCount [j] [k] is set equal to 0 for all k from 0 to (1 ⁇ BitDepth Y ) -1, and all j from 0 to nSegNum [xCb] [yCb] -1.
- a variable mostCount [j] is set equal to 0 for allj from 0 to nSegNum [xCb] [yCb] -1.
- a variable segPred [j] is set equal to (1 ⁇ (BitDepth Y -1) for all j from 0 to nSegNum [xCb] [yCb] -1.
- refSamples [x] [y] represents the sample values at position (x, y) in the block.
- segIdx [x] [y] represents the segment index for position (x, y) .
- segPred [j] represents the required E j for the segment j.
- a default E U is used if no pixel are at some special positions as well as are in the segment U. For example, if no pixels in the black position depicted in Fig. 10 belongs to the segment U, then a default value is assigned to E U .
- the default value can be 0, 128, 255, 1 ⁇ (bit_depth-1) , (1 ⁇ bit_depth) -1 or any other valid integers.
- the offset Off U for segment U can be signaled explicitly by the encoder to decoder, or it can be derived implicitly by the decoder.
- the offset Off U for segment U is calculated by the encoder according to the pixel original values in segment U and pixel prediction values in segment U. For example, the offset Off U for segment U is calculated by the encoder by subtracting the average value of all the pixel original values in segment U and the average value of all the pixel prediction values in segment U.
- the offset Off U for segment U is calculated by the encoder by subtracting the average value of all the pixel original values in segment U and E U .
- OffIdx U instead of Off U is coded.
- OffIdx U is the DLT index offset.
- a flag is signaled to indicate whether Off or OffIdx for all segments are zero in the block if segmental prediction is applied. If the condition holds, then no Off or OffIdx for segments in the block is signaled and all Off or OffIdx’s are implied as 0.
- Off or OffIdx if the flag indicates that at least one Off or OffIdx for a segments is not zero in the block if segmental prediction is applied, and all Off or OffIdx’s for segments before the last segments are signaled to be 0, then the Off or OffIdx for the last segment cannot be 0.
- Off-1 or OffIdx-1 instead of Off or OffIdx for the last segment should be coded.
- the decoded value plus 1 is assigned to Off or OffIdx for the last segment.
- the DLT index offset OffIdx U for segment U is calculated by the encoder by subtracting DLT index of the average value of all the pixel original values in segment U and the DLT index of E U .
- OffIdx U f (A U ) -f (E U ) , where A U is the average value of all the original pixel values in segment U and f represents a function mapping a depth value to a DLT index.
- the segmental prediction method can be used or not adaptively.
- the encoder can send the information of whether to use the segmental prediction method to the decoder explicitly. Or the decoder can derive whether to use the segmental prediction method in the same way as the encoder implicitly.
- residues of a block are not signaled and implied to be all 0 if segmental prediction is applied in the block.
- the segmental prediction method can be applied on coding tree unit (CTU) , coding unit (CU) , prediction unit (PU) or transform unit (TU) .
- CTU coding tree unit
- CU coding unit
- PU prediction unit
- TU transform unit
- the encoder can send the information of whether to use the segmental prediction method to the decoder in video parameter set (VPS) , sequence parameter set (SPS) , picture parameter set (PPS) , slice header (SH) , CTU, CU, PU, or TU.
- VPS video parameter set
- SPS sequence parameter set
- PPS picture parameter set
- SH slice header
- the segmental prediction method can only be applied for CU with some particular sizes. For example, it can only be applied to a CU with size larger than 8x8. In another example, it can only be applied to a CU with size smaller than 64x64.
- the segmental prediction method can only be applied for CU with some particular PU partition. For example, it can only be applied to a CU with 2Nx2N partition.
- the segmental prediction method can only be applied for CU with some particular coding mode. For example, it can only be applied to a CU with IBC mode.
- the segmental prediction method can only be applied for CU with InterSDC mode.
- the number of segments in the segmental prediction method is adaptively.
- the encoder can send the information of how many segments to the decoder explicitly when the segmental prediction method is used. Or the decoder can derive the number in the same way as the encoder implicitly.
- the encoder can send the information of how many segments to the decoder in video parameter set (VPS) , sequence parameter set (SPS) , picture parameter set (PPS) , slice header (SH) , CTU, CU, PU, or TU where the segmental prediction method is used.
- VPS video parameter set
- SPS sequence parameter set
- PPS picture parameter set
- SH slice header
- the encoder can send the information of the offsets for each segment to the decoder in video parameter set (VPS) , sequence parameter set (SPS) , picture parameter set (PPS) , slice header (SH) , CTU, CU, PU, or TU where the segmental prediction method is used.
- VPS video parameter set
- SPS sequence parameter set
- PPS picture parameter set
- SH slice header
- the encoder can send the information of whether to use the segmental prediction method to the decoder in a CU coded with InterSDC mode.
- the encoder can send the information of how many segments to the decoder in a CU coded with InterSDC mode.
- the encoder can send the information of the offsets or DLT index offsets for each segment to the decoder in a CU coded with InterSDC and in which the segmental prediction method is used.
- the segmental prediction method can be applied to the texture component. It can also be applied to depth components in 3D video coding.
- the segmental prediction method can be applied to the luma component. It can also be applied to chroma components.
- the decision of whether to use segmental prediction method can be made separately for each component, with information signaled separately. Or, the decision of whether to use segmental prediction method can be made together for all components, with a single piece of information signaled.
- the number of segments when segmental prediction method is used can be controlled separately for each component separately, with information signaled separately.
- the number of segments when segmental prediction method is used can be controlled together for all components, with a single piece of information signaled.
- the offsets for each segment when segmental prediction method is used can be decided separately for each component separately, with information signaled separately.
- the offsets for each segment when segmental prediction method is used can be decided together for all components, with a single piece of information signaled.
- an embodiment of the present invention can be a circuit integrated into a video compression chip or program codes integrated into video compression software to perform the processing described herein.
- An embodiment of the present invention may also be program codes to be executed on a Digital Signal Processor (DSP) to perform the processing described herein.
- DSP Digital Signal Processor
- the invention may also involve a number of functions to be performed by a computer processor, a digital signal processor, a microprocessor, or field programmable gate array (FPGA) .
- processors can be configured to perform particular tasks according to the invention, by executing machine-readable software code or firmware code that defines the particular methods embodied by the invention.
- the software code or firmware codes may be developed in different programming languages and different format or style.
- the software code may also be compiled for different target platform.
- different code formats, styles and languages of software codes and other means of configuring code to perform the tasks in accordance with the invention will not depart from the spirit and scope of the invention.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
A segmental prediction coding method is proposed. By segment prediction pixels into different segments with different treatments, prediction coding can be more efficient.
Description
FIELD OF INVENTION
The invention relates generally to video/image processing.
Prediction takes a critical role in video coding. When coding or decoding a block, a prediction block, which is generated by intra-prediction or inter-prediction, is obtained first before producing residues at encoder or reconstructing the reconstruction samples at decoder.
Besides inter-prediction and intra-prediction, a new prediction mode named intra-block copy (IBC) has been adopted in the screen content coding (SCC) profile for high efficiency video coding (HEVC) standard. IBC was adopted to take advantage of reduplicated content in a picture. As depicted in Fig. 1, a reference block in the current picture is copied to the current block as the prediction if IBC is applied. The reference block is located by applying a block-copying vector (BV) . The samples in the reference block must have been reconstructed already before the current block is coded or decoded.
Inter simplified depth coding (InterSDC) is adopted into 3D-HEVC as a special prediction mode for depth coding. When InterSDC is used, a normal inter-prediction is performed for the current block first. Then each pixels in the prediction block is added by a coded offset. Suppose Pi,,j represents the prediction value at pixel position (i, j) after performing the normal inter-prediction, Offset is the offset coded for this block. Then the final prediction value at pixel position is Pi,,j+Offset. With InterSDC mode, no residues are coded. Thus the final prediction value will be output as the reconstructed value.
Depth lookup table (DLT) is adopted into 3D-HEVC. Since there are often only several values appearing in the depth component, DLT signals those valid values from the encoder to the decoder. When a CU is coded in intra simplified depth coding (SDC) mode or depth map modeling (DMM) mode, DLT is used to map the valid depth value to a DLT index which is much easier to compress. Fig. 2 demonstrates an example of DLT approach. DLT is signaled in picture parameter set (PPS) . And it is left to be an encoder issue how to get the DLT when encoding.
Since prediction values come from reconstructed pixels, there are distortions between the prediction values and the original values, even if the original pixels in the current block and the original pixels in the reference block are exactly the same. Because the reconstructed signal
loses high-frequency information generally, the quality of prediction is deteriorated more when there are sharp pixel value changes in the reference block. Fig. 3 and Fig. 4 demonstrate two examples where there are two and three segments with sharp edges in a block.
SUMMARY OF THE INVENTION
In light of the previously described problems, a segmental prediction method is proposed. The prediction block is processed by a segmental process before it is used to get residues at an encoder or get reconstruction at a decoder. The segmental prediction method comprises classifying pixels in the prediction block into different categories, named as ‘segment’ , wherein the pixels in the segment can be adjacent or not, treating the pixels in different segments in different ways, and forming a new prediction block after the treatment. The new prediction block is used to get the residues at the encoder or get the reconstruction at the decoder.
Other aspects and features of the invention will become apparent to those with ordinary skill in the art upon review of the following descriptions of specific embodiments.
The invention can be more fully understood by reading the subsequent detailed description and examples with references made to the accompanying drawings, wherein:
Fig. 1 is a diagram illustrating intra block copying;
Fig. 2 is a diagram illustrating is a diagram illustrating an example of DLT where five valid values appear in depth samples.
Fig. 3 is a diagram illustrating a block with two segments with sharp sample value changes;
Fig. 4 is a diagram illustrating a block with three segments with sharp sample value changes;
Fig. 5 is a diagram illustrating an exemplary segmental prediction architecture at the decoder;
Fig. 6 is a diagram illustrating an exemplary segmental process architecture;
Fig. 7 is a diagram illustrating an exemplary treatment for a segment. After the treatment, pixels in the segment can hold different values;
Fig. 8 is a diagram illustrating an exemplary treatment for a segment. After the treatment, pixels in the segment hold only one value.
Fig. 9 is a diagram illustrating four corners (painted in black) in a block.
Fig. 10 is a diagram illustrating special positions (painted in black) in a block used to calculate EU.
The following description is of the best-contemplated mode of carrying out the invention. This description is made for the purpose of illustrating the general principles of the invention and should not be taken in a limiting sense. The scope of the invention is best determined by reference to the appended claims.
A segmental prediction method is proposed.
In one embodiment, the prediction block is processed by a segmental process before it is used to get the residues at encoder or get the reconstruction at decoder. In another embodiment, the prediction block is processed by a segmental process, and then the modified prediction block is output as the reconstruction without adding to residues. Fig. 5 demonstrates segmental prediction architecture at the decoder.
The prediction block can be obtained by intra-prediction, inter-prediction, intra-block copy prediction or any combination of them.
For example, a part of the prediction block can be obtained by inter-prediction, and another part of the prediction block can be obtained by intra-block copy prediction.
In the segmental process, there are two steps generally as depicted in Fig. 6. The first step is called ‘classification’ , in which the pixels in a prediction block are classified into different categories, named as ‘segment’ . Pixels in a segment can be adjacent or not. The second step is called ‘treatment’ , in which pixels in different segments are treated in different ways. Finally, all the pixels after the treatment form a new prediction block which is output then. The number of segments can be any positive integers such as 1, 2, 3, etc.
In one embodiment, the prediction values can be classified according to their values.
In another embodiment, the prediction values can be classified according to their positions.
In still another embodiment, the prediction values can be classified according to their gradients. Gradients can be got by applying operators such as Sobel, Roberts, and Prewitt.
In still another embodiment, classification is not applied ifthe number of segments is one.
In still another embodiment, the prediction block is classified into two segments in the classification step. A pixel is classified according to its relationship with a threshold number T. In one example, a pixel is classified into segment 0 if its value is larger than T, Otherwise it is classified into segment 1. In another example, a pixel is classified into segment 0 if its value is larger than or equal to T, Otherwise it is classified into segment 1.
In still another embodiment, T is calculated as a function of all the pixel values in the prediction block. In a formula way, T=f (P) , where P represents all the pixel values in the prediction block.
In still another embodiment, T is calculated as the average value of all the pixel values in the prediction block.
In still another embodiment, T is calculated as the middle value of all the pixel values in the prediction block.
In still another embodiment, T is calculated as the average value of four corner values in the prediction block. As an example depicted in Fig. 9, T is calculated as the average value of the four pixels painted in black color.
In still another embodiment, T is calculated as the average value of the minimum and the maximum pixel value in the prediction block. In a formula way, T= (Vmax+Vmin) /2, where Vmax and Vmin are the minimum and the maximum pixel value in the prediction block respectively.
In still another embodiment, the prediction block is classified into M (M>2) segments in the classification step. A pixel is classified according to its relationship with M-1 threshold numbers T1<=T2<=...<=TM-1. In one example where M is equal to 3, a pixel is classified into segment 0 if its value is smaller than T1; it is classified into segment 2 if its value is larger than T2;otherwise it is classified into segment 1. In another example, a pixel is classified into segment 0 if its value is smaller than or equal to T1; it is classified into segment 2 if its value is larger than T2; otherwise it is classified into segment 1.
In still another embodiment, Tk is calculated as a function of all the pixel values in the prediction block, where k if from 1 to M-1. In a formula way, Tk=fk (P) , where P represents all the pixel values in the prediction block.
In still another embodiment where M is equal to 3, Tk is calculated as
T1= (T+Vmin) /2
and
T2= (Vmax+T) /2,
where T is the average value of all the pixel values in the prediction block. Vmax and Vmin are the minimum and the maximum pixel value in the prediction block respectively.
In one embodiment as depicted in Fig. 7, an offset OffU is added to a pixel in a segment, denoted as segment U, in the treatment process to get the new prediction value. In a formula way, Vnew=Vold+OffU, where Vold is the pixel value before the treatment and Vnew is the pixel value after the treatment respectively.
In another embodiment as depicted in Fig. 8, all pixels possess the same value VU in a segment, denoted as segment U, after the treatment process. an offset OffU is added to an estimated value EU to obtain VU. In a formula way, VU=EU+OffU.
In another embodiment, EU is calculated as a function of all the pixel values in segment U. In a formula way, EU=f (PU) , where PU represents all the pixel values in segment U.
In still another embodiment, EU is calculated as the average value of all the pixel prediction values in segment U.
In still another embodiment, EU is calculated as the middle value of all the pixel prediction values in segment U.
In still another embodiment, EU is calculated as the average value of the minimum and the maximum pixel prediction value in segment U. In a formula way, EU= (VUmax+VUmin) /2, where VU max and VU min are the minimum and the maximum pixel prediction value in segment U respectively.
In still another embodiment, EU is calculated as the mode value of the pixel values in segment U. The mode value is defined as the value that appears most often in segment U.
An exemplary procedure to get the mode value of the pixel values in segment U is as follows. Suppose MinV and MaxV are possible the minimal and maximal pixel values respectively. When the bit depth is 8, MinV is 0 and MaxV is 255. For i from MinV to MaxV, a variable Count [i] is initialized to be 0. For each pixel in segment U, Count [v] ++, where v is the pixel value. Finally, m is output as the mode value if Count [m] is the largest in Count [i] swith i from MinV to MaxV.
In another embodiment, EU is calculated based on pixels which are at some special positions as well as are in the segment U.
In another embodiment, EU is calculated as the mode value of the pixels which are at some special positions as well as are in the segment U. The mode value is defined as the value that appears most often in segment U. For example, EU is calculated as the mode value of the pixels which satisfying (x%2==0) && (y% 2==0) in the segment U, where (x, y) represents the position. In one example as depicted in Fig. 10, EU is calculated as the mode value of the pixels which are at the positions painted in black and in the segment U. An exemplary algorithm can be described as follows,
A variable sampleCount [j] [k] is set equal to 0 for all k from 0 to (1<<BitDepthY) -1, and all j from 0 to nSegNum [xCb] [yCb] -1.
A variable mostCount [j] is set equal to 0 for allj from 0 to nSegNum [xCb] [yCb] -1.
A variable segPred [j] is set equal to (1<< (BitDepthY-1) for all j from 0 to nSegNum [xCb] [yCb] -1.
For y in the range of 0 to nTbS-1, inclusive, the following applies:
For x in the range of 0 to nTbS-1, inclusive, the following applies:
When y% 2==0 &&x % 2==0, the following applies:
j=segIdx [x] [y] .
sampleCount [j] [refSamples [x] [y] ] ++.
When sampleCount [j] [refSamples [x] [y] ] >mostCount [j] , mostCount [j] is set equal to sampleCount [j] [refSamples [x] [y] ] and segPred [j] is set equal to refSamples [x] [y] .
In the algorithm above, refSamples [x] [y] represents the sample values at position (x, y) in the block. segIdx [x] [y] represents the segment index for position (x, y) . segPred [j] represents the required Ej for the segment j.
In another embodiment, a default EU is used if no pixel are at some special positions as well as are in the segment U. For example, if no pixels in the black position depicted in Fig. 10 belongs to the segment U, then a default value is assigned to EU. The default value can be 0, 128, 255, 1<< (bit_depth-1) , (1<<bit_depth) -1 or any other valid integers.
In one embodiment, the offset OffU for segment U can be signaled explicitly by the encoder to decoder, or it can be derived implicitly by the decoder.
In one embodiment, the offset OffU for segment U is calculated by the encoder according to the pixel original values in segment U and pixel prediction values in segment U. For example, the offset OffU for segment U is calculated by the encoder by subtracting the average value of all the pixel original values in segment U and the average value of all the pixel prediction values in segment U.
In another embodiment, the offset OffU for segment U is calculated by the encoder by subtracting the average value of all the pixel original values in segment U and EU.
In still another embodiment, OffIdxU instead of OffU is coded. OffIdxU is the DLT index offset.
In still another embodiment, a flag is signaled to indicate whether Off or OffIdx for all segments are zero in the block if segmental prediction is applied. If the condition holds, then no Off or OffIdx for segments in the block is signaled and all Off or OffIdx’s are implied as 0.
In still another embodiment, if the flag indicates that at least one Off or OffIdx for a segments is not zero in the block if segmental prediction is applied, and all Off or OffIdx’s for segments before the last segments are signaled to be 0, then the Off or OffIdx for the last segment cannot be 0. For example, Off-1 or OffIdx-1 instead of Off or OffIdx for the last segment should be coded. And the decoded value plus 1 is assigned to Off or OffIdx for the last segment.
In still another embodiment, VU is calculated as VU=g (f (EU) +OffIdxU) , where f represents
a function mapping a depth value to a DLT index, and g represents a function mapping a DLT index to a depth value. f (EU) +OffIdxU should be clipped to a valid DLT index.
In another embodiment, the DLT index offset OffIdxU for segment U is calculated by the encoder by subtracting DLT index of the average value of all the pixel original values in segment U and the DLT index of EU. In a formula way, OffIdxU=f (AU) -f (EU) , where AU is the average value of all the original pixel values in segment U and f represents a function mapping a depth value to a DLT index.
In one embodiment, the segmental prediction method can be used or not adaptively. The encoder can send the information of whether to use the segmental prediction method to the decoder explicitly. Or the decoder can derive whether to use the segmental prediction method in the same way as the encoder implicitly.
In another embodiment, residues of a block are not signaled and implied to be all 0 if segmental prediction is applied in the block.
In still another embodiment, the segmental prediction method can be applied on coding tree unit (CTU) , coding unit (CU) , prediction unit (PU) or transform unit (TU) .
In still another embodiment, the encoder can send the information of whether to use the segmental prediction method to the decoder in video parameter set (VPS) , sequence parameter set (SPS) , picture parameter set (PPS) , slice header (SH) , CTU, CU, PU, or TU.
In still another embodiment, the segmental prediction method can only be applied for CU with some particular sizes. For example, it can only be applied to a CU with size larger than 8x8. In another example, it can only be applied to a CU with size smaller than 64x64.
In still another embodiment, the segmental prediction method can only be applied for CU with some particular PU partition. For example, it can only be applied to a CU with 2Nx2N partition.
In still another embodiment, the segmental prediction method can only be applied for CU with some particular coding mode. For example, it can only be applied to a CU with IBC mode.
In still another embodiment, the segmental prediction method can only be applied for CU with InterSDC mode.
In one embodiment, the number of segments in the segmental prediction method is adaptively. The encoder can send the information of how many segments to the decoder explicitly when the segmental prediction method is used. Or the decoder can derive the number in the same way as the encoder implicitly.
In still another embodiment, the encoder can send the information of how many segments to the decoder in video parameter set (VPS) , sequence parameter set (SPS) , picture parameter
set (PPS) , slice header (SH) , CTU, CU, PU, or TU where the segmental prediction method is used.
In still another embodiment, the encoder can send the information of the offsets for each segment to the decoder in video parameter set (VPS) , sequence parameter set (SPS) , picture parameter set (PPS) , slice header (SH) , CTU, CU, PU, or TU where the segmental prediction method is used.
In still another embodiment, the encoder can send the information of whether to use the segmental prediction method to the decoder in a CU coded with InterSDC mode.
In still another embodiment, the encoder can send the information of how many segments to the decoder in a CU coded with InterSDC mode.
In still another embodiment, the encoder can send the information of the offsets or DLT index offsets for each segment to the decoder in a CU coded with InterSDC and in which the segmental prediction method is used.
In still another embodiment, the segmental prediction method can be applied to the texture component. It can also be applied to depth components in 3D video coding.
In still another embodiment, the segmental prediction method can be applied to the luma component. It can also be applied to chroma components.
In still another embodiment, the decision of whether to use segmental prediction method can be made separately for each component, with information signaled separately. Or, the decision of whether to use segmental prediction method can be made together for all components, with a single piece of information signaled.
In still another embodiment, the number of segments when segmental prediction method is used can be controlled separately for each component separately, with information signaled separately. Or, the number of segments when segmental prediction method is used can be controlled together for all components, with a single piece of information signaled.
In still another embodiment, the offsets for each segment when segmental prediction method is used can be decided separately for each component separately, with information signaled separately. Or, the offsets for each segment when segmental prediction method is used can be decided together for all components, with a single piece of information signaled.
The methods described above can be used in a video encoder as well as in a video decoder. Embodiments of disparity vector derivation methods according to the present invention as described above may be implemented in various hardware, software codes, or a combination of both. For example, an embodiment of the present invention can be a circuit integrated into a video compression chip or program codes integrated into video compression software to
perform the processing described herein. An embodiment of the present invention may also be program codes to be executed on a Digital Signal Processor (DSP) to perform the processing described herein. The invention may also involve a number of functions to be performed by a computer processor, a digital signal processor, a microprocessor, or field programmable gate array (FPGA) . These processors can be configured to perform particular tasks according to the invention, by executing machine-readable software code or firmware code that defines the particular methods embodied by the invention. The software code or firmware codes may be developed in different programming languages and different format or style. The software code may also be compiled for different target platform. However, different code formats, styles and languages of software codes and other means of configuring code to perform the tasks in accordance with the invention will not depart from the spirit and scope of the invention.
The invention may be embodied in other specific forms without departing from its spirit or essential characteristics. The described examples are to be considered in all respects only as illustrative and not restrictive. To the contrary, it is intended to cover various modifications and similar arrangements (as would be apparent to those skilled in the art) . Therefore, the scope of the appended claims should be accorded the broadest interpretation so as to encompass all such modifications and similar arrangements.
Claims (15)
- A method of segmental prediction coding, in which a prediction block is processed by a segmental process before used to get residues at an encoder or get reconstruction at a decoder, comprising,classifying pixels in the prediction block into different categories, named as ‘segment’ , wherein the pixels in a segment is either adjacent or not;treating the pixels in different segments in different ways; andforming a new prediction block after treatment for all the pixels, wherein the new prediction block is used to get the residues at the encoder or get the reconstruction at the decoder.
- The method as claimed in claim 1, wherein prediction values are classified according to values, positions, or gradients.
- The method as claimed in claim 2, wherein the prediction block is classified into two segments, and a pixel is classified according to its relationship with a threshold number T.
- The method as claimed in claim 3, wherein T is calculated as a function of all the pixel values in the prediction block, and the function comprises one of the following calculations:T is calculated as an average value of all the pixel values in the prediction block;T is calculated as a middle value of all the pixel values in the prediction block;T is calculated as the average value of some pixels in the prediction block;T is calculated as the average value of four corner pixels in the prediction block;andT is calculated as an average value of a minimum and a maximum pixel value in the prediction block, T= (Vmax+Vmin) /2, where Vmax and Vmin are the minimum and the maximum pixel value in the prediction block respectively.
- The method as claimed in claim 2, wherein the prediction block is classified into M (M>2) segments, and a pixel is classified according to its relationship with M–1 threshold numbers T1<=T2<=...<=TM-1.
- The method as claimed in claim 5, wherein Tk is calculated as a function of all the pixel values in the prediction block, where k if from 1 to M-1, Tk=fk (P) , where P represents all the pixel values in the prediction block.
- The method as claimed in claim 5, wherein M is equal to 3, a pixel is classified into segment 0 if its value is smaller than T1; it is classified into segment 2 if its value is larger than T2; otherwise it is classified into segment 1, T1= (T+Vmin) /2 and T2= (Vmax+T) /2, where T is an average value of all the pixel values in the prediction block, Vmax and Vmin are a minimum and a maximum pixel value in the prediction block respectively.
- The method as claimed in claim 1, wherein an offset OffU is added to a pixel in a segment, denoted as segment U, in the treatment process to get the new prediction value, Vnew=Vold+OffU, where Vold is the pixel value before treatment and Vnew is the pixel value after treatment respectively.
- The method as claimed in claim 1, wherein all pixels possess a same value VU in a segment, denoted as segment U, after treatment, and an offset OffU is added to an estimated value EU to obtain VU, VU=EU+OffU.
- The method as claimed in claim 9, EU is calculated as a function of all the pixel values in segment U, EU=f (PU) , where PU represents all the pixel values in segment U, and functions include but not limited to:EU is calculated as an average value of all the pixel prediction values in segment U;EU is calculated as a middle value of all the pixel prediction values in segment U;EU is calculated as an average value of a minimum and a maximum pixel prediction value in segment U, EU= (VUmax+VUmin) /2, where VUmax and VUmin are a minimum and the maximum pixel prediction value in segment U respectively;EU is calculated as the mode value of the pixel values in segment U; The mode value is defined as the value that appears most often in segment U.
- The method as claimed in claim 9, wherein EU is calculated as a function of some pixel values in segment U, EU=f (PU) , where PU represents some pixel values in at some special positions in segment U.
- The method as claimed in claim 11, wherein EU is calculated based on pixels which are at some special positions as well as are in the segment U.
- The method as claimed in claim 12, wherein EU is calculated as the mode value of the pixels which are at some special positions in the segment U; For example, EU is calculated as the mode value of the pixels which satisfying (x%2==0) && (y%2==0) in the segment U, where (x, y) represents the position.
- The method as claimed in claim 12, wherein a default EU is used if no pixel are at some special positions as well as are in the segment U.
- The method as claimed in claim 14, wherein The default value can be 0, 128, 255, 1<<(bit_depth-1) , (1<<bit_depth) -1 or any other valid integers.
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2014/089040 WO2016061743A1 (en) | 2014-10-21 | 2014-10-21 | Segmental prediction for video coding |
PCT/CN2015/082074 WO2015196966A1 (en) | 2014-06-23 | 2015-06-23 | Method of segmental prediction for depth and texture data in 3d and multi-view coding systems |
CN201580001847.4A CN105556968B (en) | 2014-06-23 | 2015-06-23 | The device and method of predictive coding in three-dimensional or multi-view video coding system |
US15/032,205 US10244258B2 (en) | 2014-06-23 | 2015-06-23 | Method of segmental prediction for depth and texture data in 3D and multi-view coding systems |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2014/089040 WO2016061743A1 (en) | 2014-10-21 | 2014-10-21 | Segmental prediction for video coding |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2014/087094 Continuation-In-Part WO2016044979A1 (en) | 2014-06-23 | 2014-09-22 | Segmental prediction for video coding |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2016061743A1 true WO2016061743A1 (en) | 2016-04-28 |
Family
ID=55760039
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2014/089040 WO2016061743A1 (en) | 2014-06-23 | 2014-10-21 | Segmental prediction for video coding |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2016061743A1 (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080247657A1 (en) * | 2000-01-21 | 2008-10-09 | Nokia Corporation | Method for Encoding Images, and an Image Coder |
US20100303149A1 (en) * | 2008-03-07 | 2010-12-02 | Goki Yasuda | Video encoding/decoding apparatus |
US20130028321A1 (en) * | 2010-04-09 | 2013-01-31 | Sony Corporation | Apparatus and method for image processing |
CN103227921A (en) * | 2013-04-03 | 2013-07-31 | 华为技术有限公司 | HEVC (high efficiency video coding) intra-frame prediction method and device |
CN103780910A (en) * | 2014-01-21 | 2014-05-07 | 华为技术有限公司 | Method and device for determining block segmentation mode and optical prediction mode in video coding |
-
2014
- 2014-10-21 WO PCT/CN2014/089040 patent/WO2016061743A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080247657A1 (en) * | 2000-01-21 | 2008-10-09 | Nokia Corporation | Method for Encoding Images, and an Image Coder |
US20100303149A1 (en) * | 2008-03-07 | 2010-12-02 | Goki Yasuda | Video encoding/decoding apparatus |
US20130028321A1 (en) * | 2010-04-09 | 2013-01-31 | Sony Corporation | Apparatus and method for image processing |
CN103227921A (en) * | 2013-04-03 | 2013-07-31 | 华为技术有限公司 | HEVC (high efficiency video coding) intra-frame prediction method and device |
CN103780910A (en) * | 2014-01-21 | 2014-05-07 | 华为技术有限公司 | Method and device for determining block segmentation mode and optical prediction mode in video coding |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6529568B2 (en) | Method of sample adaptive offset processing of video coding | |
US10616584B2 (en) | Method and apparatus for encoding and decoding a texture block using depth based block partitioning | |
WO2017045580A1 (en) | Method and apparatus of advanced de-blocking filter in video coding | |
WO2017041676A1 (en) | Method and apparatus of context modelling for syntax elements in image and video coding | |
JP7308983B2 (en) | Cross-component adaptive loop filter for chroma | |
CN110036637B (en) | Method and device for denoising and vocalizing reconstructed image | |
US20130170558A1 (en) | Video decoding using block-based mixed-resolution data pruning | |
WO2015090217A1 (en) | Method and apparatus for palette table prediction | |
JP2011529593A (en) | Using repair techniques for image correction | |
WO2016131417A1 (en) | Method and apparatus for palette coding of monochrome contents in video and image compression | |
CN113196783B (en) | Deblocking filtering adaptive encoder, decoder and corresponding methods | |
US10244258B2 (en) | Method of segmental prediction for depth and texture data in 3D and multi-view coding systems | |
CN108141601B (en) | Method, apparatus and readable medium for encoding and decoding a sequence of pictures | |
CN115695787A (en) | Segmentation information in neural network-based video coding and decoding | |
US11006147B2 (en) | Devices and methods for 3D video coding | |
WO2016044979A1 (en) | Segmental prediction for video coding | |
WO2016061743A1 (en) | Segmental prediction for video coding | |
WO2015196333A1 (en) | Segmental prediction for video coding | |
WO2020007747A1 (en) | Deblocking of intra-reference samples | |
EP4391534A1 (en) | Inter-frame prediction method, coder, decoder, and storage medium | |
WO2024074131A1 (en) | Method and apparatus of inheriting cross-component model parameters in video coding system | |
CN114598873B (en) | Decoding method and device for quantization parameter | |
WO2022222744A1 (en) | Image encoding method and apparatus, image decoding method and apparatus, and electronic device and storage medium | |
WO2024074129A1 (en) | Method and apparatus of inheriting temporal neighbouring model parameters in video coding system | |
WO2023197181A1 (en) | Decoding method, encoding method, decoders and encoders |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 14904232 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 14904232 Country of ref document: EP Kind code of ref document: A1 |