CN106688238B - Improved reference pixel selection and filtering for intra-depth map coding - Google Patents

Improved reference pixel selection and filtering for intra-depth map coding Download PDF

Info

Publication number
CN106688238B
CN106688238B CN201480057368.XA CN201480057368A CN106688238B CN 106688238 B CN106688238 B CN 106688238B CN 201480057368 A CN201480057368 A CN 201480057368A CN 106688238 B CN106688238 B CN 106688238B
Authority
CN
China
Prior art keywords
partition
sample
current block
samples
block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201480057368.XA
Other languages
Chinese (zh)
Other versions
CN106688238A (en
Inventor
顾舟叶
郑建铧
林楠
张臣雄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority claimed from PCT/US2014/060873 external-priority patent/WO2015057947A1/en
Publication of CN106688238A publication Critical patent/CN106688238A/en
Application granted granted Critical
Publication of CN106688238B publication Critical patent/CN106688238B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

A video codec configured to receive a current block and a plurality of neighboring pixels, wherein the current block includes a first partition and a second partition, select at least one reference pixel from the plurality of neighboring pixels, and predict a plurality of pixels located in the second partition based on the reference pixel.

Description

Improved reference pixel selection and filtering for intra-depth map coding
Cross application of related applications
this application claims priority from prior applications, entitled U.S. provisional patent application No. 61/892,342 entitled "simplified DC predictor for depth map intra coding mode" by zhouuye Gu et al, filed on 2013, month 10, and day 17, and U.S. provisional patent application No. 61/923,124 entitled "adjacent reference pixel selection for depth map intra coding" by zhouuye Gu et al, filed on month 1, and day 2, 2014, both of which are incorporated herein by reference in their entirety.
Statement regarding research or development sponsored by federal government
Not applicable.
reference microfilm appendix
not applicable.
Background
Digital video capabilities can be applied to a number of devices such as digital televisions, digital direct broadcast systems, wireless broadcast systems, Personal Digital Assistants (PDAs), laptop or desktop computers, digital cameras, digital recording devices, digital media players, video gaming devices, video game consoles, cellular or satellite radio telephones, and video conferencing devices. Digital video devices implement various video compression techniques to more efficiently transmit and receive digital video information. Some video compression techniques are described in several standards including Moving Picture Experts Group (MPEG) -2, MPEG-4, International Telecommunication Union (ITU) telecommunication standardization sector (ITU-T) H.263, ITU-T H.264/MPEG-4, Part 10, Advanced Video Coding (AVC), and extensions of these standards, all of which are incorporated herein by reference. New video standards are also continuing to emerge and evolve. For example, the high performance video coding (HEVC) standard, sometimes referred to as h.265, is currently being developed by the joint collaborative group of video coding (JCT-VC for short) where MPEG and ITU-T are jointly established.
Disclosure of Invention
In one embodiment, the present invention includes a video codec configured to receive a current block and a plurality of neighboring pixels, wherein the current block includes a first partition and a second partition, select one or more reference pixels from the plurality of neighboring pixels, and predict a plurality of pixels located in the second partition based on the reference pixels.
In another embodiment, the invention includes an apparatus for video encoding, comprising a processor to: receiving a current block including a first partition and a second partition, wherein the first partition includes at least top-right, top-left, and bottom-left corner samples of the current block; selecting one reference sample from an upper-right neighboring block of the current block and a lower-left neighboring block of the current block; predicting samples of a second partition with the reference samples selected from the upper-right neighboring block and the lower-left neighboring block.
In yet another embodiment, the present invention includes a method for intra prediction for 3-dimensional HEVC (3D-HEVC), the method comprising: receiving a plurality of neighboring samples with respect to a current block, wherein the neighboring samples include a first sample located at a lower-right corner of a lower-left neighboring block and a second sample located at a lower-right corner of an upper-right neighboring block; receiving a partition mode indicating a partition of the current block into partition 0 and partition 1, wherein the partition 0 includes at least one of samples of an upper-right, an upper-left, and a lower-left corner of the current block; selecting one of the first sample and the second sample as a reference sample for partition 1 intra prediction.
The above and other features will be more clearly understood from the following detailed description taken in conjunction with the accompanying drawings and claims.
Drawings
for a more complete understanding of the present invention, reference is now made to the following brief description, taken in connection with the accompanying drawings and detailed description, wherein like reference numerals represent like parts.
Fig. 1 is a diagram illustrating an intra prediction scheme for 35 intra prediction modes allowed by the HEVC software model (HM);
FIGS. 2A to 2C are schematic diagrams of an embodiment of a wedge-shaped splitting scheme;
FIGS. 3A to 3C are schematic diagrams of embodiments of contour segmentation schemes;
FIG. 4A illustrates a block divided into two regions P1And P2A schematic diagram of (a);
FIG. 4B is a schematic illustration of the three constant split values (CPVs) generated for FIG. 4A;
FIGS. 5A and 5B are schematic illustrations of a preferred wedge sector and profile sector, respectively;
FIGS. 6A to 6D are schematic diagrams of 4 different segmentation modes, referred to as scenarios 1 to 4, respectively;
FIG. 7A is a diagram of an embodiment of a depth map intra prediction scheme;
FIGS. 7B-7C are schematic diagrams of a reference sample selection scheme further illustrating the principles underlying the scheme in FIG. 7A;
FIGS. 8A and 8B are schematic diagrams of additional embodiments of reference sample selection schemes;
FIGS. 9A and 9B show simulation results obtained by reference sample selection and filtering techniques;
FIG. 10 is a schematic diagram of an embodiment of a video encoder;
FIG. 11 is a schematic diagram of an embodiment of a video decoder;
FIG. 12 is a flow diagram of an embodiment of a method of intra prediction;
FIG. 13 is a schematic diagram of a general-purpose computer system.
Detailed Description
It should be understood at the outset that although illustrative implementations of one or more embodiments are provided below, the disclosed systems and/or methods may be implemented using any number of techniques, whether currently known or in existence. The present invention should in no way be limited to the illustrative embodiments, drawings, and techniques illustrated below, including the exemplary designs and embodiments illustrated and described herein, but may be modified within the scope of the appended claims along with their full scope of equivalents.
Video compression techniques perform spatial prediction and/or temporal prediction to reduce or remove redundancy inherent in video sequences. For block-based video coding, a video frame or slice is divided or partitioned into a plurality of blocks, called macroblocks or coding units (shortly referred to as CUs). The CU of the HEVC standard and the macroblock of the h.264 standard are used similarly, except that the CU may not be sized differently. A CU of an intra-coded frame or slice, also known as an intra-frame (I), may be coded by using spatial prediction with respect to neighboring CUs located in the same frame or slice. CUs of an inter-coded frame or slice, also known as predicted frames (P) or bilateral predicted frames (B), may use spatial prediction with respect to neighboring CUs located in the same frame or slice, or use temporal prediction with respect to other reference frames.
Reference sample selection and filtering techniques are disclosed herein to improve the quality of intra prediction when compressing or encoding depth map frames. In 3D-HEVC, a block currently being encoded (i.e., a current block) may be partitioned into two regions or partitions according to a particular intra prediction mode. In the embodiments disclosed herein, one or more reference samples are selected for intra prediction based on the way the two regions are partitioned. For example, if the first partition occupies the top-right, top-left, and bottom-left corner samples of the current block (this mode is sometimes referred to as case 1), the reference sample for the second partition prediction is either the first sample located in the bottom-right corner in the bottom-left neighboring block or the second sample located in the bottom-right corner in the top-right neighboring block. Further, the absolute difference between adjacent samples is calculated and compared to finally determine which reference sample to use as the average (sometimes referred to as a Direct Current (DC) value) of the second partition. In other embodiments of the present disclosure, multiple reference samples may be filtered prior to intra-prediction, and whether or not to filter the reference samples depends on various factors, such as intra-prediction mode, current block size, and signaling flag.
To a broader extent, the present invention investigates the problem of HEVC being questioned after the committee draft (CD for short). These problems include depth modeling mode (DMM for short), depth lookup table (DLT for short), prediction mode selection, view synthesis optimization, and/or HEVC extension. The present invention also contemplates methods for improving coding techniques by studying the balance between coding gain and computational complexity. The disclosed intra-prediction embodiments have been tested to measure the performance of the Bjontegaard Delta (BD) -rate). Simulation results show that the disclosed DC prediction technology can improve coding efficiency under 3D-HEVC random access (under common test conditions (CTC for short)) and full I-frame configuration.
A picture or video frame may contain a large number of samples or pixels (e.g., a 1920 x 1080 frame has 2,073,600 pixels). Therefore, it is very troublesome and inefficient to independently encode and decode (hereinafter, referred to as encoding) each pixel. To liftHigh coding efficiency, a video frame is decomposed into rectangular blocks, which are used as basic units for encoding, prediction, transformation, quantization, and other processes. For example, typically an N block contains N2And each pixel point is provided with N which is an integer greater than 1 and is usually a multiple of 4. The concept of new blocks has been introduced in HEVC compared to macroblocks used in previous coding standards. For example, a coding unit (CU for short) refers to subdividing a video frame into rectangular blocks having the same or variable size. According to the inter or intra prediction mode, a CU includes one or more prediction units (PU for short), each of which serves as a basic unit of prediction. For example, a 64 × 64 CU may be symmetrically divided into 4 32 × 32 PUs for intra prediction. For another example, a 64 × 64 CU may be asymmetrically divided into a 16 × 64 PU and a 48 × 64 PU for inter prediction. Similarly, a PU may include one or more transform units (TUs for short), each serving as a basic unit for transform and/or quantization. For example, a 32 × 32 PU may be symmetrically divided into 4 16 × 16 TUs. Multiple TUs of one PU may share one prediction mode, but may be transformed separately. Here, the term "block" may generally be any macroblock, CU, PU, or TU, depending on the context.
The block is encoded by an inter prediction encoding method or an intra prediction encoding method. In intra prediction, if a neighboring block in the same frame as a current block has been encoded, the neighboring block may be used to predict the current block (e.g., a PU that is currently being encoded). Typically, the blocks of a frame are encoded in left-to-right, top-to-bottom, or zigzag scan order. Thus, the current block may be predicted by one or more reference samples, which may be located in any adjacent block above or to the left of the current block. Block sizes in intra-prediction coding vary widely, e.g., between a smaller size (e.g., 4 x 4) and a larger size (e.g., 128 x 128). The direction of intra prediction (i.e., the direction from an already encoded block to the current block) determines the mode of intra prediction.
Fig. 1 shows an intra prediction scheme 100 illustrating 35 intra prediction modes allowed by the HEVC software model (HM for short). The 35 modes include 33 different intra-prediction directional modes, a planar mode (planar) and a DC mode. As shown in fig. 1, each of the 33 intra-prediction directional modes corresponds to an extrapolation direction that is at an angle of 45 degrees to-135 degrees from the horizontal in a clockwise direction. The 33 directional modes extend outward at approximately 180 degrees with an angular resolution of approximately 5.625 degrees. The angle refers to the extrapolation direction from the reference pixel to the pixel of the current block. In non-directional modes, including planar mode (denoted as mode 0) and DC mode (denoted as mode 1), a particular neighboring pixel sample may be used to predict a pixel of the current block. 3D-HEVC uses the conventional HEVC intra prediction mode and the new DMM mode to intra code the depth map, respectively. In particular, 3D-HEVC employs various partition-based depth map coding methods, including: depth modeling mode (DMM for short), segmented DC Coding (SDC for short) (previously referred to as simplified depth Coding, also SDC _ DMM1), region boundary Chain (RBC for short) mode, and Chain Coding (Chain Coding). The adjacent reference pixels can be used in the conventional intra prediction mode and the novel intra prediction mode respectively.
In the HEVC intra prediction coding process, various factors may affect and reduce prediction accuracy. For example, as the size of a PU increases, the prediction accuracy of pixels farther away from the predicted pixel may decrease. A decrease in prediction accuracy may result in more residual data, which in turn may result in more data that needs to be encoded, transmitted, and/or stored. To improve prediction accuracy, a smoothing filter may be used on reference pixels (predictor pixels) used for block prediction. However, in some cases, better results can be achieved without using any smoothing filter for the reference pixels. Current standards use adaptive intra-frame smoothing techniques for texture frame coding.
In 3D-HEVC, the same adaptive intra-frame smoothing techniques may be used for intra-coding the depth map. The features of the depth map may be sharp edges (representing the boundaries of the object) and large areas (representing the area of the object) where the sample values are hardly or slowly changing. Although HEVC intra prediction and transform coding techniques are suitable for almost invariant regions, it can cause severe coding distortion on sharp edges. These distortions are visible in the intermediate view after the synthesis, thereby degrading the video quality. In order to better display the edge of the depth map, 4 novel intra-frame prediction modes are specifically designed for depth map intra-frame coding. These 4 new modes are referred to as depth modeling modes (DMM for short), or DMM1-DMM4 for short. DMM may be integrated as an alternative to the traditional intra prediction mode in HEVC. With DMM, the reference sample selection scheme for texture frame coding may not always be suitable for depth frame coding. It is necessary to develop a new reference pixel selection algorithm for depth map intra prediction. In a new depth intra prediction mode (e.g., DMM1-4), filtered or unfiltered neighboring reference pixels can be used for intra prediction. Similar to the intra prediction mode, a residual representing a difference from the original depth signal approximately may be encoded by transform coding and transmitted.
In all four DMM modes, the depth block is approximated by a model that divides the region in which the block is located into two parts, partitions, or regions, where each rectangular or non-rectangular region is represented by a constant value. The information required for this model includes: segmentation information for specifying a region to which each sample belongs; and region value information for specifying a constant value of the sample corresponding to the region. The region value information is sometimes referred to as a constant division value (CPV for short). Wedge partitioning and contour partitioning are two different methods of partitioning a current block into two regions.
fig. 2A to 2C show an example of a wedge-shaped segmentation scheme in which two regions are separated by a straight line. The two regions are marked as P1And P2. The separation line is determined by a starting point S and an end point E, both located on different boundaries of the current block. For the continuous signal region 200 shown in fig. 2A, the separation line can be expressed by a straight line equation. Fig. 2B illustrates the segmentation of the dispersed sample region 230. Here, the block includes a size uB×vBOf samples of (1), wherein uBIndicates block width, vBIndicating the block height. The start sample labeled S, the end sample labeled E, and the middle thick straight line correspond to the boundary samples. The region P is in the dispersed sample region 230, although the separation line can also be expressed by a linear equation1And P2The definition of (a) is different, only the complete sample can be part of either of the two regions. In order to use wedge block partitioning in the encoding process, the partitioning information is stored in the form of a partitioning pattern. This pattern includes a size uB×vBeach element containing whether the corresponding sample belongs to the region P or not1Or P2The binary information of (2). Region P1And P2Which may be represented by black and white samples, respectively, in block 250 shown in fig. 2C. In one embodiment, it may be the region P1The white sample in (1) is assigned a partition value of 1, and the region P is assigned2The black sample in (1) is assigned a segmentation value of 0; and vice versa.
Fig. 3A to 3C show an embodiment of a contour segmentation scheme. An irregular line separating two regions cannot be simply described by a geometric function. In particular, FIG. 3A shows a continuous signal region 310, where P1And P2The two regions may be of any shape. Fig. 3B shows the transformation from a continuous signal region 310 to a discrete signal region 330. Fig. 3C shows a partitioning pattern of the block 350, in which binary information is represented as white and black samples. A region, e.g. P2and may include a plurality of discrete portions. The profile segmentation has very similar properties to the wedge segmentation except for the segmentation mode. To use contour block segmentation in the encoding process, a segmentation pattern is derived for each block of the signal from the reference block (e.g., as shown in fig. 3C). In contour segmentation, there is no search for the best matching partition due to the lack of a functional description of the region separation line, and no template lookup table.
FIG. 4A illustrates the division of a block 400 into two regions P1And P2. Fig. 4B shows 3 CPV types along the dashed line a-a, including original, predicted, and incremental CPVs. Whether in wedge form or in wheel formThe second information required for profile splitting, modeling of the signal of the depth block, is the CPV of each of the two regions. The CPV coding method is the same for all 4 modes described above, since it does not distinguish partition types, but assumes that a partition mode is given for the current depth block. Fig. 4B schematically illustrates the calculation of the original CPV as the average of the samples covered by the corresponding area. Although the original CPVs may bring optimal closeness for a given partition, they are not available from the decoder since the original samples are not transmitted to the decoder. On the other hand, the predicted CPV does not require the original samples at the decoder. Instead, the predicted CPV can be derived from information available at the decoder. Such information includes neighboring samples located in the left neighboring block and/or the top neighboring block. For example, fig. 4A illustrates that the top reference sample row or the left reference sample column includes the first sample, the last sample, and the middle sample. In predicting the DC partition value, two of the three samples (from the top row or left column) may be selected according to different situations to generate one DC prediction (using the reference sample as the average of the partitions). Note that more than two samples are required to compute the DC prediction at a time.
FIG. 5A shows a preferred wedge-shaped partition 500 in which two portions of the partition are separated by a thick straight line, designated P0And P1. FIG. 5B shows another preferred contour partition 530, where P0And P1the two portions are separated by a non-straight line. In 3D-HEVC, the partition-based approach, including DMM, SDC _ DMM1 and chain coding, is applied to depth map intra coding. According to the partition-based depth map intra coding method, one depth block may be divided into two parts or regions. Each portion, partition, or region in FIGS. 5A and 5B is represented as a single DC value (which may be represented as DC)0Or DC1). The DC value of each partition is predicted by one or two reconstructed neighboring reference samples. Further, each partition (P)0And P1) Can be further compensated by incremental DC/residual values. The residual values are displayed in the bitstreamAnd (4) sending the data.
The choice of which reference sample to use for intra prediction may depend on the partitioning mode (e.g., the location of the boundary line). Fig. 6A to 6D show 4 different segmentation patterns 600, 630, 650 and 670, respectively. The partitioning patterns 600, 630, 650, and 670 are also sometimes referred to as cases 1-4, respectively. The main difference between these 4 modes or cases is the partition value of the 3 corner samples of the current block, including the top left corner sample, the top right corner sample, and the bottom left corner sample. Specifically, in FIG. 6A, all corner samples 602, 604, and 606 are located in the same partition P0. In FIG. 6B, corner samples 632 are located in partition P0And corner samples 634 and 636 are located in partition P1. In FIG. 6C, corner samples 652 and 654 are located in partition P0And corner samples 656 are located in partition P1. In FIG. 6D, corner samples 672 and 676 are located in partition P0And the corner sample 674 is located in the partition P1. The current block may be of any suitable size, e.g., the size of the current block used in FIGS. 6A-6D is 8 x 8, and the sample value is represented as Cx,yWherein x is 0..7, and y is 0.. 7.
In fig. 6A-6D, more than two reconstructed neighboring reference samples may be used to predict partition P0and P1. In general, to optimize the accuracy of depth map intra prediction, a reference sample or samples are selected to stay as close as possible to as many partitioned samples as possible. Assuming that the reconstructed neighboring reference samples are denoted as pi,jWherein i ═ 1, j ═ 1..7, and i ═ 0..7, j ═ 1. Suppose again that the upper left corner sample (c)0,0) Is X, wherein X is 0 or 1; and assume a segmentation pattern bPatternx,yWhere x ═ 0.. N-1, y ═ 0.. N-1, and N denotes the size of the current block. Based on the above assumptions, the predicted DC value (e.g., DC)0And DC1) This can be obtained by the following operation:
Set bT ═ b pattern (bPattern)0,0!=bPatternN-1,0) ? 1:0, setting bL ═ b pattern0,0!= bPattern0,N-1)?1:0
If bT is equal to bL
DCX=(p-1,0+p0,-1)>>1 (1)
DC1-X=bL?(p-1,N-1+pN-1,-1)>>1:2B-1 (2)
Otherwise
DCX=bL?p(N-1)>>1,-1:p-1,(N-1)>>1 (3)
DC1-X=bL?p-1,N-1:pN-1,-1 (4)
Based on the above-listed steps, those skilled in the art will appreciate that FIG. 6A uses the reference sample p-1,0And p0,-1To predict partition P0(Note partition P)1Lack of reference sample). FIG. 6B uses a reference sample p-1,0And p0,-1To predict partition P0Using a reference sample p-1,7And p7,-1To predict partition P1. FIG. 6C uses a reference sample p3,-1To predict partition P0Using a reference sample p-1,7To predict partition P1. FIG. 6D uses a reference sample p-1,3to predict partition P0Using a reference sample p7,-1To predict partition P1. Referring to FIG. 6A, since partition P1Without direct neighboring samples, the conventional method uses a preset DC value to predict partition P-1. As a result, partition P in FIG. 6A1does not employ neighboring reference samples. Embodiments disclosed herein address this problem by extending the selection of reference samples to reference pixel points located in extended neighboring blocks (e.g., coding units). In brief, when performing DC prediction of DMM partition or region boundary chain (REC) partition, reference pixels for predicting the DC value of each partition portion may be selected from extended pixels of the left reference block or extended pixels of the top reference block.
FIG. 7A illustrates an embodiment of a depth map intra prediction scheme 700 for selecting reference samples to predict partition P in FIG. 6A1. As described above, in order to predict the upper left portion of the current block 710 (e.g., the partition P shown in white samples)0) The scheme 700 uses above-neighboring blocks 720 and left-neighboring blocks, respectivelyReference pixel 722 (e.g., p) in side neighbor block 730-1,0) And a reference pixel 732 (e.g., p)0,-1). Further, based on the availability of neighboring reference pixels, the scheme 700 predicts the lower-right region (e.g., partition P shown as black samples) using extended neighboring blocks1). In one embodiment, if the current block is of size N, the scheme 700 first checks the pixel point 742 (e.g., p)-1,2*N-1) And pixel 752 (e.g., p)2*N-1,-1) Whether it is available. Pixel 742 is located at the lower right corner of lower left neighboring block 740 and pixel 752 is located at the lower right corner of upper right neighboring block 750. As shown in FIG. 7A, both blocks 740 and 750 are diagonal neighbors of the current block 710.
If both pixels 742 and 752 are available, then scheme 700 calculates the absolute value of the difference between pixels 742 and 722, where the absolute difference is denoted as abs (p)-1,2*N-1-p-1,0). Similarly, scheme 700 calculates the absolute value of the difference between pixels 752 and 732, where the absolute difference is denoted as abs (p)2*N-1,-1-p0,-1). The scheme 700 then compares these two absolute values. If abs (p)-1,2*N-1-p-1,0) Greater than abs (p)2*N-1,-1-p0,-1) Then the pixel point 742 (i.e., p) is set-1,2*N-1) Is used as partition P1the DC prediction value of (1). On the contrary, if abs (p)-1,2*N-1-p-1,0) Abs (p) or less2*N-1,-1-p0,-1) Then the pixel 752 (i.e., p)2*N-1,-1)Is used as partition P1The DC prediction value of (1). Further, if one of the pixels 742 and 752 is not available, the other is used as the reference pixel. If neither of pixels 742 and 752 is available, then the closest available neighbor pixel is used to fill in pixels 742 and 752.
In one embodiment, the DC estimation algorithm described in scheme 700 is represented as the following operation:
if p is-1,2*N-1And p2*N-1,-1can be used for the first time and the second time,
DC1-X=abs(p-1,2*N-1-p-1,0)>abs(p2*N-1,-1-p0,-1)?p-1,2*N-1:p2*N-1,-1 (5)
Fig. 7B and 7C show the principle behind operation (5) further explained with reference to sample selection schemes 770 and 790, respectively. FIGS. 7B and 7C are similar to FIG. 7A, both belonging to case 1, wherein 3 of the 4 corner pixel points in the current block belong to the same partition P0. FIG. 7B shows partition P0And P1the extreme case of an almost horizontal boundary in between. In FIG. 7B, sample 772 (p)0,-1) And the upper left partition P0Adjacent, sample 774 (p)15,1) And partition P0Adjacent but lower right partition P1Far away. Sample 778 and partition P1Relatively close. Since the partition boundaries are typically used to reflect sharp edges, samples on the same side of the boundary line 771 may have similar luminance/chrominance values, while samples on the opposite side of the boundary line 771 may have completely different luminance/chrominance values. Therefore, p is judged according to the direction of the boundary 77115,-1May be equal to p0,-1Similarly, p0,-A luminance/chrominance value of 1 may be summed with p-1,0Not the same. Since the reference sample 778 (i.e., p co-located with the pixel point or sample 742 in FIG. 7A)-1,15) And said partition P1On the same side of said borderline 771, p should be chosen-1,15To predict partition P1The DC value of (c). Equation (5) is obtained by calculating the absolute difference abs (p)-1,15-p-1,0) And abs (p)15,-1-p0,-1) To confirm the principle, because abs (p) in FIG. 7B-1,15-p-1,0) Most likely greater than abs (p)15,-1-p0,-1)。
similarly, FIG. 7C shows partition P0and P1The other extreme case of an almost vertical boundary between them. Since the partition boundaries are typically used to reflect sharp edges, samples on the same side of the boundary line 791 may have similar luminance/chrominance values, while samples on the opposite side of the boundary line 791 may have completely different luminance/chrominance values. Thus, p15,-1may be equal to p0,-1In contrast, p-1,15May be equal to p-1,0Similarly. Since the reference sample p is located at the same position as the sample 752 in FIG. 7A-1,15And said partition P1On the same side of said borderline 791, p should be selected15,-1to predict partition P1The DC value of (c). Equation (5) is obtained by calculating the absolute difference abs (p)-1,15-p-1,0) And abs (p)15,-1-p0,-1) This principle is determined because abs (p) in FIG. 7C15,-1- p0,-1) Most likely greater than abs: (p-1,15-p-1,0)。
As described above with respect to FIG. 6A, when the 3 corner samples 602, 604, and 606 all belong to the same partition P0Schemes 700, 770, and 790 are used for case 1. In this case, equation (5) can be equivalently transformed into 3 separate operations:
vertAbsDiff=Abs(p[-1][0]-p[-1][nTbS*2-1]) (6)
horAbsDiff=Abs(p[0][-1]-p[nTbS*2-1][-1]) (7)
dcValBR=(horAbsDiff>vertAbsDiff)?p[nTbS*2-1][-1]:p[- 1][nTbS*2-1]) (8)
Those skilled in the art will appreciate that the variables used in the operations (6-8) are specified by the HEVC standard and are incorporated herein by reference. According to the section "process of deriving and assigning depth partition values" in HEVC standard, the following operations and procedures may be used to encode the current block. Note that the inputs in the encoding process are: (1) adjacent samples p [ x ] [ y ] of-1, y-1, nTbS 2-1 and x-0, nTbS 2-1, y-1; (2) a binary group partition pattern [ x ] [ y ] of 0.. nTbS-1, illustrating partitioning of the prediction block on partition 0 and partition 1; (3) a sample position (xTb, yTb) that illustrates a top-left sample of the current picture relative to a top-left sample of the current block; (4) the variable nTbS, describes the transform block size. The output of the above coding process is the prediction sample predSamples [ x ] [ y ] of x, y ═ 0.. nTbS-1.
in one embodiment, the variables vertEdgeFlag and horEdgeFlag may be derived as described below:
vertEdgeFlag=(partitionPattern[0][0]!=partitionPattern[nTbS- 1][0]) (I-61)
horEdgeFlag=(partitionPattern[0][0]!=partitionPattern[0][nTbS -1]) (I-62)
the variables dcValBR and dcValLT may be derived as described below:
if vertEdgeFlag is equal to horEdgeFlag, the following applies:
The variable dcValBR can be derived as explained below:
if horEdgeFlag is equal to 1, the following applies:
dcValBR=((p[-1][nTbS-1]+p[nTbS-1][-1])>>1)(I-63)
otherwise, (horEdgeFlag equal to 0), the following applies:
vertAbsDiff=Abs(p[-1][0]-p[-1][nTbS*2-1]) (I-64)
horAbsDiff=Abs(p[0][-1]-p[nTbS*2-1][-1]) (I-65)
dcValBR=(horAbsDiff>vertAbsDiff)?p[nTbS*2-1][-1]:p[- 1][nTbS*2-1]) (I-66)
The variable dcValLT can be derived as described below:
dcValLT=(p[-1][0]+p[0][-1])>>1 (I-67)
Otherwise, (horEdgeFlag does not equal vertEdgeFlag), the following applies:
dcValBR=horEdgeFlag?p[-1][nTbS-1]:p[nTbS-1][-1] (I-68)
dcValLT=horEdgeFlag?p[(nTbS-1)>>1][-1]:p[-1][(nTbS- 1)>>1] (I-69)
The prediction sample value predSamples [ x ] [ y ] can be derived as follows:
For x in this range 0 to (nTbS-1), the following applies:
For y in this range of 0 to (nTbS-1), the following applies:
The variables predDcVal and dcOffset can be derived as explained below:
predDcVal=(partitionPattern[x][y]==partitionPattern[0][0])? dcValLT:dcValBR (I-70)
dcOffset=DcOffset[xTb][yTb][partitionPattern[x][y]] (I-71)
If DltFlag [ nuh _ layer _ id ] is equal to 0, the following applies:
predSamples[x][y]=predDcVal+dcOffset (I-72)
Otherwise (DltFlag [ nuh _ layer _ id ] equals 1, the following applies:
predSamples[x][y]=Idx2DepthValue[Clip1Y(DepthValue2Idx [predDcVal]+dcOffset)] (I-73)
it is noted that the reference pixel points that can be used for DC prediction are not limited to the positions shown in fig. 7A-7C. The weighted combination of any reference pixels can be used to make DC prediction. A preferred combination is given below:
DC=a1*px1,y1+a2*px2,y2+a3*px3,y3+…an*pxn,yn,
wherein a is1…anAs weighting coefficient, px1,y1…pxn,ynFor neighboring reference pixels, DC is the predicted value of the partition.
in one embodiment, p is used by the following algorithm-1,N-1And pN-1,-1To predict the lower right partition P1
DCright-bottom=a1*p-1,N-1+a2*pN-1,-1
Wherein, a1And a2Are weighting coefficients.
In this embodiment, reference pixels in the top-right and bottom-left blocks can be used for DC prediction for any of the scenarios in FIGS. 6A-6D (not limited to scenario 1). A preferred algorithm is given below:
DC=a1*px1,y1+a2*px2,y2+a3*px3,y3+…aM*pxM,yM,
Where xM and yM may be larger than the block size N.
referring to case 3 in FIG. 6C, in one embodiment, p may be used according to the following algorithm-1,0,p(N-1)>>1,-1,p0,-1And pN-1,-1By weighted average ofSide top partition P0DC value of (d):
DCup=a1*p-1,0+a2*p(N-1)>>1,-1+a3*p0,-1+a4*pN-1,-1,
Wherein, a1、a2And is a weighting coefficient.
For example, p is used according to the following algorithmN-1,-1And pN,-1To predict the DC value for case 4 in FIG. 6D:
DCright=a1*pN,-1+a2*pN-1,-1.
in addition to scheme 700, alternative schemes may be used to select one or more best reference pixels to predict the partition P1. Fig. 8A and 8B illustrate embodiments of reference sample selection schemes 800 and 850, respectively. The segmentation patterns in fig. 8A and 8B, although not looking the same, still belong to case 1 (as in fig. 6A). They have a common feature that, assuming a block size of N × N, the upper right pixel point (c)0,N-1) And the upper left pixel point (c)0,0) Same, lower left pixel point (c)N-1,0) And the upper left pixel point (c)0,0) The same is true. Accordingly, in one embodiment, the profile sections P1The DC value of the pixel points in (black pixel points in FIGS. 8A and 8B) can be according to the lower right partition P in FIG. 6A1Determined in the same manner. However, the special feature of the segmentation patterns 800 and 850 makes other pixel points referred to as better DC reference samples.
FIG. 8A shows a sub-Case (Case 1a) of Case 1, in which (c)0,0),(c7,0) And (c) and0,7) Has the same division value, but the division value is equal to (c)3,0) Are different. FIG. 8B shows another sub-situation (Case 1B) of Case 1, where (c)0,0),(c7,0) And (c) and0,7) Has the same division value, but the division value is equal to (c)0,3) Are different. In practice, if the top row (c)0,0) And (c)N,0) Any one of the division values of (c) and (c) shown in FIG. 8A0,0) Is/are as followsThe binary partition values are different, meaning that the partition graph is predicted according to the DMM4 mode (typically, DMM1-4 is all available for case 1). Similarly, if (c) of one column on the left0,0) And (c)0,N) Any one of the division values in between and (c) shown in FIG. 8B0,0) Different means that the partition mode is also predicted according to the DMM4 mode. In view of the differences between FIGS. 8A-8B and FIG. 6A, p0,-1and pN,-1or p-1,0and p-1,Nreference pixel point between (e.g., reference pixel point p)3,-1and p-1,3) A better DC prediction value can be obtained. In one embodiment of the scheme 800, the profile is partitioned into sections P1The DC value of the middle pixel point is set as a middle reference pixel point (p)(N-1)>>1,-1) Or (p)N>>1,-1) The value of (c). Similarly, in one embodiment of the scheme 850, the partition P is partitioned1is set as a middle reference pixel point (p)-1,(N-1)>>1) Or (p)-1,N>>1) The value of (c).
In one embodiment, if an upper pixel point in the current block (c)(N-1)>>1,0) The division value of (a) and the upper left sample (c)0,0) Has different division values, as shown in fig. 8A, the middle-upper reference pixel point (p) is selected(N-1)>>1,-1) As the contour partition P1The DC prediction value of (1). Alternatively, if the pixel point in the left of the left column of the current block (c)0,(N-1)>>1) And the upper left pixel point (c)0,0) As shown in fig. 8B, the reference pixel point (p) in the left is selected-1,(N-1)>>1) As the contour partition P1The DC value of (c). Although not shown in fig. 8A or 8B, if the pixel point in the middle left (c)0,(N-1)>>1) The division value and the middle-upper pixel point (c)(N-1)>>1,0) If the division values are the same, the reference pixel point (p) in the left middle can be obtained-1,(N-1)>>1) And the upper middle reference pixel (p)(N-1)>>1,-1) As the profile sub-section P1The DC prediction value of (1).
The embodiment of the DC estimation algorithm in case 1 above can be described by the following operations:
Set bTM ═ (bPattern)0,0!=bPattern(N-1)>>1,0) ? 1:0, and (9)
Set bLM ═ (bPattern)0,0!=bPattern0,(N-1)>>1)?1:0。 (10)
if bTM is not equal to bLM,
DCright-bottom=bLM?p-1,(N-1)>>1:p(N-1)>>1,-1; (11)
If not, then,
DCright-bottom=(p-1,(N-1)>>1+p(N-1)>>1,-1)>>1. (12)
Case 1 shown in fig. 6A may include another case (denoted as case 2) in which the pixel point in the middle left (c)0,(N-1)>>1) The division value and the middle-upper pixel point (c)(N-1)>>1,0) Is the same as the division value of (a), (b), or (c)0,(N-1)>>1) And (c)(N-1)>>1,0) Are all reacted with (c)0,0) Are equal. In another embodiment of case 2, the method described above (e.g., combining operation (5)) is still used to predict the lower right partition P1
As described above, any of the 35 shown intra prediction modes, as well as the new depth intra prediction mode, may be used for intra prediction of the 3D-HEVC depth map. When the decoder performs intra depth map prediction, decoded put boundary samples of neighboring blocks are used as reference data for spatial prediction in regions where inter prediction is not performed. All TUs within a PU may use the same associated inter prediction mode for one luma component and two chroma components. The encoder selects the best luma inter prediction mode for each PU from the 35 options. Such as 33 directional prediction modes, 1 DC mode, and 1 planar mode shown in fig. 1. For the luma component, neighboring reference samples may be filtered prior to the inter prediction process. The filtering is controlled by a given inter prediction mode and/or transform block size. For example, the filtering rules may be designed such that: if the inter prediction mode is DC or the transform block size is equal to 4 × 4, filtering neighboring samples is not performed; further, if the distance between a given inter prediction mode and the vertical mode (or horizontal mode) is greater than a predefined threshold, filtering is started. Table 1 details an example of a predefined threshold, where nT represents a block (e.g., transform unit) size.
TABLE 1 description of predefined thresholds for various transform block sizes
nT=8 nT=16 nT=32
Threshold value 7 1 0
Any suitable filter design may be used to perform the reference sample filtering. For example, adjacent sample filtering may be performed using [1,2,1] filtering and bilinear filtering. In one embodiment, the bilinear filter may be used if all of the following conditions are satisfied:
Strong _ inter _ smoothing _ enable _ flag (strong _ intra _ smoothing _ enable _ flag) is equal to 1;
Transform block size is equal to 32;
Abs (p < -1 > [ -1] + p [ nT 2-1] [ -1] -2 x p [ nT-1] [ -1]) < (BitDepthY-5)); and
Abs(p[-1][-1]+p[-1][nT*2-1]–2*p[-1][nT-1])<(1<< (BitDepthY-5))。
The above-mentioned neighboring samplesThe filtering process of (2) may be performed in depth map inter-frame coding. Embodiments of reference sample filtering disclosed herein (including below) may enable more accurate DC prediction and may improve depth map coding performance. In an embodiment, the application of filtering or smoothing to reference pixels for depth map intra prediction depends on the intra prediction mode (e.g., direction or angle) and/or block size. For example, if the width and height of a PU is less than or equal to 8 (i.e., the PUwidth/height8) then filtering of the neighboring samples may not be performed to maintain sharp edges on the neighboring samples. PU (polyurethane)width/heightIs a variable indicating the PU/block size, representing either width or height. If the width value and the height value of a block are not equal, the block is not a square. Otherwise, if the width value and the height value of a block are equal, the block is square.
PU width/height for making reference sample filtering decisions is not limited to PUswidth/heightLess than or equal to 8. That is, any predefined range of PU widths/heights (e.g., a ≦ PUwidth/heightb is less than or equal to b; or PUwidth/heightA is less than or equal to a and b is less than or equal to PUwidth/heightWhere a and b are integers and b > a) can both be used to make the decision on filtering of the reference sample. For example, if a is 4 and b is 16, the filtering case may be: if 4 is less than or equal to PUwidth/heightIf the value is less than or equal to 16, filtering is not needed. For another example, if a is 4 and b is 16, the filtering case may be: if PUwidth/heightPU of not more than 4 or not more than 16width/heightThen no filtering is used. For another example, if a is 4 and b is 8, the filtering case may be: if 4 is less than or equal to PUwidth/heightIf the value is less than or equal to 8, filtering is not needed.
in one embodiment, a specific intra prediction mode is preset to always select the filtered reference sample for intra prediction. For example, the planar mode and the DC mode always use filtered reference samples for intra prediction. Optionally, the filtering rule may be designed to: if the index of the intra prediction mode is within a certain predefined range, the filtering process of the reference pixel point can be skipped. For example, if the intra prediction mode ID is 0 or 1, the reference pixel point filtering process may skipAnd then the mixture is processed. In another embodiment, the decision to reference sample filter depends on a combination of PU width/height and intra mode. For example, if 4 ≦ PUwidth/heightreference sample filtering may be skipped if 16 is ≦ and the current prediction mode is planar or DC mode.
Furthermore, the choice of filter for smoothing is mode independent. That is, not all intra prediction modes may use the same filter. Any suitable filter coefficients may be used for adjacent sample filtering. In other words, the weighted combination of any reference pixel can be used for neighboring sample filtering for depth map intra prediction. A preferred combination is given below:
Px3,y3=a1*px1,y1+a2*px2,y2+a3*px3,y3+…an*pxn,yn,
Wherein, Px3,y3For filtered reference pixels, px1,y1、px2,y2、px3,y3…..pxn,ynis a reference pixel point before filtering1、a2、a3…anAnd a coefficient of 1.
In one embodiment, if multiple reference pixels are selected as the prediction values for the block partition, a simple pixel filter, such as a filter with coefficients of [1,1] or [1,2,1], may be used for the multiple reference pixels. In addition, for the reference pixel filtering decision of the intra prediction mode, the reference pixel is not limited to a certain row and a certain column of the neighboring blocks. Multiple rows, columns, and combinations thereof on one or more neighboring reference blocks may be used for intra prediction. A preferred combination is given below:
Px3,y3=a1*px1,y1+a2*px2,y2+a3*px3,y3+…aM*pxM,yM,
Wherein, Px3,y3For the filtered reference pixel, xM and yM may be less than-1, indicating more than one row and column of the neighboring reference block.
Further, the decision whether or not to employ smoothing is sometimes based on a combination of intra prediction mode, PU/block size, and smoothing flag. In one embodiment, the flag indicating the filter on/off (or enable/disable) state is transmitted via a bitstream. For example, setting the flag to "0" indicates that filtering is off or disabled, and setting the flag to "1" indicates that filtering is on or enabled; and vice versa. The flag may be transmitted in a block syntax or a picture layer syntax. Alternatively, the signaling flag may be implicitly derived. In this case, there is no need for a flag bit in the bitstream. For example, if reference pixel filtering is turned off for depth maps in the encoder and/or decoder (codec), it is not necessary to transmit a flag in the picture layer syntax or block syntax. Otherwise, if the reference pixel filtering is on, the flag is implied by some PU characteristics such as PU size and/or intra prediction mode used by the PU. For example, the depth reference pixel filtering flag may be inferred to be on or enabled when the intra prediction mode is the directional prediction mode (indices 2-35), and off or disabled when in the non-directional mode (planar and DC modes).
The partition value that can be used to check whether the block partition is the contour partition is not limited to the middle-up partition value (c)(N-1)>>1,0) Or left median division value (c)0,(N-1)>>1) Any other suitable samples and weighted combinations thereof may be used to determine whether a block partition is a contour partition. For example, if the value (c) is divided in FIG. 6A(N-1)>>2,0) Is not equal to (c)0,0) Then the block is partitioned into P1It is classified as a contour partition.
In one embodiment, the segmentation values can be used to examine not only the contour partition of case 1 in FIG. 6A, but also the contour partitions of cases 2-4 in FIGS. 6B-6D. DC prediction can then be done accordingly. For example, if in case 3 of FIG. 6C, the median upper partition value (C)(N-1)>>1,0) And the upper left division value (c)0,0) Otherwise, the partition of the block belongs to the contour partition P1. At this time, the middle upper reference pixel value (p) should no longer be used(N-1)>>1,-1) As the upper partition P0The DC prediction value of (1). Instead, the upper right reference pixel point (p) may be usedN-1,-1) And all the above-mentioned herbsExamination pixel point (p)0,-1) Is taken as the upper partition P0The DC prediction value of (1).
Fig. 9A and 9B show here simulation results obtained by reference sample selection and filtering techniques. Specifically, fig. 9A shows BD ratio results for the 3-view case only under the intra condition, and fig. 9B shows BD ratio results for the 3-view case under the common test condition. Based on the above results, the described depth map reference pixel selection and filtering techniques can improve coding efficiency under 3D-HEVC random access (under CTC) and all intra configurations. For example, as shown in fig. 9A, the video peak signal-to-noise ratio (PSNR)/total bit rate drops by 0.03% on average, and the integrated PSNR/total bit rate drops by 0.01% on average. As shown in fig. 9B, the video PSNR/total bit rate drops by 0.1% on average, and the integrated PSNR/total bit rate drops by 0.1% on average.
Fig. 10 illustrates an embodiment of a video encoder 1000 to implement the disclosed intra-prediction embodiments (e.g., schemes 700, 770, 790, 800, and 850 described above). It is noted that the disclosed embodiments may also be implemented by other video codecs. The video encoder 1000 includes a Rate Distortion Optimization (RDO) module 1010, a prediction module 1020, a transform module 1030, a quantization module 1040, an entropy encoder 1050, an inverse quantization module 1060, an inverse transform module 1070, and a reconstruction module 1080 arranged as shown in fig. 10. In operation, the video encoder 1000 may receive input video comprising a sequence of video frames (or slices). Here, a frame may refer to any predicted frame (P-frame), intra-coded frame (I-frame), or bilateral predicted frame (B-frame). Similarly, a stripe may refer to any P stripe, I stripe, or B stripe.
The RDO module 1010 may be used to integrate or make logical decisions about one or more other modules. For example, based on one or more last encoded frames, the RDO module 1010 may determine how a frame (or slice) currently being encoded is partitioned into multiple Coding Units (CUs) and how a CU is partitioned into one or more Prediction Units (PUs) and Transform Units (TUs). CU, PU, and TU are various types of blocks used in HEVC. In addition, the RDO module 1010 may determine how the current frame is predicted. The current frame may be predicted by inter-frame and/or intra-frame prediction. For intra prediction, where there are multiple prediction modes or directions available in HEVC (e.g., 35 modes and DMM mode in fig. 1), the RDO module 1010 may determine the best mode. For example, the RDO module 1010 may calculate a Sum of Absolute Errors (SAE) for each prediction mode and select the prediction mode that yields the smallest SAE.
the prediction module 1020 may perform inter prediction using a reference frame or intra prediction using reference pixels in a current frame. In an embodiment, the prediction module 1020 is configured to calculate a prediction block for a current block from the input video. The prediction block includes a plurality of prediction samples, and each prediction sample may be generated based on a plurality of reconstructed samples of neighboring blocks (decoded) located at the left and above the current block.
After generating the prediction block, the prediction block subtracts the current block, or vice versa, thereby generating a residual block. The residual block may be fed into the transform module 1030, and the residual chroma samples are transformed into a transform coefficient matrix by the transform module 1030. The transform may be a two-dimensional orthogonal transform, such as a Discrete Cosine Transform (DCT). The matrix of transform coefficients is then quantized by the quantization module 1040 before being fed into the entropy encoder 1050. The quantization module 1040 may change the size of the transform coefficients and round them, which may reduce the number of non-zero transform coefficients. In this way, the compression ratio can be increased. The entropy encoder 1050 scans and compiles the quantized transform coefficients into an encoding bitstream. Further, to facilitate continuous coding of chroma blocks, the quantized transform coefficients are fed into the inverse quantization module 1060 to recover the original sizes of the transform coefficients. The inverse transform module 1070 may then perform the inverse operation of the transform module 1030 and generate a noisy version of the original residual block. The residual block is then fed into the reconstruction module 1080, and the reconstruction module 1080 may generate reconstructed luma and/or chroma samples for future intra prediction of chroma blocks. The reconstructed samples may be filtered before they are used for intra prediction, if desired.
Fig. 10 is a simplified depiction of a video encoder, which may include only a portion of the modules presented at the video encoder. Those skilled in the art will appreciate that other modules (e.g., filters, scanners and transmitters), although not shown in fig. 10, may be used to assist in video encoding. Furthermore, depending on the encoding scheme, some modules of the video encoder may be omitted. For example, in lossless encoding of certain video content, no information loss is allowed, and thus, the quantization module 1040 and the inverse quantization module 1060 may be omitted. For another example, if the residual block is directly encoded without being transformed into transform coefficients, the transform module 1030 and the inverse transform module 1070 may be omitted. Furthermore, the encoded bitstream may include other information such as video resolution, frame rate, block partition information (size and coordinates), prediction mode, etc. before being transmitted from the encoder, so that the encoded sequence of video frames can be properly decoded by the video decoder.
Fig. 11 illustrates an embodiment of a video decoder 1100 for implementing the disclosed intra prediction embodiments. The video decoder 1100 corresponds to the video encoder 1000, and includes an entropy decoder 1110, an inverse quantization module 1120, an inverse transmission module 1130, a prediction module 1140, and a reconstruction module 1150, which are arranged as shown in fig. 11. In operation, the entropy decoder 1110 receives an encoded bitstream containing information on a sequence of video frames and decodes the bitstream into an uncompressed format. Then, a quantized transform coefficient matrix is generated and fed into the inverse quantization module 1120. The inverse quantization module 1120 may be the same as or similar to the inverse quantization module 1060 of fig. 10. The output of the inverse quantization module 1120 is then fed into the inverse transform module 1130, and the inverse transform module 1130 transforms the transform coefficients into residual values for the residual block. In addition, the entropy decoder 1110 also decodes information including a prediction mode (e.g., a directional intra prediction mode) of the current block. Based on the prediction mode, the prediction module 1140 may generate a prediction block for the current block.
In an embodiment, the prediction module 1140 is used to use the disclosed reference sample selection and/or filtering embodiments. The prediction block includes a plurality of prediction samples, and each prediction sample may be generated based on a plurality of reconstructed samples located in (decoded) adjacent blocks to the left and above the current block. After the prediction block for the current block is generated, the reconstruction module 1150 may combine the residual chroma block and the prediction block to generate a reconstructed block. In addition, to facilitate continuous decoding, some of the samples of the reconstructed block may also be used as reference pixels for intra prediction of future blocks in the same video slice or frame.
Fig. 12 is a flow diagram of an embodiment of an intra prediction method 1200 performed by a 3D-HEVC video codec (e.g., the encoder 1000 or the decoder 1100). The method 1200 begins at step 1210: a processor or transceiver in the codec receives a plurality of neighboring samples of a current block. The neighboring samples include a first sample located at the bottom right corner of the bottom left neighboring block (e.g., sample 742 in fig. 7A) and a second sample located at the bottom right corner of the top right neighboring block (e.g., sample 752 in fig. 7A). Step 1220: the processor or transceiver receives a partition mode indicating a partitioning of a current block into partition 0 (P)0) And partition 1 (P)1). To satisfy case 1 (as in FIGS. 6A, 7A-7C, and 8A-8B), the partition 0 includes at least the top-right, top-left, and bottom-left corner samples of the current block. Step 1230: the processor selects one of the first sample and the second sample as a reference sample for partition 1 intra prediction.
In one embodiment, the neighboring samples are denoted as p [ x ] [ y ] with x ═ 1 and y ═ 1.. nTbS ×. 2-1 and x ═ 0.. nTbS × 2-1 and y ═ 1, where nTbS indicates the size of the current block. The first and second samples are denoted p [ -1] [ nTbS 2-1] and p [ nTbS 2-1] [ -1], respectively. In step 1230, the reference sample is selected based on a first absolute difference between samples p [ 1] [0] and p [ 1] [ nTbS ] 2-1] and a second absolute difference between samples p [0] [ -1] and p [ nTbS 2-1] [ -1 ]. Further, the first and second absolute differences may be denoted as vertABsDiff and horABsDiff, respectively, and calculated by the above-described operations (6) and (7). In one embodiment, the reference sample is used as a DC average, denoted dcValBR, calculated by the above operation (8).
The above-described scheme may be implemented on any general-purpose computer system, such as a computer or network component, with sufficient processing power, storage resources, and network throughput capability to handle the necessary workload thereon. Fig. 13 shows a schematic diagram of a general-purpose computer system 1300 suitable for implementing one or more embodiments of the methods disclosed herein, e.g., a video encoder 1000, a video decoder 1100, and an intra-prediction method 1200. Computer system 1300 includes a processor 1302 (which may be referred to as a central processing unit or CPU) that communicates with storage devices including: secondary memory 1304, Read Only Memory (ROM) 1306, Random Access Memory (RAM) 1308, transmitter/receiver 1310, and input/output (I/O) device 1312. Although processor 1302 is illustrated as a single processor, it is not so limited and may comprise multiple processors. The processor 1302 may be implemented as one or more Central Processing Unit (CPU) chips, cores (e.g., multi-core processors), Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), and/or Digital Signal Processors (DSPs). Processor 1302 may be used to implement any of the aspects described herein, e.g., aspects 700, 770, 790, 800, 850, and method 1200. The processor 1302 may be implemented using hardware, software, or a combination of both. The processor 1302 includes a prediction module 1303, which is similar to the prediction module 1020. The prediction module 1303 may implement the reference sample selection/filtering techniques disclosed herein for depth map intra prediction in 3D-HEVC.
The secondary storage 1304, which typically includes one or more disk drives or tape drives, is used for non-volatile storage of data and as an over-flow data storage device if RAM1308 is not large enough to store all working data. Secondary storage 1304 may be used to store programs that are loaded into RAM1308 when such programs are selected for execution. The ROM 1306is used to store instructions and pages data that are read during program execution. < > 100{ > The ROM 1306is used to store instructions and data that may be read during program execution. The ROM 1306is a non-volatile memory device that typically has a small memory capacity relative to the large memory capacity of the secondary storage 1304. The RAM1308 is used to store volatile data and perhaps to store instructions. Access to both ROM 1306 and RAM1308 is typically faster than to secondary storage 1304. Secondary memory 1304, ROM 1306, and/or RAM1308 may be non-transitory computer-readable media that may not include transitory, propagating signals. Any of the secondary memory 1304, ROM 1306, or RAM1308 may be referred to as memory, or these modules may be referred to collectively as memory.
Transmitter/receiver 1310 may serve as an output and/or input device for computer system 1300. For example, if transmitter/receiver 1310 is used as a transmitter, it may transmit data out of computer system 1300. If transmitter/receiver 1310 is used as a receiver, it may transmit data into computer system 1300. The transmitter/receiver 1310 may take the form of: modems, modem banks, ethernet cards, Universal Serial Bus (USB) interface cards, serial interfaces, token ring cards, Fiber Distributed Data Interface (FDDI) cards, Wireless Local Area Network (WLAN) cards, and wireless transceiver cards such as Code Division Multiple Access (CDMA), global system for mobile communications (GSM), Long Term Evolution (LTE), Worldwide Interoperability for Microwave Access (WiMAX), and/or other air interface protocol wireless transceiver cards, among other well-known network devices. Transmitter/receiver 1310 may enable processor 1302 to communicate with the internet or one or more intranets. The I/O devices 1312 may include a video monitor, a Liquid Crystal Display (LCD), a touch screen display, or other type of video display for displaying video, and may also include a video recording device for capturing video. The I/O devices 1312 may include one or more keyboards, mice, trackballs, or other well-known input devices.
it should be appreciated that by programming and/or loading executable instructions into the computer system 1300, at least one of the processor 1302, RAM1308, and ROM 1306is altered to transform a portion of the computer system 1300 into a particular machine or device, such as a video codec having novel functionality as taught by the present invention. It is essential to power engineering and software engineering that the functionality that can be implemented by loading executable software into a computer can be transformed into a hardware implementation by well-known design rules. The decision between implementing a concept in software or hardware generally depends on considerations related to the stability of the design and the number of units to be generated, regardless of any issues involved in translating from the software domain to the hardware domain. In general, designs that are still subject to frequent changes may be preferred to be implemented in software because redesigning hardware implementations is more expensive than redesigning software designs. In general, a robust and mass-produced design is more suitable for implementation in hardware, such as an application-specific integrated circuit (ASIC), since mass production running a hardware implementation is cheaper than a software implementation. Typically, a design may be developed and tested in the form of software and later transformed by well-known design rules into an equivalent hardware implementation in an application specific integrated circuit that hardwires the instructions of the software. The machine controlled by the new ASIC is a specific machine or device, and as such, a computer programmed and/or loaded with executable instructions may be considered a specific machine or device.
It is to be understood that any of the processes in the present invention may be implemented by a processor (e.g., a general purpose CPU within a computer system) in a computer system (e.g., the video encoder 1000 or the decoder 1100) executing a computer program. In this case, the computer program product may be provided to a computer or network device using any type of non-transitory computer-readable medium. The computer program product may be stored in a non-transitory computer readable medium in a computer or network device. Non-transitory computer readable media include any type of tangible storage media. Examples of the non-transitory computer readable medium include a magnetic storage medium (e.g., a floppy disk, a magnetic tape, a hard disk drive, etc.), an opto-magnetic storage medium (e.g., a magneto-optical disk), a read-only optical disk (CD-ROM for short), a recordable optical disk (CD-R for short), a writable optical disk (CD-R/W for short), a digital versatile disk (DVD for short), a blu-ray (registered trademark) optical disk (BD for short), and a semiconductor memory device (e.g., a mask ROM, a programmable ROM (PROM for short), an erasable PROM, a flash ROM, and a RAM). The computer program product may also be provided to a computer or network device using any type of transitory computer-readable medium. Examples of transitory computer readable media include electrical signals, optical signals, and electromagnetic waves. A transitory computer-readable medium can provide a program to a computer via a wired communication line (e.g., an electric wire and an optical fiber) or a wireless communication line.
while various specific embodiments of the invention have been described, it should be understood that the disclosed systems and methods may be embodied in many other specific forms without departing from the spirit or scope of the invention. The present examples are to be considered as illustrative and not restrictive, and the invention is not to be limited to the details given herein. For example, various elements or components may be combined or combined in another system, or certain features may be omitted, or not implemented.
Furthermore, techniques, systems, subsystems, and methods described and illustrated in the various embodiments as discrete or separate may be combined or integrated with other systems, modules, techniques, or methods without departing from the scope of the present disclosure. Other items shown or discussed as coupled or directly coupled or communicating with each other may also be indirectly coupled or communicating through some interface, device, or intermediate component, whether electrically, mechanically, or otherwise. Other alterations, substitutions, and alternative examples will now be apparent to those skilled in the art without departing from the spirit and scope of the disclosure.

Claims (20)

1. A video codec configured to:
Receiving a current block and a plurality of adjacent pixel points, wherein the current block comprises a first partition and a second partition, wherein the first partition comprises samples of an upper right corner, an upper left corner and a lower left corner of the current block;
Selecting at least one reference pixel point from the plurality of adjacent pixel points, wherein the plurality of adjacent pixel points comprise pixel points in an upper right adjacent block of the current block and a lower left adjacent block of the current block;
Predicting a plurality of pixel points located on the second partition based on the reference pixel point.
2. The video codec of claim 1, further configured to determine whether to filter the reference pixels prior to predicting the pixels of the second partition, wherein the determination is based on at least one of an intra prediction mode of the current block and a size of the current block.
3. The video codec of claim 2, further configured to generate a filtered reference pixel by filtering a reference pixel when the size of the current block is within a predefined range, wherein a linear combination of a plurality of reference pixels spanning rows and columns is used as the filtered reference pixel, and a plurality of pixels of the second partition are predicted by the filtered reference pixel.
4. The video codec of claim 2, further configured to generate a filtered reference pixel by filtering the reference pixel when the index of the intra prediction mode is within a predefined range, wherein a linear combination of a plurality of reference pixels is used as the filtered reference pixel, and a plurality of pixels of the second partition are predicted by the filtered reference pixel.
5. The video codec of claim 2, further configured to write a binary flag into a bitstream including coding information of the current block, wherein the binary flag indicates whether the current block is filtered before intra prediction, and wherein the binary flag determines whether a corresponding decoder receiving the bitstream uses filtering when decoding the current block.
6. The video codec of claim 1, wherein the first partition comprises at least top-right, top-left, and bottom-left corner pixels of the current block, wherein the second partition comprises at least one pixel in a top row of the current block, and wherein one of two pixels in a middle of a bottom row of an upper neighboring block with respect to the current block is selected as a reference pixel.
7. The video codec of claim 1, wherein the first partition comprises at least top-right, top-left, and bottom-left corner pixels of the current block, wherein the second partition comprises at least one pixel in a left-most column of the current block, and wherein one of two pixels in a middle of a right-most column of a left-side neighboring block with respect to the current block is selected as a reference pixel.
8. An apparatus for video encoding, comprising:
A processor to:
Receiving a current block comprising a first partition and a second partition, wherein the first partition comprises at least top-right, top-left, and bottom-left corner samples of the current block;
Selecting one reference sample from an upper-right neighboring block of the current block and a lower-left neighboring block of the current block;
Predicting samples of the second partition with the reference samples selected from the upper-right neighboring block and the lower-left neighboring block.
9. The apparatus of claim 8, wherein the selected reference sample is a first sample located at a bottom-right corner of the bottom-left neighboring block or a second sample located at a bottom-right corner of the top-right neighboring block.
10. The apparatus of claim 9, wherein the current block has a left-side neighboring block and an upper-side neighboring block, wherein a third sample is located at an upper-right corner of the left-side neighboring block, and wherein a fourth sample is located at a lower-left corner of the left-side neighboring block, and wherein the selecting a reference sample comprises:
Calculating a first absolute difference value between the first sample and the third sample;
Calculating a second absolute difference between the second sample and the fourth sample.
11. the apparatus of claim 10, wherein the first absolute difference is greater than the second absolute difference, and wherein the first sample is selected as a reference sample because the first absolute difference is greater than the second absolute difference.
12. The apparatus of claim 10, wherein the first absolute difference is equal to or less than the second absolute difference, and wherein the second sample is selected as a reference sample because the first absolute difference is equal to or less than the second absolute difference.
13. the apparatus of claim 10, wherein the predicting the samples of the second partition comprises setting the reference sample as a mean value of the second partition, and wherein the processor is further configured to calculate a plurality of residual samples representing differences between the mean value and each sample of the second partition.
14. the apparatus of claim 13, wherein the current block, the upper-right neighboring block, the lower-left neighboring block, the left-side neighboring block, and the upper neighboring block are all located in a depth map frame used in 3-dimensional high performance video coding (3D-HEVC), wherein samples of the second partition are predicted according to a Depth Modeling Mode (DMM), the processor further configured to:
Transforming the plurality of residual samples to generate a plurality of transform coefficients;
Quantizing the plurality of transform coefficients to generate a plurality of quantized transform coefficients;
Entropy encoding at least a portion of the plurality of quantized transform coefficients to generate an encoded bitstream.
15. A method for intra prediction in three-dimensional high performance video coding (3D-HEVC), the method comprising:
Receiving a plurality of neighboring samples of a current block, wherein the neighboring samples include a first sample located at a lower-right corner of a lower-left neighboring block and a second sample located at a lower-right corner of an upper-right neighboring block;
Receiving a partition mode indicating that the current block is partitioned into partition 0 and partition 1, wherein the partition 0 includes at least samples of upper-right, upper-left, and lower-left corners of the current block;
selecting one of the first sample and the second sample as a reference sample for intra-predicting the partition 1.
16. The method according to claim 15, wherein the adjacent samples are denoted as p [ x ] [ y ] with x ═ 1 and y ═ 1.. nTbS 2-1, and x ═ 0.. nTbS 2-1 and y ═ 1, wherein nTbS indicates the size of the current block, the first and second samples are denoted as p [ -1] [ nTbS 2-1] and p [ nTbS 2-1] [ -1], respectively, and the reference sample is selected based on a first absolute difference between samples p [ -1] [0] and p [ -1] [ nTbS 2-1] and a second absolute difference between samples p [0] -1] and p [ nTbS 2-1] [ -1 ].
17. The method of claim 16, wherein the first and second absolute differences are denoted as vertabdiff and horabdiff, respectively, and wherein the reference samples are calculated as DC averages, denoted dcValBR, by the following operations:
dcValBR=(horAbsDiff>vertAbsDiff)?p[nTbS*2-1][-1]:p[-1][nTbS*2-1])。
18. The method of claim 17, wherein vertABsDiff and horABsDiff are calculated by:
vertAbsDiff=Abs(p[-1][0]-p[-1][nTbS*2-1]);
horAbsDiff=Abs(p[0][-1]-p[nTbS*2-1][-1])。
19. The method of claim 17, wherein the partition pattern is a binary array comprising a pattern value denoted as partitionPattern [ x ] [ y ], wherein x, y ═ 0.. nTbS-1, the partition 0 comprises upper-right, upper-left, and lower-left corner samples indicated by two binary variables vertEdgeFlag and horEdgeFlag, the upper-right, upper-left, and lower-left corner samples all equal to 0, and the vertEdgeFlag and horEdgeFlag are derived by the following operations:
vertEdgeFlag=(partitionPattern[0][0]!=partitionPattern[nTbS-1][0]),
horEdgeFlag=(partitionPattern[0][0]!=partitionPattern[0][nTbS-1])。
20. The method of claim 17, further comprising intra-predicting partition 0 using a second DC average, denoted dcValLT, calculated by:
dcValLT=(p[-1][0]+p[0][-1])>>1,
Where for x, y-0.. nTbS-1, the intermediate variables predDcVal and dcOffset are defined and derived by the following operations:
predDcVal=(partitionPattern[x][y]==partitionPattern[0][0])?dcValLT:dcValBR,
dcOffset=DcOffset[xTb][yTb][partitionPattern[x][y]],
wherein (xTb, yTb) indicates the position of the top left sample of the current block relative to the top left sample of the depth map that includes the current block, the prediction sample value for the current block being represented as predSamples [ x ] [ y ], where x, y ═ 0.. nTbS-1, derived based on predDcVal and dcOffset.
CN201480057368.XA 2013-10-17 2014-10-16 Improved reference pixel selection and filtering for intra-depth map coding Active CN106688238B (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201361892342P 2013-10-17 2013-10-17
US61/892,342 2013-10-17
PCT/US2014/060873 WO2015057947A1 (en) 2013-10-17 2014-10-16 Improved reference pixel selection and filtering for intra coding of depth map

Publications (2)

Publication Number Publication Date
CN106688238A CN106688238A (en) 2017-05-17
CN106688238B true CN106688238B (en) 2019-12-17

Family

ID=58857853

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201480057368.XA Active CN106688238B (en) 2013-10-17 2014-10-16 Improved reference pixel selection and filtering for intra-depth map coding

Country Status (1)

Country Link
CN (1) CN106688238B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107592538B (en) * 2017-09-06 2019-07-23 华中科技大学 A method of reducing stereoscopic video depth map encoder complexity
US10856010B2 (en) 2017-09-08 2020-12-01 FG Innovation Company Limited Device and method for coding video data based on multiple reference lines
CN115474042A (en) * 2017-10-20 2022-12-13 韩国电子通信研究院 Image encoding method, image decoding method, and recording medium storing bit stream
CN116156166A (en) * 2017-10-31 2023-05-23 三星电子株式会社 Image encoding method, image decoding method and apparatus thereof
CN116456084A (en) 2018-12-25 2023-07-18 Oppo广东移动通信有限公司 Decoding prediction method, device and computer storage medium
CN111435993B (en) 2019-01-14 2022-08-26 华为技术有限公司 Video encoder, video decoder and corresponding methods
SG11202106215XA (en) 2019-03-12 2021-07-29 Guangdong Oppo Mobile Telecommunications Corp Ltd Intra-frame prediction method and apparatus, and computer-readable storage medium
WO2021054807A1 (en) * 2019-09-19 2021-03-25 엘지전자 주식회사 Image encoding/decoding method and device using reference sample filtering, and method for transmitting bitstream

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101184237A (en) * 2006-11-16 2008-05-21 汤姆森许可贸易公司 Method of transcoding data from the MPEG2 standard to an MPEG4 standard
CN101527843A (en) * 2008-03-07 2009-09-09 瑞昱半导体股份有限公司 Device for decoding video block in video screen and related method thereof
CN102934441A (en) * 2010-05-25 2013-02-13 Lg电子株式会社 New planar prediction mode

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8660380B2 (en) * 2006-08-25 2014-02-25 Nvidia Corporation Method and system for performing two-dimensional transform on data value array with reduced power consumption

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101184237A (en) * 2006-11-16 2008-05-21 汤姆森许可贸易公司 Method of transcoding data from the MPEG2 standard to an MPEG4 standard
CN101527843A (en) * 2008-03-07 2009-09-09 瑞昱半导体股份有限公司 Device for decoding video block in video screen and related method thereof
CN102934441A (en) * 2010-05-25 2013-02-13 Lg电子株式会社 New planar prediction mode

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Geometry-Adaptive Block Partitioning for Intra Prediction in Image&Video Coding;Congxia Dai et al;《2007 IEEE International Conference on Image Processing》;20071231;第2-3部分以及附图2-3 *

Also Published As

Publication number Publication date
CN106688238A (en) 2017-05-17

Similar Documents

Publication Publication Date Title
US10129542B2 (en) Reference pixel selection and filtering for intra coding of depth map
CN106688238B (en) Improved reference pixel selection and filtering for intra-depth map coding
US11265540B2 (en) Apparatus and method for applying artificial neural network to image encoding or decoding
TWI634777B (en) Method of searching reference patches
KR20230113257A (en) Method and apparatus for encoding/decoding image and recording medium for storing bitstream
EP4221204A1 (en) Image signal encoding/decoding method and apparatus therefor
KR102138828B1 (en) Method and Apparatus for image encoding
KR20130085392A (en) Method and apparatus for encoding and decoding video to enhance intra prediction process speed
CN111373749A (en) Method and apparatus for low complexity bi-directional intra prediction in video encoding and decoding
US11991378B2 (en) Method and device for video coding using various transform techniques
JP2023090929A (en) Video decoding method, video decoding apparatus, and storage medium
CN110832854B (en) Method and apparatus for intra prediction using interpolation
CN111448798A (en) Method and apparatus for block shape based video encoding and decoding
KR20220098114A (en) Image encoding method/apparatus, image decoding method/apparatus and and recording medium for storing bitstream
CN111971963A (en) Image encoding and decoding, image encoder, and image decoder
US11962764B2 (en) Inter-prediction method and video decoding apparatus using the same
CN108432247B (en) Method and apparatus for predicting residual signal
US20230269399A1 (en) Video encoding and decoding using deep learning based in-loop filter
US20210289202A1 (en) Intra prediction method and apparatus for performing adaptive filtering on reference pixel
US20240179324A1 (en) Method and apparatus for video coding using an improved in-loop filter
US12034921B2 (en) Apparatus and method for applying artificial neural network to image encoding or decoding
US20240007623A1 (en) Block splitting structure for efficient prediction and transform, and method and appartus for video encoding and decoding using the same
US20240179326A1 (en) Method and apparatus for video coding using nonrectangular block splitting structure
WO2023208131A1 (en) Efficient geometric partitioning mode video coding
US20230300325A1 (en) Video coding method and apparatus using intra prediction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant