WO2016142977A1 - Dispositif de codage d'image vidéo, procédé de codage d'image vidéo, et programme de codage d'image vidéo - Google Patents

Dispositif de codage d'image vidéo, procédé de codage d'image vidéo, et programme de codage d'image vidéo Download PDF

Info

Publication number
WO2016142977A1
WO2016142977A1 PCT/JP2015/006395 JP2015006395W WO2016142977A1 WO 2016142977 A1 WO2016142977 A1 WO 2016142977A1 JP 2015006395 W JP2015006395 W JP 2015006395W WO 2016142977 A1 WO2016142977 A1 WO 2016142977A1
Authority
WO
WIPO (PCT)
Prior art keywords
prediction mode
block size
layer
prediction
image
Prior art date
Application number
PCT/JP2015/006395
Other languages
English (en)
Japanese (ja)
Inventor
健太 徳満
慶一 蝶野
Original Assignee
日本電気株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電気株式会社 filed Critical 日本電気株式会社
Priority to JP2017504305A priority Critical patent/JPWO2016142977A1/ja
Publication of WO2016142977A1 publication Critical patent/WO2016142977A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/107Selection of coding mode or of prediction mode between spatial and temporal predictive coding, e.g. picture refresh
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/119Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • H04N19/159Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/187Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a scalable video layer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • H04N19/33Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the spatial domain
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/59Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution

Definitions

  • the present invention relates to a video coding apparatus, a video coding method, and a video coding program based on block division of a quadtree structure using a scalable coding system.
  • the inventors of the present invention proposed a new video encoding technique in which the processing load is distributed in Japanese Patent Application No. 2013-185994.
  • an input image is encoded by a pre-stage video encoder (hereinafter also referred to as a pre-stage) and a post-stage video encoder (hereinafter also referred to as a post-stage).
  • the video encoding technique is characterized by having encoded data conversion / merging means for using the encoded data obtained in the previous stage for encoding in the subsequent stage.
  • the encoded data conversion / merging means converts the encoded data so that the block division shape obtained in the previous stage can be used in the subsequent stage, as shown in FIG. For example, when the intra prediction mode and the intra prediction direction are the same, those blocks are merged into a large block.
  • FIG. 9 (b) shows an example in which the conversion means in the encoded data conversion / merging means expands the block size (expanded twice vertically and horizontally). Further, an example is shown in which the intra prediction direction of the lower left divided block in the block is the same and merged into one block (block of x pixels ⁇ y pixels).
  • SHVC Scalable High-efficiency Video Coding
  • HEVC High efficiency video coding
  • the low resolution video obtained by downsampling the input image is encoded as a low resolution layer (lowest layer or BL ⁇ ⁇ ⁇ : Base Layer), and the input image is a high resolution layer (upper layer or EL: Enhancement LayerEnhance). Is encoded as
  • Each frame of an image having a resolution corresponding to BL and each frame of an image having a resolution corresponding to EL are divided and encoded into coding tree units (CTU: Coding Tree Unit).
  • Each CU is predicted by being divided into prediction units (PU: Prediction Unit).
  • the prediction error of each CU is divided into transform units (TU: Transform Unit) in a quadtree structure, and is subjected to frequency conversion.
  • the maximum size CU and the minimum size CU are referred to as LCUL (Largest Coding Unit: maximum encoding unit) and SCU (Smallest Coding Unit: minimum encoding unit).
  • CU is a coding unit for intra prediction / interframe prediction.
  • inter-layer prediction which is a kind of inter-frame prediction, can also be used.
  • intra prediction, interframe prediction, and interlayer prediction will be described.
  • Intra prediction is prediction in which a prediction image is generated from a reconstructed image of an encoding target frame.
  • FIG. 10 is an explanatory diagram showing angular intra prediction, which is a type of intra prediction.
  • an intra prediction signal is generated by extrapolating the reconstructed pixels around the encoding target block in any of the 33 types of directions shown in FIG.
  • a CU that uses intra prediction is referred to as an intra CU.
  • Inter-frame prediction is prediction based on an image of a reconstructed frame (reference picture) having a display time different from that of an encoding target frame.
  • inter-frame prediction is also referred to as inter prediction.
  • FIG. 11 is an explanatory diagram illustrating an example of inter-frame prediction.
  • the motion vector MV (mv x , mv y ) indicates the parallel movement amount of the reconstructed image block of the reference picture with respect to the encoding target block.
  • an inter prediction signal is generated based on a reconstructed image block of a reference picture (using pixel interpolation if necessary).
  • AMVP Advanced Motion Vector Prediction
  • merge mode is a technique for predicting a motion vector by using a motion vector of a reference picture so that a difference between motion vectors is minimized.
  • AMVP a set of a reference picture index, an AMVP index associated with an AMVP prediction motion vector, and an AMVP prediction motion vector is transmitted.
  • the merge mode is a technique that uses the motion vector of the reference picture as it is. In the merge mode, a set of merge candidate indexes associated with a merge flag indicating that merge prediction is valid and a motion vector to be used is transmitted.
  • Inter-layer prediction is inter prediction using an upsampled image of a reconstructed frame of a coded BL.
  • FIG. 12 is an explanatory diagram showing inter-layer prediction.
  • an inter-layer prediction signal is generated by inter-frame prediction of an upsampled image obtained by up-sampling a reconstructed frame of an encoded BL to the same resolution as an EL frame.
  • a CU using intra prediction is referred to as an intra CU
  • a CU using inter prediction is referred to as an inter CU
  • a CU using inter layer prediction is referred to as an inter layer CU.
  • a frame encoded only by the intra CU is called an I frame (or I picture).
  • a frame including not only an intra CU but also an inter CU and an inter-layer CU is called a P frame (or P picture).
  • a frame that is encoded including not only one reference picture for inter prediction of a block but also an inter CU that uses two reference pictures at the same time is called a B frame (or B picture).
  • BL encoder low resolution layer HEVC encoder
  • EL encoder high resolution layer HEVC encoder
  • a multiplexer 110 ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ .
  • the downsampler 109 supplies a low resolution image (BL image) obtained by downsampling the input image to the BL encoder 100A.
  • the BL encoder 100A includes an estimator 101A, a predictor 102A, a frequency converter 103A, a quantizer 104A, an inverse quantization / inverse frequency converter 105A, a buffer 106A, and an entropy encoder 107A.
  • Each CTU of the BL image frame is divided into variable-size CUs based on the quadtree structure.
  • the prediction error of each CU of the BL image is divided into variable-size TUs based on the quadtree structure, similar to CTU.
  • the estimator 101A determines, for example, a CU partition shape that minimizes the coding cost, that is, a CU quadtree structure. Further, the estimator 101A determines a PU block prediction parameter and a TU quadtree structure for each CTU of the BL image.
  • the predictor 102A generates a prediction signal for the CU of the BL image based on the CU quadtree structure and the PU block prediction parameter determined by the estimator 101A.
  • the prediction signal is generated based on the above-described intra prediction or inter prediction.
  • the frequency converter 103A performs frequency conversion on the prediction error image obtained by subtracting the prediction signal from the image signal of the BL image based on the TU quadtree structure determined by the estimator 101A.
  • the quantizer 104A quantizes the frequency-transformed prediction error image (orthogonal transform coefficient).
  • the quantized orthogonal transform coefficient is referred to as a coefficient level.
  • a coefficient level having a value other than 0 is called a significant coefficient level.
  • the entropy encoder 107A entropy-encodes the cu_split_flag ⁇ indicating the CTU ⁇ quad tree structure of the BL image, the block prediction parameter of the PU, the split_transform_flag indicating the TU quad tree structure, and the coefficient level to generate a bit stream (low resolution layer) of the BL image. (BL) sub-bitstream) is output.
  • a group of parameters for entropy encoding is referred to as an encoding parameter.
  • the inverse quantization / inverse frequency converter 105A inversely quantizes the coefficient level. Further, the inverse quantization / inverse frequency converter 105A performs inverse frequency transform on the inversely quantized orthogonal transform coefficient.
  • the reconstructed prediction error image of the BL image subjected to inverse frequency conversion is supplied with a prediction signal and supplied to the buffer 106A as a reconstructed image of the BL image.
  • the buffer 106A stores the reconstructed image of the BL image for subsequent encoding processing.
  • EL encoder 300B includes estimator 101B, predictor 102B, frequency converter 103B, quantizer 104B, inverse quantizer / inverse frequency converter 105B, buffer 106B, entropy encoder 107B ⁇ ⁇ ⁇ , upsampler 108, prediction A mode determiner 111 and a block size determiner 112 ⁇ ⁇ ⁇ are provided.
  • the front-stage video encoder and the rear-stage video encoder based on the proposed video encoding technology correspond to the BL encoder 100A and the EL encoder 300B in FIG.
  • the function of the encoded data conversion / merging means is realized by a combination of the block size determiner 112 and the prediction mode determiner 111 in FIG.
  • Each CTU of the frame of the input image (EL image) input to the EL encoder 300B is divided into variable-size CUs based on the quadtree structure.
  • the prediction error of each CU of the EL image is divided into variable-size TUs based on the quadtree structure, similarly to the CTU.
  • the block size determiner 112 is an EL so as to be the same as the block division shape of the BL image determined by the estimator 101A (included in the image area of the BL image corresponding to the CTU of the EL image to be processed). Determine the CU quadtree structure for each CTU in the image.
  • the prediction mode determiner 111 is configured so that the prediction mode of the CU (the CU corresponding to the image area of the BL image corresponding to the CU of the EL image to be processed) determined by the estimator 101A is the same as the prediction mode of the CU. Determine the CU prediction mode. For example, when the prediction mode of the CU of the BL image is intra prediction, the prediction mode determiner 111 determines the prediction mode of the CU of the EL image corresponding to the image region of the CU of the BL image to intra prediction. When the prediction mode of the CU of the BL image is inter prediction, the prediction mode of the CU of the EL image is determined to be inter prediction.
  • the estimator 101B determines the PU block prediction parameter and TU of each CU of the EL image. Determine the quad tree structure.
  • the predictor 102B generates a prediction signal for the CU image signal of the EL image based on the CU quadtree structure determined by the block size determiner 112 and the PU block prediction parameter determined by the estimator 101B.
  • the frequency converter 103B performs frequency conversion on the prediction error image obtained by subtracting the prediction signal from the image signal of the EL image based on the TU quadtree structure determined by the estimator 101B.
  • the quantizer 104B quantizes the frequency-transformed prediction error image (orthogonal transform coefficient).
  • the entropy encoder 107B entropy-encodes the cu_split_flag indicating the CTU quad tree structure of the EL image, the block prediction parameter of the PU, the split_transform_flag indicating the TU quad tree structure, and the coefficient level, and generates an EL image bit stream (EL sub-bit). Stream).
  • the inverse quantization / inverse frequency converter 105B performs inverse quantization on the coefficient level. Further, the inverse quantization / inverse frequency converter 105B performs inverse frequency conversion on the inversely quantized orthogonal transform coefficient.
  • the reconstructed prediction error image subjected to the inverse frequency transform is supplied with a prediction signal and supplied to the buffer 106B as a reconstructed image.
  • the buffer 106B stores the reconstructed image of the EL image and the reconstructed image of the BL image upsampled by the upsampler 108 for subsequent encoding processing.
  • the multiplexer 110 multiplexes the BL bit stream and the EL bit stream to generate a scalable bit stream.
  • the video encoding device Based on the above-described operation, the video encoding device generates a scalable bit stream from the input image.
  • the video encoding apparatus described above makes the prediction mode of the upper layer (EL in the above example) the same as the prediction mode of the corresponding lower layer. Therefore, when the lower layer is the lowest layer (BL in the above example), inter-layer prediction cannot be used in the encoding of the lower layer. Therefore, since the prediction mode of the lowest layer becomes the intra prediction mode or the inter prediction mode, only the intra prediction mode or the inter prediction mode can be selected in the higher layer of the lowest layer. As a result, the compression performance of the spatial scalable coding by selecting the inter-layer prediction cannot be sufficiently extracted, and the image quality is deteriorated.
  • the prediction mode determiner 111 is compressed in both the prediction mode of the lower layer prediction mode and the inter-layer prediction mode, and the prediction with the highest coding efficiency is performed. It is conceivable to utilize the compression performance of spatial scalable coding by changing to the double search prediction mode determiner that selects the mode, but if so configured, a process of compressing in both prediction modes is required. Therefore, there arises a problem that the calculation amount increases.
  • the present invention provides a video encoding device, a video encoding method, and a video encoding program based on quad-tree block division that can compress an upper layer image with high image quality while suppressing an increase in the amount of computation. Objective.
  • the video encoding device based on block division of the quad tree structure uses the block size of the lower layer image area corresponding to the upper layer image area when encoding the upper layer image area.
  • a block size determining means for determining a block size of the image area;
  • a prediction mode determining means for determining a prediction mode of the image area of the upper layer using a prediction mode of the image area of the lower layer corresponding to the image area of the upper layer;
  • An inter-layer prediction unit that performs inter-layer prediction using a reconstructed image of a lower layer, and an inter-prediction unit that performs inter prediction using a reconstructed image of a higher layer, and the prediction mode determination unit includes: When the prediction mode of the image region is the intra prediction mode, the prediction mode of the upper layer image region is set to the inter-layer prediction mode. And determining the mode.
  • the video encoding method based on block division of the quad tree structure uses the block size of the lower layer image area corresponding to the upper layer image area when encoding the upper layer image area. Determine the block size of the image area, use the prediction mode of the image area of the lower layer corresponding to the image area of the upper layer, determine the prediction mode of the image area of the upper layer, and use the reconstructed image of the lower layer
  • inter prediction is performed using the reconstructed image of the upper layer, and the prediction mode is determined.
  • the prediction mode of the image region of the lower layer is the intra prediction mode
  • the prediction of the image region of the upper layer is performed.
  • the mode is determined to be an inter-layer prediction mode.
  • the video encoding program based on the block division of the quad tree structure uses the block size of the lower layer image area corresponding to the upper layer image area to the computer when encoding the upper layer image area.
  • a process for determining the block size of the image area of the upper layer, a process for determining the prediction mode of the image area of the upper layer using the prediction mode of the image area of the lower layer corresponding to the image area of the upper layer, and a lower layer The process of performing inter-layer prediction using the reconstructed image of the image, the process of performing inter prediction using the reconstructed image of the upper layer, and the prediction mode of the image region of the lower layer when the prediction mode is determined. Mode, the process for determining the prediction mode of the image area of the upper layer to the inter-layer prediction mode is executed.
  • FIG. FIG. 1 is a block diagram showing a first embodiment of a video encoding apparatus according to the present invention.
  • the configuration of a video encoding apparatus according to the first embodiment that outputs a bit stream using each frame of a digitized video as an input image will be described.
  • the video encoding apparatus uses the tendency that the high image quality prediction mode determination unit has high encoding efficiency in the order of inter prediction, inter-layer prediction, and intra prediction, so that the prediction mode of the lower layer is intra.
  • the prediction mode an inter-layer prediction mode with higher encoding efficiency can be selected as the prediction mode of the higher layer.
  • the video encoding apparatus is similar to the video encoding apparatus shown in FIG. 13 in that a BL encoder 100A that encodes BL, an EL encoder 100B that encodes EL, and a downsampler 109 And a multiplexer 110 ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ .
  • the configuration and operation of the BL encoder 100A in the present embodiment are the same as the configuration and operation of the BL encoder 100A in the video encoding device shown in FIG.
  • the configuration and operation of the EL encoder 100B in the present embodiment are different from the configuration and operation of the EL encoder 300B in the video encoding device shown in FIG.
  • the configuration and operation of the size determiner 112 are the same as those in the EL encoder 300B shown in FIG.
  • the prediction mode determiner 111 is replaced with a high image quality prediction mode determiner 113.
  • the high image quality prediction mode determiner 113 ⁇ ⁇ receives the CU prediction mode (determined by the estimator 101 A) of the BL image of the image area corresponding to the CU of the EL image to be processed, and inputs each CU of the EL image. Determine the prediction mode. Specifically, since the encoding efficiency is inter prediction> inter-layer prediction> intra prediction, the high image quality prediction mode determiner 113 uses the EL image when the prediction mode of the CU of the BL image is the inter prediction mode. In contrast, it is determined that the inter prediction has the highest encoding efficiency, and the prediction mode of the CU of the EL image to be processed is set to the inter prediction mode.
  • the prediction mode of the CU of the BL image is the intra prediction mode
  • FIG. 2 is an explanatory diagram for explaining the operation of the high image quality prediction mode determiner 113.
  • the upper right block (CU) in the image area (corresponding area) of the BL image corresponding to the CU of the EL image to be processed, the upper right block (CU) is an inter prediction mode CU, and the other blocks are intra. CU in prediction mode. Therefore, in the example shown in FIG. 2B, the image quality improvement prediction mode determiner 113 determines the prediction mode of the upper right CU as the inter prediction mode in the image region (encoding target region) in the EL image to be processed, and so on. The prediction mode of the CU is determined to be the inter-layer prediction mode (see FIG. 2B).
  • step S101 the high image quality prediction mode determiner 113 determines whether or not the prediction mode for each CU of the input BL image is intra prediction.
  • the process proceeds to step S102.
  • the prediction mode for each CU of BL is not intra prediction, the process proceeds to step S103.
  • step S102 the high image quality prediction mode determiner 113 outputs a prediction mode for each EL CU. Specifically, the high image quality prediction mode determiner 113 outputs the inter-layer prediction mode as the prediction mode for each EL CU.
  • step S103 the high image quality prediction mode determiner 113 outputs the prediction mode for each EL CU. Specifically, the high image quality prediction mode determiner 113 outputs the inter prediction mode as the prediction mode for each CU of EL.
  • the predictor 102B is a prediction mode (notified via the estimator 101A) determined by the high image quality prediction mode determiner 113, and the CU quadtree structure determined by the block size determiner 112 and the estimator 101B. Based on the determined PU block prediction parameter, a prediction signal for the CU image signal of the EL image is generated. That is, the predictor 102B has a function of performing inter-layer prediction using a lower layer reconstructed image (stored in the buffer 106A) and a function of performing inter prediction using an upper layer reconstructed image. Prepare.
  • the video encoding apparatus can compress the image of the upper layer with higher image quality than the general technique while keeping the calculation amount of the prediction mode determination unit at the same level as the calculation amount by the general technique.
  • the high image quality prediction mode determiner 113 sets the CU prediction mode of the EL image to be processed as the layer when the CU prediction mode of the lower layer image is the inter-layer prediction mode. Set to inter prediction mode. This is because inter prediction has higher encoding efficiency than EL images, but is likely to consume a large amount of processing to estimate a motion vector suitable for inter prediction.
  • the high image quality prediction mode determiner 113 ⁇ uses the inter-layer prediction for the layer being processed even if the prediction mode of the CU of the lower layer image is the intra prediction mode. If not, the CU prediction mode of the image of the layer being processed is set to the intra prediction mode.
  • Embodiment 2 FIG. In the first embodiment described above, it is not always guaranteed to select the inter-layer prediction that minimizes the number of bits.
  • the CU block size of the EL image area can be determined so as to minimize the number of CUs.
  • FIG. 4 (b) is a block diagram showing a second embodiment of the video encoding device according to the present invention.
  • FIG. 4B the configuration of a video encoding apparatus according to the second embodiment that outputs a bit stream using each frame of a digitized video as an input image will be described.
  • the video encoding apparatus of the present embodiment includes a BL encoder 100A that encodes BL, an EL encoder 200B that encodes EL, and a downsampler 109. , And a multiplexer 110.
  • the configuration and operation of the BL encoder 100A in the present embodiment are the same as the configuration and operation of the BL encoder 100A in the first embodiment shown in FIG.
  • the block size determiner 112 in the EL encoder 300B shown in FIG. 13 is replaced with a high image quality block size determiner 114.
  • the upsampler 108 and the high image quality prediction mode determiner 113 include the estimator 101B, the predictor 102B, the frequency converter 103B, the quantizer 104B, the inverse quantizer / decoder in the first embodiment shown in FIG. This is the same as the inverse frequency converter 105B, the buffer 106B, the entropy encoder 107B, the upsampler 108, and the high image quality prediction mode determiner 113.
  • the high image quality block size determiner 114 is a CU quadtree structure (determined by the estimator 101A) of the BL image of the image area corresponding to the CU of the EL image to be processed, and a prediction mode of the CU (by the estimator 101A). And the block size of each CU of the EL image is output. Unlike the block size determiner 112 in the first embodiment, the high image quality block size determiner 114 can minimize the number of intra prediction modes CU included in the CTU of the EL image based on the operation described later.
  • the prediction mode of the CU of the EL image is determined by the high image quality prediction mode determiner 113.
  • the high image quality prediction mode determiner 113 sets the inter-layer prediction mode as the prediction mode of the CU of the EL image.
  • the high image quality block size determiner 114 can perform inter-layer prediction with the largest possible block size instead of inter-layer prediction with a fine block size (FIG. 5 ( See right side in B)).
  • the block size (side length) of the maximum CU (LCU: Largest Coding Unit) and the minimum CU (SCU: Smallest Coding Unit) is 64 and 8, respectively.
  • the high image quality block size determiner 114 determines the temporary image of the EL image so that the block division shape of the BL image included in the image region of the BL image corresponding to the CTU of the EL image to be processed is the same.
  • the prediction mode (Epred) of each provisional CU of the EL image is determined so as to be the same as the prediction mode of the CU corresponding to the image region of the BL image corresponding to the CU of the EL image to be processed.
  • Condition (1) The temporary block size Eb of the sub-block is the same as Kb.
  • Condition (2) The temporary prediction mode Epred of the sub-block is the intra prediction mode.
  • the high image quality block size determiner 114 updates the block size Eb of the four corresponding sub-blocks to Ob. If not filled, the high image quality block size determiner 114 maintains the block sizes Eb of the four corresponding sub-blocks.
  • step S202 the processing in step S202 described above, the CU block size of the EL image from which the high image quality prediction mode determiner 113 selects the inter-layer prediction mode can be enlarged.
  • step S204 the high image quality block size determiner 114 updates Kb to Kb ⁇ 2, and proceeds to step S202.
  • the high image quality block size determiner 114 can minimize the number of CUs in which the Epred included in the CTU of the EL image is included in the image region of the intra prediction mode.
  • the high image quality prediction mode determiner 113 determines that the BL of the image region corresponding to the CU of the EL image
  • the inter-layer prediction mode is set as the prediction mode of the EL image CU. Therefore, the video encoding apparatus according to the present embodiment performs the inter-layer prediction included in the CTU of the EL image.
  • the image area of the mode can be processed with the minimum number of CUs.
  • the CTU can be predicted between layers with the largest possible block size, rather than being predicted between layers with a fine block size.
  • inter-layer prediction with a large block size it is possible to increase the compression efficiency by reducing the number of bits in the CTU parameter group. That is, the upper layer can be compressed with higher image quality than in the first embodiment.
  • EL_SCU: Enhancement Layer SCU Enhancement Layer SCU
  • L_SCU LCU
  • the preferred relationship of EL_LCU: Enhancement Layer LCU ⁇ ) is supplementarily described below.
  • the block size determination means (block size determiner 112 and high image quality block size determiner 114) in each of the above embodiments determines the block size of the upper layer image area using the block size of the lower layer image area. . For this reason, it is necessary to satisfy the relationship in which EL_LCU is greater than or equal to RR ⁇ SL_LCU. That is, the video encoding device of each of the above embodiments automatically sets the value of EL_LCU (or LL_LCU) according to the set LL_LCU (or EL_LCU) so as to satisfy the relationship between LL_LCU and EL_LCU. Means may be provided.
  • the SCU size must satisfy the relationship that EL_SCU is greater than or equal to RR ⁇ LL_SCU. That is, the video encoding device of each of the above embodiments automatically sets the value of EL_SCU (or LL_SCU) according to the set LL_SCU (or EL_SCU) so as to satisfy the relationship between LL_SCU and EL_SCU. Means may be provided.
  • the block size determination means divides the upper layer image area by the minimum number of blocks so that the prediction is the same as the block size of the lower layer image area. That's fine.
  • each of the above embodiments can be configured by a hardware circuit, it can also be realized by a computer program.
  • the information processing system shown in FIG. 7 includes a processor 1001, a program memory 1002, a storage medium 1003 for storing video data, and a storage medium 1004 for storing a bitstream.
  • the storage medium 1003 and the storage medium 1004 may be separate storage media, or may be storage areas composed of the same storage medium.
  • a magnetic storage medium such as a hard disk can be used as the storage medium.
  • the program memory 1002 stores a program for realizing the function of each block shown in FIG. 1B or FIG. Then, the processor 1001 executes the processing according to the program stored in the program memory 1002, thereby realizing the function of the video encoding device shown in FIG. 1B or the function of the video encoding device shown in FIG. To do.
  • FIG. 8 (b) is a block diagram showing a main part of a video encoding apparatus based on block division of a quadtree structure according to the present invention.
  • the video encoding apparatus uses the block size of the lower layer image area corresponding to the upper layer image area to encode the upper layer image area block.
  • the block size determining means 11 for determining the size for example, realized by the block size determiner 112 ⁇ shown in FIG. 1 or the high image quality block size determiner 114 ⁇ ⁇ shown in FIG.
  • the prediction mode determination means 12 for determining the prediction mode of the upper layer image region using the prediction mode of the lower layer image region (for example, realized by the high image quality prediction mode determiner 113 ⁇ ⁇ shown in FIGS. And inter-layer prediction means 13 (for example, realized by the predictor 102B shown in FIGS. 1 to 4) and an upper layer.
  • Inter prediction means 14 (for example, realized by the predictor 102B shown in FIGS. 1 and 4B) that performs inter prediction using the reconstructed image of the ear, and the prediction mode determination means 12
  • the region prediction mode is the intra prediction mode
  • the prediction mode of the upper layer image region is determined as the inter-layer prediction mode.
  • Block size determination means 12 Prediction mode determination means 13 Inter-layer prediction means 14 Inter prediction means 100A BL encoder 100B, 200B EL encoder 101A, 101B Estimator 102A, 102B Predictor 103A, 103B Frequency converter 104A, 104B Quantizer 105A, 105B Inverse quantization / inverse frequency converter 106A, 106B Buffer 107A, 107B Entropy encoder 108 Upsampler 109 Downsampler 110 Multiplexer 111 Prediction mode determiner 112 Block size determiner 113 High quality prediction Mode determiner 114 High image quality block size determiner 1001 Processor 1002 Program memory 1003,1004 Storage medium

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

Un dispositif de codage d'image vidéo selon la présente invention comprend des moyens de détermination de taille de bloc 11 pour déterminer la taille de bloc une zone d'image de couche supérieure en utilisant la taille de bloc d'une zone d'image de couche inférieure correspondant à la zone d'image de couche supérieure pour coder la zone d'image de couche supérieure ; des moyens de détermination de mode de prédiction 12 pour déterminer le mode de prédiction pour la zone d'image de couche supérieure en utilisant le mode de prédiction pour la zone d'image de couche inférieure correspondant à la zone d'image de couche supérieure ; des moyens de prédiction inter-couche 13 pour exécuter une prédiction inter-couche en utilisant une image reconstruite de couche inférieure ; et des moyens d'inter-prédiction 14 pour exécuter une inter-prédiction en utilisant une image reconstruite de couche supérieure. Lorsque le mode de prédiction pour la zone d'image de couche inférieure est un mode d'intra-prédiction, les moyens de détermination de mode de prédiction déterminent le mode de prédiction pour la zone d'image de couche supérieure en tant que mode de prédiction inter-couche.
PCT/JP2015/006395 2015-03-09 2015-12-22 Dispositif de codage d'image vidéo, procédé de codage d'image vidéo, et programme de codage d'image vidéo WO2016142977A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2017504305A JPWO2016142977A1 (ja) 2015-03-09 2015-12-22 映像符号化装置、映像符号化方法および映像符号化プログラム

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2015045619 2015-03-09
JP2015-045619 2015-03-09

Publications (1)

Publication Number Publication Date
WO2016142977A1 true WO2016142977A1 (fr) 2016-09-15

Family

ID=56880207

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2015/006395 WO2016142977A1 (fr) 2015-03-09 2015-12-22 Dispositif de codage d'image vidéo, procédé de codage d'image vidéo, et programme de codage d'image vidéo

Country Status (2)

Country Link
JP (1) JPWO2016142977A1 (fr)
WO (1) WO2016142977A1 (fr)

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
H.SCHWARZ ET AL.: "Description of scalable video coding technology proposal by Fraunhofer HHI (Configuration A)", JOINT COLLABORATIVE TEAM ON VIDEO CODING(JCT-VC) OF ITU-T SG 16 WP3 AND ISO/IEC JTC1/SC29/WG11 LLTH MEETING, 10 October 2012 (2012-10-10), Shanghai, CN, pages 6 - 29, XP030054922 *
JIANLE CHEN ET AL.: "Description of scalable video coding technology proposal by Qualcomm (configuration 2)", JOINT COLLABORATIVE TEAM ON VIDEO CODING(JCT-VC) OF ITU-T SG 16 WP3 AND ISO/IEC JTC1/SC29/WG11 LLTH MEETING, 10 October 2012 (2012-10-10), Shanghai, CN, pages 1 - 21, XP030054931 *
SEBASTIEN LASSERRE ET AL.: "Description of the scalable video coding technology proposal by Canon Research Centre France", JOINT COLLABORATIVE TEAM ON VIDEO CODING(JCT-VC) OF ITU-T SG 16 WP3 AND ISO/IEC JTC1/SC29/WG11 11TH MEETING, 10 October 2012 (2012-10-10), Shanghai, CN, pages 35 - 42,60-63, XP030054873 *

Also Published As

Publication number Publication date
JPWO2016142977A1 (ja) 2017-12-21

Similar Documents

Publication Publication Date Title
JP6526292B2 (ja) 符号化装置、復号装置、符号化方法、復号方法、及びプログラム
JP4991699B2 (ja) ビデオ信号のスケーラブルなエンコーディング方法およびデコーディング方法
JP5504336B2 (ja) スケーラブルビデオ符号化方法、符号器及びコンピュータプログラム
US20150010081A1 (en) Apparatus and method for encoding/decoding images for intra-prediction
WO2010004939A1 (fr) Dispositif de codage d'image, dispositif de décodage d'image, procédé de codage d'image et procédé de décodage d'image
US20060233250A1 (en) Method and apparatus for encoding and decoding video signals in intra-base-layer prediction mode by selectively applying intra-coding
US20150085933A1 (en) Method and apparatus for encoding multi-view images, and method and apparatus for decoding multi-view images
JP6495268B2 (ja) パラメータセット内のビューidビット深度のシグナリング
US20200228831A1 (en) Intra prediction mode based image processing method, and apparatus therefor
WO2012167539A1 (fr) Procédé et dispositif de traitement de modes de prédiction intra-trame
JP2016501483A (ja) ビデオコーディングにおけるhevc拡張用の多重レイヤの低複雑度サポート
KR20130107861A (ko) 인터 레이어 인트라 예측 방법 및 장치
JP2022513457A (ja) Vvcにおける色変換のための方法及び機器
CN110999290A (zh) 使用跨分量线性模型进行帧内预测的方法和装置
US11706449B2 (en) Method and device for intra-prediction
JP2022535859A (ja) Mpmリストを構成する方法、クロマブロックのイントラ予測モードを取得する方法、および装置
JP6479776B2 (ja) リサンプリングプロセスにおける中間データのダイナミックレンジ制御
WO2015190078A1 (fr) Dispositif de codage vidéo, procédé de codage vidéo et support d'enregistrement
JP2017073598A (ja) 動画像符号化装置、動画像符号化方法及び動画像符号化用コンピュータプログラム
WO2016142977A1 (fr) Dispositif de codage d'image vidéo, procédé de codage d'image vidéo, et programme de codage d'image vidéo
US10743009B2 (en) Image processing apparatus and image processing method
KR20200004348A (ko) 타겟 영역 수정을 통해 비디오 신호를 처리하는 방법 및 장치
KR20130055317A (ko) 순차적으로 후보 블록을 선정하여 블록 병합을 이용한 영상 부호화/복호화 장치 및 방법
JP6635197B2 (ja) 映像符号化装置、映像符号化方法およびプログラム
CN116998153A (zh) 基于多个预测模式的交叉通道预测

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15884475

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2017504305

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15884475

Country of ref document: EP

Kind code of ref document: A1