WO2013161689A1 - Dispositif de décodage de vidéo animée et dispositif de codage de vidéo animée - Google Patents

Dispositif de décodage de vidéo animée et dispositif de codage de vidéo animée Download PDF

Info

Publication number
WO2013161689A1
WO2013161689A1 PCT/JP2013/061588 JP2013061588W WO2013161689A1 WO 2013161689 A1 WO2013161689 A1 WO 2013161689A1 JP 2013061588 W JP2013061588 W JP 2013061588W WO 2013161689 A1 WO2013161689 A1 WO 2013161689A1
Authority
WO
WIPO (PCT)
Prior art keywords
motion vector
layer
decoding
unit
frame
Prior art date
Application number
PCT/JP2013/061588
Other languages
English (en)
Japanese (ja)
Inventor
久雄 熊井
山本 智幸
友子 青野
伊藤 典男
Original Assignee
シャープ株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by シャープ株式会社 filed Critical シャープ株式会社
Publication of WO2013161689A1 publication Critical patent/WO2013161689A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/44Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/187Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a scalable video layer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/12Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding

Definitions

  • the present invention relates to a moving picture decoding apparatus that decodes encoded data that has been hierarchically encoded, and a moving picture encoding apparatus that generates encoded data by hierarchically encoding an image.
  • a moving image encoding device that generates encoded data by encoding the moving image, and decoding by decoding the encoded data
  • a video decoding device (decoding device) that generates an image is used.
  • Hierarchical encoding is used in which moving images are encoded hierarchically according to a required data rate.
  • Hierarchical coding systems include H.264 and ISO / IEC and ITU-T standards.
  • H.264 / AVC Annex G Scalable Video Coding (SVC) Non-Patent Document 1).
  • encoded data can be made into two layers (hierarchies) including a base layer (lower layer) and an enhancement layer (upper layer).
  • a base layer lower layer
  • an enhancement layer upper layer
  • SVC also supports spatial scalability, temporal scalability, and SNR scalability.
  • spatial scalability an image obtained by down-sampling an original image to a desired resolution is used as a lower layer. It is encoded with H.264 / AVC.
  • inter-layer prediction is performed in order to remove redundancy between layers.
  • inter-layer prediction there is motion information prediction in which information related to motion prediction is predicted from information in lower layers at the same time, or intra-layer prediction in which prediction is performed from an image obtained by up-sampling a decoded image of a lower layer at the same time (non- Patent Document 2).
  • Non-Patent Document 3 describes a method adopted in HM (HEVC TestModel) software as an encoding method.
  • hybrid transmission Hybrid Delivery
  • a receiving terminal that has received data distributed by hybrid transmission can display data received from a plurality of routes on the screen as one data by synchronizing, superimposing, synthesizing, or the like.
  • SVC scalable encoded
  • MPEG-2 or H.264 is used as the encoding method.
  • H.264 / AVC is adopted, but the basic layer is MPEG-2 or H.264.
  • MPEG-2 or H.264 is used in the enhancement layer.
  • HEVC HEVC
  • the present invention has been made in view of the above-described problems, and an object of the present invention is to perform enhancement layer coding even when the base layer coding scheme and the enhancement layer coding scheme are different in scalable coding. It is to realize a moving picture decoding apparatus and the like that can use encoded information used for encoding a base layer.
  • a video decoding device is a video decoding device that decodes encoded data composed of a plurality of layers having different encoding methods, and includes a plurality of layers.
  • the first layer motion vector decoding means for decoding the motion vector used for decoding the first layer and the first layer motion vector decoding means
  • An intermediate motion vector deriving unit for deriving an intermediate motion vector based on the motion vector, and an intermediate motion vector derived by the intermediate motion vector deriving unit with reference to the second of the plurality of layers
  • second layer motion vector deriving means for deriving a motion vector used for layer decoding.
  • the second layer motion vector is derived based on the motion vector included in the first layer, the amount of codes can be reduced and the encoding efficiency can be improved.
  • the intermediate motion vector deriving unit is provided, it is possible to cope with a case where various encoding methods are used as the first layer and the second ear.
  • the intermediate motion vector deriving means converts the motion vector decoded by the first layer motion vector decoding means into a value separated by a predetermined frame to obtain an intermediate motion vector. It may be a thing.
  • the motion vector of the first layer is converted into a format that does not depend on the encoding method of the first layer, even if the encoding method of the second layer does not depend on the first layer.
  • the motion vector of the first layer can be used for layer decoding. Thereby, encoding efficiency can be improved.
  • converting to a value separated by a predetermined frame means that the length of the decoded motion vector is a predetermined number of frames (for example, one frame) or a time distance corresponding to a predetermined field (for example, one field). To convert to a value.
  • the first layer includes a motion vector derived using a plurality of fields for one frame
  • the intermediate motion vector deriving means includes the second layer. Based on the motion vector derived using the field corresponding to the processing target frame among the plurality of fields in the first layer frame corresponding to the processing target frame.
  • the intermediate motion vector may be derived.
  • a motion vector derived using a plurality of fields for one frame is included in the first layer, a motion vector can be derived from the corresponding field. Thereby, a motion vector can be appropriately derived.
  • the intermediate motion vector deriving means converts the motion vector used for the second layer decoding target frame into a motion vector in the first layer frame corresponding to the processing target frame. Based on this, the intermediate motion vector may be derived.
  • the motion vector can be derived from the frame corresponding to the processing target frame. Thereby, a motion vector can be appropriately derived.
  • the intermediate motion vector deriving means includes a motion in a prediction direction required in the second layer in the first layer frame corresponding to the decoding target frame in the second layer.
  • the intermediate motion vector may be derived based on the motion vector included in the reference frame closest to the frame of the first layer in the encoding order.
  • the motion vector can be appropriately derived.
  • a video decoding device is a video decoding device that decodes encoded data composed of a plurality of layers having different encoding methods, and includes a plurality of layers.
  • the first layer motion vector decoding means for decoding the motion vector used for decoding the first layer
  • the first layer motion vector decoding means Second layer motion vector deriving means for deriving a motion vector used for decoding the second layer of the plurality of layers based on the reference relationship between the motion vector and the reference frame of the second layer. It is a feature.
  • the second layer motion vector is derived based on the motion vector included in the first layer, the amount of codes can be reduced and the encoding efficiency can be improved.
  • the first layer includes motion vectors derived using a plurality of fields for one frame
  • the second layer motion vector deriving means includes the second layer
  • a motion vector used for a decoding target frame of the layer is based on a motion vector derived using a field corresponding to the processing target frame among a plurality of fields in the first layer frame corresponding to the processing target frame.
  • the second layer motion vector may be derived.
  • a motion vector derived using a plurality of fields for one frame is included in the first layer, a motion vector can be derived from the corresponding field. Thereby, a motion vector can be appropriately derived.
  • the second layer motion vector deriving means uses the motion vector used for the second layer decoding target frame as the motion vector in the first layer frame corresponding to the processing target frame. Based on the above, the second layer motion vector may be derived.
  • the motion vector can be derived from the frame corresponding to the processing target frame. Thereby, a motion vector can be appropriately derived.
  • a video encoding apparatus is encoded data including a plurality of layers having different encoding methods, and each layer includes an original image and a predicted image.
  • a video encoding device that generates encoded data including a prediction residual that is a difference
  • an intermediate motion vector deriving unit that derives an intermediate motion vector based on a motion vector used for decoding of the first layer, and Referring to the intermediate motion vector derived by the intermediate motion vector deriving means, second layer motion vector derivation for deriving the motion vector used for generating the predicted image for generating the second layer encoded data And means.
  • the motion vector of the second layer is derived based on the motion vector used for decoding of the first layer, it is possible to reduce the amount of codes and improve the encoding efficiency. Further, by deriving the intermediate motion vector, it is possible to cope with cases where various encoding methods are used as the first layer and the second layer.
  • a video encoding apparatus is encoded data including a plurality of layers having different encoding methods, and each layer includes an original image and a predicted image.
  • the second layer code is based on the reference relationship between the motion vector used for decoding the first layer and the reference frame of the second layer. It is characterized by comprising second layer motion vector deriving means for deriving a motion vector used for generating the predicted image for generating the quantized data.
  • the second layer motion vector is derived based on the motion vector included in the first layer, it is possible to reduce the code amount and improve the encoding efficiency.
  • the moving picture decoding apparatus is a moving picture decoding apparatus that decodes encoded data including a plurality of layers having different encoding schemes, and is the first of the plurality of layers.
  • the first layer motion vector decoding means for decoding the motion vector used for decoding the first layer
  • the motion vector decoded by the first layer motion vector decoding means Based on the intermediate motion vector deriving means for deriving the intermediate motion vector, and decoding the second layer of the plurality of layers with reference to the intermediate motion vector derived by the intermediate motion vector deriving means
  • a second layer motion vector deriving unit for deriving a motion vector used for.
  • the second layer motion vector is derived based on the motion vector included in the first layer, it is possible to reduce the amount of codes and improve the encoding efficiency.
  • the intermediate motion vector deriving unit is provided, there is an effect that it is possible to cope with cases where various encoding methods are used as the first layer and the second layer.
  • the moving picture decoding apparatus is a moving picture decoding apparatus that decodes encoded data composed of a plurality of layers having different encoding schemes, and is applied to a first layer of the plurality of layers.
  • a second layer motion vector deriving unit for deriving a motion vector used for decoding the second layer of the plurality of layers.
  • the second layer motion vector is derived based on the motion vector included in the first layer, it is possible to reduce the amount of codes and improve the encoding efficiency.
  • a moving image encoding apparatus is encoded data including a plurality of layers having different encoding methods, and an original image and a predicted image are included in each layer.
  • An intermediate motion vector deriving unit for deriving an intermediate motion vector based on a motion vector used for decoding of the first layer in a video encoding device that generates encoded data including a prediction residual that is a difference between A second layer motion for deriving a motion vector used for generating the predicted image for generating the second layer encoded data with reference to the intermediate motion vector derived by the intermediate motion vector deriving means
  • Vector derivation means for deriving an intermediate motion vector based on a motion vector used for decoding of the first layer in a video encoding device that generates encoded data including a prediction residual that is a difference between A second layer motion for deriving a motion vector used for generating the predicted image for generating the second layer encoded data with reference to the intermediate motion vector derived by the intermediate motion vector deriving means
  • the video encoding apparatus is encoded data composed of a plurality of layers having different encoding methods, and each layer has a prediction residual that is a difference between an original image and a predicted image.
  • a motion vector used for generating the predicted image for generating the second layer encoded data based on a motion vector used for decoding the first layer in a video encoding device that generates encoded data including Is provided with second layer motion vector deriving means for deriving.
  • the second layer motion vector is derived based on the motion vector included in the first layer, it is possible to reduce the amount of codes and improve the encoding efficiency.
  • FIG. 1 It is a block diagram which shows the principal part structure of the moving image decoding apparatus which concerns on embodiment of this invention. It is a figure for demonstrating the outline
  • FIG. 3 is a diagram illustrating a configuration of encoded data according to an embodiment of the present invention, where (a) is a diagram illustrating a configuration of a picture layer of encoded data, and (b) is a slice layer included in the picture layer.
  • (C) is a figure which shows the structure of the macroblock layer contained in a slice layer
  • (d) is a figure which shows the structure of the block layer contained in a macroblock layer. It is a figure for demonstrating the field prediction and frame prediction in a frame structure
  • (a) is a figure which shows field prediction
  • (b) is a figure which shows frame prediction.
  • (A) shows a transmitting apparatus equipped with a moving picture coding apparatus, and (b) shows a receiving apparatus equipped with a moving picture decoding apparatus. It is the figure shown about the structure of the recording device which mounts the said moving image encoder, and the reproducing
  • (A) shows a recording apparatus equipped with a moving picture coding apparatus, and (b) shows a reproduction apparatus equipped with a moving picture decoding apparatus.
  • Embodiment 1 The moving picture decoding apparatus 1 according to the present embodiment, when decoding encoded data that has been subjected to scalable coding (SVC), uses a coding method in the base layer (first layer) and an enhancement layer (second layer). Even if the encoding method is different, the encoding information used when decoding the base layer can be used when decoding the enhancement layer.
  • SVC scalable coding
  • the base layer motion prediction information and prediction mode information are converted into a format that does not depend on the base layer encoding method, thereby performing processing dependent on the base layer encoding method in the enhancement layer. Rather, it can be referred to as an enhancement layer motion vector.
  • the ultra-high-definition video when transmitting ultra-high-definition video (moving image, 4k video data), the ultra-high-definition video is scalable-coded, and the base layer downscales 4k video data for interlacing.
  • Recorded video data is MPEG-2 or H.264.
  • the following describes a case where the H.264 / AVC encoding is performed and transmitted on the television broadcasting network, and the enhancement layer is a case where 4k video is encoded by HEVC (progressive) and transmitted over the Internet.
  • the encoding method is not limited to this.
  • Encoded data # 1 (a, b))
  • encoded data # 1 (generated by the video encoding device 2 and decoded by the video decoding device 1).
  • the data structure of a and b) will be described.
  • the encoded data # 1 includes a base layer (encoded data # 1b) and an enhancement layer (encoded data # 1a).
  • the base layer and the enhancement layer may be supplied to the video decoding device 1 via different transmission paths, or may be supplied to the video decoding device 1 via the same transmission path. There may be.
  • a bit stream (encoded data # 1b) including a base layer is transmitted by a broadcast wave
  • a bit stream (encoded data # 1a) including an enhancement layer is transmitted by an Internet communication network.
  • the basic layer is, for example, MPEG-2 or H.264. H.264 / AVC
  • the enhancement layer is, for example, H.264. H.264 / MPEG-4 AVC is a successor to HEVC (High Efficiency Video Coding).
  • HEVC High Efficiency Video Coding
  • FIG. 3 is a diagram showing a data structure (encoded data # 1a) of the enhancement layer of encoded data # 1.
  • the encoded data # 1a illustratively includes a sequence and a plurality of pictures constituting the sequence.
  • FIG. 3 shows a hierarchical structure of data in the encoded data # 1a.
  • 3A to 3E show a sequence layer that defines a sequence SEQ, a picture layer that defines a picture PICT, a slice layer that defines a slice S, and a tree block that defines a tree block TBLK. It is a figure which shows the CU layer which prescribes
  • coding unit Coding
  • sequence layer a set of data referred to by the video decoding device 1 for decoding a sequence SEQ to be processed (hereinafter also referred to as a target sequence) is defined.
  • the sequence SEQ includes a sequence parameter set SPS (Sequence Parameter Set), a picture parameter set PPS (Picture Parameter Set), an adaptive parameter set APS (Adaptation Parameter Set), and pictures PICT1 to PICTNP ( The NP includes the total number of pictures included in the sequence SEQ) and supplemental enhancement information (SEI).
  • SPS Sequence Parameter Set
  • PPS Picture Parameter Set
  • APS Adaptation Parameter Set
  • SEI Supplemental Enhancement Information
  • sequence parameter set SPS a set of encoding parameters referred to by the video decoding device 1 for decoding the target sequence is defined.
  • a set of encoding parameters referred to by the video decoding device 1 for decoding each picture in the target sequence is defined.
  • a plurality of PPS may exist. In that case, one of a plurality of PPSs is selected from each picture in the target sequence.
  • the adaptive parameter set APS defines a set of encoding parameters that the moving image decoding apparatus 1 refers to in order to decode each slice in the target sequence. There may be a plurality of APSs. In that case, one of a plurality of APSs is selected from each slice in the target sequence.
  • Picture layer In the picture layer, a set of data referred to by the video decoding device 1 for decoding a picture PICT to be processed (hereinafter also referred to as a target picture) is defined. As shown in FIG. 3B, the picture PICT includes a picture header PH and slices S1 to SNS (NS is the total number of slices included in the picture PICT).
  • the picture header PH includes a coding parameter group referred to by the video decoding device 1 in order to determine a decoding method of the target picture.
  • the encoding parameter group is not necessarily included directly in the picture header PH, and may be included indirectly, for example, by including a reference to the picture parameter set PPS.
  • slice layer In the slice layer, a set of data referred to by the video decoding device 1 for decoding the slice S to be processed (also referred to as a target slice) is defined. As shown in FIG. 3C, the slice S includes a slice header SH and a sequence of tree blocks TBLK1 to TBLKNC (NC is the total number of tree blocks included in the slice S).
  • the slice header SH includes a coding parameter group that the moving image decoding apparatus 1 refers to in order to determine a decoding method of the target slice.
  • Slice type designation information (slice_type) for designating a slice type is an example of an encoding parameter included in the slice header SH.
  • I slice that uses only intra prediction at the time of encoding (2) P slice that uses unidirectional prediction or intra prediction at the time of encoding, (3) B-slice using unidirectional prediction, bidirectional prediction, or intra prediction at the time of encoding may be used.
  • the slice header SH may include a reference to the picture parameter set PPS (pic_parameter_set_id) and a reference to the adaptive parameter set APS (aps_id) included in the sequence layer.
  • the slice header SH includes an ALF parameter FP that is referred to by an adaptive filter provided in the video decoding device 1. Details of the ALF parameter FP will be described later.
  • Tree block layer In the tree block layer, a set of data referred to by the video decoding device 1 for decoding a processing target tree block TBLK (hereinafter also referred to as a target tree block) is defined. Note that the tree block may be referred to as a coding tree block (CTB) or a maximum coding unit (LCU).
  • CTB coding tree block
  • LCU maximum coding unit
  • the tree block TBLK includes a tree block header TBLKH and coding unit information CU1 to CUNL (NL is the total number of coding unit information included in the tree block TBLK).
  • NL is the total number of coding unit information included in the tree block TBLK.
  • the tree block TBLK is divided into partitions for specifying a block size for each process of intra prediction or inter prediction and conversion.
  • the above partition of the tree block TBLK is divided by recursive quadtree partitioning.
  • the tree structure obtained by this recursive quadtree partitioning is hereinafter referred to as a coding tree.
  • a partition corresponding to a leaf that is a node at the end of the coding tree is referred to as a coding node.
  • the encoding node is a basic unit of the encoding process, hereinafter, the encoding node is also referred to as an encoding unit (CU).
  • CU encoding unit
  • coding unit information (hereinafter referred to as CU information)
  • CU1 to CUNL is information corresponding to each coding node (coding unit) obtained by recursively dividing the tree block TBLK into quadtrees.
  • the root of the coding tree is associated with the tree block TBLK.
  • the tree block TBLK is associated with the highest node of the tree structure of the quadtree partition that recursively includes a plurality of encoding nodes.
  • each encoding node is half the size of the encoding node to which the encoding node directly belongs (that is, the partition of the node one layer higher than the encoding node).
  • the size of the tree block TBLK and the size that each coding node can take are the size designation information of the minimum coding node and the maximum coding node and the minimum included in the sequence parameter set SPS of the coded data # 1. It depends on the difference in the hierarchical depth of the coding node. For example, when the size of the minimum coding node is 8 ⁇ 8 pixels and the difference in the layer depth between the maximum coding node and the minimum coding node is 3, the size of the tree block TBLK is 64 ⁇ 64 pixels.
  • the size of the encoding node can take any of four sizes, namely, 64 ⁇ 64 pixels, 32 ⁇ 32 pixels, 16 ⁇ 16 pixels, and 8 ⁇ 8 pixels.
  • the tree block header TBLKH includes an encoding parameter referred to by the video decoding device 1 in order to determine a decoding method of the target tree block. Specifically, as shown in FIG. 3D, tree block division information SP_TBLK that specifies a division pattern of the target tree block into each CU, and a quantization parameter difference that specifies the size of the quantization step ⁇ qp (qp_delta) is included.
  • the tree block division information SP_TBLK is information representing a coding tree for dividing the tree block. Specifically, the shape and size of each CU included in the target tree block, and the position in the target tree block Is information to specify.
  • the tree block division information SP_TBLK may not explicitly include the shape or size of the CU.
  • the tree block division information SP_TBLK may be a set of flags indicating whether the entire target tree block or a partial region of the tree block is to be divided into four. In that case, the shape and size of each CU can be specified by using the shape and size of the tree block together.
  • the quantization parameter difference ⁇ qp is a difference qp ⁇ qp ′ between the quantization parameter qp in the target tree block and the quantization parameter qp ′ in the tree block encoded immediately before the target tree block.
  • CU layer In the CU layer, a set of data referred to by the video decoding device 1 for decoding a CU to be processed (hereinafter also referred to as a target CU) is defined.
  • the encoding node is a node at the root of a prediction tree (PT) and a transformation tree (TT).
  • PT prediction tree
  • TT transformation tree
  • the encoding node is divided into one or a plurality of prediction blocks, and the position and size of each prediction block are defined.
  • the prediction block is one or a plurality of non-overlapping areas constituting the encoding node.
  • the prediction tree includes one or a plurality of prediction blocks obtained by the above division.
  • Prediction processing is performed for each prediction block.
  • a prediction block that is a unit of prediction is also referred to as a prediction unit (PU).
  • intra prediction There are roughly two types of division in the prediction tree: intra prediction and inter prediction.
  • the division method is 2N ⁇ 2N (the same size as the encoding node), 2N ⁇ N, 2N ⁇ nU, 2N ⁇ nD, N ⁇ 2N, nL ⁇ 2N, nR ⁇ 2N, and N XN etc.
  • 2N ⁇ nU indicates that a 2N ⁇ 2N encoding node is divided into two regions of 2N ⁇ 0.5N and 2N ⁇ 1.5N in order from the top.
  • 2N ⁇ nD indicates that a 2N ⁇ 2N encoding node is divided into two regions of 2N ⁇ 1.5N and 2N ⁇ 0.5N in order from the top.
  • nL ⁇ 2N indicates that a 2N ⁇ 2N coding node is divided into two regions of 0.5N ⁇ 2N and 1.5N ⁇ 2N in order from the left.
  • nR ⁇ 2N indicates that a 2N ⁇ 2N coding node is divided into two regions of 1.5N ⁇ 2N and 0.5N ⁇ 1.5N in order from the left.
  • the encoding node is divided into one or a plurality of transform blocks, and the position and size of each transform block are defined.
  • the transform block is one or a plurality of non-overlapping areas constituting the encoding node.
  • the conversion tree includes one or a plurality of conversion blocks obtained by the above division.
  • the division in the transformation tree includes the one in which an area having the same size as the encoding node is assigned as the transformation block, and the one in the recursive quadtree division as in the above-described division of the tree block.
  • transform processing is performed for each conversion block.
  • the transform block which is a unit of transform is also referred to as a transform unit (TU).
  • the CU information CU specifically includes a skip flag SKIP, PT information PTI, and TT information TTI.
  • the skip flag SKIP is a flag indicating whether or not the skip mode is applied to the target PU.
  • the value of the skip flag SKIP is 1, that is, when the skip mode is applied to the target CU, A part of the PT information PTI and the TT information TTI in the CU information CU are omitted. Note that the skip flag SKIP is omitted for the I slice.
  • the PT information PTI is information regarding the PT included in the CU.
  • the PT information PTI is a set of information related to each of one or more PUs included in the PT, and is referred to when the moving image decoding apparatus 1 generates a predicted image.
  • the PT information PTI includes prediction type information PType and prediction information PInfo.
  • Prediction type information PType is information that specifies whether intra prediction or inter prediction is used as a prediction image generation method for the target PU.
  • the prediction information PInfo is composed of intra prediction information or inter prediction information depending on which prediction method is specified by the prediction type information PType.
  • a PU to which intra prediction is applied is also referred to as an intra PU
  • a PU to which inter prediction is applied is also referred to as an inter PU.
  • the prediction information PInfo includes information specifying the shape, size, and position of the target PU. As described above, the generation of the predicted image is performed in units of PU. Details of the prediction information PInfo will be described later.
  • TT information TTI is information related to TT included in the CU.
  • the TT information TTI is a set of information regarding each of one or a plurality of TUs included in the TT, and is referred to when the moving image decoding apparatus 1 decodes residual data.
  • a TU may be referred to as a block.
  • the TT information TTI includes TT division information SP_TT that specifies a division pattern of the target CU into each transform block, and quantized prediction residuals QD1 to QDNT (NT is a target CU). The total number of blocks contained in).
  • TT division information SP_TT is information for determining the shape and size of each TU included in the target CU and the position in the target CU.
  • the TT division information SP_TT can be configured by a set of information (split_transform_unit_flag) indicating whether or not the target node is divided.
  • each TU obtained by the division can have a size from 32 ⁇ 32 pixels to 4 ⁇ 4 pixels.
  • Each quantization prediction residual QD is encoded data generated by the moving image encoding apparatus 2 performing the following processes 1 to 3 on a target block that is a processing target block.
  • Process 1 DCT transform (Discrete Cosine Transform) of the prediction residual obtained by subtracting the prediction image from the encoding target image;
  • Process 2 Quantize the transform coefficient obtained in Process 1;
  • Process 3 Variable length coding is performed on the transform coefficient quantized in Process 2;
  • prediction information PInfo As described above, there are two types of prediction information PInfo: inter prediction information and intra prediction information.
  • the inter prediction information includes an encoding parameter that is referred to when the video decoding device 1 generates an inter predicted image by inter prediction. More specifically, the inter prediction information includes inter PU division information that specifies a division pattern of the target CU into each inter PU, and inter prediction parameters for each inter PU.
  • the inter prediction parameters include a reference image index, an estimated motion vector index, and a motion vector residual.
  • the intra prediction information includes an encoding parameter that is referred to when the video decoding device 1 generates an intra predicted image by intra prediction. More specifically, the intra prediction information includes intra PU division information that specifies a division pattern of the target CU into each intra PU, and intra prediction parameters for each intra PU.
  • the intra prediction parameter is a parameter for designating an intra prediction method (prediction mode) for each intra PU.
  • FIG. 4 is a diagram showing the data structure of the base layer of encoded data # 1 (encoded data # 1b).
  • the encoded data # 1b includes, for example, a sequence and a GOP (Group of Pictures) composed of a plurality of picture groups constituting the sequence.
  • Fig. 4 shows the structure of the hierarchy below the picture layer.
  • 4A to 4D are diagrams showing the structures of the picture layer P, the slice layer S, the macroblock layer MB, and the block layer B, respectively.
  • the picture layer P is a set of data referred to by the video decoding device 1 in order to decode the corresponding picture. As shown in FIG. 4A, the picture layer P includes a picture header PH and slice layers S1 to SNs (Ns is the total number of slice layers included in the picture layer P).
  • the picture header PH includes a coding parameter group referred to by the video decoding device 1 in order to determine the decoding method of the corresponding picture. For example, a number (temporal reference) indicating the display order of images used in encoding by the moving image encoding device 2 and a code (picture type) indicating a difference between I picture, P picture, and B picture are pictures. It is an example of the encoding parameter contained in header PH.
  • Each slice layer S included in the picture layer P is a set of data referred to by the video decoding device 1 in order to decode the corresponding slice.
  • the slice layer S includes macroblock layers MB1 to MBNm (Nm is the total number of macroblocks included in the slice S).
  • Each macroblock layer MB included in the slice layer S is a set of data referred to by the video decoding device 1 in order to decode the corresponding macroblock.
  • the macroblock layer MB is a square pixel block of 16 pixels ⁇ 16 lines, and is composed of two 8-pixel ⁇ 8-line color difference blocks Cb and Cr corresponding to the luminance blocks Y1 to Y4.
  • the block is further subdivided into 8 pixel ⁇ 8 pixel line blocks which are DCT processing units. This corresponds to two 8-pixel ⁇ 16-line color difference blocks when the color difference format of the encoded image is 4: 2: 0 and the color difference format of the encoded image is 4: 2: 2.
  • the 4: 4: 4: color difference format corresponds to two 16 pixel ⁇ 16 line color difference blocks.
  • Each block layer B included in the macroblock layer MB is usually composed of data that has been DCT transformed and quantized.
  • the moving picture decoding apparatus 1 includes, in part, a method employed in MPEG-2, H.264, and H.264. H.264 / MPEG-4.
  • a method adopted in AVC a method adopted in KTA software, which is a codec for joint development in VCEG (Video Coding Expert Group), and a method adopted in TMuC (Test Model under Consideration) software, which is the successor codec And the technology employed in HM (HEVC TestModel) software.
  • FIG. 1 is a block diagram showing a configuration of the moving picture decoding apparatus 1.
  • the moving image decoding apparatus 1 includes an enhancement layer decoding unit (HEVC) 10 and a base layer decoding unit (MPEG-2) 11.
  • HEVC enhancement layer decoding unit
  • MPEG-2 base layer decoding unit
  • the enhancement layer decoding unit 10 includes a variable length decoding unit 13, an inverse orthogonal transform / inverse quantization unit 14, an in-loop filter 15, an intra prediction unit 16, an expansion / IP conversion unit 17, a frame memory 18, an inter prediction unit 19, a motion An information conversion processing unit 20, a motion information storage unit 21, a motion information normalization unit 22, a selection unit 23, and an adder 24 are included.
  • the enhancement layer decoding unit 10 performs a decoding process based on the decoded image and motion information decoded by the base layer decoding unit 11.
  • the base layer decoding unit 11 includes a variable length decoding unit 31, an inverse orthogonal transform / inverse quantization unit 32, a frame memory 33, a motion compensation unit 34, and an adder 35.
  • the base layer decoding unit 11 generates a decoded image based on motion information and intra prediction information decoded from the encoded data # 1b.
  • the video decoding device 1 is a device for decoding the base layer of the encoded data # 1 by the base layer decoding unit 11 and the enhancement layer by the enhancement layer decoding unit 10, and the enhancement layer decoding unit 10 Decoding is performed using the encoded information (described later) decoded by the unit 11.
  • the variable length decoding unit 13 decodes the quantized prediction residual QD for each block and the quantization parameter difference ⁇ qp for the tree block including the block from the encoded data # 1a, and performs inverse orthogonal transform / inverse quantization To the unit 14. Further, the variable length decoding unit 13 decodes the prediction parameter PP related to each partition from the encoded data # 1a. That is, with respect to the inter prediction partition, the motion information and mode information such as the reference image index RI, the estimated motion vector index PMVI, and the motion vector residual MVD are decoded from the encoded data # 1, and these are transmitted to the inter prediction unit 19. Supply.
  • the inverse orthogonal transform / inverse quantization unit 14 (1) inversely quantizes the quantized prediction residual QD, (2) performs inverse DCT (Discrete Cosine Transform) conversion on the DCT coefficient obtained by the inverse quantization, and (3 ) The prediction residual D obtained by the inverse DCT transform is supplied to the adder 24.
  • the inverse orthogonal transform / inverse quantization unit 14 derives a quantization step QP from the quantization parameter difference ⁇ qp supplied from the variable length decoding unit 13.
  • the generation of the prediction residual D by the inverse orthogonal transform / inverse quantization unit 14 is performed in units of blocks (transform units).
  • the in-loop filter 15 subjects the decoded image supplied from the adder 24 to deblocking processing and filtering processing using adaptive filter parameters.
  • the intra prediction unit 16 generates a prediction image related to each intra prediction partition. Specifically, a prediction image is generated from the intra prediction information decoded from the encoded data # 1 and the decoded image of the base layer supplied from the expansion / IP conversion unit.
  • the expansion / IP conversion unit 17 supplies the image obtained by expanding and IP-converting the base layer decoded image decoded by the base layer decoding unit 11 to the intra prediction unit 16.
  • the frame memory 18 stores the decoded image that has been filtered by the in-loop filter 15.
  • the inter prediction unit 19 generates a motion compensated image for each inter prediction partition. Specifically, a motion compensated image is generated using motion information, mode information supplied from the variable length decoding unit 13, or motion information in the corresponding base layer supplied from the motion information conversion processing unit. The detailed configuration of the inter prediction unit 19 will be described later.
  • the motion information normalization unit 22 converts the motion vector MV_BL in the corresponding macroblock of the base layer into an intermediate format MV_ITM (intermediate motion vector) that is aligned with the value when separated by one frame (field) in units of macroblocks. . Then, the converted intermediate format MV_ITM is supplied to the motion information storage unit.
  • MV_ITM intermediate motion vector
  • the motion information storage unit 21 stores the intermediate format MV_ITM converted by the motion information normalization unit 22 and supplies the intermediate format MV_ITM to the motion information conversion processing unit 20 in accordance with an instruction from the inter prediction unit 19.
  • An example of information stored in the motion information storage unit 21 will be described with reference to FIG.
  • the field rate (v) in units of pictures is the direction of motion vectors (forward prediction motion vector, backward prediction motion vector) and horizontal components in units of macroblocks. Or information indicating whether the component is a vertical component is stored as motion information in association with the macroblock.
  • the motion information conversion processing unit 20 scales the intermediate motion vector MV_ITM stored in the motion information storage unit 21 based on the picture reference relationship in the enhancement layer, and derives a motion vector MV_EL used by the inter prediction unit 19.
  • the selection unit 23 selects which one of the prediction image generated by the intra prediction unit 16 and the prediction image generated by the inter prediction unit 19 to use, and supplies the selected prediction image to the adder 24.
  • the adder 24 generates a decoded image by adding the prediction image supplied from the selection unit 23 and the prediction residual D supplied from the inverse orthogonal transform / inverse quantization unit 14.
  • variable length decoding unit 31 is the variable length decoding unit 13
  • the inverse orthogonal transform / inverse quantization unit 32 is the inverse orthogonal transform / inverse quantization unit 14
  • the frame memory 33 is the frame memory 18, and the adder 35 is the adder 24. Since this has the same function, the description thereof is omitted.
  • the motion compensation unit 34 restores the motion vector related to each inter prediction partition from the motion vector residual MVD related to the partition and the restored motion vector related to another partition, and generates an inter prediction image.
  • the inter prediction unit 19 includes an enhancement layer motion vector buffer 901, a mode buffer 902, a motion compensation unit 903, and a base layer motion vector buffer 904.
  • the enhancement layer motion vector buffer 901 stores the motion vector residual MVD supplied from the variable length decoding unit 13 and supplies it to the motion compensation unit 903 as enhancement layer motion vector information.
  • the mode buffer 902 stores prediction mode information supplied from the variable length decoding unit 13.
  • the base layer motion vector buffer 904 stores the base layer motion vector information supplied from the motion information conversion processing unit 20 and supplies the base layer motion vector information to the motion compensation unit 903.
  • the motion compensation unit 903 generates an inter prediction image using either the enhancement layer motion vector information supplied from the enhancement layer motion vector buffer 901 or the base layer motion vector information supplied from the base layer motion vector buffer 904. , Supplied to the selector 23.
  • Motion vector deriving method 1 First, prior to the description of the motion vector derivation method, feed prediction and frame prediction will be described with reference to FIG. As shown in FIG. 5A, when performing field prediction, a motion vector is predicted from two fields for one picture. Prediction data is generated by two motion vectors from two identical or different fields.
  • prediction data is generated from one frame by one motion vector.
  • the method of deriving the motion vector used by the inter prediction unit 19 in other words, the processing of the motion information normalization unit 22, the motion information storage unit 21, and the motion information conversion processing unit 20 will be described with reference to FIG. To do.
  • the motion vector derivation process used in the enhancement layer when field prediction in the frame structure is performed in the base layer will be described.
  • FIG. 7 shows the arrangement of the frames in the display time order when the moving image decoding apparatus 1 decodes the moving image.
  • the base layer (BL) in FIG. 7 shows a field structure picture.
  • frame BB0 is a combination of fields B01 and B02 indicated by broken lines.
  • the base layer frame BI2 is processed as an I picture
  • the frame BP5 is processed as a P picture
  • the frames BB0, BB1, BB3, and BB4 are processed as B pictures.
  • One frame can be handled as two fields.
  • the frame BB0 can be handled as fields B01 and B02
  • the frame BB1 can be handled as fields B11 and B12.
  • each frame of the base layer is adaptively decoded in either a frame structure or a field structure.
  • a picture used for motion compensation in the base layer has a frame structure and field prediction is performed in units of fields.
  • each frame picture corresponds to a picture in each field of the base layer.
  • the base layer field corresponding to the enhancement layer frame b0 is B01.
  • I4 of the enhancement layer is processed as an I picture
  • P10 is processed as a P picture
  • B1 and B7 are processed as reference B pictures
  • b0, b2, b3, b6, b8, b9, and b11 are processed as non-reference B pictures.
  • the current processing target is a frame B1
  • the frame B1 uses an enhancement layer I4 as a backward reference picture and a base layer BB0 (fields B01 and B02) as an inter-layer reference picture. These reference pictures have already been decoded when the processing target frame B1 is decoded.
  • the frame (field) of the lower layer (base layer) corresponding to the current processing target frame of the enhancement layer is specified based on the time information included in the bit stream of each layer.
  • the identification of the corresponding frame is not limited to the configuration based on the time information, and may be performed by a configuration in which a common picture number is assigned to the base layer and the enhancement layer, for example.
  • the motion vector MV_EL (1) _a used in the prediction block PU of the frame B1 is derived from the base layer.
  • the motion vector used when processing the macroblock a in the prediction block PU of the base layer frame BB0 corresponding to the frame B1 is used.
  • the macroblock a is a macroblock a that is a base layer macroblock located at coordinates corresponding to the center point of the prediction block PU of the frame B1.
  • the macroblock of the base layer is specified based on the coordinates of the center point of the PU of the prediction block, but the macroblock of the base layer may be specified based on the coordinates of the upper left corner of the PU and other end points,
  • the present invention is not limited to this.
  • the frame BB0 is processed by field prediction as a frame structure.
  • the frame BB0 is referred to as the backward reference picture, referring to the fields I21 and I22, and the macroblock a refers to the respective reference pictures and is processed by the motion vectors MV_BL (1-1) _a and MV_BL (1-2) _a. It is assumed that
  • the reference source fields of MV_BL (1-1) _a and MV_BL (1-2) _a are B01 and B02, respectively. Since the base layer field corresponding to the processing target frame b1 is B02, the inter prediction unit 19 uses the intermediate motion vector MV_ITM (1) _a stored in the motion information storage unit 21 to convert it. Motion compensation is performed using the motion vector MV_EL (1) _a.
  • the inter prediction unit 19 specifies the frame BB0 of the base layer corresponding to the processing target B1.
  • the motion vector information (MV_BL (1-1) _a, MV_BL (1-2) _a) used when processing the corresponding macroblock of the frame BB0
  • the frame of the base layer corresponding to B1 to be processed An intermediate motion vector MV_ITM (1) _a obtained by normalizing the motion vector MV_BL (1-2) _a used when processing B02 is supplied from the motion information storage unit 21 to the motion information conversion processing unit 20.
  • the intermediate motion vector MV_ITM (1) _a stored in the motion information storage unit 21 has already been normalized and stored by the motion information normalization unit 22 when the prediction block PU is decoded.
  • the intermediate motion vector MV_ITM (1) _a is calculated by the motion information normalization unit 22 from the motion vector MV_BL (1-2) _a using equation (1).
  • MV_ITM (1) _a MV_BL (1-2) _a ⁇ tb_a / td_a
  • tb is a time interval indicating a distance one frame away from the processing target frame
  • td is a frame from which the motion vector MV_BL (1-2) _a used for decoding the target base layer is calculated. This is a time interval indicating the distance between B02 and the frame I22.
  • the motion information conversion processing unit 20 calculates a motion vector MV_EL (1) _a to be used according to the equation (2).
  • MV_EL (1) _a MV_ITM (2) _a ⁇ tde_a / tb_a ⁇ scaling process (2) where scaling process is EL resolution / BL resolution, and tde is one frame away from the target frame B1 A time interval indicating the distance.
  • the frame rate (field rate) is the same between the base layer and the enhancement layer.
  • the enhancement layer frame rate / basic layer The MV_EL may be obtained by multiplying the MV_ITM by the frame rate and adjusting the scale in the time direction.
  • the base layer frame rate can be obtained from the sequence header in the case of MPEG-2.
  • the inter prediction unit 19 specifies the frame BB1 of the base layer corresponding to b2 to be processed. Then, an intermediate motion vector MV_ITM (1) _b obtained by normalizing the motion vector MV_BL (1) _b used when processing the corresponding macroblock of the frame BB1 is supplied from the motion information storage unit 21 to the motion information conversion processing unit 20.
  • the intermediate motion vector MV_ITM (1) _b stored in the motion information storage unit 21 is already normalized and stored by the motion information normalization unit 22 when the prediction block PU is decoded.
  • the intermediate motion vector MV_ITM (1) _b is calculated by the motion information normalization unit 22 from the motion vector MV_BL (1) _b using equation (3).
  • MV_ITM (1) _b MV_BL (1) _b ⁇ tb_b / td_b Formula (3)
  • the motion information conversion processing unit 20 calculates a motion vector MV_EL (1) _b to be used according to Equation (4).
  • MV_EL (1) _b MV_ITM (2) _b ⁇ tde_b / tb_b ⁇ scaling formula (4) (Motion vector derivation method 1-3)
  • the inter prediction unit 19 specifies the frame BB4 of the base layer corresponding to b9. Since motion vector information in the same prediction direction is not stored in the motion information storage unit 21 in the corresponding macrobook of the frame BB4, it corresponds to P10 which is the nearest reference frame in the encoding order from b9 which is the decoding target frame.
  • the motion vector information of the corresponding macroblock in the base layer frame BP5 is referred to.
  • One of the intermediate motion vectors MV_ITM (1-1) _c and MV_ITM (1-1) _c calculated from _c is supplied from the motion information storage unit 21 to the conversion unit 220.
  • the intermediate motion vector MV_ITM (1-1) _c is supplied.
  • both the intermediate motion vectors MV_ITM (1-1) _c and MV_ITM (1-1) _c may be supplied.
  • the intermediate motion vector MV_ITM (1) _c is calculated by the motion information normalization unit 22 from the motion vector MV_BL (1-1) _b using equation (5).
  • MV_ITM (1) _c MV_BL (1-1) _c ⁇ tb_c / td_c Formula (5)
  • the motion information conversion processing unit 20 calculates a motion vector MV_EL (1) _c to be used according to Equation (6).
  • MV_EL (1) _c MV_ITM (2) _c ⁇ tde_c / tb_c ⁇ scaling formula (6) (Motion vector derivation method 1-4)
  • the inter prediction unit 19 specifies the frame B42 of the base layer corresponding to b9. Since motion vector information in the same prediction direction is not stored in the motion information storage unit 21 in the corresponding macrobook of the frame B42, it corresponds to P10 which is the nearest reference frame in the encoding order from b9 which is the decoding target frame.
  • the base layer frame BP5 that is, in the fields P51 and P52, the intermediate vector calculated from MV_BL (1) _d which is the motion vector in the same prediction direction among the motion vectors used when the anchor macroblock at the corresponding position is processed
  • the motion vector MV_ITM (1) _d is supplied from the motion information storage unit 21 to the motion information conversion unit 20.
  • the intermediate motion vector MV_ITM (1) _d is already normalized and accumulated by the motion information normalization unit 22 when the prediction block PU is decoded.
  • the intermediate motion vector MV_ITM (1) _d is calculated by the motion information normalization unit 22 from the motion vector MV_BL (1-1) _d using equation (7).
  • MV_ITM (1) _d MV_BL (1-1) _d ⁇ tb_d / td_d Equation (7)
  • the motion information conversion processing unit 20 calculates a motion vector MV_EL (1) _d to be used according to Expression (8).
  • MV_EL (1) _d MV_ITM (2) _d ⁇ tde_d / tb_d ⁇ scaling formula (8)
  • the intermediate motion vector is based on the motion vector included in the reference frame nearest to the frame in the encoding order.
  • the base layer contains a motion vector with a prediction direction different from the prediction direction required by the enhancement layer, an intermediate motion vector is derived based on the motion vector. Also good.
  • the enhancement layer decoding unit 10 constructs a reference image list used for decoding the target frame (S101).
  • the reference image list includes the position of the decoded image of each reference image in the frame buffer and the output order of each reference image.
  • the base layer decoding unit 11 decodes a base layer picture (frame or field) necessary for decoding the target frame of the enhancement layer (S102). Then, the motion information normalization unit 22 executes an intermediate motion information derivation process for calculating intermediate motion vector information from the motion vector used for decoding the corresponding base layer picture (intermediate motion derivation process: S103). .
  • the side information includes a CU prediction type (intra prediction mode and inter prediction mode identification information), a skip flag, and PU partition information (S105).
  • variable length decoding unit 13 decodes the transform coefficient in each TU in the target CU, and the inverse orthogonal transform / inverse quantization unit 14 performs inverse orthogonal transform / inverse quantization to decode the prediction residual (S106). ).
  • the intra prediction unit 16 predicts each prediction unit PU in the target CU by intra prediction. Is generated (S108).
  • the inter prediction unit 19 uses the inter prediction to calculate the prediction image of each PU in the target CU. Generate (inter prediction process: S109).
  • a deblocking filter When a decoded image is generated for the target CU (S110) and decoded images are generated for all CUs (YES in S111), a deblocking filter, an adaptive offset filter (SAO: Sample Adaptive Offset), and an adaptive loop are added to the decoded image.
  • a filter (ALF: Alternative Loop Filter) is applied (S112), and the process ends.
  • the inter prediction unit 19 first sets a target PU (S201) and decodes PU side information (S202).
  • the PU side information includes a base layer prediction flag (base_mode_flag), a merge flag (merge_flag), and a merge index (merge_idx).
  • the motion compensation parameter of the PU is decoded (S205).
  • the target PU is a merge PU (YES in S203) and the base mode refers to the motion vector of the base layer (YES in S203)
  • the enhancement layer decoding unit 10 performs an inter-layer motion estimation process ( Inter-layer motion estimation processing: S207). If the target PU is not the base mode (NO in S204), the inter prediction unit 19 derives a merge candidate (S206).
  • the inter prediction unit 19 performs motion compensation using the motion compensation parameter derived in step S205, step S206, or step S207 to generate a predicted image (S208). Then, when the process is completed for all PUs (YES in S209), the inter prediction process is terminated.
  • FIG. 14 shows a table for deriving motion compensation parameters.
  • the motion compensation parameter derivation table shown in FIG. 14 associates CU side information and PU side information with a motion compensation parameter derivation method for the target PU.
  • “0” and “1” indicate corresponding syntax values
  • “ ⁇ ” indicates that decoding of the syntax elements is unnecessary.
  • the motion information normalization unit 22 sets a base layer picture used for derivation as intermediate motion information (S301).
  • the motion information normalization unit 22 divides the target frame into basic processing units (in this embodiment, MPEG-2 macroblocks) (S302), and determines the position of the base layer picture corresponding to the target basic processing unit.
  • the specified motion vector is read out (S303).
  • an intermediate motion vector MV_ITM is derived from the read motion vector (S304).
  • the enhancement layer decoding unit 10 estimates the use flag of the reference image list.
  • the use flag can be used for both L0 and L1 when there are two motion vectors in the intermediate motion information, and only L0 can be used when there is only one.
  • the enhancement layer decoding unit 10 sets a reference image list for referring to a motion vector used for decoding the target PU (S402).
  • the reference image list is selected in the order of L0 ⁇ L1.
  • the moving picture encoding apparatus 2 includes, in part, a method employed in MPEG-2, H.264, and H.264. H.264 / MPEG-4.
  • a method adopted in AVC a method adopted in KTA software, which is a codec for joint development in VCEG (Video Coding Expert Group), and a method adopted in TMuC (Test Model under Consideration) software, which is the successor codec And the technology employed in HM (HEVC TestModel) software.
  • FIG. 21 is a block diagram showing a configuration of the video encoding device 2 according to the present embodiment.
  • the moving picture encoding apparatus 2 includes an enhancement layer encoding unit (HEVC) 81, a base layer encoding unit (MPEG-2) 82, and a reduction / interlacing processing unit 83. is there.
  • HEVC enhancement layer encoding unit
  • MPEG-2 base layer encoding unit
  • MPEG-2 base layer encoding unit
  • the enhancement layer encoding unit 81 includes an image rearrangement buffer 51, a motion information conversion processing unit 52, a motion information accumulation unit 53, a motion information normalization unit 54, a frame memory 55, a motion prediction / compensation unit 56, an enlargement / IP Transformer 57, intra-loop filter 58, intra prediction unit 59, selection unit 60, inverse orthogonal transform / inverse quantization unit 61, orthogonal transform / quantization unit 62, variable length coding unit 63, subtractor 64, adder 65 It is the structure containing.
  • the base layer encoding unit 82 also includes an image rearrangement buffer 71, a motion estimation / motion compensation unit 72, a frame memory 73, an inverse orthogonal transform / inverse quantization unit 74, an orthogonal transform / quantization unit 75, and a variable length encoding.
  • the unit 76 includes an adder 77 and a subtractor 78.
  • the moving image encoding device 2 generates encoded data # 1 (a, b) including a base layer and an enhancement layer by performing scalable encoding (SVC) on moving image # 10 (encoding target image). It is.
  • SVC scalable encoding
  • the image rearrangement buffer 51 and the image rearrangement buffer 71 are buffers for rearranging input images in the encoding order.
  • the orthogonal transform / quantization unit 62 (1) performs DCT transform (Discrete Cosine Transform) for each block on the prediction residual D obtained by subtracting the predicted image Pred from the encoding target image, and (2) DCT obtained by DCT transform. The coefficient is quantized, and (3) the quantized prediction residual QD obtained by the quantization is supplied to the variable length coding unit 63 and the inverse orthogonal transform / inverse quantization unit 61.
  • the orthogonal transform / quantization unit 62 selects (1) a quantization step QP used for quantization for each tree block, and (2) a quantization parameter indicating the size of the selected quantization step QP.
  • the difference ⁇ qp is supplied to the variable length encoding unit 63, and (3) the selected quantization step QP is supplied to the inverse orthogonal transform / inverse quantization unit 61.
  • the difference value obtained by subtracting the value of '.
  • the variable length encoding unit 63 includes (1) a quantized prediction residual QD and ⁇ qp supplied from the orthogonal transform / quantization unit 62, (2) intra prediction information supplied from the intra prediction unit 59, and (3) motion.
  • the motion information supplied from the prediction / compensation unit 56 is variable-length encoded to generate encoded data # 1a.
  • the motion prediction / compensation unit 56 detects the motion vector mv for each partition, and uses the detected motion vector mv and the reference image index RI that specifies the filtered decoded image used as the reference image, to compensate for the motion compensated image mc. Is generated. Details of the motion prediction / compensation unit 56 will be described later.
  • the motion estimation / motion compensation unit 72 detects a motion vector mv related to each partition, and uses the detected motion vector mv and a reference image index RI that specifies a filtered decoded image used as a reference image, to compensate for a motion compensated image. Generate mc.
  • the orthogonal transform / quantization unit 75 performs (1) DCT transform (Discrete Cosine Transform) for each block on the prediction residual D obtained by subtracting the predicted image Pred from the encoding target image, and (2) DCT obtained by DCT transform. The coefficient is quantized, and (3) the quantized prediction residual QD obtained by the quantization is supplied to the variable length coding unit 76 and the inverse orthogonal transform / inverse quantization unit 74.
  • DCT transform Discrete Cosine Transform
  • the in-loop filter 15, the inverse orthogonal transform / inverse quantization unit 14, the adder 24, the frame memory 33, and the inverse orthogonal transform / inverse quantization unit 32 have the same functions, and thus description thereof is omitted.
  • the motion prediction / compensation unit 56 includes a motion search unit 911, a cost function calculation unit 912, a mode determination unit 913, and a motion compensation unit 914.
  • the motion search unit 911 derives motion information from the input image information supplied from the image rearrangement buffer 51 and the reference image information supplied from the frame memory 55, and costs function calculation unit 912 as enhancement layer motion vector information. To supply.
  • Cost function calculation unit 912 defines a cost function for motion vector information and mode determination.
  • the mode determination unit 913 determines whether to perform prediction using the cost function defined by the cost function calculation unit 912, refer to the motion vector of the base layer, or perform normal motion compensation processing.
  • the motion compensation unit 914 selects an inter prediction process that minimizes the cost function as an optimum parameter, generates a prediction image, supplies the prediction image to the selection unit 60, and also supplies the motion compensation parameter and the motion vector information to the variable length coding unit 63. To supply.
  • the calculated intermediate motion vector MV_ITM is converted and used as the enhancement layer motion vector MV_EL.
  • the intermediate motion vector MV_ITM is expanded as one of motion vector candidates used in motion compensation.
  • the motion compensation prediction processing of the layer may be performed, or may be used as a prediction vector in motion vector prediction.
  • the picture used for motion compensation in the base layer has a frame structure, but processing may be performed in units of fields.
  • frame prediction and field prediction will be described as prediction methods, even in the case of dual prime prediction in which an average of predictions from different fields is obtained for a macroblock, as described above, it is based on the reference relationship of pictures. Thus, it is possible to calculate an intermediate motion vector from the motion vector of the base layer.
  • the reduction / interlacing processing unit 83 downscales the input image and interlaces it.
  • the example using interlaced video data as the base layer has been described.
  • an intermediate motion vector calculated from the motion vector of the base layer is used.
  • the motion vector of the enhancement layer can be calculated.
  • the reduction / interlacing processing unit 83 performs only the process of downscaling the input image.
  • the difference from the first embodiment is that the enhancement layer is decoded based on the motion information of the base layer.
  • the base layer motion information is converted into intermediate motion information and used for the enhancement layer decoding process.
  • the present embodiment without converting into the intermediate motion information, Used for enhancement layer decoding.
  • FIG. 17 shows the configuration of the video decoding device 1 ′ according to the present embodiment.
  • the moving image decoding apparatus 1 ′ differs from the moving image decoding apparatus 1 in that it does not include the motion information normalization unit 22, and instead of the motion information conversion processing unit 20 and the motion information storage unit 21, motion information conversion processing is performed. It is a point provided with the part 20 'and the motion information storage part 21'.
  • the motion information storage unit 21 ′ stores the motion information MV_BL of the base layer in units of macroblocks and supplies the motion information MV_BL to the motion information conversion processing unit 20 ′ according to an instruction from the inter prediction unit 19.
  • An example of information stored in the motion information storage unit 21 ′ will be described with reference to FIG. As shown in FIG. 20, in the motion information storage unit 21 ′, the picture number difference value and the field rate in units of pictures, the direction of motion vectors (forward prediction motion vector, backward prediction motion vector) in units of macroblocks, Information indicating whether the component is a horizontal component or a vertical component is stored as motion information in association with the macroblock.
  • the picture number difference value can be calculated by taking the difference between the current picture number acquired from the picture header and the picture number of the reference picture.
  • a common picture number is defined in the base layer and the enhancement layer, it can be calculated by taking a difference from the current picture number acquired from the slice header. That is, the picture number difference value only needs to indicate the time interval between the motion vector reference source and the reference destination picture.
  • the frame rate is used when the display rate of the basic layer and the extension layer are different.
  • the frame rate can be obtained from the sequence header.
  • the frame rate (field rate) is the same between the base layer and the enhancement layer.
  • the enhancement layer frame rate / basic layer MV_EL may be obtained by multiplying MV_BL by “frame rate” and adjusting the scale in the time direction.
  • the motion information conversion processing unit 20 ′ scales the motion vector MV_BL stored in the motion information storage unit 21 ′ based on the picture reference relationship in the enhancement layer, and derives a motion vector MV_EL used by the inter prediction unit 19. .
  • Motion vector derivation method 2-1 Next, a method for deriving a motion vector used by the inter prediction unit 19 will be described with reference to FIG. Here, the motion vector derivation process used in the enhancement layer when field prediction in the frame structure is performed in the base layer will be described.
  • the processing target is a frame b3.
  • Frame b3 uses B1 as the forward reference picture, I4 as the backward reference picture, and BB1 of the base layer as the inter-layer reference picture. These reference pictures have already been decoded when the processing target frame b3 is decoded.
  • the motion vector MV_EL (1) _e used in the prediction block PU of the frame b3 is derived from the base layer.
  • the motion vector used when processing the macroblock a corresponding to the prediction block PU of the base layer frame BB1 corresponding to the frame b3 is used.
  • the macroblock a is a macroblock a of the base layer located at the coordinates corresponding to the center point of the prediction block PU of the frame b3.
  • the frame BB1 is processed by field prediction as a frame structure.
  • the frame BB1 refers to the fields I21 and I22 as backward reference pictures
  • the macroblock a refers to the respective reference pictures and processes them with the motion vectors MV_BL (1-1) _e and MV_BL (1-2) _e. It is assumed that
  • the reference source fields of MV_BL (1-1) _e and MV_BL (1-2) _e are B01 and B02, respectively. Since the base layer field corresponding to the processing target frame b3 is B12, the inter prediction unit 19 performs motion compensation using the motion vector MV_BL (1-2) _e stored in the motion information storage unit 21.
  • the inter prediction unit 19 specifies the frame BB1 of the base layer corresponding to b3 to be processed.
  • the motion vector information (MV_BL (1-1) _e, MV_BL (1-2) _e) used when processing the corresponding macroblock of the frame BB1
  • the frame of the base layer corresponding to the processing target b3 The motion vector MV_BL (1-2) _e used when processing B12 is supplied from the motion information storage unit 21 ′ to the motion information conversion processing unit 20 ′.
  • the motion information conversion processing unit 20 ′ calculates a motion vector MV_EL (1) _e from the motion vector MV_BL (1-2) _e according to the equation (9).
  • MV_EL (1) _e MV_BL (1-2) _e ⁇ tb_e / td_e ⁇ scaling formula (9)
  • the scaling processing is EL resolution / BL resolution
  • tb / td is the ratio of the distances in the time direction of each picture.
  • the decryption target is b3.
  • the inter prediction unit 19 specifies the frame B1 of the base layer corresponding to b3 to be processed.
  • the motion vector MV_BL (1) _f used when processing the corresponding macroblock of the frame B1 is supplied from the motion information storage unit 21 ′ to the motion information conversion processing unit 20 ′.
  • the motion information conversion processing unit 20 ′ calculates a motion vector MV_EL (1) _f from the motion vector MV_BL (1) _f using the equation (10).
  • the moving picture coding apparatus 2 ′ differs from the moving image encoding device 2 in that the motion information normalization unit 54 is not provided, and the motion information conversion processing unit 52 and the motion information storage unit 53 are replaced with motion information.
  • a conversion processing unit 52 ′ and a motion information storage unit 53 ′ are provided.
  • the motion information conversion processing unit 52 ′ and the motion information storage unit 53 ′ have the same functions as the motion information conversion processing unit 20 ′ and the motion information storage unit 21 ′.
  • the motion compensation prediction of the enhancement layer is performed using the motion vector calculated from the motion information of the base layer, but the motion vector candidate of the motion vector to be used for motion compensation is calculated.
  • the motion compensation prediction process of the enhancement layer may be performed, or may be used as a prediction vector in motion vector prediction.
  • a motion vector calculated from motion information of the base layer may be used as one of motion vector candidates used in merge mode or median prediction.
  • the picture used for motion compensation in the base layer has a frame structure, but processing may be performed in units of fields.
  • frame prediction and field prediction will be described as prediction methods.
  • a motion vector from the base layer to the enhancement layer can be calculated based on the relationship.
  • the above-described moving image encoding device 2 and moving image decoding device 1 can be used by being mounted on various devices that perform transmission, reception, recording, and reproduction of moving images.
  • the moving image may be a natural moving image captured by a camera or the like, or may be an artificial moving image (including CG and GUI) generated by a computer or the like.
  • the moving picture encoding apparatus 2 and the moving picture decoding apparatus 1 described above can be used for transmission and reception of moving pictures.
  • FIG. 20 is a block diagram showing a configuration of a transmission apparatus PROD_A in which the moving picture encoding apparatus 2 is mounted.
  • the transmission device PROD_A modulates a carrier wave with an encoding unit PROD_A1 that obtains encoded data by encoding a moving image, and the encoded data obtained by the encoding unit PROD_A1.
  • a modulation unit PROD_A2 that obtains a modulation signal and a transmission unit PROD_A3 that transmits the modulation signal obtained by the modulation unit PROD_A2 are provided.
  • the moving image encoding apparatus 2 described above is used as the encoding unit PROD_A1.
  • the transmission device PROD_A is a camera PROD_A4 that captures a moving image, a recording medium PROD_A5 that records the moving image, an input terminal PROD_A6 that inputs the moving image from the outside, as a supply source of the moving image input to the encoding unit PROD_A1.
  • An image processing unit A7 that generates or processes an image may be further provided. In FIG. 20A, a configuration in which all of these are provided in the transmission device PROD_A is illustrated, but a part may be omitted.
  • the recording medium PROD_A5 may be a recording of a non-encoded moving image, or a recording of a moving image encoded by a recording encoding scheme different from the transmission encoding scheme. It may be a thing. In the latter case, a decoding unit (not shown) for decoding the encoded data read from the recording medium PROD_A5 according to the recording encoding method may be interposed between the recording medium PROD_A5 and the encoding unit PROD_A1.
  • FIG. 20 is a block diagram illustrating a configuration of the receiving device PROD_B in which the moving image decoding device 1 is mounted.
  • the receiving device PROD_B includes a receiving unit PROD_B1 that receives a modulated signal, a demodulating unit PROD_B2 that obtains encoded data by demodulating the modulated signal received by the receiving unit PROD_B1, and a demodulator.
  • a decoding unit PROD_B3 that obtains a moving image by decoding the encoded data obtained by the unit PROD_B2.
  • the moving picture decoding apparatus 1 described above is used as the decoding unit PROD_B3.
  • the receiving device PROD_B has a display PROD_B4 for displaying a moving image, a recording medium PROD_B5 for recording the moving image, and an output terminal for outputting the moving image to the outside as a supply destination of the moving image output by the decoding unit PROD_B3.
  • PROD_B6 may be further provided.
  • FIG. 20B illustrates a configuration in which the reception apparatus PROD_B includes all of these, but a part may be omitted.
  • the recording medium PROD_B5 may be used for recording a non-encoded moving image, or may be encoded using a recording encoding method different from the transmission encoding method. May be. In the latter case, an encoding unit (not shown) for encoding the moving image acquired from the decoding unit PROD_B3 according to the recording encoding method may be interposed between the decoding unit PROD_B3 and the recording medium PROD_B5.
  • the transmission medium for transmitting the modulation signal may be wireless or wired.
  • the transmission mode for transmitting the modulated signal may be broadcasting (here, a transmission mode in which the transmission destination is not specified in advance) or communication (here, transmission in which the transmission destination is specified in advance). Refers to the embodiment). That is, the transmission of the modulation signal may be realized by any of wireless broadcasting, wired broadcasting, wireless communication, and wired communication.
  • a terrestrial digital broadcast broadcasting station (broadcasting equipment or the like) / receiving station (such as a television receiver) is an example of a transmitting device PROD_A / receiving device PROD_B that transmits and receives a modulated signal by wireless broadcasting.
  • a broadcasting station (such as broadcasting equipment) / receiving station (such as a television receiver) of cable television broadcasting is an example of a transmitting device PROD_A / receiving device PROD_B that transmits and receives a modulated signal by cable broadcasting.
  • a server workstation etc.
  • Client television receiver, personal computer, smart phone etc.
  • VOD Video On Demand
  • video sharing service using the Internet is a transmitting device for transmitting and receiving modulated signals by communication.
  • PROD_A / reception device PROD_B usually, either a wireless or wired transmission medium is used in a LAN, and a wired transmission medium is used in a WAN.
  • the personal computer includes a desktop PC, a laptop PC, and a tablet PC.
  • the smartphone also includes a multi-function mobile phone terminal.
  • the video sharing service client has a function of encoding a moving image captured by the camera and uploading it to the server. That is, the client of the video sharing service functions as both the transmission device PROD_A and the reception device PROD_B.
  • moving image encoding device 2 and moving image decoding device 1 can be used for recording and reproduction of moving images.
  • FIG. 21A is a block diagram showing a configuration of a recording apparatus PROD_C in which the above-described moving picture encoding apparatus 2 is mounted.
  • the recording device PROD_C includes an encoding unit PROD_C1 that obtains encoded data by encoding a moving image, and the encoded data obtained by the encoding unit PROD_C1 on the recording medium PROD_M.
  • the moving image encoding apparatus 2 described above is used as the encoding unit PROD_C1.
  • the recording medium PROD_M may be of a type built in the recording device PROD_C, such as (1) HDD (Hard Disk Drive) or SSD (Solid State Drive), or (2) SD memory. It may be of the type connected to the recording device PROD_C, such as a card or USB (Universal Serial Bus) flash memory, or (3) DVD (Digital Versatile Disc) or BD (Blu-ray Disc: registration) Or a drive device (not shown) built in the recording device PROD_C.
  • HDD Hard Disk Drive
  • SSD Solid State Drive
  • SD memory such as a card or USB (Universal Serial Bus) flash memory, or (3) DVD (Digital Versatile Disc) or BD (Blu-ray Disc: registration) Or a drive device (not shown) built in the recording device PROD_C.
  • the recording device PROD_C is a camera PROD_C3 that captures moving images as a supply source of moving images to be input to the encoding unit PROD_C1, an input terminal PROD_C4 for inputting moving images from the outside, and reception for receiving moving images.
  • the unit PROD_C5 and an image processing unit C6 that generates or processes an image may be further provided.
  • FIG. 21A illustrates a configuration in which the recording apparatus PROD_C includes all of these, but some of them may be omitted.
  • the receiving unit PROD_C5 may receive a non-encoded moving image, or may receive encoded data encoded by a transmission encoding scheme different from the recording encoding scheme. You may do. In the latter case, a transmission decoding unit (not shown) that decodes encoded data encoded by the transmission encoding method may be interposed between the reception unit PROD_C5 and the encoding unit PROD_C1.
  • Examples of such a recording device PROD_C include a DVD recorder, a BD recorder, and an HDD (Hard Disk Drive) recorder (in this case, the input terminal PROD_C4 or the receiving unit PROD_C5 is a main supply source of moving images).
  • a camcorder in this case, the camera PROD_C3 is a main source of moving images
  • a personal computer in this case, the receiving unit PROD_C5 or the image processing unit C6 is a main source of moving images
  • a smartphone is also an example of such a recording device PROD_C.
  • FIG. 21 (B) of FIG. 21 is a block showing a configuration of a playback device PROD_D in which the above-described moving image decoding device 1 is mounted.
  • the playback device PROD_D reads a moving image by decoding a read unit PROD_D1 that reads encoded data written on the recording medium PROD_M and a read unit PROD_D1 that reads the encoded data. And a decoding unit PROD_D2 to be obtained.
  • the moving picture decoding apparatus 1 described above is used as the decoding unit PROD_D2.
  • the recording medium PROD_M may be of the type built into the playback device PROD_D, such as (1) HDD or SSD, or (2) such as an SD memory card or USB flash memory, It may be of a type connected to the playback device PROD_D, or (3) may be loaded into a drive device (not shown) built in the playback device PROD_D, such as DVD or BD. Good.
  • the playback device PROD_D has a display PROD_D3 that displays a moving image, an output terminal PROD_D4 that outputs the moving image to the outside, and a transmission unit that transmits the moving image as a supply destination of the moving image output by the decoding unit PROD_D2.
  • PROD_D5 may be further provided.
  • FIG. 21B illustrates a configuration in which the playback apparatus PROD_D includes all of these, but a part may be omitted.
  • the transmission unit PROD_D5 may transmit an unencoded moving image, or transmits encoded data encoded by a transmission encoding method different from the recording encoding method. You may do. In the latter case, it is preferable to interpose an encoding unit (not shown) that encodes a moving image using an encoding method for transmission between the decoding unit PROD_D2 and the transmission unit PROD_D5.
  • Examples of such a playback device PROD_D include a DVD player, a BD player, and an HDD player (in this case, an output terminal PROD_D4 to which a television receiver or the like is connected is a main supply destination of moving images).
  • a television receiver in this case, the display PROD_D3 is a main supply destination of moving images
  • a digital signage also referred to as an electronic signboard or an electronic bulletin board
  • the display PROD_D3 or the transmission unit PROD_D5 is the main supply of moving images.
  • Desktop PC (in this case, the output terminal PROD_D4 or the transmission unit PROD_D5 is the main video image supply destination), laptop or tablet PC (in this case, the display PROD_D3 or the transmission unit PROD_D5 is a moving image)
  • a smartphone which is a main image supply destination
  • a smartphone in this case, the display PROD_D3 or the transmission unit PROD_D5 is a main moving image supply destination
  • the like are also examples of such a playback device PROD_D.
  • each block of the moving picture decoding apparatus 1 (1 ′) and the moving picture encoding apparatus 2 (2 ′) is realized in hardware by a logic circuit formed on an integrated circuit (IC chip). Alternatively, it may be realized by software using a CPU (central processing unit).
  • the moving picture decoding apparatus 1 (1 ′) and the moving picture encoding apparatus 2 (2 ′) include a CPU that executes instructions of a control program that realizes each function, and a ROM (readonly memory) that stores the program.
  • a RAM random access memory
  • a storage device such as a memory for storing the program and various data.
  • An object of the present invention is to provide program codes (execution format program, intermediate code) of control programs of the video decoding device 1 (1 ′) and the video encoding device 2 (2 ′) which are software for realizing the functions described above.
  • a recording medium in which a program and a source program are recorded so as to be readable by a computer is supplied to the moving picture decoding apparatus 1 (1 ′) and moving picture encoding apparatus 2 (2 ′), and the computer (or CPU or MPU) (Micro processing unit)) can also be achieved by reading and executing the program code recorded on the recording medium.
  • the recording medium examples include tapes such as a magnetic tape and a cassette tape, a magnetic disk such as a floppy (registered trademark) disk / hard disk, a CD-ROM (compact disk-read-only memory) / MO (magneto-optical) / Discs including optical discs such as MD (Mini Disc) / DVD (digital versatile disc) / CD-R (CD Recordable), IC cards (including memory cards) / optical cards, mask ROM / EPROM (erasable) Programmable read-only memory) / EEPROM (electrically erasable and programmable programmable read-only memory) / semiconductor memory such as flash ROM, or logic circuits such as PLD (Programmable logic device) and FPGA (Field Programmable Gate Array) be able to.
  • a magnetic disk such as a floppy (registered trademark) disk / hard disk
  • the moving picture decoding apparatus 1 (1 ′) and the moving picture encoding apparatus 2 (2 ′) may be configured to be connectable to a communication network, and the program code may be supplied via the communication network.
  • the communication network is not particularly limited as long as it can transmit the program code.
  • Internet intranet, extranet, LAN (local area network), ISDN (integrated area services digital area), VAN (value-added area network), CATV (community area antenna television) communication network, virtual area private network (virtual area private network), A telephone line network, a mobile communication network, a satellite communication network, etc. can be used.
  • the transmission medium constituting the communication network may be any medium that can transmit the program code, and is not limited to a specific configuration or type.
  • IEEE institute of electrical and electronic engineers 1394, USB, power line carrier, cable TV line, telephone line, ADSL (asynchronous digital digital subscriber loop) line, etc. wired such as IrDA (infrared data association) and remote control, Bluetooth (registered trademark), IEEE802.11 wireless, HDR (high data rate), NFC (Near Field Communication), DLNA (Digital Living Network Alliance), mobile phone network, satellite line, terrestrial digital network, etc.
  • the present invention can also be realized in the form of a computer data signal embedded in a carrier wave in which the program code is embodied by electronic transmission.
  • the present invention can be suitably used for an image encoding device and an image decoding device that perform encoding by scalable encoding.
  • variable length decoding unit (first layer motion vector decoding means) 20, 20 ′ motion information conversion processing unit (second layer motion vector deriving means) 22 Motion information normalization unit (intermediate motion vector deriving means) 2, 2 ′ moving image encoding device 52, 52 ′ motion information conversion processing unit (second layer motion vector deriving means) 54 Motion information normalization unit (intermediate motion vector deriving means)

Abstract

Pour améliorer l'efficacité de codage de données codées qui sont codées au moyen de différents protocoles de codage dans chaque couche, l'invention concerne un dispositif de décodage de vidéo animée comprenant : une unité de décodage de longueur variable (13) qui décode un vecteur de mouvement utilisé dans le décodage d'une première couche ; une unité de normalisation d'informations de mouvement (22) qui, en fonction du vecteur de mouvement décodé, dérive un vecteur de mouvement intermédiaire ; et une unité de traitement de conversion d'informations de mouvement (20) qui, en référence au vecteur de mouvement intermédiaire dérivé, dérive un vecteur de mouvement utilisé dans le décodage d'une seconde couche.
PCT/JP2013/061588 2012-04-27 2013-04-19 Dispositif de décodage de vidéo animée et dispositif de codage de vidéo animée WO2013161689A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2012103716A JP2013232775A (ja) 2012-04-27 2012-04-27 動画像復号装置、および動画像符号化装置
JP2012-103716 2012-04-27

Publications (1)

Publication Number Publication Date
WO2013161689A1 true WO2013161689A1 (fr) 2013-10-31

Family

ID=49483010

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2013/061588 WO2013161689A1 (fr) 2012-04-27 2013-04-19 Dispositif de décodage de vidéo animée et dispositif de codage de vidéo animée

Country Status (2)

Country Link
JP (1) JP2013232775A (fr)
WO (1) WO2013161689A1 (fr)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150195549A1 (en) 2014-01-08 2015-07-09 Qualcomm Incorporated Support of non-hevc base layer in hevc multi-layer extensions
US20170201766A1 (en) * 2014-06-20 2017-07-13 Samsung Electronics Co., Ltd. Method and apparatus for coding and decoding scalable video data

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH1118085A (ja) * 1997-06-05 1999-01-22 General Instr Corp ビデオオブジェクト平面のための時間的及び空間的スケーラブル符号化
JP2006121701A (ja) * 2004-10-21 2006-05-11 Samsung Electronics Co Ltd 多階層基盤のビデオコーダでモーションベクトルを効率よく圧縮する方法及び装置
JP2009517941A (ja) * 2005-12-01 2009-04-30 トムソン ライセンシング 動き及びテクスチャデータを予測する方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH1118085A (ja) * 1997-06-05 1999-01-22 General Instr Corp ビデオオブジェクト平面のための時間的及び空間的スケーラブル符号化
JP2006121701A (ja) * 2004-10-21 2006-05-11 Samsung Electronics Co Ltd 多階層基盤のビデオコーダでモーションベクトルを効率よく圧縮する方法及び装置
JP2009517941A (ja) * 2005-12-01 2009-04-30 トムソン ライセンシング 動き及びテクスチャデータを予測する方法

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
DANNY HONG ET AL.: "Scalability Support in HEVC", JOINT COLLABORATIVE TEAM ON VIDEO CODING (JCT-VC) OF ITU-T SG16 WP3 AND ISO/IEC JTC1/SC29/WG11 6TH MEETING, 14 July 2011 (2011-07-14), TORINO, IT *
HISAO KUMAI ET AL.: "Sharp's proposals for HEVC scalability Extension", INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/ SC29/WG11 CODING OF MOVING PICTURES AND AUDIO ISO/IEC JTC1/SC29/WG11 MPEG2011/M23618, February 2012 (2012-02-01), SAN JOSE, USA *

Also Published As

Publication number Publication date
JP2013232775A (ja) 2013-11-14

Similar Documents

Publication Publication Date Title
JP6284661B2 (ja) 画像符号化装置、および画像符号化方法
US10136151B2 (en) Image decoding device and image decoding method
US20190014351A1 (en) Moving image coding device, a moving image coding method, and a moving image decoding device
JP6352248B2 (ja) 画像復号装置、および画像符号化装置
US10171823B2 (en) Image decoding device and image coding device
JP7368603B2 (ja) フィルタリングベースの映像コーディング装置及び方法
CN115244938A (zh) 基于预测加权表对图像/视频进行编译的方法和装置
CN115104317A (zh) 图像编码装置和用于控制环路滤波的方法
CN115023954A (zh) 用于控制环路滤波的图像编码装置和方法
WO2014104242A1 (fr) Dispositif de codage d'image et dispositif de décodage d'image
WO2014007131A1 (fr) Dispositif de décodage d'image et dispositif de codage d'image
WO2013161690A1 (fr) Dispositif de décodage d'image et dispositif de codage d'image
CN115244927A (zh) 图像/视频编码系统中的帧间预测方法和设备
CN115088263A (zh) 基于预测加权表的图像/视频编译方法和设备
WO2013161689A1 (fr) Dispositif de décodage de vidéo animée et dispositif de codage de vidéo animée
CN114762349B (zh) 用于图像/视频编译的高级别语法信令方法和装置
JP2015073213A (ja) 画像復号装置、画像符号化装置、符号化データ変換装置、および、注目領域表示システム
JP2014013975A (ja) 画像復号装置、符号化データのデータ構造、および画像符号化装置
CN114762350A (zh) 基于切片类型的图像/视频编译方法和设备
CN115104314A (zh) 基于加权预测的图像/视频编译方法及装置
CN114762351B (zh) 图像/视频编译方法和装置
WO2014050554A1 (fr) Dispositif de décodage d'image et dispositif de codage d'image
JP2014082729A (ja) 画像復号装置、および画像符号化装置
WO2012147947A1 (fr) Appareil de décodage d'image et appareil de codage d'image
JP2022085475A (ja) 動画像符号化装置、復号装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13780512

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13780512

Country of ref document: EP

Kind code of ref document: A1