WO2011013253A1 - Prediction-signal producing device using geometric transformation motion-compensation prediction, time-varying image encoding device, and time-varying image decoding device - Google Patents

Prediction-signal producing device using geometric transformation motion-compensation prediction, time-varying image encoding device, and time-varying image decoding device Download PDF

Info

Publication number
WO2011013253A1
WO2011013253A1 PCT/JP2009/063692 JP2009063692W WO2011013253A1 WO 2011013253 A1 WO2011013253 A1 WO 2011013253A1 JP 2009063692 W JP2009063692 W JP 2009063692W WO 2011013253 A1 WO2011013253 A1 WO 2011013253A1
Authority
WO
WIPO (PCT)
Prior art keywords
geometric transformation
prediction
transformation parameter
unit
block
Prior art date
Application number
PCT/JP2009/063692
Other languages
French (fr)
Japanese (ja)
Inventor
昭行 谷沢
太一郎 塩寺
中條 健
Original Assignee
株式会社 東芝
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社 東芝 filed Critical 株式会社 東芝
Priority to PCT/JP2009/063692 priority Critical patent/WO2011013253A1/en
Publication of WO2011013253A1 publication Critical patent/WO2011013253A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/02Affine transformations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/223Analysis of motion using block-matching
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/174Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a slice, e.g. a line of blocks or a group of blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/196Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/46Embedding additional information in the video signal during the compression process
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards

Definitions

  • the present invention relates to a prediction signal generation apparatus and a moving picture coding method for deriving a geometric transformation parameter using motion information of neighboring blocks and a prediction target block, and performing a geometric transformation prediction process for the prediction target block based on the derived geometric transformation parameter.
  • the present invention relates to a device and a video decoding device.
  • H.264 a moving picture coding method that has greatly improved coding efficiency has been jointly developed by ITU-T and ISO / IEC. H. H.264 and ISO / IEC 14496-10 (hereinafter referred to as “H.264”).
  • H. In H.264 prediction processing, conversion processing, and entropy coding processing are performed in units of rectangular blocks (for example, 16 ⁇ 16 pixels, 8 ⁇ 8 pixels). For this reason, H.C. In H.264, when an object that cannot be expressed by a rectangular block is predicted, the prediction efficiency is increased by selecting a smaller prediction block (for example, 4 ⁇ 4 pixels).
  • Methods for effectively predicting such objects include a method of preparing a plurality of prediction patterns in a rectangular block, and a method of applying motion compensation using affine transformation to a deformed object.
  • Japanese Patent Application Laid-Open No. 2007-312397 uses an object motion model as an affine transformation model, and calculates the optimum affine transformation parameters for each block to be predicted, thereby taking into consideration the enlargement / reduction / rotation of the object.
  • a video frame transfer method using prediction is disclosed.
  • the method described in Kordasiewicz et al. Uses affine transformation using motion vectors of eight types of adjacent blocks adjacent to a pixel block to be predicted, such as up, down, left, and right, and a motion vector calculated from the pixel block to be predicted. In order to estimate the parameters, it is necessary to re-encode the frame a plurality of times in order to obtain an optimal motion vector. On the other hand, when only the motion vector is calculated in units of frames from the original image, this conventional method is not optimal from the viewpoint of code amount and encoding distortion, and there is a problem that encoding efficiency decreases.
  • the present invention reduces a motion detection process necessary for estimating a geometric transformation parameter used for geometric transformation motion compensated prediction and improves prediction efficiency without increasing a code amount, a moving picture coding apparatus, a moving picture, and the like. It is an object of the present invention to provide a decoding device and a moving image decoding device.
  • prediction selection information indicating whether to use a first geometric transformation parameter or a second geometric transformation parameter indicating information related to an image shape by geometric transformation of a pixel block is set.
  • One setting unit and one or more second blocks of which the prediction signal generation process has already been completed among the plurality of first adjacent blocks adjacent to one pixel block of the plurality of pixel blocks into which the image signal is divided An acquisition unit that acquires motion information of neighboring blocks or the geometric transformation parameters; a derivation unit that derives predicted geometric transformation parameters of the one pixel block from the geometric transformation parameters of the one or more second neighboring blocks; Deriving the first geometric transformation parameter by a predetermined method from the derived geometric transformation parameter derivation value and the predicted geometric transformation parameter, A second setting unit for setting, a third setting unit for setting the second geometric transformation parameter based on motion information of the one pixel block and the one or more second adjacent blocks, A prediction signal is generated by performing a geometric transformation process using the first geometric transformation parameter or the second geometric transformation parameter indicated by the selection information on
  • the prediction selection information indicating whether to use the first geometric transformation parameter or the second geometric transformation parameter is encoded using the prediction signal generation device described above.
  • a derived value of the geometric transformation parameter is derived from the first geometric transformation parameter and the predicted geometric transformation parameter by a predetermined method.
  • a moving image decoding apparatus that decodes moving image encoded data obtained by encoding an input image signal in units of a plurality of pixel blocks, and performs decoding processing by a prescribed method.
  • Motion information or pixel block of one or more second adjacent blocks that have already been decoded among a plurality of first adjacent blocks adjacent to one pixel block of the plurality of pixel blocks into which the image signal is divided
  • a motion information acquisition unit for acquiring a geometric conversion parameter indicating information related to the shape of an image by geometric conversion of the image, and decoding selection information indicating whether to use the first geometric conversion parameter or the second geometric conversion parameter
  • a derivation unit for deriving a predicted geometric transformation parameter of the one pixel block from the geometric transformation parameters of two adjacent blocks, a derivation value of the decoded geometric transformation parameter, and the predicted geometric transformation
  • the first and second embodiments relate to a video encoding device
  • the third and fourth embodiments relate to a video decoding device.
  • the video encoding apparatus described in the following embodiment divides each frame constituting an input image signal into a plurality of pixel blocks, performs encoding processing on the divided pixel blocks, and performs compression encoding. It is a device that outputs a code string.
  • the moving picture coding apparatus 100 is connected to the coding control unit 114.
  • the subtractor 101 calculates a difference between the input image signal 115 and the prediction signal 206, and outputs a prediction error signal 116.
  • the output terminal of the subtractor 101 is connected to the transform / quantization unit 102.
  • the transform / quantization unit 102 includes, for example, an orthogonal transformer (discrete cosine transformer) and a quantizer, performs orthogonal transform (discrete cosine transform) on the prediction error signal 116, quantizes it, and transforms it into a transform coefficient 117.
  • the output terminal of the transform / quantization unit 102 is connected to the inverse quantization / inverse transform unit 103 and the entropy coding unit 112.
  • the inverse quantization / inverse transform unit 103 includes an inverse quantizer and an inverse orthogonal transformer (inverse discrete cosine transformer), inversely quantizes the transform coefficient 117, and performs inverse orthogonal transform to restore the decoded prediction error signal 118. .
  • the output terminal of the inverse quantization / inverse transform unit 103 is connected to the adder 104.
  • the adder 104 adds the decoded prediction error signal 118 and the prediction signal 206 to generate a decoded image signal 119.
  • the output terminal of the adder 104 is connected to the reference image memory 105.
  • the reference image memory 105 stores the decoded image signal 119 as a reference image signal.
  • the output terminal of the reference image memory 105 is connected to the motion information search unit 106, the intra prediction signal generation device 107, and the inter prediction signal generation device 109.
  • the motion information search unit 106 uses the input image signal 115 and the reference image signal 207 to calculate motion information (motion vector) 210 suitable for the prediction target block.
  • the output terminal of the motion information search unit 106 is connected to the derived parameter derivation unit 108 and the inter prediction signal generation device 109.
  • the derivation parameter derivation unit 108 includes a predictive geometric transformation parameter derivation unit 601 and a converter 602.
  • the derivation parameter derivation unit 108 has a function of deriving a predictive geometric transformation parameter to be described later, It has a function of calculating derived parameters using information 210.
  • the output terminal of the derived parameter deriving unit 108 is connected to the inter prediction signal generation device 109 and the entropy coding unit 112.
  • the inter prediction signal generation device 109 has a function of generating a prediction signal 206 using the input motion information 210, derived parameter information 108, reference image signal 207, and prediction selection information 123.
  • the intra prediction signal generation device 107 has a function of performing intra prediction using the input reference image signal 207.
  • Output terminals of the intra prediction signal generation device 107 and the inter prediction signal generation device 109 are connected to terminals of the prediction separation switch 110, respectively.
  • the prediction selection unit 111 sets the prediction selection information 123 according to the prediction mode controlled by the encoding control unit 114.
  • the output terminal of the prediction selection unit 111 is connected to the inter prediction signal generation device 109, the prediction separation switch 110, and the entropy encoding unit 112.
  • the prediction separation switch 110 switches between the intra prediction signal generation device 107 and the inter prediction signal generation device 109 according to the prediction selection information 123 of the prediction selection unit 111.
  • the switching terminal of the prediction separation switch 110 is connected to the subtractor 101 and the adder 104, and introduces the prediction signal of the intra prediction signal generation device 107 or the inter prediction signal generation device 109 to the subtractor 101 and the adder 104.
  • the entropy encoding unit 112 includes an encoder and a multiplexer, and entropy-encodes and multiplexes the transform coefficient 117, the derived parameter information 211, and the prediction selection information 123.
  • the output terminal of the entropy encoding unit 112 is connected to the output buffer 113.
  • the output buffer 113 temporarily stores the multiplexed data and outputs it as encoded data 129 according to the output timing managed by the encoding control unit 114.
  • the moving image encoding apparatus 100 having the above configuration performs an intra prediction (intraframe prediction) or inter prediction (interframe prediction) encoding process on the input image signal 115 based on the encoding parameter input from the encoding control unit 114.
  • the prediction signal 206 is generated, and the encoded data 129 is output. That is, an input image signal 115 of a moving image or a still image is input to the moving image encoding apparatus 100 after being divided into pixel blocks, for example, macroblocks.
  • the input image signal is one encoding processing unit including both a frame and a field. In the present embodiment, an example in which a frame is used as one encoding processing unit will be described.
  • the moving picture encoding apparatus 100 performs encoding in a plurality of prediction modes with different block sizes and generation methods of the prediction signal 206.
  • the generation method of the prediction signal 206 is roughly divided into intra prediction (intraframe prediction) in which a prediction signal is generated only within a frame to be encoded, and prediction using a plurality of temporally different reference frames.
  • intra prediction intraframe prediction
  • inter prediction an example in which a prediction signal is generated using inter prediction will be described in detail.
  • the macro block is set to the basic processing block size of the encoding process.
  • the macroblock is typically a 16 ⁇ 16 pixel block shown in FIG. 3B, for example, but may be a 32 ⁇ 32 pixel block unit or an 8 ⁇ 8 pixel block unit.
  • the shape of the macroblock does not necessarily need to be a square lattice.
  • the encoding target block or macroblock of the input image signal 115 is simply referred to as a “prediction target block”.
  • the input image signal 115 is input to the subtractor 101.
  • the subtracter 101 further receives a prediction signal 206 corresponding to each prediction mode output from the prediction separation switch 110.
  • the subtractor 101 calculates a prediction error signal 116 obtained by subtracting the prediction signal 206 from the input image signal 115.
  • the prediction error signal 116 is input to the transform / quantization unit 102.
  • the transform / quantization unit 102 performs orthogonal transform such as discrete cosine transform (DCT) on the prediction error signal 116 to generate transform coefficients.
  • DCT discrete cosine transform
  • the transformation in the transformation / quantization unit 102 is H.264.
  • discrete sine transform, wavelet transform, or component analysis is included.
  • the transform / quantization unit 102 quantizes the transform coefficient in accordance with quantization information represented by a quantization parameter, a quantization matrix, and the like given by the encoding control unit 114.
  • the transform / quantization unit 102 outputs the quantized transform coefficient 117 to the entropy coding unit 112 and also outputs it to the inverse quantization / inverse transform unit 103.
  • the entropy encoding unit 112 performs entropy encoding, for example, Huffman encoding or arithmetic encoding, on the quantized transform coefficient 117.
  • the entropy encoding unit 112 further performs entropy encoding on various encoding parameters used when the encoding target block including the prediction information output from the encoding control unit 114 is encoded. . As a result, encoded data 129 is generated.
  • the encoding parameter is a parameter required for decoding prediction information, information on transform coefficients, information on quantization, and the like.
  • the encoding parameter of the prediction target block is held in an internal memory of the encoding control unit 114, and is used when the prediction target block is used as an adjacent block of another pixel block.
  • the encoded data 129 generated and multiplexed by the entropy encoding unit 112 is output from the moving image encoding apparatus 100, temporarily stored in the output buffer 113, and then managed by the encoding control unit 114.
  • the encoded data 129 is output according to the output timing.
  • the encoded data 129 is sent to, for example, a storage system (storage medium) or a transmission system (communication line) (not shown).
  • the quantization information corresponding to the quantization information used in the transform / quantization unit 102 is loaded from the internal memory of the encoding control unit 114 and subjected to inverse quantization processing.
  • the quantization information is, for example, a parameter represented by a quantization parameter, a quantization matrix, or the like.
  • the inverse quantization / inverse transform unit 103 further reproduces the decoded prediction error signal 118 by performing inverse orthogonal transform such as inverse discrete cosine transform (IDCT) on the transform coefficient after inverse quantization.
  • inverse orthogonal transform such as inverse discrete cosine transform (IDCT)
  • the decoded prediction error signal 118 is input to the adder 104.
  • the adder 104 adds the decoded prediction error signal 118 and the prediction signal 206 output from the prediction separation switch 110 to generate a decoded image signal 119.
  • the decoded image signal 119 is a local decoded image signal.
  • the decoded image signal 119 is stored as the reference image signal 207 in the reference image memory 105.
  • the reference image signal 207 stored in the reference image memory 105 is output to the motion information search unit 106, the intra prediction signal generation device 107, the inter prediction signal generation device 109, etc., and is referred to when performing prediction.
  • the motion information search unit 106 uses the input image signal 115 and the reference image signal 207 to calculate motion information 210 suitable for the prediction target block.
  • the motion information may be represented by an affine transformation parameter, for example.
  • the motion information can be represented by a motion vector.
  • the motion information 210 may be, for example, a predicted value for predicting an affine transformation parameter with another affine transformation parameter or the like, or may be a predicted value for predicting a motion vector with another motion vector or the like.
  • motion information including geometric deformation between images is used as motion information.
  • the motion information search unit 106 searches for motion information 210 (affine transformation parameters and motion vectors) by performing a search such as block matching between the prediction target block of the input image signal 115 and the interpolated image of the reference image signal 207. calculate.
  • a search such as block matching between the prediction target block of the input image signal 115 and the interpolated image of the reference image signal 207.
  • As an evaluation criterion for matching for example, a value obtained by accumulating a difference between the input image signal 115 and the interpolated image after matching for each pixel, a value obtained by adding a difference value between the calculated affine transformation parameter and the center of search, or the like. Use.
  • the motion information 210 may be determined by using a value obtained by converting the difference between the predicted image and the original image, taking into account the size of the motion vector or the affine transformation parameter, Alternatively, the determination may be made by taking into account the code amount of the affine transformation parameter or the like. Moreover, you may utilize costs, such as Formula (1) and Formula (2) mentioned later. Further, the matching may be performed through a search within the matching range based on search range information provided from the outside of the moving image encoding apparatus 100, or may be performed hierarchically for each pixel accuracy.
  • the motion information 210 calculated for a plurality of reference image signals in this way is input to the inter prediction signal generation device 109 and used to generate the prediction signal 206.
  • the plurality of reference image signals are locally decoded images having different display times.
  • the calculated motion information 210 is output to the derived parameter derivation unit 108.
  • the derived parameter derivation unit 108 includes a predicted geometric transformation parameter derivation unit 601 and a converter 602, derives a predicted geometric transformation parameter to be described later, and inputs the predicted geometric transformation parameter and the input motion.
  • the derived parameter is calculated using the information 210.
  • the converter 602 may be, for example, a subtracter.
  • the converter 602 may be an adder, a multiplier, a divider, a converter that performs conversion using a predetermined matrix, and a combination of these.
  • achieves may be sufficient.
  • the converter 602 will be described as a subtractor.
  • the derived parameter information 211 derived by the derived parameter deriving unit 108 is output to the entropy encoding unit 112, and after being subjected to entropy encoding, is multiplexed into encoded data. Furthermore, the motion information 210 obtained by encoding the target pixel block is stored in the internal memory of the encoding control unit 114, and is appropriately loaded from the inter prediction signal generation device 109 and used.
  • the reference image signal 207 stored in the reference image memory 105 is output to the intra prediction signal generation device 107.
  • intra prediction is performed using the input reference image signal 207.
  • a prediction signal is generated by performing pixel interpolation in the prediction direction such as the vertical direction and the horizontal direction using an encoded reference pixel value adjacent to the prediction target block.
  • the interpolated pixel value may be copied in a predetermined prediction direction.
  • the generated prediction signal 206 is output to the prediction separation switch 110.
  • the inter prediction signal generation apparatus 109 generates a prediction signal 206 using the input motion information 210, derived parameter information 108, reference image signal 207, and prediction selection information 123.
  • the generated prediction signal 206 is output to the prediction separation switch 110.
  • the prediction separation switch 110 selects the output terminal of the intra prediction signal generation device 107 and the output terminal of the inter prediction signal generation device 109 according to the prediction selection information 123.
  • the switch is connected to the intra prediction signal generation device 107.
  • the prediction selection information 123 is inter prediction
  • the switch is connected to the inter prediction signal generation device 109.
  • An example of the prediction selection information 123 is shown in FIG.
  • the prediction selection unit 111 sets the prediction selection information 123 according to the prediction mode controlled by the encoding control unit 114.
  • the prediction mode intra prediction or inter prediction can be selected, and a plurality of modes may exist for each.
  • the encoding control unit 114 controls which mode is selected.
  • the prediction signal 206 may be generated for all prediction modes, and one prediction mode may be selected from these, or the prediction mode may be limited according to the characteristics of the input image.
  • the prediction selection information 123 is determined using a cost such as the following equation.
  • the code amount (for example, the code amount of the derivation parameter 211 and the code amount of the prediction block size) required for the prediction information when the prediction mode is selected is OH, and the absolute difference (prediction error) between the input image signal 115 and the prediction signal 206 If the SAD is defined as the absolute cumulative sum of the signals 116, the following determination formula is used.
  • K is the cost and ⁇ is a constant.
  • is a Lagrangian undetermined multiplier determined based on the quantization scale and the value of the quantization parameter. In this determination formula, the mode giving the value with the smallest cost K is selected as the optimum prediction mode.
  • the prediction selection information 123 may be determined using (a) only prediction information or (b) only SAD instead of the formula (1), and Hadamard transformation is applied to these (a) and (b). A value or a value close to it may be used.
  • a provisional encoding unit is prepared, and the amount of code when the prediction error signal 116 generated in the prediction mode by the provisional encoding unit is actually encoded, the input image signal 115 and the decoded image signal
  • the prediction selection information 123 may be determined using a square error with 119.
  • the judgment formula in this case is as follows.
  • J is an encoding cost
  • D is an encoding distortion representing a square error between the input image signal 114 and the decoded image signal 118.
  • R represents a code amount estimated by provisional encoding.
  • the cost J of Equation (2) When the encoding cost J of Equation (2) is used, provisional encoding and local decoding processing are required for each prediction mode, so that the circuit scale or calculation amount increases. However, since a more accurate code amount and encoding distortion are used, high encoding efficiency can be maintained.
  • the cost may be calculated using only R or only D instead of the expression (2), or the cost function may be created using a value approximating R or D.
  • inter prediction signal generation device 109 Next, the inter prediction signal generation device 109 will be described with reference to FIG.
  • the inter prediction signal generation device 109 includes a prediction separation switch 201, a geometric transformation prediction unit 202, a second geometric transformation parameter derivation unit 203, a first geometric transformation parameter derivation unit 204, and a predicted geometric transformation parameter derivation unit 205.
  • the second geometric transformation parameter derivation unit 203 uses the motion information 210 of the prediction target block output from the motion information search unit 106 and the motion information of the encoded block stored in the encoding control unit 114 to perform prediction.
  • the second geometric transformation parameter 208 of the target block is derived.
  • the motion information 210 stored in the encoding control unit 114 is a motion vector included in the motion information 210 of the adjacent block, and is hereinafter referred to as “adjacent motion vector”.
  • the second geometric transformation parameter derivation unit 203 includes a motion information acquisition unit 501 and a second parameter derivation unit 502.
  • the motion information acquisition unit 501 determines an adjacent block from which motion information is acquired from among a plurality of adjacent blocks, and acquires motion information of the adjacent block, for example, a motion vector.
  • the second parameter derivation unit 502 derives a second geometric transformation parameter from the motion vector of the adjacent block.
  • FIG. 6A shows an example in which the sizes of prediction target blocks and adjacent blocks (for example, 16 ⁇ 16 pixel blocks) match.
  • a hatched pixel block p is a pixel block that has already been encoded or predicted (hereinafter referred to as “predicted pixel block”).
  • a block c with dot hatching indicates a prediction target block, and a pixel block n displayed in white is an uncoded pixel (unpredicted) block.
  • X represents an encoding (prediction) target pixel block.
  • the adjacent block A is the adjacent block on the left of the prediction target block X
  • the adjacent block B is the adjacent block on the prediction target block X
  • the adjacent block C is the adjacent block on the upper right of the prediction target block X
  • the adjacent block D is This is an adjacent block at the upper left of the prediction target block X.
  • the adjacent motion vector held in the internal memory of the encoding control unit 114 is only the motion vector of the predicted pixel block. As shown in FIG. 3A, since the pixel block is encoded and predicted from the upper left to the lower right, when the pixel block X is predicted, the right and lower pixel blocks are still encoded. Has not been made. Therefore, an adjacent motion vector cannot be derived from these adjacent blocks.
  • 6B to 6E are diagrams illustrating examples of adjacent blocks when the prediction target block is an 8 ⁇ 8 pixel block.
  • bold lines represent macroblock boundaries.
  • 6B is a pixel block located at the upper left in the macro block
  • FIG. 6C is a pixel block located at the upper right in the macro block
  • FIG. 6D is a pixel block located at the lower left in the macro block
  • FIG. An example is shown in which each pixel block located at the lower right in the block is a prediction target block.
  • the position of the adjacent block changes according to the encoding order of the 8 ⁇ 8 pixel block.
  • the pixel block becomes an encoded pixel block and is used as an adjacent block of the pixel block to be processed later.
  • the pixel block located at the upper right of the encoded pixel block is set as an adjacent block.
  • adjacent blocks having the closest Euclidean distance to the prediction target block X are referred to as adjacent blocks A, B, C, and D, respectively.
  • adjacent blocks A, B, C, and D adjacent blocks having the closest Euclidean distance to the prediction target block X
  • adjacent blocks A, B, C, and D are referred to as adjacent blocks A, B, C, and D, respectively.
  • the nearest block among the upper right adjacent blocks is the adjacent block C.
  • a block having the shortest Euclidean distance with respect to the prediction target block X is set as an adjacent block.
  • the block is 16 ⁇ 16 pixels and 8 ⁇ 8 pixels has been described as an example.
  • a square pixel block such as 32 ⁇ 32 pixels, 4 ⁇ 4 pixels, or the like is used using a similar framework.
  • Adjacent blocks may be determined for rectangular pixel blocks such as 8 pixels and 8 ⁇ 16 pixels.
  • adjacent blocks may be defined more widely.
  • a pixel block on the left of the adjacent block A may be used, or a pixel block further on the adjacent block B may be used.
  • the second geometric transformation parameter 208 is derived by the second parameter deriving unit 502.
  • the adjacent motion vectors held by the adjacent blocks are defined by equations (3) to (6), respectively.
  • the motion information 210 provided from the motion information search unit 106 is defined by equation (7). Note that the motion information 210 indicates a motion vector of the prediction target block X.
  • the second geometric transformation parameter 208 is derived using the motion vector and the adjacent motion vector represented by the equations (3) to (7).
  • the transformation formula is expressed by the following formula (8).
  • the parameters (c, f) correspond to motion vectors
  • the parameters (a, b, d, e) indicate parameters associated with geometric deformation.
  • u and v indicate the coordinates of the encoding target block
  • x and y indicate the coordinates of the reference image. If the parameter (a, b, d, e) is (1, 0, 0, 1), it means that it is the same as the motion compensation of the parallel movement model (formula (18) described later).
  • affine transformation was shown here as geometric transformation
  • geometric such as a bilinear transformation, Helmert transformation, secondary conformal transformation, projective transformation, and three-dimensional projective transformation corresponding to geometric transformation
  • a transformation may be used.
  • the required number of parameters varies depending on the geometric transformation to be used, but a suitable geometric transformation can be selected according to the nature of the image to be applied, depending on the amount of code when the parameters are encoded and the corresponding geometric transformation pattern. Should be selected.
  • affine transformation will be described.
  • equation (8) coordinates (x, y) are converted to coordinates (u, v) by affine transformation.
  • Six parameters a, b, c, d, e, and f included in Expression (8) represent geometric transformation parameters.
  • affine transformation since these six types of parameters are derived from adjacent vectors, six or more input values are required.
  • a geometric transformation parameter is derived by the following equation (9).
  • the motion vector is 1 ⁇ 4 precision
  • the precision of the parameters (a, b, d, e) is 1/64.
  • ax and ay are variables based on the size of the prediction target block, and are calculated by the following equation (10).
  • mb_size_x and mb_size_y indicate the horizontal and vertical sizes of the macroblock.
  • Equation (8) shows an example in which a, b, d, and e are obtained as real numbers, respectively, but by determining the calculation accuracy of these parameters in advance, an integer can be easily obtained as in Equation (9). Is possible.
  • the predicted geometric transformation parameter derivation unit 205 includes a motion information acquisition unit 701 and a predicted geometric transformation parameter calculation unit 702 as shown in FIG.
  • the motion information acquisition unit 501 determines an adjacent block in the same procedure as the motion information acquisition unit 501 held by the second geometric transformation derivation unit 203. However, the motion information calculated from the adjacent block is a geometric transformation parameter.
  • the motion information acquisition unit 701 derives a prediction geometric transformation parameter 212 of the prediction target block using the motion information 210 of the encoded block stored in the encoding control unit 114.
  • the motion information 210 stored in the encoding control unit 114 is motion information 210 of adjacent encoded blocks, and is hereinafter referred to as “adjacent geometric transformation parameter”.
  • a method for deriving the predicted geometric transformation parameter 212 will be described with reference to FIG. 6A.
  • the adjacent geometric transformation parameters held by the adjacent blocks are defined by equations (11) to (14), respectively.
  • ap indicates an affine transformation parameter, which is a six-dimensional parameter as shown in Equation (8).
  • the predicted geometric transformation parameter calculation unit 702 calculates a predicted geometric transformation parameter by median processing using the spatial correlation between the prediction target block and the adjacent block.
  • pred_ap represents a predicted geometric transformation parameter.
  • the function affine_median () is a function that takes the median value of the 6-dimensional affine transformation parameters. Moreover, you may determine a prediction geometric transformation parameter using following Formula.
  • Equation (16) indicates a scalar median.
  • T in equation (16) means transposition.
  • equation (16) a predicted geometric transformation parameter is derived by taking a median value for each geometric transformation parameter component.
  • four blocks that have already been encoded (further located on the left, upper located, upper left, and upper right) are adjacent.
  • a geometric transformation parameter is derived using Equation (9) as a block. It is also possible for ap A to use this geometric transformation parameter as an adjacent geometric transformation parameter. It is also possible to derive predicted geometric transformation parameters by re-deriving geometric transformation parameters for adjacent blocks B, C, and D in the same manner.
  • the predicted geometric transformation parameter 212 derived by the predicted geometric transformation parameter derivation unit 205 is output to the first geometric transformation parameter derivation unit 204.
  • the first geometric transformation parameter derivation unit 204 adds the inputted geometric transformation parameter derivation parameter information 211 of the prediction target block and the predicted geometric transformation parameter 212.
  • the added parameter becomes the first geometric transformation parameter 209 and is output to the geometric transformation prediction unit 202.
  • the first geometric transformation parameter deriving unit 204 derives the first geometric transformation parameter 209 by adding the predicted geometric transformation parameter 212 and the derivation parameter information 211 of the geometric transformation parameter of the prediction target block.
  • the predicted geometric transformation parameter is ⁇ pred_ap
  • the first geometric transformation parameter derivation unit 204 adds the predicted geometric transformation parameter 212 ( ⁇ pred_ap) and the derived parameter information 211 to obtain the first geometric transformation parameter 209. To derive.
  • the prediction geometric transformation parameter 212 and the first geometric transformation parameter derivation unit 204 define the derivation formula so as to reduce the information amount of the first geometric transformation parameter 209 calculated in the prediction target block as much as possible. .
  • the geometric transformation prediction unit 202 has a function of generating a prediction signal using the input geometric transformation parameters and the reference image signal 207.
  • the geometric transformation prediction unit 202 includes a geometric transformation unit 401 and an interpolation unit 402 as shown in FIG.
  • the geometric transformation unit 202 performs geometric transformation on the reference image signal 207 and calculates the position of the predicted pixel.
  • the interpolation unit 402 calculates the predicted pixel value corresponding to the fractional position of the predicted pixel obtained by the geometric transformation by interpolation or the like.
  • the prediction target block is a square pixel block CR composed of pixels indicated by ⁇ .
  • the corresponding pixel of motion compensation prediction is indicated by ⁇ .
  • a pixel block MER composed of pixels indicated by ⁇ is square.
  • the pixel corresponding to the geometric transformation prediction is indicated by x, and the pixel block GTR composed of these pixels is, for example, a parallelogram.
  • the region after motion compensation and the region after geometric transformation describe the corresponding region of the reference image signal according to the coordinates of the frame to be encoded.
  • the geometric transformation prediction it is possible to generate a prediction signal in accordance with deformation such as rotation, enlargement / reduction, shearing, and mirror transformation of the rectangular pixel block.
  • the geometric transformation unit 401 calculates coordinates (u, v) after the geometric transformation using the input geometric transformation parameters using the equation (8).
  • the calculated coordinates (u, v) after geometric transformation are real values. Therefore, the predicted value is generated by interpolating the luminance value corresponding to the coordinates (u, v) from the reference image signal.
  • an interpolation method is used.
  • the interpolation method is expressed by the following equation (17).
  • R (x, y) the integer pixel value of the used reference image signal
  • a new prediction signal is generated by applying interpolation for each coordinate in the prediction target block subjected to geometric transformation.
  • the geometric transformation prediction unit 202 can also generate a prediction signal using a conventional translation model.
  • the coordinate conversion formula of the translation model is expressed by formula (19).
  • Equation (19) is the same as the parameter (a, b, d, e) in Equation (8) being (1, 0, 0, 1).
  • the geometric transformation prediction unit 202 can calculate the equation (2) regardless of the values of the first geometric transformation parameter 208 and the second geometric transformation parameter 209.
  • the conventional motion compensated prediction can be realized by changing 8) to Equation (19) and deriving coordinates.
  • bilinear interpolation is used as an interpolation method.
  • nearest neighbor interpolation cubic convolution interpolation
  • linear filter interpolation linear filter interpolation
  • Lagrange interpolation Lagrange interpolation
  • spline spline
  • Any interpolation method such as an interpolation method or a Lanczos interpolation method may be applied.
  • the prediction separation switch 201 switches the output terminals of the two prediction signals 206 output from the geometric transformation prediction unit 202. That is, the prediction separation switch 201 uses the prediction selection information 213 (FIG. 1) as the output terminal of the prediction signal 215 generated by the first geometric transformation parameter 209 and the output terminal of the prediction signal 214 generated by the second geometric transformation parameter 208. In accordance with 123).
  • Examples of the prediction selection information 213 and 123 are shown in FIG.
  • the skip mode is selected. In the skip mode, transform coefficients, motion vectors, etc. are not encoded. For this reason, it means that the first geometric transformation prediction that needs to encode the additional motion information 210 is not selected.
  • the index of the prediction selection information 213 is 9, intra prediction is selected. In this case, since the output terminal of the prediction separation switch 110 is connected to the intra prediction signal generation device 107, this means that the inter prediction signal generation device 109 does not need to perform the prediction signal generation processing.
  • index tables may be divided into a plurality of tables, or a plurality of index tables may be integrated. Moreover, it is not always necessary to use the same term, and it may be arbitrarily changed depending on the form to be used. Furthermore, each element described in the index table may be changed so as to be described by an independent flag.
  • the prediction separation switch 201 sets the prediction selection information 213 to the first geometric transformation parameter. It is determined whether it is 209 (S502). When this determination is YES, the prediction separation switch 201 switches the output terminal of the geometric transformation prediction unit 202 to the first prediction signal 215. On the other hand, when the determination is NO, the prediction separation switch 201 switches the output terminal of the geometric transformation prediction unit 202 to the second prediction signal 215.
  • the motion information acquisition unit 701 in the predicted geometric transformation parameter derivation unit 205 determines an adjacent block based on the motion information 210 input from the outside (S508). Using the determined adjacent block motion information 210, an adjacent geometric transformation parameter is derived (S509). Receiving the derived adjacent geometric transformation parameter, the predicted geometric transformation parameter derivation unit 702 derives the predicted geometric transformation parameter 212 using equation (15) or equation (16) (S510). The predicted geometric transformation parameter 212 is output to the first geometric transformation parameter derivation unit 204.
  • the first geometric transformation parameter derivation unit 204 derives the first geometric transformation parameter 209 using the derived parameter information 211 and the predicted geometric transformation parameter 212 input from the outside (S511).
  • the first geometric transformation parameter 209 is input to the geometric transformation prediction unit 202, and the geometric transformation unit 401 derives the coordinates after geometric transformation using the equation (8) (S512).
  • the interpolation unit 402 Based on the calculated coordinates, the interpolation unit 402 performs an interpolation process on the reference image signal 207 input from the outside using Expression (18) to generate a first prediction signal 215 ( S513).
  • the first prediction signal 215 is output to the outside via the prediction separation switch 201 to which the output terminal is connected (S515), and the first geometric transformation parameter 209 used in the prediction target block is stored in the memory (S514). ).
  • the first geometric transformation parameter 209 held in the memory is used as an adjacent geometric transformation parameter or an adjacent motion vector of the next block (S517).
  • step S502 the motion information acquisition unit 501 in the second geometric transformation parameter derivation unit 203 determines an adjacent block based on the motion information 210 input from the outside (S503). ). An adjacent motion vector is derived using the determined adjacent block motion information 210 (S504). Receiving the derived adjacent motion vector, the second parameter deriving unit 502 derives the second geometric transformation parameter 208 using the equations (9) to (10) (S505).
  • the second geometric transformation parameter 208 is input to the geometric transformation prediction unit 202, and the geometric transformation unit 401 derives coordinates after the geometric transformation using the equation (8) (S506). Based on the derived coordinates, the interpolation unit 402 performs an interpolation process on the reference image signal 207 input from the outside using Expression (18) to generate the second prediction signal 214 ( S507).
  • the second prediction signal 214 is output to the outside via the prediction separation switch 201 (S515), and the second geometric transformation parameter 208 used in the prediction target block is stored in the memory (S514).
  • the second geometric transformation parameter 208 held in the memory is used as an adjacent geometric transformation parameter or an adjacent motion vector of the next block (S517).
  • the syntax 1600 mainly has three parts.
  • the high-level syntax 1601 has higher layer syntax information that is equal to or higher than a slice.
  • the slice level syntax 1602 has information necessary for decoding for each slice, and the macroblock level syntax 1603 has information necessary for decoding for each macroblock.
  • High level syntax 1601 includes sequence and picture level syntax, such as sequence parameter set syntax 1604 and picture parameter set syntax 1605.
  • the slice level syntax 1602 includes a slice header syntax 1606, a slice data syntax 1607, and the like.
  • the macroblock level syntax 1603 includes a macroblock layer syntax 1608, a macroblock prediction syntax 1609, and the like.
  • slice_affine_motion_prediction_flag is a syntax element indicating whether to apply geometric transformation prediction to a slice.
  • the geometric transformation prediction unit 202 does not use the parameters (a, b, d, e) in Equation (8) but uses Equation (19) for this slice.
  • H.264 represents motion compensation prediction using a translation model used in H.264 and the like, and parameters (c, f) correspond to motion vectors.
  • this flag is 0, it is the same as the motion compensation prediction of the conventional translation model.
  • slice_affine_motion_prediction_flag 1
  • the prediction separation switch 201 dynamically switches the prediction signal as indicated by the prediction selection information 213 in the slice.
  • mb_skip_flag is a flag indicating whether or not the macroblock is encoded in the skip mode. In the skip mode, transform coefficients, motion vectors, etc. are not encoded. For this reason, the first geometric transformation prediction is not applied to the skip mode.
  • AvailAffineMode is an internal parameter indicating whether or not the second geometric transformation prediction can be used in the macroblock. When AvailAffineMode is 0, it means that the prediction selection information 213 is set not to use the second geometric transformation prediction. When the adjacent motion vector of the adjacent block and the motion vector of the prediction target block have the same value, AvailAffineMode is 0. Otherwise, AvailAffineMode is 1.
  • the setting of AvailAffineMode can also be set using an adjacent geometric transformation parameter or an adjacent motion vector. For example, when the adjacent motion vector points in a completely different direction, there is a possibility that an object boundary exists in the adjacent block of the current prediction target block. Therefore, it is possible to set AvailAffineMode to 0.
  • mb_affine_motion_skip_flag indicating whether to use the second geometric transformation prediction or the motion compensation prediction is encoded.
  • mb_affine_motion_skip_flag 1, it means that the second geometric transformation prediction is applied to the skip mode.
  • mb_affine_motion_skip_flag 0, it means that motion compensation prediction is applied using equation (19).
  • mb_type indicates macroblock type information. That is, whether the current macroblock is intra-coded, inter-coded, what block shape is being predicted, whether the prediction direction is unidirectional prediction or bidirectional prediction, etc. Contains information.
  • the mb_type is passed to the macroblock prediction syntax and the submacroblock prediction syntax indicating the syntax of the subblock in the macroblock.
  • mb_affine_pred_flag indicates whether the first geometric transformation prediction or the second geometric transformation prediction is used in the block.
  • the prediction selection information 213 is set to use the second geometric transformation parameter.
  • the prediction selection information 213 is set to use the first geometric transformation parameter.
  • NumMbPart () is an internal function that returns the number of block divisions specified in mb_type. It is 1 for a 16 ⁇ 16 pixel block, 2 for an 8 ⁇ 16 pixel block, and 2 for an 8 ⁇ 8 pixel block. In the case, 4 is output.
  • mv_l0 and mv_l1 indicate motion vector difference information in the macroblock.
  • the motion vector information is a value set by the motion information search unit 106 and obtained by taking a difference from a predicted motion vector not disclosed in the present embodiment.
  • mvd_l0_affine and mvd_l1_affine indicate derived parameters in the macroblock, and indicate difference information of components (a, b, d, e) excluding motion vectors of affine transformation parameters. This syntax element is encoded only when the first geometric transformation parameter is selected.
  • mb_affine_pred_flag indicates whether to use the first geometric transformation prediction or the second geometric transformation prediction in the block.
  • the prediction selection information 213 is set to use the second geometric transformation parameter.
  • the prediction selection information 213 is set to use the first geometric transformation parameter.
  • mv_l0 and mv_l1 indicate motion vector difference information in the sub macroblock.
  • the motion vector information is a value set by the motion information search unit 106 and obtained by taking a difference from a predicted motion vector not disclosed in the present embodiment.
  • Mvd_l0_affine and mvd_l1_affine in the figure indicate derived parameters in the sub-macroblock, and indicate difference information of components (a, b, d, e) excluding the motion vectors of the affine transformation parameters.
  • This syntax element is encoded only when the first geometric transformation parameter is selected.
  • the conventional motion compensated prediction or the second geometric transformation prediction using the translation model can be selected, and other inter predictions can be selected.
  • the first geometric transformation prediction or the second geometric transformation prediction can be selected.
  • syntax elements not defined in the present embodiment may be inserted between lines in the syntax tables shown in FIGS. 12 to 17, and descriptions regarding other conditional branches may be included.
  • the syntax table may be divided into a plurality of tables, or a plurality of syntax tables may be integrated. Moreover, it is not always necessary to use the same term, and it may be arbitrarily changed depending on the form to be used.
  • two geometric transformation parameters indicating information related to the shape of the image by the geometric transformation of the pixel block that is, the first geometric transformation parameter and the second geometric transformation parameter are derived, and any of these geometric transformation parameters is derived.
  • a prediction signal is generated by performing motion compensation prediction using a geometric transformation parameter selected according to prediction selection information indicating whether or not to select.
  • FIG. 18 is a diagram illustrating an example of the macroblock layer syntax 1607.
  • Mb_type shown in the figure indicates macroblock type information. That is, whether the current macroblock is intra-coded, inter-coded, what block shape is being predicted, whether the prediction direction is unidirectional prediction or bidirectional prediction, etc. Contains information.
  • the mb_type is passed to the macroblock prediction syntax and the submacroblock prediction syntax indicating the syntax of the subblock in the macroblock.
  • mb_additional_affine_motion_flag indicates flag information for selecting whether to use the first geometric transformation parameter or the second geometric transformation parameter for the prediction target block. When this flag is 0, the second geometric transformation parameter is used, and when this flag is 1, the first geometric transformation parameter is used.
  • mb_affine_pred_flag uses a geometric transformation prediction including a first geometric transformation prediction or a second geometric transformation prediction, or uses a motion compensated prediction of a translation model.
  • the prediction selection information 213 is set to use the motion compensated prediction of the translation model regardless of the mb_additional_affine_motion_flag.
  • the prediction selection information 213 is set to use the first geometric transformation parameter or the second geometric transformation parameter according to the flag information of mb_additional_affine_motion_flag.
  • mb_affine_pred_flag is a block that uses geometric transformation prediction including first geometric transformation prediction or second geometric transformation prediction, or uses motion compensation prediction of a translation model. Indicates whether or not When the flag is 0, the prediction selection information 213 is set to use the motion compensated prediction of the translation model regardless of the mb_additional_affine_motion_flag. On the other hand, when the flag is 1, the prediction selection information 213 is set to use the first geometric transformation parameter or the second geometric transformation parameter according to the flag information of mb_additional_affine_motion_flag.
  • the conventional motion compensation or the second geometric transformation prediction using the translation model can be selected. Then, at the macroblock level, it is determined whether the first geometric transformation prediction or the second geometric transformation prediction is used, and the geometric transformation prediction including the first geometric transformation prediction or the second geometric transformation prediction at the sub macroblock level. Or using motion compensated prediction of a translation model.
  • syntax elements not defined in the present embodiment may be inserted between lines in the syntax tables shown in FIGS. 18 to 20, and descriptions regarding other conditional branches may be included.
  • the syntax table may be divided into a plurality of tables, or a plurality of syntax tables may be integrated. Moreover, it is not always necessary to use the same term, and it may be arbitrarily changed depending on the form to be used.
  • the video decoding device 300 decodes, for example, encoded data generated by the video encoding device according to the first embodiment.
  • the video decoding device 300 decodes the encoded data 311 stored in the input buffer 301 and outputs a decoded image signal 317 to the output buffer 309.
  • the encoded data 311 is encoded data that is transmitted from, for example, the moving image encoding apparatus 100, transmitted through a storage system or a transmission system, once stored in the input buffer 301, and multiplexed.
  • the video decoding device 300 includes an entropy decoding unit 302, an inverse quantization / inverse conversion unit 303, an adder 304, a reference image memory 305, an intra prediction signal generation device 306, an inter prediction signal generation device 307, and a prediction separation switch 308.
  • the moving picture decoding apparatus 300 is also connected to the input buffer 301, the output buffer 309, and the decoding control unit 310.
  • the entropy decoding unit 302 decodes the encoded data 311 by syntax analysis based on the syntax for each frame or field.
  • the entropy decoding unit 302 sequentially entropy-decodes the code string of each syntax, and reproduces the motion information 315, the derived parameter information 316, the encoding parameters of the decoding target block, and the like.
  • the encoding parameter includes all parameters necessary for decoding such as prediction information, information on transform coefficients, information on quantization, and the like.
  • the transform coefficient decoded by the entropy decoding unit 302 is input to an inverse quantization / inverse transform unit 303 including an inverse quantizer and an inverse transformer.
  • Various information relating to the quantization decoded by the entropy decoding unit 302, that is, the quantization parameter and the quantization matrix are set in the internal memory of the decoding control unit 310 and loaded when used as an inverse quantization process.
  • the inverse quantization process is first performed by the inverse quantizer using the information on the loaded quantization.
  • the inverse quantized transform coefficient is then subjected to inverse transform processing, for example, inverse discrete cosine transform, by an inverse transformer.
  • inverse transform processing for example, inverse discrete cosine transform
  • the inverse orthogonal transform has been described.
  • the inverse quantization / inverse transform unit 303 performs the corresponding inverse quantization and inverse wavelet transform. Good.
  • the restored prediction error signal 312 is input to the adder 304.
  • the adder 304 adds the prediction error signal 312 and the prediction signal 416 output from the prediction separation switch 308 described later to generate a decoded image signal 317.
  • the generated decoded image signal 317 is output from the moving image decoding apparatus 300, temporarily stored in the output buffer 317, and then output according to the output timing managed by the decoding control unit 310.
  • the decoded image signal 317 is stored in the reference image memory 305 and becomes a reference image signal 313.
  • the reference image signal 313 is sequentially read from the reference image memory 305 for each frame or each field, and is input to the intra prediction signal generation device 306 or the inter prediction signal generation device 307.
  • the motion information 315 used in the decoding target pixel block is stored in the decoding control unit 310, and is appropriately loaded from the decoding control unit 310 and used in the inter prediction signal generation processing of the next block.
  • the intra prediction signal generation device 306 has the same function and configuration as the intra prediction signal generation device 107 in the video encoding device 100 shown in FIG. That is, the intra prediction signal generation device 306 performs intra prediction using the input reference image signal 313.
  • a prediction signal is generated by performing pixel interpolation in the prediction direction such as the vertical direction and the horizontal direction using an encoded reference pixel value adjacent to the prediction target block.
  • the interpolated pixel value may be copied in a predetermined prediction direction.
  • the generated prediction signal 416 is output to the prediction separation switch 308.
  • the inter prediction signal generation device 307 has the same function and configuration as the inter prediction signal generation device 109 shown in FIGS. 1 and 2 and FIGS. That is, the inter prediction signal generation device 307 generates the prediction signal 416 using the input motion information 315, derivation parameter information 316, reference image signal 313, and prediction selection information 314.
  • the motion information 315, the derived parameter information 316, the reference image signal 313, and the prediction selection information 314 are the motion information 210, the derived parameter information 211, the reference image signal 207, and the input to the inter prediction signal generation device 109 of the video encoding device 100.
  • the prediction signal 416 is generated in the inter prediction signal generation device 109 shown in FIG.
  • the generated prediction signal 416 is output to the prediction separation switch 308.
  • the prediction separation switch 308 selects the output terminal of the intra prediction signal generation device 308 and the output terminal of the inter prediction signal generation device 307 according to the prediction selection information 314.
  • the switch is connected to the intra prediction signal generation device 308.
  • the prediction selection information 314 is inter prediction, the switch is connected to the inter prediction signal generation device 307.
  • the prediction selection information 314 is the same as the prediction selection information 123 set by the prediction selection unit 111 of the video encoding device 100, and is shown in FIG.
  • the encoded data 311 decoded by the video decoding device 300 may have the same syntax structure as that of the video encoding device 100.
  • the same syntax as in FIGS. 12 to 17 is used.
  • the syntax 1600 has mainly three parts as shown in FIG.
  • the high-level syntax 1601 has higher layer syntax information that is equal to or higher than a slice.
  • the slice level syntax 1602 has information necessary for decoding for each slice, and the macroblock level syntax 1603 has information necessary for decoding for each macroblock.
  • High level syntax 1601 includes sequence and picture level syntax, such as sequence parameter set syntax 1604 and picture parameter set syntax 1605.
  • the slice level syntax 1602 includes a slice header syntax 1606, a slice data syntax 1607, and the like.
  • the macroblock level syntax 1603 includes a macroblock layer syntax 1608, a macroblock prediction syntax 1609, and the like.
  • slice_affine_motion_prediction_flag is a syntax element indicating whether to apply geometric transformation prediction to a slice.
  • the geometric transformation prediction unit 202 does not use the parameters (a, b, d, e) in Equation (8) but uses Equation (19) for this slice.
  • H.264 represents motion compensation prediction using a translation model used in H.264 and the like, and parameters (c, f) correspond to motion vectors.
  • this flag is 0, it is the same as the motion compensation prediction of the conventional translation model.
  • slice_affine_motion_prediction_flag 1
  • the prediction separation switch 201 dynamically switches the prediction signal as indicated by the prediction selection information 314 in the slice.
  • mb_skip_flag is a flag indicating whether or not the macroblock is encoded in the skip mode. In the skip mode, transform coefficients, motion vectors, etc. are not encoded. For this reason, the first geometric transformation prediction is not applied to the skip mode.
  • AvailAffineMode is an internal parameter indicating whether or not the second geometric transformation prediction can be used in the macroblock. When AvailAffineMode is 0, it means that the prediction selection information 314 is set not to use the second geometric transformation prediction. When the adjacent motion vector of the adjacent block and the motion vector of the prediction target block have the same value, AvailAffineMode is 0. Otherwise, AvailAffineMode is 1.
  • the setting of AvailAffineMode can also be set using an adjacent geometric transformation parameter or an adjacent motion vector. For example, when the adjacent motion vector points in a completely different direction, there is a possibility that an object boundary exists in the adjacent block of the current prediction target block. Therefore, it is possible to set AvailAffineMode to 0.
  • mb_affine_motion_skip_flag indicating whether to use the second geometric transformation prediction or the motion compensation prediction is encoded.
  • mb_affine_motion_skip_flag 1, it means that the second geometric transformation prediction is applied to the skip mode.
  • mb_affine_motion_skip_flag 0, it means that motion compensation prediction is applied using equation (19).
  • mb_type indicates macroblock type information. That is, whether the current macroblock is intra-coded, inter-coded, what block shape is being predicted, whether the prediction direction is unidirectional prediction or bidirectional prediction, etc. Contains information.
  • the mb_type is passed to the macroblock prediction syntax and the submacroblock prediction syntax indicating the syntax of the subblock in the macroblock.
  • mb_affine_pred_flag indicates whether the first geometric transformation prediction or the second geometric transformation prediction is used in the block.
  • the prediction selection information 314 is set to use the second geometric transformation parameter.
  • the prediction selection information 314 is set to use the first geometric transformation parameter.
  • NumMbPart () is an internal function that returns the number of block divisions specified in mb_type. It is 1 for a 16 ⁇ 16 pixel block, 2 for an 8 ⁇ 16 pixel block, and 2 for an 8 ⁇ 8 pixel block. In the case, 4 is output.
  • mv_l0 and mv_l1 indicate motion vector difference information in the macroblock.
  • the motion vector information is a value that is set by the motion information search unit 106 of the video encoding device 100 and is obtained by taking a difference from a predicted motion vector that is not disclosed in the present embodiment.
  • mvd_l0_affine and mvd_l1_affine indicate derived parameters in the macroblock, and indicate difference information of components (a, b, d, e) excluding motion vectors of affine transformation parameters. This syntax element is encoded only when the first geometric transformation parameter is selected.
  • mb_affine_pred_flag indicates whether to use the first geometric transformation prediction or the second geometric transformation prediction in the block.
  • the prediction selection information 314 is set to use the second geometric transformation parameter.
  • the prediction selection information 314 is set to use the first geometric transformation parameter.
  • mv_l0 and mv_l1 indicate motion vector difference information in the sub macroblock.
  • the motion vector information is a value that is set by the motion information search unit 106 of the video encoding device 100 and is obtained by taking a difference from a predicted motion vector that is not disclosed in the present embodiment.
  • Mvd_l0_affine and mvd_l1_affine in the figure indicate derived parameters in the sub-macroblock, and indicate difference information of components (a, b, d, e) excluding the motion vectors of the affine transformation parameters.
  • This syntax element is encoded only when the first geometric transformation parameter is selected.
  • the syntax structure shown in FIGS. 12 to 17 can select the conventional motion compensation prediction or the second geometric transformation prediction using the translation model when the prediction target pixel block is in the skip mode, and other inter predictions.
  • the first geometric transformation prediction or the second geometric transformation prediction can be selected.
  • syntax elements not defined in the present embodiment may be inserted between lines in the syntax tables shown in FIGS. 12 to 17, and descriptions regarding other conditional branches may be included.
  • the syntax table may be divided into a plurality of tables, or a plurality of syntax tables may be integrated. Moreover, it is not always necessary to use the same term, and it may be arbitrarily changed depending on the form to be used.
  • mb_type indicates macroblock type information. That is, whether the current macroblock is intra-coded, inter-coded, what block shape is being predicted, whether the prediction direction is unidirectional prediction or bidirectional prediction, etc. Contains information.
  • the mb_type is passed to the macroblock prediction syntax and the submacroblock prediction syntax indicating the syntax of the subblock in the macroblock.
  • mb_additional_affine_motion_flag indicates flag information for selecting whether to use the first geometric transformation parameter or the second geometric transformation parameter for the prediction target block. When this flag is 0, the second geometric transformation parameter is used, and when this flag is 1, the first geometric transformation parameter is used.
  • mb_affine_pred_flag uses a geometric transformation prediction including a first geometric transformation prediction or a second geometric transformation prediction or uses a motion compensation prediction of a translation model in a block.
  • the prediction selection information 314 is set to use the motion compensation prediction of the translation model regardless of the mb_additional_affine_motion_flag.
  • the prediction selection information 314 is set to use the first geometric transformation parameter or the second geometric transformation parameter according to the flag information of mb_additional_affine_motion_flag.
  • mb_affine_pred_flag is a block that uses geometric transformation prediction including first geometric transformation prediction or second geometric transformation prediction, or uses motion compensation prediction of a translation model. Indicates whether or not When the flag is 0, the prediction selection information 314 is set to use the motion compensation prediction of the translation model regardless of the mb_additional_affine_motion_flag. On the other hand, when the flag is 1, the prediction selection information 314 is set to use the first geometric transformation parameter or the second geometric transformation parameter according to the flag information of mb_additional_affine_motion_flag.
  • the conventional motion compensation or the second geometric transformation prediction using the translation model can be selected. Then, at the macroblock level, it is determined whether the first geometric transformation prediction or the second geometric transformation prediction is used, and the geometric transformation prediction including the first geometric transformation prediction or the second geometric transformation prediction at the sub macroblock level. Or using motion compensated prediction of a translation model.
  • syntax elements not defined in the present embodiment may be inserted between lines in the syntax tables shown in FIGS. 18 to 20, and descriptions regarding other conditional branches may be included.
  • the syntax table may be divided into a plurality of tables, or a plurality of syntax tables may be integrated. Moreover, it is not always necessary to use the same term, and it may be arbitrarily changed depending on the form to be used.
  • the processing target frame is divided into short blocks of 16 ⁇ 16 pixel size or the like, and as shown in FIG.
  • the case of encoding / decoding has been described, but the encoding order and decoding order are not limited to this.
  • encoding and decoding may be performed sequentially from the lower right to the upper left, or encoding and decoding may be performed sequentially from the center of the screen toward the spiral.
  • encoding and decoding may be performed in order from the upper right to the lower left, or encoding and decoding may be performed in order from the peripheral part to the center part of the screen.
  • the block size has been described as a 4 ⁇ 4 pixel block and an 8 ⁇ 8 pixel block.
  • the prediction target block does not need to have a uniform block shape, and 16 ⁇ Any block size such as an 8 pixel block, an 8 ⁇ 16 pixel block, an 8 ⁇ 4 pixel block, or a 4 ⁇ 8 pixel block may be used.
  • the luminance signal and the color difference signal are not divided and described as an example limited to one color signal component.
  • different prediction methods may be used, or the same prediction method may be used.
  • the prediction method selected for the color difference signal is encoded or decoded by the same method as the luminance signal.
  • the present invention is not limited to the above-described embodiment as it is, and can be embodied by modifying constituent elements without departing from the scope of the invention in the implementation stage.
  • various inventions can be formed by appropriately combining a plurality of components disclosed in the embodiment. For example, some components may be deleted from all the components shown in the embodiment.
  • constituent elements over different embodiments may be appropriately combined.
  • the method of the present invention described in the above embodiment can be executed by a computer, and as a program that can be executed by the computer, a magnetic disk (flexible disk, hard disk, etc.), an optical disk (CD-ROM, DVD) It is also possible to store and distribute in a recording medium such as a semiconductor memory.
  • the prediction signal generation device the moving image encoding device, and the moving image decoding device according to the embodiment of the present invention, the error of the parameter generated by the derivation of the geometric transformation parameter used for the geometric transformation motion compensation prediction is reduced.
  • a prediction signal generation device a moving image encoding device, and a moving image decoding device that suppress error propagation and improve prediction efficiency without increasing the code amount.
  • the following prediction signal generation method, video encoding method, and video decoding method can be provided.
  • a prediction signal generation method for generating a prediction signal sets prediction selection information indicating whether to use a first geometric conversion parameter or a second geometric conversion parameter indicating information related to the shape of an image obtained by pixel block geometric conversion. And one or more second adjacent blocks for which prediction signal generation processing has already been completed among a plurality of first adjacent blocks adjacent to one pixel block of the plurality of pixel blocks into which the image signal is divided And obtaining the first geometric transformation parameter and the second geometric transformation parameter, and predictive geometric transformation of the one pixel block from the geometric transformation parameters of the one or more second neighboring blocks
  • the moving image encoding method includes a step of setting prediction selection information indicating whether to use a first geometric transformation parameter or a second geometric transformation parameter indicating information related to an image shape by geometric transformation of a pixel block; Motion information of one or more second adjacent blocks for which prediction signal generation processing has already been completed among a plurality of first adjacent blocks adjacent to one pixel block of the plurality of pixel blocks into which the signal is divided, or Obtaining a first geometric transformation parameter and the second geometric transformation parameter, and deriving a predicted geometric transformation parameter of the one pixel block from the geometric transformation parameters of the one or more second neighboring blocks. And a derived method of the geometric transformation parameter input from the outside and the predicted geometric transformation parameter by a predetermined method.
  • Prediction is performed by performing a geometric transformation process using the first geometric transformation parameter or the second geometric transformation parameter indicated in the selection information on a reference image signal when performing motion compensation on the one pixel block.
  • a step of generating a signal, a step of encoding the prediction selection information indicating whether to use the first geometric transformation parameter or the second geometric transformation parameter, and the first geometric transformation parameter are selected.
  • Derived to includes a step of encoding the derived value, a step of encoding the differential signal of the input image signal and the prediction signal.
  • a moving image decoding method for decoding moving image encoded data obtained by encoding an input image signal in units of a plurality of pixel blocks and performing decoding processing by a prescribed method includes a plurality of pixels obtained by dividing an input image signal.
  • the motion information of one or more second neighboring blocks that have already been decoded, or the shape of the image by the geometric transformation of the pixel block Obtaining a first geometric transformation parameter and a second geometric transformation parameter indicating information; and decoding selection information indicating whether to use the first geometric transformation parameter or the second geometric transformation parameter Decoding a derived value of the geometric transformation parameter if the first geometric transformation parameter is selected; and the one or more second neighbors Deriving a predicted geometric transformation parameter of the one pixel block from the geometric transformation parameter of the block, and deriving the predicted geometric transformation parameter from the decoded derived value of the geometric transformation parameter and the predicted geometric transformation parameter in a predetermined manner.
  • Deriving and setting one geometric transformation parameter Setting the second geometric transformation parameter based on motion information of the one pixel block and the one or more second neighboring blocks; A prediction signal obtained by performing a geometric transformation process using the first geometric transformation parameter or the second geometric transformation parameter indicated in the selection information on a reference image signal when performing motion compensation for the one pixel block. Generating.
  • the moving image encoding device, the moving image decoding device, the moving image encoding method, and the moving image decoding method according to the present invention are useful for highly efficient moving image encoding, and in particular, geometric transformation motion compensation. It is suitable for moving picture coding that reduces the motion detection process necessary for estimating the geometric transformation parameter used for prediction.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

Disclosed is a prediction-signal producing device comprising: a setting unit for setting prediction selection information indicating which of first and second geometric transformation parameters indicating the information relating to the shape of an image according to the geometric transformation of a pixel block is to be used; an acquiring unit for acquiring the motion information or geometric transformation parameter of one or more second adjacent blocks, which have already been subjected to the prediction signal producing treatment, of the first adjacent block adjacent to one of the pixel blocks having the image signals divided; a deriving unit for deriving the predicted geometric transformation parameters of one pixel block from the geometric transformation parameters of the second adjacent block; a setting unit for deriving the first geometric transformation parameter by a predetermined method from the derived values of the geometric transformation parameters and the predicted geometric transformation parameters and for setting the derived parameter; a setting unit for setting the second geometric transformation parameters on the basis of the motion information of one pixel block and the second adjacent block; and a producing unit for subjecting the reference image signals at the time of performing motion compensations for one pixel block, to a geometric transformation treatment by using the first or second geometric transformation parameters indicated in the selection information, thereby to produce a prediction signal.

Description

幾何変換動き補償予測を用いる予測信号生成装置、動画像符号化装置及び動画像復号化装置Predicted signal generation apparatus, moving picture coding apparatus, and moving picture decoding apparatus using geometric transformation motion compensated prediction
 本発明は、隣接ブロックと予測対象ブロックの動き情報を用いて幾何変換パラメータを導出し、導出した幾何変換パラメータを基に予測対象ブロックの幾何変換予測処理を行う予測信号生成装置、動画像符号化装置及び動画像復号化装置に関する。 The present invention relates to a prediction signal generation apparatus and a moving picture coding method for deriving a geometric transformation parameter using motion information of neighboring blocks and a prediction target block, and performing a geometric transformation prediction process for the prediction target block based on the derived geometric transformation parameter. The present invention relates to a device and a video decoding device.
 近年、大幅に符号化効率を向上させた動画像符号化方法がITU-TとISO/IECとの共同で、ITU-T REC. H.264及びISO/IEC 14496-10(以下「H.264」という。)として勧告されている。H.264では、予測処理、変換処理、及びエントロピー符号化処理が、矩形ブロック単位(例えば、16×16画素、8×8画素)で行われる。このため、H.264では矩形ブロックで表現出来ないオブジェクトを予測する際に、より小さな予測ブロック(例えば、4×4画素)を選択することで予測効率を高めている。このようなオブジェクトを効果的に予測する方法には、矩形ブロックに複数の予測パターンを用意する方法や、変形したオブジェクトに対してアフィン変換を用いた動き補償を適応する方法が含まれる。 In recent years, a moving picture coding method that has greatly improved coding efficiency has been jointly developed by ITU-T and ISO / IEC. H. H.264 and ISO / IEC 14496-10 (hereinafter referred to as “H.264”). H. In H.264, prediction processing, conversion processing, and entropy coding processing are performed in units of rectangular blocks (for example, 16 × 16 pixels, 8 × 8 pixels). For this reason, H.C. In H.264, when an object that cannot be expressed by a rectangular block is predicted, the prediction efficiency is increased by selecting a smaller prediction block (for example, 4 × 4 pixels). Methods for effectively predicting such objects include a method of preparing a plurality of prediction patterns in a rectangular block, and a method of applying motion compensation using affine transformation to a deformed object.
 例えば、特開2007―312397号公報は、オブジェクトの動きのモデルをアフィン変換モデルとし、予測対象のブロック毎に最適なアフィン変換パラメータを算出することによって、オブジェクトの拡大・縮小・回転などを考慮する予測を用いるビデオフレーム転送方法を開示している。 For example, Japanese Patent Application Laid-Open No. 2007-312397 uses an object motion model as an affine transformation model, and calculates the optimum affine transformation parameters for each block to be predicted, thereby taking into consideration the enlargement / reduction / rotation of the object. A video frame transfer method using prediction is disclosed.
 また、R.C. Kordasiewicz, M.D. Gallant, and S. Shirani, “Affine Motion Prediction Based on Translational Motion Vectors,” IEEE Trans. On Circuits and Systems for Video Technologies, Vol. 17, No. 10, October 2007は、動きのモデルを平行移動モデルとして算出した動きベクトルの情報を基にして、ブロックを三角パッチに分割し、それぞれのパッチ毎にアフィン変換パラメータを推定することで、近似的にアフィン変換モデルの動き補償予測を行う方法を開示している。 RC Kordasiewicz, MD Gallant, and S. Shirani, “Affine Motion Prediction Based on Translational Motion Vectors,” IEEE Trans. On Circuits and Systems for Video Technologies, Vol. 17, No. 10, October Based on the motion vector information calculated as a translation model, the block is divided into triangular patches, and the affine transformation parameters are estimated for each patch, so that motion compensation prediction of the affine transformation model is performed approximately. A method is disclosed.
 しかしながら、特開2007―312397号公報に記載の方法では、6種類のアフィン変換パラメータの情報を全ての画素ブロック毎に送信するため、オーバーヘッドが増加する。 However, in the method described in Japanese Patent Application Laid-Open No. 2007-312397, information on six types of affine transformation parameters is transmitted for every pixel block, so that overhead increases.
 また、Kordasiewicz et al.に記載の方法は、上下左右など予測対象となる画素ブロックに隣接する8種類の隣接ブロックの動きベクトルと予測対象の画素ブロックで算出された動きベクトルとを用いてアフィン変換パラメータを推定するため、最適な動きベクトルを求めるためにはフレームを複数回再符号化する必要がある。一方、動きベクトルの算出のみを原画像からフレーム単位で行った場合は、この従来方法は符号量と符号化歪みの観点で最適ではなく、符号化効率が低下するという問題がある。更に、隣接ブロックと予測対象ブロック間にエッジや異なるオブジェクトが存在する場合に、アフィン変換パラメータの導出による誤差が発生し、予測効率が低下するとともに、誤差を含んだアフィン変換パラメータが後段のアフィン変換パラメータの導出に利用されるために、アフィン変換パラメータの誤差が累積されながら伝播し、以後のアフィン変換予測の予測効率が低下していく。 In addition, the method described in Kordasiewicz et al. Uses affine transformation using motion vectors of eight types of adjacent blocks adjacent to a pixel block to be predicted, such as up, down, left, and right, and a motion vector calculated from the pixel block to be predicted. In order to estimate the parameters, it is necessary to re-encode the frame a plurality of times in order to obtain an optimal motion vector. On the other hand, when only the motion vector is calculated in units of frames from the original image, this conventional method is not optimal from the viewpoint of code amount and encoding distortion, and there is a problem that encoding efficiency decreases. Furthermore, when an edge or a different object exists between the adjacent block and the prediction target block, an error due to the affine transformation parameter is generated, the prediction efficiency is lowered, and the affine transformation parameter including the error is used in the subsequent affine transformation. Since it is used for parameter derivation, the error of the affine transformation parameter propagates while accumulating, and the prediction efficiency of the subsequent affine transformation prediction decreases.
 本発明は、幾何変換動き補償予測に用いる幾何変換パラメータの推定に必要な動き検出処理を低減し、符号量を増加させることなく予測効率を向上する予測信号生成、動画像符号化装置、動画像復号化装置、及び動画像復号化装置を提供することを目的とする。 The present invention reduces a motion detection process necessary for estimating a geometric transformation parameter used for geometric transformation motion compensated prediction and improves prediction efficiency without increasing a code amount, a moving picture coding apparatus, a moving picture, and the like. It is an object of the present invention to provide a decoding device and a moving image decoding device.
 本発明の第1の態様は、画素ブロックの幾何変換による画像の形状に係る情報を示す第一の幾何変換パラメータと第二の幾何変換パラメータのどちらを用いるかを示す予測選択情報を設定する第1の設定部と、画像信号が分割された複数の画素ブロックの1つの画素ブロックに隣接する複数の第1の隣接ブロックのうちの、既に予測信号生成処理が完了した1つ以上の第2の隣接ブロックの動き情報又は前記幾何変換パラメータを取得する取得部と、前記1つ以上の第2の隣接ブロックの幾何変換パラメータから、前記1つの画素ブロックの予測幾何変換パラメータを導出する導出部と、入力された幾何変換パラメータの導出値と前記予測幾何変換パラメータから、予め定められた方法によって前記第一の幾何変換パラメータを導出して、設定する第2の設定部と、前記1つの画素ブロック及び前記1つ以上の第2の隣接ブロックの動き情報に基づいて、前記第二の幾何変換パラメータを設定する第3の設定部と、前記1つの画素ブロックに対する動き補償を行う際の参照画像信号に対して前記選択情報に示される前記第一の幾何変換パラメータ又は第二の幾何変換パラメータを用いて幾何変換処理を行って予測信号を生成する生成部と、を具備する予測信号生成装置を提供する。 According to a first aspect of the present invention, prediction selection information indicating whether to use a first geometric transformation parameter or a second geometric transformation parameter indicating information related to an image shape by geometric transformation of a pixel block is set. One setting unit and one or more second blocks of which the prediction signal generation process has already been completed among the plurality of first adjacent blocks adjacent to one pixel block of the plurality of pixel blocks into which the image signal is divided An acquisition unit that acquires motion information of neighboring blocks or the geometric transformation parameters; a derivation unit that derives predicted geometric transformation parameters of the one pixel block from the geometric transformation parameters of the one or more second neighboring blocks; Deriving the first geometric transformation parameter by a predetermined method from the derived geometric transformation parameter derivation value and the predicted geometric transformation parameter, A second setting unit for setting, a third setting unit for setting the second geometric transformation parameter based on motion information of the one pixel block and the one or more second adjacent blocks, A prediction signal is generated by performing a geometric transformation process using the first geometric transformation parameter or the second geometric transformation parameter indicated by the selection information on a reference image signal when performing motion compensation for one pixel block. And a generation unit for generating a prediction signal.
 本発明の第2の態様は、上記の予測信号生成装置を用い、前記第一の幾何変換パラメータと前記第二の幾何変換パラメータのどちらを用いるかを示す前記予測選択情報を符号化する第1の符号化部と、前記第一の幾何変換パラメータが選択された場合に、前記第一の幾何変換パラメータと前記予測幾何変換パラメータから、予め規定された方法で幾何変換パラメータの導出値を導出して、前記導出値を符号化する第2の符号化部と、前記入力画像信号と前記予測信号の差分信号を示す情報を符号化する第3の符号化部と、を有する、動画像符号化装置を提供する。 According to a second aspect of the present invention, the prediction selection information indicating whether to use the first geometric transformation parameter or the second geometric transformation parameter is encoded using the prediction signal generation device described above. When the first geometric transformation parameter is selected, a derived value of the geometric transformation parameter is derived from the first geometric transformation parameter and the predicted geometric transformation parameter by a predetermined method. And a second encoding unit that encodes the derived value, and a third encoding unit that encodes information indicating a difference signal between the input image signal and the prediction signal. Providing equipment.
 本発明の第3の態様は、入力画像信号を複数の画素ブロック単位に符号化処理された動画像符号化データを解読し、規定された方法で復号化処理する動画像復号化装置において、入力画像信号が分割された複数の画素ブロックの1つの画素ブロックに隣接する複数の第1の隣接ブロックのうちの、既に復号処理が完了した1つ以上の第2の隣接ブロックの動き情報又は画素ブロックの幾何変換による画像の形状に係る情報を示す幾何変換パラメータを取得する動き情報取得部と、前記第一の幾何変換パラメータと前記第二の幾何変換パラメータのどちらを用いるかを示す選択情報を復号する第1の復号部と、前記第一の幾何変換パラメータが選択されている場合に、幾何変換パラメータの導出値を復号する第2の復号部と、前記1つ以上の第2の隣接ブロックの前記幾何変換パラメータから、前記1つの画素ブロックの予測幾何変換パラメータを導出する導出部と、復号された前記幾何変換パラメータの導出値と前記予測幾何変換パラメータから、予め規定された方法で前記第一の幾何変換パラメータを導出して、設定する第1の設定部と、前記第二の幾何変換パラメータが選択されている場合に、前記1つの画素ブロック及び前記1つ以上の第2の隣接ブロックの動き情報に基づいて、前記第二の幾何変換パラメータを設定する第2の設定部と、前記1つの画素ブロックに対する動き補償を行う際の参照画像信号に対して前記選択情報に示される前記第一の幾何変換パラメータ又は第二の幾何変換パラメータを用いて幾何変換処理を行なって予測信号を生成する生成部と、を具備することを特徴とする動画像復号化装置を提供する。 According to a third aspect of the present invention, there is provided a moving image decoding apparatus that decodes moving image encoded data obtained by encoding an input image signal in units of a plurality of pixel blocks, and performs decoding processing by a prescribed method. Motion information or pixel block of one or more second adjacent blocks that have already been decoded among a plurality of first adjacent blocks adjacent to one pixel block of the plurality of pixel blocks into which the image signal is divided A motion information acquisition unit for acquiring a geometric conversion parameter indicating information related to the shape of an image by geometric conversion of the image, and decoding selection information indicating whether to use the first geometric conversion parameter or the second geometric conversion parameter A first decoding unit, a second decoding unit that decodes a derived value of the geometric transformation parameter when the first geometric transformation parameter is selected, and the one or more A derivation unit for deriving a predicted geometric transformation parameter of the one pixel block from the geometric transformation parameters of two adjacent blocks, a derivation value of the decoded geometric transformation parameter, and the predicted geometric transformation parameter; When the first setting unit for deriving and setting the first geometric transformation parameter by the method and the second geometric transformation parameter are selected, the one pixel block and the one or more first geometric transformation parameters are selected. A second setting unit for setting the second geometric transformation parameter based on motion information of two adjacent blocks, and the selection information for a reference image signal when performing motion compensation for the one pixel block. A generation unit that generates a prediction signal by performing a geometric transformation process using the first geometric transformation parameter or the second geometric transformation parameter shown. To provide a moving picture decoding apparatus according to claim Rukoto.
第1の実施形態及び第2の実施形態に従う動画像符号化装置のブロック図である。It is a block diagram of the moving image encoder according to the first embodiment and the second embodiment. 第1の実施形態及び第2の実施形態に従うインター予測信号生成装置のブロック図である。It is a block diagram of the inter estimated signal generation apparatus according to 1st Embodiment and 2nd Embodiment. 符号化の処理の流れを示す図である。It is a figure which shows the flow of an encoding process. 16×16画素ブロックを示す図である。It is a figure which shows a 16x16 pixel block. 導出パラメータ導出部のブロック図である。It is a block diagram of a derived parameter derivation unit. 第二幾何変換パラメータ導出部のブロック図である。It is a block diagram of a 2nd geometric transformation parameter derivation part. 符号化又は復号化の対象となる画素ブロックと隣接ブロックとの位置関係を表す図である。It is a figure showing the positional relationship of the pixel block used as the object of an encoding or decoding, and an adjacent block. 符号化又は復号化の対象となる画素ブロックがマクロブロックの左上の場合の隣接ブロックの位置関係を表す図である。It is a figure showing the positional relationship of an adjacent block in case the pixel block used as the object of an encoding or decoding is the upper left of a macroblock. 符号化又は復号化の対象となる画素ブロックがマクロブロックの右上の場合の隣接ブロックの位置関係を表す図である。It is a figure showing the positional relationship of an adjacent block in case the pixel block used as the object of an encoding or decoding is an upper right of a macroblock. 符号化又は復号化の対象となる画素ブロックがマクロブロックの左下の場合の隣接ブロックの位置関係を表す図である。It is a figure showing the positional relationship of an adjacent block in case the pixel block used as the object of an encoding or decoding is the lower left of a macroblock. 符号化又は復号化の対象となる画素ブロックがマクロブロックの右下の場合の隣接ブロックの位置関係を表す図である。It is a figure showing the positional relationship of an adjacent block in case the pixel block used as the object of an encoding or decoding is a lower right of a macroblock. 予測幾何変換パラメータ導出部のブロック図である。It is a block diagram of a prediction geometric transformation parameter derivation unit. 幾何変換予測部のブロック図である。It is a block diagram of a geometric transformation prediction part. 幾何変換を行った分数位置の画素値を内挿補間処理によって生成する例を示す図である。It is a figure which shows the example which produces | generates the pixel value of the fractional position which performed geometric transformation by the interpolation process. 予測選択情報のインデックスの例を示す図である。It is a figure which shows the example of the index of prediction selection information. 第1の実施形態に示されるインター予測信号生成装置における幾何変換予測の処理の流れを示すフローチャートである。It is a flowchart which shows the flow of a process of the geometric transformation prediction in the inter prediction signal generation apparatus shown by 1st Embodiment. シンタクス構造を示す図である。It is a figure which shows a syntax structure. スライスヘッダーシンタクスに含まれる情報を示す図である。It is a figure which shows the information contained in a slice header syntax. 第1の実施形態乃至は第4の実施形態におけるスライスデータシンタクスに含まれる情報を示す図である。It is a figure which shows the information contained in the slice data syntax in 1st Embodiment thru | or 4th Embodiment. 第1の実施形態及び第3の実施形態におけるマクロブロックレイヤーシンタクスに含まれる情報を示す図である。It is a figure which shows the information contained in the macroblock layer syntax in 1st Embodiment and 3rd Embodiment. 第1の実施形態及び第3の実施形態におけるマクロブロックプレディクションシンタクスに含まれる情報を示す図である。It is a figure which shows the information contained in the macroblock prediction syntax in 1st Embodiment and 3rd Embodiment. 第1の実施形態及び第3の実施形態におけるサブマクロブロックプレディクションシンタクスに含まれる情報を示す図である。It is a figure which shows the information contained in the submacroblock prediction syntax in 1st Embodiment and 3rd Embodiment. 第2の実施形態及び第4の実施形態におけるマクロブロックレイヤーシンタクスに含まれる情報を示す図である。It is a figure which shows the information contained in the macroblock layer syntax in 2nd Embodiment and 4th Embodiment. 第2の実施形態及び第4の実施形態におけるマクロブロックプレディクションシンタクスに含まれる情報を示す図である。It is a figure which shows the information contained in the macroblock prediction syntax in 2nd Embodiment and 4th Embodiment. 第2の実施形態及び第4の実施形態におけるサブマクロブロックプレディクションシンタクスに含まれる情報を示す図である。It is a figure which shows the information contained in the submacroblock prediction syntax in 2nd Embodiment and 4th Embodiment. 第3の実施形態及び第4の実施形態に従う動画像復号化装置を示すブロック図である。It is a block diagram which shows the moving image decoding apparatus according to 3rd Embodiment and 4th Embodiment.
 以下、第1の実施形態乃至第4の実施形態を図面に基づき説明する。第1及び第2の実施形態は、動画像符号化装置に関し、第3及び第4の実施形態は、動画像復号化装置に関する。 Hereinafter, the first to fourth embodiments will be described with reference to the drawings. The first and second embodiments relate to a video encoding device, and the third and fourth embodiments relate to a video decoding device.
 以下の実施形態で説明する動画像符号化装置は、入力画像信号を構成する各々のフレームを複数の画素ブロックに分割し、これら分割した画素ブロックに対して符号化処理を行って圧縮符号化し、符号列を出力する装置である。 The video encoding apparatus described in the following embodiment divides each frame constituting an input image signal into a plurality of pixel blocks, performs encoding processing on the divided pixel blocks, and performs compression encoding. It is a device that outputs a code string.
 [第1の実施形態]
 図1を参照して、第1の実施形態に係る、幾何変換予測を用いる動画像符号化装置100の構成を説明する。
[First Embodiment]
With reference to FIG. 1, the structure of the moving image encoder 100 which uses geometric transformation prediction based on 1st Embodiment is demonstrated.
 動画像符号化装置100は、符号化制御部114に接続される。動画像符号化装置100においては、減算器101は入力画像信号115と予測信号206との差分を算出し、予測誤差信号116を出力する。減算器101の出力端は変換/量子化部102に接続される。変換/量子化部102は例えば、直交変換器(離散コサイン変換器)及び量子化器を含み、予測誤差信号116を直交変換(離散コサイン変換)し、量子化して変換係数117に変換する。変換/量子化部102の出力端は逆量子化/逆変換部103及びエントロピー符号化部112に接続される。 The moving picture coding apparatus 100 is connected to the coding control unit 114. In the moving image encoding apparatus 100, the subtractor 101 calculates a difference between the input image signal 115 and the prediction signal 206, and outputs a prediction error signal 116. The output terminal of the subtractor 101 is connected to the transform / quantization unit 102. The transform / quantization unit 102 includes, for example, an orthogonal transformer (discrete cosine transformer) and a quantizer, performs orthogonal transform (discrete cosine transform) on the prediction error signal 116, quantizes it, and transforms it into a transform coefficient 117. The output terminal of the transform / quantization unit 102 is connected to the inverse quantization / inverse transform unit 103 and the entropy coding unit 112.
 逆量子化/逆変換部103は逆量子化器及び逆直交変換器(逆離散コサイン変換器)を有し、変換係数117を逆量子化し、逆直交変換して復号予測誤差信号118に復元する。逆量子化/逆変換部103の出力端は加算器104に接続される。加算器104は復号予測誤差信号118と予測信号206とを加算して復号画像信号119を生成する。加算器104の出力端は参照画像メモリ105に接続される。参照画像メモリ105は復号画像信号119を参照画像信号として蓄積する。参照画像メモリ105の出力端は動き情報探索部106、イントラ予測信号生成装置107及びインター予測信号生成装置109に接続される。 The inverse quantization / inverse transform unit 103 includes an inverse quantizer and an inverse orthogonal transformer (inverse discrete cosine transformer), inversely quantizes the transform coefficient 117, and performs inverse orthogonal transform to restore the decoded prediction error signal 118. . The output terminal of the inverse quantization / inverse transform unit 103 is connected to the adder 104. The adder 104 adds the decoded prediction error signal 118 and the prediction signal 206 to generate a decoded image signal 119. The output terminal of the adder 104 is connected to the reference image memory 105. The reference image memory 105 stores the decoded image signal 119 as a reference image signal. The output terminal of the reference image memory 105 is connected to the motion information search unit 106, the intra prediction signal generation device 107, and the inter prediction signal generation device 109.
 動き情報探索部106は入力画像信号115と参照画像信号207とを用いて予測対象ブロックに適した動き情報(動きベクトル)210を算出する。動き情報探索部106の出力端は導出パラメータ導出部108及びインター予測信号生成装置109に接続される。導出パラメータ導出部108は図4に示すように、予測幾何変換パラメータ導出部601と変換器602を有し、後述する予測幾何変換パラメータを導出する機能と、この予測幾何変換パラメータと入力された動き情報210を用いて導出パラメータを計算する機能を有する。導出パラメータ導出部108の出力端はインター予測信号生成装置109及びエントロピー符号化部112に接続される。 The motion information search unit 106 uses the input image signal 115 and the reference image signal 207 to calculate motion information (motion vector) 210 suitable for the prediction target block. The output terminal of the motion information search unit 106 is connected to the derived parameter derivation unit 108 and the inter prediction signal generation device 109. As shown in FIG. 4, the derivation parameter derivation unit 108 includes a predictive geometric transformation parameter derivation unit 601 and a converter 602. The derivation parameter derivation unit 108 has a function of deriving a predictive geometric transformation parameter to be described later, It has a function of calculating derived parameters using information 210. The output terminal of the derived parameter deriving unit 108 is connected to the inter prediction signal generation device 109 and the entropy coding unit 112.
 インター予測信号生成装置109は、図2に示されるように、入力された動き情報210、導出パラメータ情報108、参照画像信号207、予測選択情報123を利用して、予測信号206を生成する機能を有する。これに対して、イントラ予測信号生成装置107は入力された参照画像信号207を利用して、イントラ予測を行う機能を有する。イントラ予測信号生成装置107及びインター予測信号生成装置109の出力端は予測分離スイッチ110の端子にそれぞれ接続される。 As shown in FIG. 2, the inter prediction signal generation device 109 has a function of generating a prediction signal 206 using the input motion information 210, derived parameter information 108, reference image signal 207, and prediction selection information 123. Have. On the other hand, the intra prediction signal generation device 107 has a function of performing intra prediction using the input reference image signal 207. Output terminals of the intra prediction signal generation device 107 and the inter prediction signal generation device 109 are connected to terminals of the prediction separation switch 110, respectively.
 予測選択部111は、符号化制御部114が制御する予測モードに従って、予測選択情報123を設定する。予測選択部111の出力端はインター予測信号生成装置109,予測分離スイッチ110及びエントロピー符号化部112に接続される。予測分離スイッチ110は予測選択部111の予測選択情報123に従ってイントラ予測信号生成装置107及びインター予測信号生成装置109を切り替える。予測分離スイッチ110の切換端子は減算器101及び加算器104に接続され、イントラ予測信号生成装置107又はインター予測信号生成装置109の予測信号を減算器101及び加算器104に導入する。 The prediction selection unit 111 sets the prediction selection information 123 according to the prediction mode controlled by the encoding control unit 114. The output terminal of the prediction selection unit 111 is connected to the inter prediction signal generation device 109, the prediction separation switch 110, and the entropy encoding unit 112. The prediction separation switch 110 switches between the intra prediction signal generation device 107 and the inter prediction signal generation device 109 according to the prediction selection information 123 of the prediction selection unit 111. The switching terminal of the prediction separation switch 110 is connected to the subtractor 101 and the adder 104, and introduces the prediction signal of the intra prediction signal generation device 107 or the inter prediction signal generation device 109 to the subtractor 101 and the adder 104.
 エントロピー符号化部112は符号化器及び多重化器を含み、変換係数117,導出パラメータ情報211及び予測選択情報123をエントロピー符号化し、多重化する。エントロピー符号化部112の出力端は出力バッファ113に接続される。出力バッファ113は多重化データを一旦蓄積し、符号化制御部114が管理する出力タイミングに従って符号化データ129として出力する。 The entropy encoding unit 112 includes an encoder and a multiplexer, and entropy-encodes and multiplexes the transform coefficient 117, the derived parameter information 211, and the prediction selection information 123. The output terminal of the entropy encoding unit 112 is connected to the output buffer 113. The output buffer 113 temporarily stores the multiplexed data and outputs it as encoded data 129 according to the output timing managed by the encoding control unit 114.
 上記構成の動画像符号化装置100は、符号化制御部114から入力される符号化パラメータに基づいて、入力画像信号115に対するイントラ予測(フレーム内予測)又はインター予測(フレーム間予測)符号化処理を行い、予測信号206を生成し、符号化データ129を出力する。即ち、動画像符号化装置100に動画像又は静止画像の入力画像信号115が、画素ブロック単位、例えばマクロブロック単位に分割されて入力される。入力画像信号は、フレーム及びフィールドの両方を含む1つの符号化の処理単位である。なお、本実施形態では、フレームを1つの符号化の処理単位とする例について説明する。 The moving image encoding apparatus 100 having the above configuration performs an intra prediction (intraframe prediction) or inter prediction (interframe prediction) encoding process on the input image signal 115 based on the encoding parameter input from the encoding control unit 114. The prediction signal 206 is generated, and the encoded data 129 is output. That is, an input image signal 115 of a moving image or a still image is input to the moving image encoding apparatus 100 after being divided into pixel blocks, for example, macroblocks. The input image signal is one encoding processing unit including both a frame and a field. In the present embodiment, an example in which a frame is used as one encoding processing unit will be described.
 動画像符号化装置100は、ブロックサイズや予測信号206の生成方法の異なる複数の予測モードによる符号化を行う。予測信号206の生成方法は、具体的には大きく分けて符号化対象のフレーム内だけで予測信号を生成するイントラ予測(フレーム内予測)と、時間的に異なる複数の参照フレームを用いて予測を行うインター予測とがあるが、本実施形態では、インター予測を用いて予測信号を生成する例について詳細に説明する。 The moving picture encoding apparatus 100 performs encoding in a plurality of prediction modes with different block sizes and generation methods of the prediction signal 206. The generation method of the prediction signal 206 is roughly divided into intra prediction (intraframe prediction) in which a prediction signal is generated only within a frame to be encoded, and prediction using a plurality of temporally different reference frames. In this embodiment, an example in which a prediction signal is generated using inter prediction will be described in detail.
 第1の実施形態乃至第4の実施形態では、マクロブロックを符号化処理の基本的な処理ブロックサイズとする。マクロブロックは、典型的に例えば図3Bに示す16×16画素ブロックであるが、32×32画素ブロック単位であっても8×8画素ブロック単位であってもよい。またマクロブロックの形状は必ずしも正方格子である必要はない。以下、入力画像信号115の符号化対象ブロック又はマクロブロックを単に「予測対象ブロック」という。 In the first to fourth embodiments, the macro block is set to the basic processing block size of the encoding process. The macroblock is typically a 16 × 16 pixel block shown in FIG. 3B, for example, but may be a 32 × 32 pixel block unit or an 8 × 8 pixel block unit. The shape of the macroblock does not necessarily need to be a square lattice. Hereinafter, the encoding target block or macroblock of the input image signal 115 is simply referred to as a “prediction target block”.
 第1の実施形態乃至は第4の実施形態では、説明を簡単にするために図3Aに示されているように左上から右下に向かって符号化処理がなされるものとする。図3Aでは、符号化処理をされている符号化フレームfにおいて、符号化対象となるブロックcよりも左及び上に位置するブロックが、符号化済みブロックpである。 In the first embodiment to the fourth embodiment, it is assumed that the encoding process is performed from the upper left to the lower right as shown in FIG. In FIG. 3A, in the encoded frame f subjected to the encoding process, a block located on the left and above the block c to be encoded is an encoded block p.
 次に、動画像符号化装置100における符号化の流れを説明する。まず、入力画像信号115が、減算器101へ入力される。減算器101には、予測分離スイッチ110から出力された各々の予測モードに応じた予測信号206が更に入力される。減算器101は、入力画像信号115から予測信号206を減算した予測誤差信号116を算出する。予測誤差信号116は変換/量子化部102へと入力される。 Next, the flow of encoding in the moving image encoding apparatus 100 will be described. First, the input image signal 115 is input to the subtractor 101. The subtracter 101 further receives a prediction signal 206 corresponding to each prediction mode output from the prediction separation switch 110. The subtractor 101 calculates a prediction error signal 116 obtained by subtracting the prediction signal 206 from the input image signal 115. The prediction error signal 116 is input to the transform / quantization unit 102.
 変換/量子化部102では、予測誤差信号116に、例えば離散コサイン変換(DCT)のような直交変換が施され、変換係数が生成される。変換/量子化部102における変換は、H.264で用いられている離散コサイン変換の他に、離散サイン変換、ウェーブレット変換、又は、成分解析を含む。 The transform / quantization unit 102 performs orthogonal transform such as discrete cosine transform (DCT) on the prediction error signal 116 to generate transform coefficients. The transformation in the transformation / quantization unit 102 is H.264. In addition to the discrete cosine transform used in H.264, discrete sine transform, wavelet transform, or component analysis is included.
 変換/量子化部102では、符号化制御部114によって与えられる量子化パラメータ、量子化マトリクス等に代表される量子化情報に従って変換係数を量子化する。変換/量子化部102は、量子化後の変換係数117を、エントロピー符号化部112に対して出力し、さらに、逆量子化/逆変換部103に対しても出力する。 The transform / quantization unit 102 quantizes the transform coefficient in accordance with quantization information represented by a quantization parameter, a quantization matrix, and the like given by the encoding control unit 114. The transform / quantization unit 102 outputs the quantized transform coefficient 117 to the entropy coding unit 112 and also outputs it to the inverse quantization / inverse transform unit 103.
 エントロピー符号化部112は、量子化後の変換係数117に対してエントロピー符号化、例えばハフマン符号化や算術符号化などを行う。エントロピー符号化部112は、さらに、符号化制御部114から出力された予測情報などを含んだ、符号化対象ブロックを符号化したときに用いた様々な符号化パラメータに対してエントロピー符号化を行う。これにより、符号化データ129が生成される。 The entropy encoding unit 112 performs entropy encoding, for example, Huffman encoding or arithmetic encoding, on the quantized transform coefficient 117. The entropy encoding unit 112 further performs entropy encoding on various encoding parameters used when the encoding target block including the prediction information output from the encoding control unit 114 is encoded. . As a result, encoded data 129 is generated.
 なお、符号化パラメータとは、予測情報、変換係数に関する情報、量子化に関する情報、等の復号の際に必要となるパラメータである。なお、予測対象ブロックの符号化パラメータは、符号化制御部114が持つ内部メモリに保持され、予測対象ブロックが他の画素ブロックの隣接ブロックとして用いられる際に利用される。 Note that the encoding parameter is a parameter required for decoding prediction information, information on transform coefficients, information on quantization, and the like. Note that the encoding parameter of the prediction target block is held in an internal memory of the encoding control unit 114, and is used when the prediction target block is used as an adjacent block of another pixel block.
 エントロピー符号化部112により生成され、多重化されて得られた符号化データ129は、動画像符号化装置100から出力され、出力バッファ113に一旦蓄積された後、符号化制御部114が管理する出力タイミングに従って符号化データ129として出力される。符号化データ129は、例えば、図示しない蓄積系(蓄積メディア)又は伝送系(通信回線)へ送出される。 The encoded data 129 generated and multiplexed by the entropy encoding unit 112 is output from the moving image encoding apparatus 100, temporarily stored in the output buffer 113, and then managed by the encoding control unit 114. The encoded data 129 is output according to the output timing. The encoded data 129 is sent to, for example, a storage system (storage medium) or a transmission system (communication line) (not shown).
 逆量子化/逆変換部103では、変換/量子化部102から出力された量子化後の変換係数117に対する逆量子化処理が行われる。ここでは、変換/量子化部102で使用された量子化情報に対応する量子化情報が、符号化制御部114の内部メモリからロードされて逆量子化処理される。なお、量子化情報は、例えば、量子化パラメータ、量子化マトリクス等に代表されるパラメータである。 In the inverse quantization / inverse transform unit 103, an inverse quantization process is performed on the quantized transform coefficient 117 output from the transform / quantization unit 102. Here, the quantization information corresponding to the quantization information used in the transform / quantization unit 102 is loaded from the internal memory of the encoding control unit 114 and subjected to inverse quantization processing. Note that the quantization information is, for example, a parameter represented by a quantization parameter, a quantization matrix, or the like.
 逆量子化/逆変換部103では、さらに、逆量子化後の変換係数に対し、逆離散コサイン変換(IDCT)のような逆直交変換が施されることによって、復号予測誤差信号118が再生される。 The inverse quantization / inverse transform unit 103 further reproduces the decoded prediction error signal 118 by performing inverse orthogonal transform such as inverse discrete cosine transform (IDCT) on the transform coefficient after inverse quantization. The
 復号予測誤差信号118は、加算器104に入力される。加算器104では、復号予測誤差信号118と予測分離スイッチ110から出力された予測信号206とが加算されることにより、復号画像信号119が生成される。復号画像信号119は、局所復号画像信号である。復号画像信号119は、参照画像メモリ105に参照画像信号207として蓄積される。参照画像メモリ105に蓄積された参照画像信号207は、動き情報探索部106、イントラ予測信号生成装置107、インター予測信号生成装置109等に出力され予測の際などに参照される。 The decoded prediction error signal 118 is input to the adder 104. The adder 104 adds the decoded prediction error signal 118 and the prediction signal 206 output from the prediction separation switch 110 to generate a decoded image signal 119. The decoded image signal 119 is a local decoded image signal. The decoded image signal 119 is stored as the reference image signal 207 in the reference image memory 105. The reference image signal 207 stored in the reference image memory 105 is output to the motion information search unit 106, the intra prediction signal generation device 107, the inter prediction signal generation device 109, etc., and is referred to when performing prediction.
 動き情報探索部106は、入力画像信号115と参照画像信号207とを用いて、予測対象ブロックに適した動き情報210を算出する。動き情報は、例えば、アフィン変換パラメータで表されるとよい。また、動き情報は、動きベクトルで表すことも可能である。動き情報210は、例えば、アフィン変換パラメータを他のアフィン変換パラメータ等により予測する際の予測値でもよいし、動きベクトルを他の動きベクトル等により予測する際の予測値でもよい。ここでは、画像間の幾何変形を含んだ動きの情報を動き情報とする。 The motion information search unit 106 uses the input image signal 115 and the reference image signal 207 to calculate motion information 210 suitable for the prediction target block. The motion information may be represented by an affine transformation parameter, for example. In addition, the motion information can be represented by a motion vector. The motion information 210 may be, for example, a predicted value for predicting an affine transformation parameter with another affine transformation parameter or the like, or may be a predicted value for predicting a motion vector with another motion vector or the like. Here, motion information including geometric deformation between images is used as motion information.
 動き情報探索部106は、入力画像信号115の予測対象ブロックと、参照画像信号207の補間画像との間でブロックマッチング等の探索を行うことにより、動き情報210(アフィン変換パラメータや動きベクトル)を算出する。マッチングの評価基準としては、例えば、入力画像信号115とマッチング後の補間画像との差分を画素毎に累積した値や算出されたアフィン変換パラメータと探索の中心との差分値を加算した値などを用いる。 The motion information search unit 106 searches for motion information 210 (affine transformation parameters and motion vectors) by performing a search such as block matching between the prediction target block of the input image signal 115 and the interpolated image of the reference image signal 207. calculate. As an evaluation criterion for matching, for example, a value obtained by accumulating a difference between the input image signal 115 and the interpolated image after matching for each pixel, a value obtained by adding a difference value between the calculated affine transformation parameter and the center of search, or the like. Use.
 動き情報210の決定は、前述した方法の他に予測された画像と原画像との差を変換した値を用いても良いし、動きベクトルやアフィン変換パラメータの大きさを加味したり、動きベクトルやアフィン変換パラメータの符号量などを加味したりして、判定してもよい。また、後述する式(1)及び式(2)などのコストを利用しても良い。また、マッチングは、動画像符号化装置100の外部から提供される探索範囲情報に基づいてマッチングの範囲内を全探索しても良いし、画素精度毎に階層的に実施しても良い。 In addition to the method described above, the motion information 210 may be determined by using a value obtained by converting the difference between the predicted image and the original image, taking into account the size of the motion vector or the affine transformation parameter, Alternatively, the determination may be made by taking into account the code amount of the affine transformation parameter or the like. Moreover, you may utilize costs, such as Formula (1) and Formula (2) mentioned later. Further, the matching may be performed through a search within the matching range based on search range information provided from the outside of the moving image encoding apparatus 100, or may be performed hierarchically for each pixel accuracy.
 このようにして複数の参照画像信号に対して算出された動き情報210は、インター予測信号生成装置109へと入力され、予測信号206の生成に利用される。なお、複数の参照画像信号は、それぞれの表示時刻が異なる局部復号画像である。 The motion information 210 calculated for a plurality of reference image signals in this way is input to the inter prediction signal generation device 109 and used to generate the prediction signal 206. The plurality of reference image signals are locally decoded images having different display times.
 算出された動き情報210は、導出パラメータ導出部108へと出力される。導出パラメータ導出部108は、図4に示されるように、予測幾何変換パラメータ導出部601と変換器602を有し、後述する予測幾何変換パラメータを導出し、この予測幾何変換パラメータと入力された動き情報210を用いて導出パラメータを計算する。変換器602は、例えば減算器であっても良いが、減算器以外にも、加算器、乗算器、除算器又は予め定められた行列を用いて変換を行う変換器、及びこれらを組み合わせた式を実現する変換器でもよい。以降、変換器602を減算器と解釈して説明する。 The calculated motion information 210 is output to the derived parameter derivation unit 108. As shown in FIG. 4, the derived parameter derivation unit 108 includes a predicted geometric transformation parameter derivation unit 601 and a converter 602, derives a predicted geometric transformation parameter to be described later, and inputs the predicted geometric transformation parameter and the input motion. The derived parameter is calculated using the information 210. The converter 602 may be, for example, a subtracter. In addition to the subtracter, the converter 602 may be an adder, a multiplier, a divider, a converter that performs conversion using a predetermined matrix, and a combination of these. The converter which implement | achieves may be sufficient. Hereinafter, the converter 602 will be described as a subtractor.
 導出パラメータ導出部108で導出された導出パラメータ情報211は、エントロピー符号化部112へと出力され、エントロピー符号化が行われた後に符号化データに多重化される。更に対象画素ブロックを符号化した動き情報210は、符号化制御部114の内部メモリに保存され、インター予測信号生成装置109から適宜ロードされて利用される。 The derived parameter information 211 derived by the derived parameter deriving unit 108 is output to the entropy encoding unit 112, and after being subjected to entropy encoding, is multiplexed into encoded data. Furthermore, the motion information 210 obtained by encoding the target pixel block is stored in the internal memory of the encoding control unit 114, and is appropriately loaded from the inter prediction signal generation device 109 and used.
 参照画像メモリ105に蓄えられている参照画像信号207は、イントラ予測信号生成装置107へと出力される。イントラ予測信号生成装置107では、入力された参照画像信号207を利用して、イントラ予測が行われる。例えば、H.264では、予測対象ブロックに隣接する符号化済みの参照画素値を利用して、垂直方向、水平方向などの予測方向に順じて画素補填を行うことによって予測信号を生成する。なお、予め定められた補間方法を用いて画素値を補間した後に、予め定められた予測方向に補間画素値をコピーしてもよい。作成された予測信号206は、予測分離スイッチ110へと出力される。 The reference image signal 207 stored in the reference image memory 105 is output to the intra prediction signal generation device 107. In the intra prediction signal generation device 107, intra prediction is performed using the input reference image signal 207. For example, H.M. In H.264, a prediction signal is generated by performing pixel interpolation in the prediction direction such as the vertical direction and the horizontal direction using an encoded reference pixel value adjacent to the prediction target block. In addition, after interpolating the pixel value using a predetermined interpolation method, the interpolated pixel value may be copied in a predetermined prediction direction. The generated prediction signal 206 is output to the prediction separation switch 110.
 インター予測信号生成装置109では、入力された動き情報210、導出パラメータ情報108、参照画像信号207、予測選択情報123を利用して、予測信号206が生成される。作成された予測信号206は、予測分離スイッチ110へと出力される。 The inter prediction signal generation apparatus 109 generates a prediction signal 206 using the input motion information 210, derived parameter information 108, reference image signal 207, and prediction selection information 123. The generated prediction signal 206 is output to the prediction separation switch 110.
 予測分離スイッチ110は、イントラ予測信号生成装置107の出力端とインター予測信号生成装置109の出力端を、予測選択情報123に従って選択する。予測選択情報123に示される情報がイントラ予測である場合はスイッチをイントラ予測信号生成装置107へと接続する。一方、予測選択情報123がインター予測である場合はスイッチをインター予測信号生成装置109へと接続する。なお、予測選択情報123の例は後述する図10に示されている。 The prediction separation switch 110 selects the output terminal of the intra prediction signal generation device 107 and the output terminal of the inter prediction signal generation device 109 according to the prediction selection information 123. When the information shown in the prediction selection information 123 is intra prediction, the switch is connected to the intra prediction signal generation device 107. On the other hand, when the prediction selection information 123 is inter prediction, the switch is connected to the inter prediction signal generation device 109. An example of the prediction selection information 123 is shown in FIG.
 予測選択部111は、符号化制御部114が制御する予測モードに従って、予測選択情報123を設定する。予測モードとして、イントラ予測やインター予測が選択可能であり、それぞれに対して複数のモードが存在しても良い。これらのどのモードを選択するかは、符号化制御部114が制御する。例えば、全ての予測モードに対して、予測信号206を生成し、これらの中から1つの予測モードを選択してもよいし、入力画像の特性に従って予測モードを限定してもよい。 The prediction selection unit 111 sets the prediction selection information 123 according to the prediction mode controlled by the encoding control unit 114. As the prediction mode, intra prediction or inter prediction can be selected, and a plurality of modes may exist for each. The encoding control unit 114 controls which mode is selected. For example, the prediction signal 206 may be generated for all prediction modes, and one prediction mode may be selected from these, or the prediction mode may be limited according to the characteristics of the input image.
 より具体的に説明すると、次式のようなコストを用いて予測選択情報123を決定する。予測モードを選択した際に必要となる予測情報に関する符号量(例えば導出パラメータ211の符号量や予測ブロックサイズの符号量など)をOH、入力画像信号115と予測信号206の差分絶対和(予測誤差信号116の絶対累積和を意味する)をSADとすると、以下の判定式を用いる。
Figure JPOXMLDOC01-appb-M000001
More specifically, the prediction selection information 123 is determined using a cost such as the following equation. The code amount (for example, the code amount of the derivation parameter 211 and the code amount of the prediction block size) required for the prediction information when the prediction mode is selected is OH, and the absolute difference (prediction error) between the input image signal 115 and the prediction signal 206 If the SAD is defined as the absolute cumulative sum of the signals 116, the following determination formula is used.
Figure JPOXMLDOC01-appb-M000001
 ここでKはコスト、λは定数をそれぞれ表す。λは量子化スケールや量子化パラメータの値に基づいて決められるラグランジュ未定乗数である。本判定式では、コストKが最も小さい値を与えるモードが最適な予測モードとして選択される。 Where K is the cost and λ is a constant. λ is a Lagrangian undetermined multiplier determined based on the quantization scale and the value of the quantization parameter. In this determination formula, the mode giving the value with the smallest cost K is selected as the optimum prediction mode.
 式(1)に代えて(a)予測情報のみ、(b)SADのみ、を用いて予測選択情報123の決定を行ってもよいし、これら(a)、(b)にアダマール変換を施した値、又はそれに近似した値を利用してもよい。さらに別の例として、仮符号化ユニットを用意し、仮符号化ユニットによりある予測モードで生成された予測誤差信号116を実際に符号化した場合の符号量と、入力画像信号115と復号画像信号119との間の二乗誤差を用いて予測選択情報123を決定してもよい。この場合の判定式は、以下のようになる。
Figure JPOXMLDOC01-appb-M000002
The prediction selection information 123 may be determined using (a) only prediction information or (b) only SAD instead of the formula (1), and Hadamard transformation is applied to these (a) and (b). A value or a value close to it may be used. As yet another example, a provisional encoding unit is prepared, and the amount of code when the prediction error signal 116 generated in the prediction mode by the provisional encoding unit is actually encoded, the input image signal 115 and the decoded image signal The prediction selection information 123 may be determined using a square error with 119. The judgment formula in this case is as follows.
Figure JPOXMLDOC01-appb-M000002
ここで、Jは符号化コスト、Dは入力画像信号114と復号画像信号118との間の二乗誤差を表す符号化歪みである。一方、Rは仮符号化によって見積もられた符号量を表している。 Here, J is an encoding cost, and D is an encoding distortion representing a square error between the input image signal 114 and the decoded image signal 118. On the other hand, R represents a code amount estimated by provisional encoding.
 式(2)の符号化コストJを用いると、予測モード毎に仮符号化と局部復号処理が必要となるため、回路規模又は演算量は増大する。しかしながら、より正確な符号量と符号化歪みを用いるため、高い符号化効率を維持することができる。式(2)に代えてRのみ、又はDのみを用いてコストを算出してもよいし、R又はDを近似した値を用いてコスト関数を作成してもよい。 When the encoding cost J of Equation (2) is used, provisional encoding and local decoding processing are required for each prediction mode, so that the circuit scale or calculation amount increases. However, since a more accurate code amount and encoding distortion are used, high encoding efficiency can be maintained. The cost may be calculated using only R or only D instead of the expression (2), or the cost function may be created using a value approximating R or D.
 次に、図2を参照してインター予測信号生成装置109を説明する。 Next, the inter prediction signal generation device 109 will be described with reference to FIG.
 インター予測信号生成装置109は、予測分離スイッチ201、幾何変換予測部202、第二幾何変換パラメータ導出部203、第一幾何変換パラメータ導出部204、予測幾何変換パラメータ導出部205を有する。 The inter prediction signal generation device 109 includes a prediction separation switch 201, a geometric transformation prediction unit 202, a second geometric transformation parameter derivation unit 203, a first geometric transformation parameter derivation unit 204, and a predicted geometric transformation parameter derivation unit 205.
 先ず、第二幾何変換パラメータ導出部203の処理について具体的に説明する。第二幾何変換パラメータ導出部203では、動き情報探索部106から出力された予測対象ブロックの動き情報210と符号化制御部114に保存されている符号化済みのブロックの動き情報を用いて、予測対象ブロックの第二幾何変換パラメータ208を導出する。符号化制御部114に保存されている動き情報210は、隣接ブロックの動き情報210に含まれる動きベクトルであり、以下、「隣接動きベクトル」という。 First, the process of the second geometric transformation parameter derivation unit 203 will be specifically described. The second geometric transformation parameter derivation unit 203 uses the motion information 210 of the prediction target block output from the motion information search unit 106 and the motion information of the encoded block stored in the encoding control unit 114 to perform prediction. The second geometric transformation parameter 208 of the target block is derived. The motion information 210 stored in the encoding control unit 114 is a motion vector included in the motion information 210 of the adjacent block, and is hereinafter referred to as “adjacent motion vector”.
 図5に示すように、第二幾何変換パラメータ導出部203は、動き情報取得部501と第二パラメータ導出部502とを有する。動き情報取得部501は、複数の隣接ブロックのうち、動き情報を取得する隣接ブロックを決定し、その隣接ブロックの動き情報、例えば、動きベクトルを取得する。第二パラメータ導出部502は、隣接ブロックの動きベクトルから、第二幾何変換パラメータを導出する。 As shown in FIG. 5, the second geometric transformation parameter derivation unit 203 includes a motion information acquisition unit 501 and a second parameter derivation unit 502. The motion information acquisition unit 501 determines an adjacent block from which motion information is acquired from among a plurality of adjacent blocks, and acquires motion information of the adjacent block, for example, a motion vector. The second parameter derivation unit 502 derives a second geometric transformation parameter from the motion vector of the adjacent block.
 以下、図6を用いて、動き情報取得部501による隣接動きベクトルを導出する処理について説明する。 Hereinafter, the process of deriving the adjacent motion vector by the motion information acquisition unit 501 will be described with reference to FIG.
 ≪隣接ブロックと隣接動きベクトルの導出≫
 図6A乃至図6Eは、予測対象ブロックに対する隣接ブロックの関係を示す図である。図6Aでは、予測対象ブロックと隣接ブロックのサイズ(例えば16×16画素ブロック)が一致する場合の例を示す。
≪Derivation of adjacent block and adjacent motion vector≫
6A to 6E are diagrams illustrating the relationship between adjacent blocks with respect to a prediction target block. FIG. 6A shows an example in which the sizes of prediction target blocks and adjacent blocks (for example, 16 × 16 pixel blocks) match.
 図6A中、斜線のハッチングが付された画素ブロックpは既に符号化又は予測が完了している画素ブロック(以下「予測済画素ブロック」という。)である。ドットのハッチングが付されたブロックcは予測対象ブロックを示しており、白で表示されている画素ブロックnは未符号化画素(未予測)ブロックである。図中Xは符号化(予測)対象画素ブロックを表している。 In FIG. 6A, a hatched pixel block p is a pixel block that has already been encoded or predicted (hereinafter referred to as “predicted pixel block”). A block c with dot hatching indicates a prediction target block, and a pixel block n displayed in white is an uncoded pixel (unpredicted) block. In the figure, X represents an encoding (prediction) target pixel block.
 隣接ブロックAは、予測対象ブロックXの左の隣接ブロック、隣接ブロックBは、予測対象ブロックXの上の隣接ブロック、隣接ブロックCは、予測対象ブロックXの右上の隣接ブロック、隣接ブロックDは、予測対象ブロックXの左上の隣接ブロックである。 The adjacent block A is the adjacent block on the left of the prediction target block X, the adjacent block B is the adjacent block on the prediction target block X, the adjacent block C is the adjacent block on the upper right of the prediction target block X, and the adjacent block D is This is an adjacent block at the upper left of the prediction target block X.
 符号化制御部114の内部メモリに保持されている隣接動きベクトルは、予測済画素ブロックの動きベクトルのみである。図3Aで示したように画素ブロックは左上から右下に向かって符号化及び予測の処理がされていくため、画素ブロックXの予測を行う際には、右及び下方向の画素ブロックは未だ符号化が行われていない。そこで、これらの隣接ブロックから隣接動きベクトルを導出することができない。 The adjacent motion vector held in the internal memory of the encoding control unit 114 is only the motion vector of the predicted pixel block. As shown in FIG. 3A, since the pixel block is encoded and predicted from the upper left to the lower right, when the pixel block X is predicted, the right and lower pixel blocks are still encoded. Has not been made. Therefore, an adjacent motion vector cannot be derived from these adjacent blocks.
 図6B乃至図6Eは、予測対象ブロックが8×8画素ブロックの場合の、隣接ブロックの例を示す図である。図6B乃至図6Eにおいて、太線はマクロブロックの境界を表す。図6Bは、マクロブロック内の左上に位置する画素ブロック、図6Cは、マクロブロック内の右上に位置する画素ブロック、図6Dは、マクロブロック内の左下に位置する画素ブロック、図6Eは、マクロブロック内の右下に位置する画素ブロックを、それぞれ、予測対象ブロックとする例を示す。 6B to 6E are diagrams illustrating examples of adjacent blocks when the prediction target block is an 8 × 8 pixel block. In FIG. 6B to FIG. 6E, bold lines represent macroblock boundaries. 6B is a pixel block located at the upper left in the macro block, FIG. 6C is a pixel block located at the upper right in the macro block, FIG. 6D is a pixel block located at the lower left in the macro block, and FIG. An example is shown in which each pixel block located at the lower right in the block is a prediction target block.
 マクロブロックの内部も同様に左上から右下に向かって符号化処理が行われるため、8×8画素ブロックの符号化順序に応じて隣接ブロックの位置が変化する。対応する8×8画素ブロックの符号化処理又は予測信号生成処理が完了すると、その画素ブロックは符号化済み画素ブロックとなり、後に処理される画素ブロックの隣接ブロックとして利用される。図6Eでは、隣接ブロックCに対応する右上の画素ブロックが未符号化画素ブロックであるため、符号化済み画素ブロックの右上に位置する画素ブロックを隣接ブロックとする。 Since the inside of the macro block is similarly encoded from the upper left to the lower right, the position of the adjacent block changes according to the encoding order of the 8 × 8 pixel block. When the encoding process or the prediction signal generation process of the corresponding 8 × 8 pixel block is completed, the pixel block becomes an encoded pixel block and is used as an adjacent block of the pixel block to be processed later. In FIG. 6E, since the upper right pixel block corresponding to the adjacent block C is an unencoded pixel block, the pixel block located at the upper right of the encoded pixel block is set as an adjacent block.
 なお、図6B乃至図6Eで説明したとおり、予測対象ブロックXに対するユークリッド距離が最も近い隣接ブロックをそれぞれ隣接ブロックA、B、C、Dとする。例えば、図6Eでは、右上に位置する隣接ブロックが未符号化ブロックであるため、右上の隣接ブロックでもっとも近いブロックが隣接ブロックCとなる。また、ブロックサイズの大きさが異なる画素ブロックが混在している場合にも、同様に予測対象ブロックXに対するユークリッド距離が最も近いブロックを隣接ブロックとする。 Note that, as described with reference to FIGS. 6B to 6E, adjacent blocks having the closest Euclidean distance to the prediction target block X are referred to as adjacent blocks A, B, C, and D, respectively. For example, in FIG. 6E, since the adjacent block located at the upper right is an uncoded block, the nearest block among the upper right adjacent blocks is the adjacent block C. Further, even when pixel blocks having different block sizes are mixed, a block having the shortest Euclidean distance with respect to the prediction target block X is set as an adjacent block.
 以上の説明では、ブロックが16×16画素及び8×8画素の場合を例に挙げて説明したが、同様の枠組みを用いて32×32画素、4×4画素などの正方画素ブロックや16×8画素、8×16画素などの矩形画素ブロックに対しても隣接ブロックを決定してよい。 In the above description, the case where the block is 16 × 16 pixels and 8 × 8 pixels has been described as an example. However, a square pixel block such as 32 × 32 pixels, 4 × 4 pixels, or the like is used using a similar framework. Adjacent blocks may be determined for rectangular pixel blocks such as 8 pixels and 8 × 16 pixels.
 なお、隣接ブロックとしてA,B,C,Dの4つの画素ブロックを用いる他に、隣接ブロックを更に広く定義してもかまわない。例えば、隣接ブロックAの更に左の画素ブロックを用いてもよいし、隣接ブロックBの更に上の画素ブロックを用いても良い。 In addition to using four pixel blocks A, B, C, and D as adjacent blocks, adjacent blocks may be defined more widely. For example, a pixel block on the left of the adjacent block A may be used, or a pixel block further on the adjacent block B may be used.
 ≪第二幾何変換パラメータの導出≫
 次に第二幾何変換パラメータ導出部502における第二幾何変換パラメータ208の導出方法について説明する。第二幾何変換パラメータ208は、第二パラメータ導出部502により導出される。隣接ブロックが保持する隣接動きベクトルをそれぞれ式(3)乃至(6)により定義する。
Figure JPOXMLDOC01-appb-M000003
Figure JPOXMLDOC01-appb-M000004
Figure JPOXMLDOC01-appb-M000005
Figure JPOXMLDOC01-appb-M000006
≪Derivation of second geometric transformation parameter≫
Next, a method for deriving the second geometric transformation parameter 208 in the second geometric transformation parameter deriving unit 502 will be described. The second geometric transformation parameter 208 is derived by the second parameter deriving unit 502. The adjacent motion vectors held by the adjacent blocks are defined by equations (3) to (6), respectively.
Figure JPOXMLDOC01-appb-M000003
Figure JPOXMLDOC01-appb-M000004
Figure JPOXMLDOC01-appb-M000005
Figure JPOXMLDOC01-appb-M000006
 また、動き情報探索部106から提供される動き情報210を式(7)により定義する。なお、動き情報210は、予測対象ブロックXの動きベクトルを示す。
Figure JPOXMLDOC01-appb-M000007
Also, the motion information 210 provided from the motion information search unit 106 is defined by equation (7). Note that the motion information 210 indicates a motion vector of the prediction target block X.
Figure JPOXMLDOC01-appb-M000007
 式(3)乃至(7)で表される動きベクトル及び隣接動きベクトルを用いて、第二幾何変換パラメータ208を導出する。幾何変換としてアフィン変換を利用する場合には、変換式は次式(8)で表される。
Figure JPOXMLDOC01-appb-M000008
The second geometric transformation parameter 208 is derived using the motion vector and the adjacent motion vector represented by the equations (3) to (7). When affine transformation is used as geometric transformation, the transformation formula is expressed by the following formula (8).
Figure JPOXMLDOC01-appb-M000008
ここで、パラメータ(c、f)は動きベクトルに対応しており、パラメータ(a,b,d,e)は幾何変形に伴うパラメータを指している。u,vが符号化対象ブロックの座標を示し、x、yは参照画像の座標を示している。仮にパラメータ(a,b,d,e)が(1,0,0,1)である場合、平行移動モデルの動き補償(後述する式(18))と同一であることを意味する。 Here, the parameters (c, f) correspond to motion vectors, and the parameters (a, b, d, e) indicate parameters associated with geometric deformation. u and v indicate the coordinates of the encoding target block, and x and y indicate the coordinates of the reference image. If the parameter (a, b, d, e) is (1, 0, 0, 1), it means that it is the same as the motion compensation of the parallel movement model (formula (18) described later).
 なお、幾何変換として、ここではアフィン変換を用いた例を示したが、幾何変換に対応する、共一次変換、へルマート変換、二次等角変換、射影変換、及び3次元射影変換などの幾何変換を用いても良い。この場合、利用する幾何変換によって、必要となるパラメータ数が変動するが、パラメータを符号化するときの符号量と対応する幾何変換のパターンによって、適用する画像の性質に合わせて、好適な幾何変換を選択すれば良い。以降、本実施形態ではアフィン変換を用いた例を説明する。 In addition, although the example which used affine transformation was shown here as geometric transformation, geometric, such as a bilinear transformation, Helmert transformation, secondary conformal transformation, projective transformation, and three-dimensional projective transformation corresponding to geometric transformation, was shown. A transformation may be used. In this case, the required number of parameters varies depending on the geometric transformation to be used, but a suitable geometric transformation can be selected according to the nature of the image to be applied, depending on the amount of code when the parameters are encoded and the corresponding geometric transformation pattern. Should be selected. Hereinafter, in this embodiment, an example using affine transformation will be described.
 式(8)では、座標(x、y)がアフィン変換によって座標(u,v)へ変換される。式(8)に含まれるa、b、c、d、e、fの6個のパラメータが幾何変換パラメータを表している。アフィン変換ではこの6種類のパラメータを隣接ベクトルから導出するため、6個以上の入力値が必要となる。隣接ブロックA、B及び予測対象ブロックXのそれぞれの動きベクトルを用いると、次式(9)により幾何変換パラメータが導出される。ここでは、動きベクトルが1/4精度であることを前提とし、(a,b,d,e)のパラメータの精度を1/64としている。
Figure JPOXMLDOC01-appb-M000009
In equation (8), coordinates (x, y) are converted to coordinates (u, v) by affine transformation. Six parameters a, b, c, d, e, and f included in Expression (8) represent geometric transformation parameters. In affine transformation, since these six types of parameters are derived from adjacent vectors, six or more input values are required. When the motion vectors of the adjacent blocks A and B and the prediction target block X are used, a geometric transformation parameter is derived by the following equation (9). Here, it is assumed that the motion vector is ¼ precision, and the precision of the parameters (a, b, d, e) is 1/64.
Figure JPOXMLDOC01-appb-M000009
但し、ax、ayは予測対象ブロックのサイズに基づく変数であり、次式(10)で算出される。
Figure JPOXMLDOC01-appb-M000010
However, ax and ay are variables based on the size of the prediction target block, and are calculated by the following equation (10).
Figure JPOXMLDOC01-appb-M000010
ここで、mb_size_x及びmb_size_yはマクロブロックの水平、垂直方向のサイズを示しており、16×16画素ブロックとすると、mb_size_x=16、mb_size_y=16となる。また、blk_size_x及びblk_size_yは予測対象画素ブロックの水平、垂直サイズを表しており、図6Bの場合は、blk_size_x=8、blk_size_y=8となる。 Here, mb_size_x and mb_size_y indicate the horizontal and vertical sizes of the macroblock. When a 16 × 16 pixel block is used, mb_size_x = 16 and mb_size_y = 16. Moreover, blk_size_x and blk_size_y represent the horizontal and vertical sizes of the prediction target pixel block. In the case of FIG. 6B, blk_size_x = 8 and blk_size_y = 8.
 ここでは、入力値として隣接画素ブロックA及びBの動きベクトルを用いて、幾何変換パラメータを導出する例を示したが、必ずしも、隣接画素ブロックA及びBの動きベクトルを用いる必要はなく、隣接画素ブロックC、D及びそれ以外の隣接画素ブロックから算出された動きベクトルを用いても良いし、これらの複数の隣接画素ブロックの動きベクトルからパラメータフィッティングを用いて、幾何変換パラメータを求めても良い。また、式(8)ではそれぞれa,b,d,eが実数で得られる例を示しているが、予めこれらのパラメータの演算精度を決めておくことで式(9)のように簡単に整数化が可能である。 Here, an example in which the geometric transformation parameters are derived using the motion vectors of the adjacent pixel blocks A and B as input values is shown, but it is not always necessary to use the motion vectors of the adjacent pixel blocks A and B. Motion vectors calculated from the blocks C and D and other adjacent pixel blocks may be used, or geometric transformation parameters may be obtained from the motion vectors of the plurality of adjacent pixel blocks using parameter fitting. In addition, Equation (8) shows an example in which a, b, d, and e are obtained as real numbers, respectively, but by determining the calculation accuracy of these parameters in advance, an integer can be easily obtained as in Equation (9). Is possible.
 ≪第一幾何変換パラメータの導出≫
 次に予測幾何変換パラメータ導出部205について説明する。
≪Derivation of first geometric transformation parameter≫
Next, the predicted geometric transformation parameter derivation unit 205 will be described.
 予測幾何変換パラメータ導出部205は、図7に示すように動き情報取得部701と予測幾何変換パラメータ算出部702とを有する。動き情報取得部501は、第二幾何変換導出部203が保持する動き情報取得部501と同様の手順で隣接ブロックを決定する。但し、隣接ブロックから算出される動き情報は幾何変換パラメータである。 The predicted geometric transformation parameter derivation unit 205 includes a motion information acquisition unit 701 and a predicted geometric transformation parameter calculation unit 702 as shown in FIG. The motion information acquisition unit 501 determines an adjacent block in the same procedure as the motion information acquisition unit 501 held by the second geometric transformation derivation unit 203. However, the motion information calculated from the adjacent block is a geometric transformation parameter.
 動き情報取得部701では、符号化制御部114に保存されている符号化済みブロックの動き情報210を用いて、予測対象ブロックの予測幾何変換パラメータ212を導出する。符号化制御部114に保存されている動き情報210は、隣接符号化済みブロックの動き情報210であり、以下、「隣接幾何変換パラメータ」という。図6Aを用いて予測幾何変換パラメータ212の導出方法を説明する。隣接ブロックが保持する隣接幾何変換パラメータをそれぞれ式(11)乃至(14)により定義する。
Figure JPOXMLDOC01-appb-M000011
Figure JPOXMLDOC01-appb-M000012
Figure JPOXMLDOC01-appb-M000013
Figure JPOXMLDOC01-appb-M000014
The motion information acquisition unit 701 derives a prediction geometric transformation parameter 212 of the prediction target block using the motion information 210 of the encoded block stored in the encoding control unit 114. The motion information 210 stored in the encoding control unit 114 is motion information 210 of adjacent encoded blocks, and is hereinafter referred to as “adjacent geometric transformation parameter”. A method for deriving the predicted geometric transformation parameter 212 will be described with reference to FIG. 6A. The adjacent geometric transformation parameters held by the adjacent blocks are defined by equations (11) to (14), respectively.
Figure JPOXMLDOC01-appb-M000011
Figure JPOXMLDOC01-appb-M000012
Figure JPOXMLDOC01-appb-M000013
Figure JPOXMLDOC01-appb-M000014
apはアフィン変換パラメータを指し、式(8)に示すように6次元のパラメータである。このように導出された隣接幾何変換パラメータ(隣接アフィン変換パラメータ)を用いて、予測幾何変換パラメータ算出部702で予測幾何変換パラメータが算出される。 ap indicates an affine transformation parameter, which is a six-dimensional parameter as shown in Equation (8). Using the adjacent geometric transformation parameter (adjacent affine transformation parameter) derived in this way, the predicted geometric transformation parameter calculation unit 702 calculates the predicted geometric transformation parameter.
予測幾何変換パラメータ算出部702では、予測対象ブロックと隣接ブロック間の空間相関を利用して、メディアン処理によって予測幾何変換パラメータを算出する。
Figure JPOXMLDOC01-appb-M000015
The predicted geometric transformation parameter calculation unit 702 calculates a predicted geometric transformation parameter by median processing using the spatial correlation between the prediction target block and the adjacent block.
Figure JPOXMLDOC01-appb-M000015
ここでpred_apは予測幾何変換パラメータを表している。なお、関数affine_median()は、6次元のアフィン変換パラメータの中央値を取る関数である。また、次式を用いて予測幾何変換パラメータを決定してもよい。
Figure JPOXMLDOC01-appb-M000016
Here, pred_ap represents a predicted geometric transformation parameter. The function affine_median () is a function that takes the median value of the 6-dimensional affine transformation parameters. Moreover, you may determine a prediction geometric transformation parameter using following Formula.
Figure JPOXMLDOC01-appb-M000016
ここで、関数median()はスカラーメディアンを示している。なお、式(16)のTは転置を意味する。式(16)では各幾何変換パラメータの成分毎に中央値を取ることによって予測幾何変換パラメータを導出する。 Here, the function median () indicates a scalar median. Note that T in equation (16) means transposition. In equation (16), a predicted geometric transformation parameter is derived by taking a median value for each geometric transformation parameter component.
 ここでは、隣接幾何変換パラメータから一意に予測幾何変換パラメータを導出する例を示したが、どの隣接幾何変換パラメータを使うかに関する情報を付加することも可能である。この場合、第一幾何変換パラメータ209が選択された場合に、選択情報を付加する必要があるため、情報量は増加するが、隣接ブロックと予測対象ブロックにエッジが存在したり、異なるオブジェクト同士が隣り合っていたりする場合に、適切な予測幾何変換パラメータを選択することが可能となる。本実施形態では、隣接ブロックとして4つのブロックを利用しているため、4通りの組み合わせを示す情報を導出パラメータ情報211とともにエントロピー符号化部112へ出力し符号化する。 Here, an example in which a predicted geometric transformation parameter is uniquely derived from an adjacent geometric transformation parameter has been shown, but information on which neighboring geometric transformation parameter is used can be added. In this case, when the first geometric transformation parameter 209 is selected, since selection information needs to be added, the amount of information increases. However, there is an edge between adjacent blocks and prediction target blocks, or different objects are When adjacent to each other, it is possible to select an appropriate prediction geometric transformation parameter. In this embodiment, since four blocks are used as adjacent blocks, information indicating four combinations is output to the entropy encoding unit 112 together with the derived parameter information 211 and encoded.
 ここで、隣接ブロックがスキップモードなどの場合、幾何変換パラメータが設定されない場合が存在する。このような場合は、ap=(1,0,c,0,1,f)に初期化しても良い。また、このような隣接ブロックの幾何変換パラメータを再度導出しなおしても良い。例えば図6Aで示される隣接ブロックAで、通常の動き補償予測が選択された場合、幾何変換パラメータは、ap=(1,0,c,0,1,f)となる。ここで、隣接ブロックAを基準として、既に符号化処理が完了している4つのブロック(更に左に位置するブロック、上に位置するブロック、左上に位置するブロック、右上に位置するブロック)を隣接ブロックとし、式(9)を用いて幾何変換パラメータを導出する。この幾何変換パラメータを隣接幾何変換パラメータとしてapAで用いることも可能である。隣接ブロックB、C、Dも同様に幾何変換パラメータを再導出することによって、予測幾何変換パラメータを導出することも可能である。 Here, there is a case where the geometric transformation parameter is not set when the adjacent block is in the skip mode or the like. In such a case, it may be initialized to ap = (1, 0, c, 0, 1, f). Further, the geometric transformation parameters of such adjacent blocks may be derived again. For example, when normal motion compensation prediction is selected in the adjacent block A shown in FIG. 6A, the geometric transformation parameter is ap = (1, 0, c, 0, 1, f). Here, with reference to adjacent block A, four blocks that have already been encoded (further located on the left, upper located, upper left, and upper right) are adjacent. A geometric transformation parameter is derived using Equation (9) as a block. It is also possible for ap A to use this geometric transformation parameter as an adjacent geometric transformation parameter. It is also possible to derive predicted geometric transformation parameters by re-deriving geometric transformation parameters for adjacent blocks B, C, and D in the same manner.
 予測幾何変換パラメータ導出部205で導出された予測幾何変換パラメータ212は、第一幾何変換パラメータ導出部204へと出力される。第一幾何変換パラメータ導出部204は、入力されてきた予測対象ブロックの幾何変換パラメータの導出パラメータ情報211と、予測幾何変換パラメータ212を加算する。加算されたパラメータは、第一幾何変換パラメータ209となり、幾何変換予測部202へと出力される。 The predicted geometric transformation parameter 212 derived by the predicted geometric transformation parameter derivation unit 205 is output to the first geometric transformation parameter derivation unit 204. The first geometric transformation parameter derivation unit 204 adds the inputted geometric transformation parameter derivation parameter information 211 of the prediction target block and the predicted geometric transformation parameter 212. The added parameter becomes the first geometric transformation parameter 209 and is output to the geometric transformation prediction unit 202.
 ここでは、第一幾何変換パラメータ導出部204は、予測幾何変換パラメータ212と予測対象ブロックの幾何変換パラメータの導出パラメータ情報211を加算して、第一幾何変換パラメータ209を導出する例を示したが、導出パラメータ情報211の生成方法によって、加算、減算、乗算、除算、又は予め定められた行列を用いた変換、及びこれらを組み合わせた式を用いて導出された値など、いずれを用いても良い。例えば、予測幾何変換パラメータが-pred_apであるとき、第一幾何変換パラメータ導出部204は、予測幾何変換パラメータ212(-pred_ap)と導出パラメータ情報211を加算することによって、第一幾何変換パラメータ209を導出する。 Here, an example is shown in which the first geometric transformation parameter deriving unit 204 derives the first geometric transformation parameter 209 by adding the predicted geometric transformation parameter 212 and the derivation parameter information 211 of the geometric transformation parameter of the prediction target block. Depending on the method of generating the derived parameter information 211, any of addition, subtraction, multiplication, division, conversion using a predetermined matrix, and a value derived using a combination of these may be used. . For example, when the predicted geometric transformation parameter is −pred_ap, the first geometric transformation parameter derivation unit 204 adds the predicted geometric transformation parameter 212 (−pred_ap) and the derived parameter information 211 to obtain the first geometric transformation parameter 209. To derive.
 いずれにしても、予測幾何変換パラメータ212と第一幾何変換パラメータ導出部204は、予測対象ブロックで算出された第一幾何変換パラメータ209の情報量をできるだけ削減するように、導出式が規定される。 In any case, the prediction geometric transformation parameter 212 and the first geometric transformation parameter derivation unit 204 define the derivation formula so as to reduce the information amount of the first geometric transformation parameter 209 calculated in the prediction target block as much as possible. .
≪幾何変換予測部の処理≫
 幾何変換予測部202は、入力された幾何変換パラメータと参照画像信号207を用いて、予測信号を生成する機能を有する。幾何変換予測部202は、図8に示すように幾何変換部401と内挿補間部402とを有する。幾何変換部202は、参照画像信号207に対する幾何変換を行い、予測画素の位置を算出する。内挿補間部402は、幾何変換により求められた予測画素の分数位置に対応する予測画素の値を、内挿補間等により算出する。
≪Processing of geometric transformation prediction part≫
The geometric transformation prediction unit 202 has a function of generating a prediction signal using the input geometric transformation parameters and the reference image signal 207. The geometric transformation prediction unit 202 includes a geometric transformation unit 401 and an interpolation unit 402 as shown in FIG. The geometric transformation unit 202 performs geometric transformation on the reference image signal 207 and calculates the position of the predicted pixel. The interpolation unit 402 calculates the predicted pixel value corresponding to the fractional position of the predicted pixel obtained by the geometric transformation by interpolation or the like.
 図9を参照して、16×16画素の予測対象ブロックに対する幾何変換予測と動き補償予測の例を説明する。 An example of geometric transformation prediction and motion compensation prediction for a 16 × 16 pixel prediction target block will be described with reference to FIG.
 図9において、予測対象ブロックは△で示される画素からなる正方形画素ブロックCRである。動き補償予測の対応する画素は●で示される。●で示される画素からなる画素ブロックMERは、正方形である。一方、幾何変換予測の対応する画素は×で示され、これらの画素からなる画素ブロックGTRは、例えば平行四辺形となる。 In FIG. 9, the prediction target block is a square pixel block CR composed of pixels indicated by Δ. The corresponding pixel of motion compensation prediction is indicated by ●. A pixel block MER composed of pixels indicated by ● is square. On the other hand, the pixel corresponding to the geometric transformation prediction is indicated by x, and the pixel block GTR composed of these pixels is, for example, a parallelogram.
 動き補償後の領域と幾何変換後の領域は、参照画像信号の対応する領域を符号化対象のフレームの座標に合わせて記述している。このように、幾何変換予測を用いることによって、矩形画素ブロックの回転、拡大・縮小、せん断、鏡面変換などの変形に合わせた予測信号の生成が可能となる。 The region after motion compensation and the region after geometric transformation describe the corresponding region of the reference image signal according to the coordinates of the frame to be encoded. As described above, by using the geometric transformation prediction, it is possible to generate a prediction signal in accordance with deformation such as rotation, enlargement / reduction, shearing, and mirror transformation of the rectangular pixel block.
 幾何変換部401は、入力された幾何変換パラメータを用い、式(8)を用いて幾何変換後の座標(u,v)を算出する。算出された幾何変換後の座標(u,v)は、実数値である。そこで、座標(u,v)に対応する輝度値を参照画像信号から内挿補間することによって予測値を生成する。 The geometric transformation unit 401 calculates coordinates (u, v) after the geometric transformation using the input geometric transformation parameters using the equation (8). The calculated coordinates (u, v) after geometric transformation are real values. Therefore, the predicted value is generated by interpolating the luminance value corresponding to the coordinates (u, v) from the reference image signal.
 本実施形態では、内挿補間法を用いる。内挿補間法は次式(17)で表される。
Figure JPOXMLDOC01-appb-M000017
In this embodiment, an interpolation method is used. The interpolation method is expressed by the following equation (17).
Figure JPOXMLDOC01-appb-M000017
ここでP(u,v)は内挿補間処理後の予測画素値を示しており、R(x,y)は、利用した参照画像信号の整数画素値を表している。画素の補間精度を1/64とすると、(x-u)=U/64、(y-v)=V/64となり、式(17)は、式(18)に示す整数演算に変形できる。
Figure JPOXMLDOC01-appb-M000018
Here, P (u, v) represents the predicted pixel value after the interpolation process, and R (x, y) represents the integer pixel value of the used reference image signal. Assuming that the pixel interpolation accuracy is 1/64, (x−u) = U / 64 and (y−v) = V / 64, and Equation (17) can be transformed into an integer calculation shown in Equation (18).
Figure JPOXMLDOC01-appb-M000018
ここで、fは丸めのオフセットを表している。本実施形態ではf=0としている。 Here, f represents a rounding offset. In this embodiment, f = 0.
 以上のように、幾何変換を行った予測対象ブロック内の座標毎に内挿補間を適用することによって、新たな予測信号を生成する。 As described above, a new prediction signal is generated by applying interpolation for each coordinate in the prediction target block subjected to geometric transformation.
 なお、幾何変換予測部202では、従来の平行移動モデルを用いた予測信号を生成することも可能である。平行移動モデルの座標変換式は式(19)で表される。
Figure JPOXMLDOC01-appb-M000019
The geometric transformation prediction unit 202 can also generate a prediction signal using a conventional translation model. The coordinate conversion formula of the translation model is expressed by formula (19).
Figure JPOXMLDOC01-appb-M000019
式(19)は、式(8)のパラメータ(a,b,d,e)が(1,0,0,1)であることと同一である。予測選択情報213が、幾何変換予測を用いないように指定されている場合、第一幾何変換パラメータ208及び第二幾何変換パラメータ209がいかなる値であっても、幾何変換予測部202は、式(8)を式(19)に変更して、座標の導出を行うことによって、従来の動き補償予測を実現できる。 Equation (19) is the same as the parameter (a, b, d, e) in Equation (8) being (1, 0, 0, 1). When the prediction selection information 213 is designated not to use the geometric transformation prediction, the geometric transformation prediction unit 202 can calculate the equation (2) regardless of the values of the first geometric transformation parameter 208 and the second geometric transformation parameter 209. The conventional motion compensated prediction can be realized by changing 8) to Equation (19) and deriving coordinates.
 なお、本実施形態では、内挿補間の方法として共一次内挿法を用いる例を示したが、最近接内挿法、3次畳み込み内挿法、線形フィルタ内挿法、ラグランジュ補間法、スプライン補間法、ランツォシュ補間法などのいかなる内挿補間法を適用しても構わない。 In this embodiment, an example in which bilinear interpolation is used as an interpolation method is shown. However, nearest neighbor interpolation, cubic convolution interpolation, linear filter interpolation, Lagrange interpolation, spline, and so on. Any interpolation method such as an interpolation method or a Lanczos interpolation method may be applied.
 また、本実施形態では、補間の精度として1/64画素精度の例を示したが、いずれの精度を用いても構わない。 In the present embodiment, an example of 1/64 pixel accuracy is shown as the interpolation accuracy, but any accuracy may be used.
 予測分離スイッチ201は、幾何変換予測部202から出力される2つの予測信号206の出力端を切り替える。即ち、予測分離スイッチ201は、第一幾何変換パラメータ209によって生成された予測信号215の出力端と第二幾何変換パラメータ208によって生成された予測信号214の出力端を、予測選択情報213(図1の123に相当)に従って切り替える。 The prediction separation switch 201 switches the output terminals of the two prediction signals 206 output from the geometric transformation prediction unit 202. That is, the prediction separation switch 201 uses the prediction selection information 213 (FIG. 1) as the output terminal of the prediction signal 215 generated by the first geometric transformation parameter 209 and the output terminal of the prediction signal 214 generated by the second geometric transformation parameter 208. In accordance with 123).
 予測選択情報213及び123の例が図10に示されている。予測選択情報213のインデックスが0の場合、スキップモードが選択される。スキップモードである場合、変換係数や動きベクトルなどは符号化されない。このため、追加の動き情報210を符号化する必要のある第一幾何変換予測は、選択されないことを意味する。また、予測選択情報213のインデックスが9の場合、イントラ予測が選択される。この場合、予測分離スイッチ110の出力端はイントラ予測信号生成装置107へと接続されるため、インター予測信号生成装置109で予測信号生成処理を行う必要がないことを意味する。 Examples of the prediction selection information 213 and 123 are shown in FIG. When the index of the prediction selection information 213 is 0, the skip mode is selected. In the skip mode, transform coefficients, motion vectors, etc. are not encoded. For this reason, it means that the first geometric transformation prediction that needs to encode the additional motion information 210 is not selected. Moreover, when the index of the prediction selection information 213 is 9, intra prediction is selected. In this case, since the output terminal of the prediction separation switch 110 is connected to the intra prediction signal generation device 107, this means that the inter prediction signal generation device 109 does not need to perform the prediction signal generation processing.
 なお、図10に示すインデックスの表中の行間には、本実施形態において規定していない要素が挿入されてもよく、予測方法、予測ブロックサイズ、予測モードの名称、又はこれらの組み合わせの情報を持った記述が含まれていてもよい。また、これらのインデックステーブルを複数のテーブルに分割し、又は複数のインデックステーブルを統合してもよい。また、必ずしも同一の用語を用いる必要は無く、利用する形態によって任意に変更してもよい。更に、インデックステーブルに記述されている各々の要素は、それぞれ独立なフラグで記述されるように変更しても良い。 Note that elements not defined in the present embodiment may be inserted between the rows in the index table shown in FIG. 10, and the prediction method, the prediction block size, the name of the prediction mode, or information on a combination of these may be included. The description you have may be included. Further, these index tables may be divided into a plurality of tables, or a plurality of index tables may be integrated. Moreover, it is not always necessary to use the same term, and it may be arbitrarily changed depending on the form to be used. Furthermore, each element described in the index table may be changed so as to be described by an independent flag.
 以上が本発明の本実施形態におけるインター予測信号生成装置200の概要である。 The above is the outline of the inter prediction signal generation apparatus 200 in the present embodiment of the present invention.
 図11を参照して、インター予測信号生成装置109の予測信号生成の処理を説明する。インター予測信号生成処理が開始される(S501)と、インター予測信号生成装置109の外部から入力されてきた予測選択情報213に従って、予測分離スイッチ201は、予測選択情報213が、第一幾何変換パラメータ209であるかどうかを判定する(S502)。かかる判定がYESである場合、予測分離スイッチ201は幾何変換予測部202の出力端を第一予測信号215に繋ぎかえる。一方、かかる判定がNOである場合、予測分離スイッチ201は幾何変換予測部202の出力端を第二予測信号215に繋ぎかえる。 With reference to FIG. 11, the process of the prediction signal generation of the inter prediction signal generation apparatus 109 is demonstrated. When the inter prediction signal generation process is started (S501), according to the prediction selection information 213 input from the outside of the inter prediction signal generation device 109, the prediction separation switch 201 sets the prediction selection information 213 to the first geometric transformation parameter. It is determined whether it is 209 (S502). When this determination is YES, the prediction separation switch 201 switches the output terminal of the geometric transformation prediction unit 202 to the first prediction signal 215. On the other hand, when the determination is NO, the prediction separation switch 201 switches the output terminal of the geometric transformation prediction unit 202 to the second prediction signal 215.
 判定がYESの場合、予測幾何変換パラメータ導出部205内の動き情報取得部701は、外部から入力されてきた動き情報210を基にして、隣接ブロックの決定を行う(S508)。決定した隣接ブロックの動き情報210を用いて、隣接幾何変換パラメータを導出する(S509)。導出した隣接幾何変換パラメータを受けて、予測幾何変換パラメータ導出部702では、式(15)又は式(16)を用いて、予測幾何変換パラメータ212を導出する(S510)。予測幾何変換パラメータ212は、第一幾何変換パラメータ導出部204へと出力される。 When the determination is YES, the motion information acquisition unit 701 in the predicted geometric transformation parameter derivation unit 205 determines an adjacent block based on the motion information 210 input from the outside (S508). Using the determined adjacent block motion information 210, an adjacent geometric transformation parameter is derived (S509). Receiving the derived adjacent geometric transformation parameter, the predicted geometric transformation parameter derivation unit 702 derives the predicted geometric transformation parameter 212 using equation (15) or equation (16) (S510). The predicted geometric transformation parameter 212 is output to the first geometric transformation parameter derivation unit 204.
 第一幾何変換パラメータ導出部204では、外部から入力される導出パラメータ情報211と予測幾何変換パラメータ212を用いて、第一幾何変換パラメータ209を導出する(S511)。第一幾何変換パラメータ209は、幾何変換予測部202へと入力され、幾何変換部401にて式(8)などを用いて、幾何変換後の座標が導出される(S512)。算出された座標を基に、内挿補間部402では、外部から入力された参照画像信号207に対して式(18)を用いて内挿補間処理を行い、第一予測信号215を生成する(S513)。第一予測信号215は、出力端が接続された予測分離スイッチ201を介して外部へ出力される(S515)とともに、予測対象ブロックで利用した第一幾何変換パラメータ209をメモリへと保存する(S514)。メモリに保持された第一幾何変換パラメータ209は、次のブロックの隣接幾何変換パラメータ又は隣接動きベクトルして利用される(S517)。 The first geometric transformation parameter derivation unit 204 derives the first geometric transformation parameter 209 using the derived parameter information 211 and the predicted geometric transformation parameter 212 input from the outside (S511). The first geometric transformation parameter 209 is input to the geometric transformation prediction unit 202, and the geometric transformation unit 401 derives the coordinates after geometric transformation using the equation (8) (S512). Based on the calculated coordinates, the interpolation unit 402 performs an interpolation process on the reference image signal 207 input from the outside using Expression (18) to generate a first prediction signal 215 ( S513). The first prediction signal 215 is output to the outside via the prediction separation switch 201 to which the output terminal is connected (S515), and the first geometric transformation parameter 209 used in the prediction target block is stored in the memory (S514). ). The first geometric transformation parameter 209 held in the memory is used as an adjacent geometric transformation parameter or an adjacent motion vector of the next block (S517).
 ステップS502での判定がNOである場合、第二幾何変換パラメータ導出部203内の動き情報取得部501は、外部から入力されてきた動き情報210を基にして、隣接ブロックの決定を行う(S503)。決定した隣接ブロックの動き情報210を用いて、隣接動きベクトルを導出する(S504)。導出した隣接動きベクトルを受けて、第二パラメータ導出部502では、式(9)乃至式(10)を用いて、第二幾何変換パラメータ208を導出する(S505)。 When the determination in step S502 is NO, the motion information acquisition unit 501 in the second geometric transformation parameter derivation unit 203 determines an adjacent block based on the motion information 210 input from the outside (S503). ). An adjacent motion vector is derived using the determined adjacent block motion information 210 (S504). Receiving the derived adjacent motion vector, the second parameter deriving unit 502 derives the second geometric transformation parameter 208 using the equations (9) to (10) (S505).
 第二幾何変換パラメータ208は、幾何変換予測部202へと入力され、幾何変換部401にて式(8)などを用いて、幾何変換後の座標が導出される(S506)。導出された座標を基に、内挿補間部402では、外部から入力された参照画像信号207に対して式(18)を用いて内挿補間処理を行い、第二予測信号214を生成する(S507)。第二予測信号214は、予測分離スイッチ201を介して外部へ出力される(S515)とともに、予測対象ブロックで利用した第二幾何変換パラメータ208をメモリへと保存する(S514)。メモリに保持された第二幾何変換パラメータ208は、次のブロックの隣接幾何変換パラメータ又は隣接動きベクトルとして利用される(S517)。 The second geometric transformation parameter 208 is input to the geometric transformation prediction unit 202, and the geometric transformation unit 401 derives coordinates after the geometric transformation using the equation (8) (S506). Based on the derived coordinates, the interpolation unit 402 performs an interpolation process on the reference image signal 207 input from the outside using Expression (18) to generate the second prediction signal 214 ( S507). The second prediction signal 214 is output to the outside via the prediction separation switch 201 (S515), and the second geometric transformation parameter 208 used in the prediction target block is stored in the memory (S514). The second geometric transformation parameter 208 held in the memory is used as an adjacent geometric transformation parameter or an adjacent motion vector of the next block (S517).
 以上が、本実施形態における、インター予測信号生成装置200の処理の流れである。 The above is the process flow of the inter prediction signal generation apparatus 200 in the present embodiment.
 次に、動画像符号化装置100におけるシンタクス構造について説明する。図12に示すとおり、シンタクス1600は主に3つのパートを有する。ハイレベルシンタクス1601は、スライス以上の上位レイヤのシンタクス情報を有する。スライスレベルシンタクス1602は、スライス毎に復号に必要な情報を有し、マクロブロックレベルシンタクス1603は、マクロブロック毎に復号に必要とされる情報を有する。 Next, a syntax structure in the moving picture coding apparatus 100 will be described. As shown in FIG. 12, the syntax 1600 mainly has three parts. The high-level syntax 1601 has higher layer syntax information that is equal to or higher than a slice. The slice level syntax 1602 has information necessary for decoding for each slice, and the macroblock level syntax 1603 has information necessary for decoding for each macroblock.
 各パートは、更に詳細なシンタクスで構成されている。ハイレベルシンタクス1601は、シーケンスパラメータセットシンタクス1604とピクチャパラメータセットシンタクス1605などの、シーケンス及びピクチャレベルのシンタクスを含む。スライスレベルシンタクス1602は、スライスヘッダーシンタクス1606、スライスデータシンタクス1607等を含む。マクロブロックレベルシンタクス1603は、マクロブロックレイヤーシンタクス1608、マクロブロックプレディクションシンタクス1609等を含む。 Each part has a more detailed syntax. High level syntax 1601 includes sequence and picture level syntax, such as sequence parameter set syntax 1604 and picture parameter set syntax 1605. The slice level syntax 1602 includes a slice header syntax 1606, a slice data syntax 1607, and the like. The macroblock level syntax 1603 includes a macroblock layer syntax 1608, a macroblock prediction syntax 1609, and the like.
 図13に示されるスライスヘッダーシンタクス1605の例では、slice_affine_motion_prediction_flagは、スライスに幾何変換予測を適用するかどうかを示すシンタクス要素である。slice_affine_motion_prediction_flagが0である場合、幾何変換予測部202は、本スライスに対しては、式(8)におけるパラメータ(a,b,d,e)のパラメータを利用せず、式(19)を用いる。 In the example of the slice header syntax 1605 shown in FIG. 13, slice_affine_motion_prediction_flag is a syntax element indicating whether to apply geometric transformation prediction to a slice. When slice_affine_motion_prediction_flag is 0, the geometric transformation prediction unit 202 does not use the parameters (a, b, d, e) in Equation (8) but uses Equation (19) for this slice.
 式(19)は、H.264などで用いられている平行移動モデルを用いた動き補償予測を表しており、パラメータ(c,f)は動きベクトルに相当する。本フラグが0の場合、従来の平行移動モデルの動き補償予測が行われることと同一である。一方、slice_affine_motion_prediction_flagが1である場合、スライスにおいて予測選択情報213に示すように、予測分離スイッチ201は予測信号を動的に切り替える。 Equation (19) H.264 represents motion compensation prediction using a translation model used in H.264 and the like, and parameters (c, f) correspond to motion vectors. When this flag is 0, it is the same as the motion compensation prediction of the conventional translation model. On the other hand, when slice_affine_motion_prediction_flag is 1, the prediction separation switch 201 dynamically switches the prediction signal as indicated by the prediction selection information 213 in the slice.
 図14に示されるスライスデータシンタクス1606の例では、mb_skip_flagは、マクロブロックがスキップモードで符号化されているかどうかを示すフラグである。スキップモードである場合、変換係数や動きベクトルなどは符号化されない。このため、第一幾何変換予測はスキップモードには適用されない。 In the example of the slice data syntax 1606 shown in FIG. 14, mb_skip_flag is a flag indicating whether or not the macroblock is encoded in the skip mode. In the skip mode, transform coefficients, motion vectors, etc. are not encoded. For this reason, the first geometric transformation prediction is not applied to the skip mode.
 AvailAffineModeはマクロブロックで第二幾何変換予測が利用できるかどうかを示す内部パラメータである。AvailAffineModeが0の場合、第二幾何変換予測を利用しないように予測選択情報213が設定されていることを意味する。隣接ブロックの隣接動きベクトルと予測対象ブロックの動きベクトルが同一の値を持つ場合、AvailAffineModeは0となり、それ以外の場合、AvailAffineModeは1となる。 “AvailAffineMode” is an internal parameter indicating whether or not the second geometric transformation prediction can be used in the macroblock. When AvailAffineMode is 0, it means that the prediction selection information 213 is set not to use the second geometric transformation prediction. When the adjacent motion vector of the adjacent block and the motion vector of the prediction target block have the same value, AvailAffineMode is 0. Otherwise, AvailAffineMode is 1.
 AvailAffineModeの設定は、隣接幾何変換パラメータや隣接動きベクトルを用いて設定することも可能である。例えば、隣接動きベクトルがまったく異なる方向を指している場合、本予測対象ブロックの隣接ブロックにオブジェクトの境界が存在する可能性があるため、AvailAffineModeを0と設定することも可能である。 The setting of AvailAffineMode can also be set using an adjacent geometric transformation parameter or an adjacent motion vector. For example, when the adjacent motion vector points in a completely different direction, there is a possibility that an object boundary exists in the adjacent block of the current prediction target block. Therefore, it is possible to set AvailAffineMode to 0.
 一方、AvailAffineModeが1の場合は、第二幾何変換予測と動き補償予測のどちらを利用するかを示すmb_affine_motion_skip_flagが符号化される。mb_affine_motion_skip_flagが1の場合、スキップモードに対して第二幾何変換予測が適用されることを意味する。mb_affine_motion_skip_flagが0の場合、式(19)を用いて、動き補償予測が適用されることを意味する。 On the other hand, when AvailAffineMode is 1, mb_affine_motion_skip_flag indicating whether to use the second geometric transformation prediction or the motion compensation prediction is encoded. When mb_affine_motion_skip_flag is 1, it means that the second geometric transformation prediction is applied to the skip mode. When mb_affine_motion_skip_flag is 0, it means that motion compensation prediction is applied using equation (19).
 図15に示すマクロブロックレイヤーシンタクス1607の例では、mb_typeは、マクロブロックタイプ情報を示している。即ち、現在のマクロブロックがイントラ符号化されているか、インター符号化されているか、又はどのようなブロック形状で予測が行われているか、予測の方向が単方向予測か双方向予測か、などの情報を含んでいる。mb_typeは、マクロブロックプレディクションシンタクスと更にマクロブロック内のサブブロックのシンタクスを示すサブマクロブロックプレディクションシンタクスなどに渡される。 In the example of the macroblock layer syntax 1607 shown in FIG. 15, mb_type indicates macroblock type information. That is, whether the current macroblock is intra-coded, inter-coded, what block shape is being predicted, whether the prediction direction is unidirectional prediction or bidirectional prediction, etc. Contains information. The mb_type is passed to the macroblock prediction syntax and the submacroblock prediction syntax indicating the syntax of the subblock in the macroblock.
 図16に示すマクロブロックプレディクションシンタクスの例では、mb_affine_pred_flagは、ブロックで、第一幾何変換予測を用いるか、第二幾何変換予測を用いるかを示している。フラグが0の場合、予測選択情報213は、第二幾何変換パラメータを用いるように設定される。一方、フラグが1の場合、予測選択情報213は、第一幾何変換パラメータを用いるように設定される。 In the example of the macroblock prediction syntax shown in FIG. 16, mb_affine_pred_flag indicates whether the first geometric transformation prediction or the second geometric transformation prediction is used in the block. When the flag is 0, the prediction selection information 213 is set to use the second geometric transformation parameter. On the other hand, when the flag is 1, the prediction selection information 213 is set to use the first geometric transformation parameter.
 NumMbPart()は、mb_typeに規定されたブロック分割数を返す内部関数であり、16×16画素ブロックの場合は1、16×8、8×16画素ブロックの場合は2、8×8画素ブロックの場合は4を出力する。 NumMbPart () is an internal function that returns the number of block divisions specified in mb_type. It is 1 for a 16 × 16 pixel block, 2 for an 8 × 16 pixel block, and 2 for an 8 × 8 pixel block. In the case, 4 is output.
 図中のmv_l0、mv_l1はマクロブロックにおける動きベクトルの差分情報を示している。動きベクトル情報は、動き情報探索部106によって設定され、本実施形態で開示しない予測動きベクトルとの差分を取られた値である。 In the figure, mv_l0 and mv_l1 indicate motion vector difference information in the macroblock. The motion vector information is a value set by the motion information search unit 106 and obtained by taking a difference from a predicted motion vector not disclosed in the present embodiment.
 図中のmvd_l0_affine、mvd_l1_affineはマクロブロックにおける導出パラメータを示しており、アフィン変換パラメータの動きベクトルを除いた成分(a,b,d,e)の差分情報を示している。本シンタクス要素は、第一幾何変換パラメータが選択されたときだけ、符号化される。 In the figure, mvd_l0_affine and mvd_l1_affine indicate derived parameters in the macroblock, and indicate difference information of components (a, b, d, e) excluding motion vectors of affine transformation parameters. This syntax element is encoded only when the first geometric transformation parameter is selected.
 図17に示すサブマクロブロックプレディクションシンタクスの例では、mb_affine_pred_flagは、ブロックで、第一幾何変換予測を用いるか、第二幾何変換予測を用いるかを示している。フラグが0の場合、予測選択情報213は、第二幾何変換パラメータを用いるように設定される。一方、フラグが1の場合、予測選択情報213は、第一幾何変換パラメータを用いるように設定される。 In the example of the sub-macroblock prediction syntax shown in FIG. 17, mb_affine_pred_flag indicates whether to use the first geometric transformation prediction or the second geometric transformation prediction in the block. When the flag is 0, the prediction selection information 213 is set to use the second geometric transformation parameter. On the other hand, when the flag is 1, the prediction selection information 213 is set to use the first geometric transformation parameter.
 図中のmv_l0、mv_l1はサブマクロブロックにおける動きベクトルの差分情報を示している。動きベクトル情報は、動き情報探索部106によって設定され、本実施形態で開示しない予測動きベクトルとの差分を取られた値である。 In the figure, mv_l0 and mv_l1 indicate motion vector difference information in the sub macroblock. The motion vector information is a value set by the motion information search unit 106 and obtained by taking a difference from a predicted motion vector not disclosed in the present embodiment.
 図中のmvd_l0_affine、mvd_l1_affineはサブマクロブロックにおける導出パラメータを示しており、アフィン変換パラメータの動きベクトルを除いた成分(a,b,d,e)の差分情報を示している。本シンタクス要素は、第一幾何変換パラメータが選択されたときだけ、符号化される。 Mvd_l0_affine and mvd_l1_affine in the figure indicate derived parameters in the sub-macroblock, and indicate difference information of components (a, b, d, e) excluding the motion vectors of the affine transformation parameters. This syntax element is encoded only when the first geometric transformation parameter is selected.
 第1の実施形態に係るシンタクス構造は、予測対象画素ブロックがスキップモードのときには、平行移動モデルを用いた従来の動き補償予測又は第二幾何変換予測を選択可能であり、それ以外のインター予測の場合では、第一幾何変換予測又は第二幾何変換予測を選択可能である。 In the syntax structure according to the first embodiment, when the prediction target pixel block is in the skip mode, the conventional motion compensated prediction or the second geometric transformation prediction using the translation model can be selected, and other inter predictions can be selected. In some cases, the first geometric transformation prediction or the second geometric transformation prediction can be selected.
 なお、図12乃至図17に示すシンタクスの表中の行間には、本実施形態において規定していないシンタクス要素が挿入されてもよく、その他の条件分岐に関する記述が含まれていてもよい。また、シンタクステーブルを複数のテーブルに分割し、又は複数のシンタクステーブルを統合してもよい。また、必ずしも同一の用語を用いる必要は無く、利用する形態によって任意に変更してもよい。 It should be noted that syntax elements not defined in the present embodiment may be inserted between lines in the syntax tables shown in FIGS. 12 to 17, and descriptions regarding other conditional branches may be included. Further, the syntax table may be divided into a plurality of tables, or a plurality of syntax tables may be integrated. Moreover, it is not always necessary to use the same term, and it may be arbitrarily changed depending on the form to be used.
 上記実施形態によると、画素ブロックの幾何変換による画像の形状に係る情報を示す2つの幾何変換パラメータ、即ち第一の幾何変換パラメータ及び第二の幾何変換パラメータを導出し、これら幾何変換パラメータのいずれかを選択するかを示す予測選択情報に従って選択された幾何変換パラメータを用いて動き補償予測を行って予測信号を生成する。 According to the above embodiment, two geometric transformation parameters indicating information related to the shape of the image by the geometric transformation of the pixel block, that is, the first geometric transformation parameter and the second geometric transformation parameter are derived, and any of these geometric transformation parameters is derived. A prediction signal is generated by performing motion compensation prediction using a geometric transformation parameter selected according to prediction selection information indicating whether or not to select.
 [第2の実施形態]
 次に、第2の実施形態について説明する。第2の実施形態に係る動画像符号化装置の構成は、第1の実施形態と同一である。なお、第1の実施形態と同じ機能を持つブロック、シンタクスには同一の符号を付し、ここでは説明を省略する。第2の実施形態では、シンタクス構造のみが第1の実施形態と異なる。
[Second Embodiment]
Next, a second embodiment will be described. The configuration of the video encoding apparatus according to the second embodiment is the same as that of the first embodiment. Note that blocks and syntax having the same functions as those in the first embodiment are denoted by the same reference numerals, and description thereof is omitted here. In the second embodiment, only the syntax structure is different from the first embodiment.
 図18は、マクロブロックレイヤーシンタクス1607の例を示す図である。図中に示されるmb_typeは、マクロブロックタイプ情報を示している。即ち、現在のマクロブロックがイントラ符号化されているか、インター符号化されているか、又はどのようなブロック形状で予測が行われているか、予測の方向が単方向予測か双方向予測か、などの情報を含んでいる。mb_typeは、マクロブロックプレディクションシンタクスと更にマクロブロック内のサブブロックのシンタクスを示すサブマクロブロックプレディクションシンタクスなどに渡される。mb_additional_affine_motion_flagは、予測対象ブロックに対して、第一幾何変換パラメータを利用するか、第二幾何変換パラメータを利用するかを選択するフラグ情報を示している。本フラグが0の場合、第二幾何変換パラメータが利用され、本フラグが1の場合、第一幾何変換パラメータが利用される。 FIG. 18 is a diagram illustrating an example of the macroblock layer syntax 1607. Mb_type shown in the figure indicates macroblock type information. That is, whether the current macroblock is intra-coded, inter-coded, what block shape is being predicted, whether the prediction direction is unidirectional prediction or bidirectional prediction, etc. Contains information. The mb_type is passed to the macroblock prediction syntax and the submacroblock prediction syntax indicating the syntax of the subblock in the macroblock. mb_additional_affine_motion_flag indicates flag information for selecting whether to use the first geometric transformation parameter or the second geometric transformation parameter for the prediction target block. When this flag is 0, the second geometric transformation parameter is used, and when this flag is 1, the first geometric transformation parameter is used.
 図19に示すマクロブロックプレディクションシンタクスの例では、mb_affine_pred_flagは、ブロックで、第一幾何変換予測又は第二幾何変換予測を含む、幾何変換予測を用いるか、平行移動モデルの動き補償予測を用いるか、を示している。フラグが0の場合、mb_additional_affine_motion_flagに関わらず、予測選択情報213は、平行移動モデルの動き補償予測を用いるように設定される。一方、フラグが1の場合、予測選択情報213は、mb_additional_affine_motion_flagのフラグ情報に従って、第一幾何変換パラメータを用いるか、第二幾何変換パラメータを用いるかが設定される。 In the example of the macroblock prediction syntax shown in FIG. 19, whether mb_affine_pred_flag uses a geometric transformation prediction including a first geometric transformation prediction or a second geometric transformation prediction, or uses a motion compensated prediction of a translation model. , Shows. When the flag is 0, the prediction selection information 213 is set to use the motion compensated prediction of the translation model regardless of the mb_additional_affine_motion_flag. On the other hand, when the flag is 1, the prediction selection information 213 is set to use the first geometric transformation parameter or the second geometric transformation parameter according to the flag information of mb_additional_affine_motion_flag.
 図20に示すサブマクロブロックプレディクションシンタクスの例では、mb_affine_pred_flagは、ブロックで、第一幾何変換予測又は第二幾何変換予測を含む、幾何変換予測を用いるか、平行移動モデルの動き補償予測を用いるか、を示している。フラグが0の場合、mb_additional_affine_motion_flagに関わらず、予測選択情報213は、平行移動モデルの動き補償予測を用いるように設定される。一方、フラグが1の場合、予測選択情報213は、mb_additional_affine_motion_flagのフラグ情報に従って、第一幾何変換パラメータを用いるか、第二幾何変換パラメータを用いるかが設定される。 In the example of the sub-macroblock prediction syntax shown in FIG. 20, mb_affine_pred_flag is a block that uses geometric transformation prediction including first geometric transformation prediction or second geometric transformation prediction, or uses motion compensation prediction of a translation model. Indicates whether or not When the flag is 0, the prediction selection information 213 is set to use the motion compensated prediction of the translation model regardless of the mb_additional_affine_motion_flag. On the other hand, when the flag is 1, the prediction selection information 213 is set to use the first geometric transformation parameter or the second geometric transformation parameter according to the flag information of mb_additional_affine_motion_flag.
 第2の実施形態に係るシンタクス構造は、予測対象画素ブロックがスキップモードのときには、平行移動モデルを用いた従来の動き補償又は第二幾何変換予測を選択可能であり、それ以外のインター予測の場合では、マクロブロックレベルで、第一幾何変換予測を用いるか、第二幾何変換予測を用いるかを判断し、サブマクロブロックレベルで第一幾何変換予測又は第二幾何変換予測を含む、幾何変換予測を用いるか、平行移動モデルの動き補償予測を用いるか、を選択可能である。 In the syntax structure according to the second embodiment, when the pixel block to be predicted is in the skip mode, the conventional motion compensation or the second geometric transformation prediction using the translation model can be selected. Then, at the macroblock level, it is determined whether the first geometric transformation prediction or the second geometric transformation prediction is used, and the geometric transformation prediction including the first geometric transformation prediction or the second geometric transformation prediction at the sub macroblock level. Or using motion compensated prediction of a translation model.
 なお、図18乃至図20に示すシンタクスの表中の行間には、本実施形態において規定していないシンタクス要素が挿入されてもよく、その他の条件分岐に関する記述が含まれていてもよい。また、シンタクステーブルを複数のテーブルに分割し、又は複数のシンタクステーブルを統合してもよい。また、必ずしも同一の用語を用いる必要は無く、利用する形態によって任意に変更してもよい。 It should be noted that syntax elements not defined in the present embodiment may be inserted between lines in the syntax tables shown in FIGS. 18 to 20, and descriptions regarding other conditional branches may be included. Further, the syntax table may be divided into a plurality of tables, or a plurality of syntax tables may be integrated. Moreover, it is not always necessary to use the same term, and it may be arbitrarily changed depending on the form to be used.
 以上説明したように、第1の実施形態では、矩形ブロックに適さない、動きを有するオブジェクトを予測する際に、過度のブロック分割が施されて、ブロック分割情報が増大することを防ぐ。付加的な情報を大幅に増加させずに、ブロック内の幾何変形を伴う動きを予測し、それぞれに最適な予測方法を適用することによって、符号化効率を向上させ、さらに、主観画質を向上するという効果を奏する。 As described above, in the first embodiment, when predicting an object having a motion that is not suitable for a rectangular block, excessive block division is prevented and block division information is prevented from increasing. Improve coding efficiency and improve subjective image quality by predicting motion with geometric deformation in a block without applying additional information, and applying an optimal prediction method for each. There is an effect.
 次に、動画像復号化に関する第3乃至第4の実施形態について述べる。 Next, third to fourth embodiments relating to moving picture decoding will be described.
 [第3の実施形態]
 図21を参照して第3の実施形態に従う動画像復号化装置を説明する。動画像復号化装置300は、例えば、第1の実施形態に従う動画像符号化装置により生成される符号化データを復号する。
[Third Embodiment]
A moving picture decoding apparatus according to the third embodiment will be described with reference to FIG. The video decoding device 300 decodes, for example, encoded data generated by the video encoding device according to the first embodiment.
 動画像復号化装置300は、入力バッファ301に蓄えられる符号化データ311を復号し、復号画像信号317を出力バッファ309に出力する。符号化データ311は、例えば、動画像符号化装置100などから送出され、蓄積系又は伝送系を経て送られ、入力バッファ301に一度蓄えられ、多重化された符号化データである。 The video decoding device 300 decodes the encoded data 311 stored in the input buffer 301 and outputs a decoded image signal 317 to the output buffer 309. The encoded data 311 is encoded data that is transmitted from, for example, the moving image encoding apparatus 100, transmitted through a storage system or a transmission system, once stored in the input buffer 301, and multiplexed.
 動画像復号化装置300は、エントロピー復号部302、逆量子化/逆変換部303、加算器304、参照画像メモリ305、イントラ予測信号生成装置306、インター予測信号生成装置307、及び予測分離スイッチ308を有する。動画像復号化装置300は、また、入力バッファ301、出力バッファ309、及び復号化制御部310と接続される。 The video decoding device 300 includes an entropy decoding unit 302, an inverse quantization / inverse conversion unit 303, an adder 304, a reference image memory 305, an intra prediction signal generation device 306, an inter prediction signal generation device 307, and a prediction separation switch 308. Have The moving picture decoding apparatus 300 is also connected to the input buffer 301, the output buffer 309, and the decoding control unit 310.
 エントロピー復号部302は、符号化データ311を1フレーム又は1フィールド毎にシンタクスに基づいて構文解析による解読を行う。エントロピー復号部302は、順次各シンタクスの符号列をエントロピー復号化し、動き情報315、導出パラメータ情報316、及び復号化対象ブロックの符号化パラメータ等を再生する。符号化パラメータとは、予測情報、変換係数に関する情報、量子化に関する情報、等の復号の際に必要になるパラメータ全てを含む。 The entropy decoding unit 302 decodes the encoded data 311 by syntax analysis based on the syntax for each frame or field. The entropy decoding unit 302 sequentially entropy-decodes the code string of each syntax, and reproduces the motion information 315, the derived parameter information 316, the encoding parameters of the decoding target block, and the like. The encoding parameter includes all parameters necessary for decoding such as prediction information, information on transform coefficients, information on quantization, and the like.
 エントロピー復号部302で解読が行われた変換係数は、逆量子化器及び逆変換器を含む逆量子化/逆変換部303へ入力される。エントロピー復号部302によって解読された量子化に関する様々な情報、即ち、量子化パラメータや量子化マトリクスは、復号化制御部310の内部メモリに設定され、逆量子化処理として利用される際にロードされる。 The transform coefficient decoded by the entropy decoding unit 302 is input to an inverse quantization / inverse transform unit 303 including an inverse quantizer and an inverse transformer. Various information relating to the quantization decoded by the entropy decoding unit 302, that is, the quantization parameter and the quantization matrix are set in the internal memory of the decoding control unit 310 and loaded when used as an inverse quantization process. The
 ロードされた量子化に関する情報を用いて、逆量子化/逆変換部303では、最初に逆量子化器により逆量子化処理が行われる。逆量子化された変換係数は、続いて逆変換器により逆変換処理、例えば逆離散コサイン変換等が実行される。ここでは、逆直交変換について説明したが、符号化装置でウェーブレット変換などが行われている場合には、逆量子化/逆変換部303は、対応する逆量子化及び逆ウェーブレット変換などが実行されるとよい。 In the inverse quantization / inverse transform unit 303, the inverse quantization process is first performed by the inverse quantizer using the information on the loaded quantization. The inverse quantized transform coefficient is then subjected to inverse transform processing, for example, inverse discrete cosine transform, by an inverse transformer. Here, the inverse orthogonal transform has been described. However, when wavelet transform or the like is performed in the encoding device, the inverse quantization / inverse transform unit 303 performs the corresponding inverse quantization and inverse wavelet transform. Good.
 逆量子化/逆変換部303を通って、復元された予測誤差信号312は加算器304へと入力される。加算器304は、予測誤差信号312と、後述する予測分離スイッチ308から出力された予測信号416とを加算し、復号画像信号317を生成する。 Through the inverse quantization / inverse transform unit 303, the restored prediction error signal 312 is input to the adder 304. The adder 304 adds the prediction error signal 312 and the prediction signal 416 output from the prediction separation switch 308 described later to generate a decoded image signal 317.
 生成された復号画像信号317は、動画像復号化装置300から出力されて、出力バッファ317に一旦蓄積された後、復号化制御部310が管理する出力タイミングに従って出力される。また、この復号画像信号317は参照画像メモリ305へと保存され、参照画像信号313となる。 The generated decoded image signal 317 is output from the moving image decoding apparatus 300, temporarily stored in the output buffer 317, and then output according to the output timing managed by the decoding control unit 310. The decoded image signal 317 is stored in the reference image memory 305 and becomes a reference image signal 313.
 参照画像信号313は参照画像メモリ305から、順次フレーム毎又はフィールド毎に読み出され、イントラ予測信号生成装置306又はインター予測信号生成装置307へと入力される。復号化対象画素ブロックで利用された動き情報315は、復号化制御部310に保存され、次のブロックのインター予測信号生成処理の際に、復号化制御部310から適宜ロードされて利用される。 The reference image signal 313 is sequentially read from the reference image memory 305 for each frame or each field, and is input to the intra prediction signal generation device 306 or the inter prediction signal generation device 307. The motion information 315 used in the decoding target pixel block is stored in the decoding control unit 310, and is appropriately loaded from the decoding control unit 310 and used in the inter prediction signal generation processing of the next block.
 イントラ予測信号生成装置306は、図1で示した動画像符号化装置100中のイントラ予測信号生成装置107と同一の機能及び構成を有する。即ち、イントラ予測信号生成装置306では、入力された参照画像信号313を利用して、イントラ予測が行われる。例えば、H.264では、予測対象ブロックに隣接する符号化済みの参照画素値を利用して、垂直方向、水平方向などの予測方向に順じて画素補填を行うことによって予測信号を生成する。なお、予め定められた補間方法を用いて画素値を補間した後に、予め定められた予測方向に補間画素値をコピーしてもよい。作成された予測信号416は、予測分離スイッチ308へと出力される。 The intra prediction signal generation device 306 has the same function and configuration as the intra prediction signal generation device 107 in the video encoding device 100 shown in FIG. That is, the intra prediction signal generation device 306 performs intra prediction using the input reference image signal 313. For example, H.M. In H.264, a prediction signal is generated by performing pixel interpolation in the prediction direction such as the vertical direction and the horizontal direction using an encoded reference pixel value adjacent to the prediction target block. In addition, after interpolating the pixel value using a predetermined interpolation method, the interpolated pixel value may be copied in a predetermined prediction direction. The generated prediction signal 416 is output to the prediction separation switch 308.
 インター予測信号生成装置307は、図1及び図2並びに図4乃至図10で示したインター予測信号生成装置109と同一の機能及び構成を有する。即ち、インター予測信号生成装置307では、入力された動き情報315、導出パラメータ情報316、参照画像信号313、予測選択情報314を利用して、予測信号416が生成される。動き情報315、導出パラメータ情報316、参照画像信号313及び予測選択情報314は動画像符号化装置100のインター予測信号生成装置109に入力される動き情報210、導出パラメータ情報211、参照画像信号207及び予測選択情報213にそれぞれ対応し、図2に示されるインター予測信号生成装置109において予測信号416が生成される。 The inter prediction signal generation device 307 has the same function and configuration as the inter prediction signal generation device 109 shown in FIGS. 1 and 2 and FIGS. That is, the inter prediction signal generation device 307 generates the prediction signal 416 using the input motion information 315, derivation parameter information 316, reference image signal 313, and prediction selection information 314. The motion information 315, the derived parameter information 316, the reference image signal 313, and the prediction selection information 314 are the motion information 210, the derived parameter information 211, the reference image signal 207, and the input to the inter prediction signal generation device 109 of the video encoding device 100. Corresponding to each of the prediction selection information 213, the prediction signal 416 is generated in the inter prediction signal generation device 109 shown in FIG.
 生成された予測信号416は、予測分離スイッチ308へと出力される。予測分離スイッチ308は、イントラ予測信号生成装置308の出力端とインター予測信号生成装置307の出力端を、予測選択情報314に従って選択する。予測選択情報314に示される情報がイントラ予測である場合はスイッチをイントラ予測信号生成装置308へと接続する。一方、予測選択情報314がインター予測である場合はスイッチをインター予測信号生成装置307へと接続する。予測選択情報314は動画像符号化装置100の予測選択部111によって設定される予測選択情報123と同一であり、図10に示される。 The generated prediction signal 416 is output to the prediction separation switch 308. The prediction separation switch 308 selects the output terminal of the intra prediction signal generation device 308 and the output terminal of the inter prediction signal generation device 307 according to the prediction selection information 314. When the information shown in the prediction selection information 314 is intra prediction, the switch is connected to the intra prediction signal generation device 308. On the other hand, when the prediction selection information 314 is inter prediction, the switch is connected to the inter prediction signal generation device 307. The prediction selection information 314 is the same as the prediction selection information 123 set by the prediction selection unit 111 of the video encoding device 100, and is shown in FIG.
 以上が、第3の実施形態の動画像復号化装置300の処理の概要である。 The above is the outline of the processing of the video decoding device 300 of the third embodiment.
 次に、動画像復号化装置300が復号する符号化データのシンタクス構造について説明する。動画像復号化装置300が復号する符号化データ311は、動画像符号化装置100と同一のシンタクス構造を有するとよい。ここでは、図12乃至図17と同一のシンタクスを用いることとする。 Next, the syntax structure of the encoded data decoded by the video decoding device 300 will be described. The encoded data 311 decoded by the video decoding device 300 may have the same syntax structure as that of the video encoding device 100. Here, the same syntax as in FIGS. 12 to 17 is used.
 即ち、動画像復号化装置300におけるシンタクス構造では、図12に示すとおり、シンタクス1600は主に3つのパートを有する。ハイレベルシンタクス1601は、スライス以上の上位レイヤのシンタクス情報を有する。スライスレベルシンタクス1602は、スライス毎に復号に必要な情報を有し、マクロブロックレベルシンタクス1603は、マクロブロック毎に復号に必要とされる情報を有する。 That is, in the syntax structure in the moving picture decoding apparatus 300, the syntax 1600 has mainly three parts as shown in FIG. The high-level syntax 1601 has higher layer syntax information that is equal to or higher than a slice. The slice level syntax 1602 has information necessary for decoding for each slice, and the macroblock level syntax 1603 has information necessary for decoding for each macroblock.
 各パートは、更に詳細なシンタクスで構成されている。ハイレベルシンタクス1601は、シーケンスパラメータセットシンタクス1604とピクチャパラメータセットシンタクス1605などの、シーケンス及びピクチャレベルのシンタクスを含む。スライスレベルシンタクス1602は、スライスヘッダーシンタクス1606、スライスデータシンタクス1607等を含む。マクロブロックレベルシンタクス1603は、マクロブロックレイヤーシンタクス1608、マクロブロックプレディクションシンタクス1609等を含む。 Each part has a more detailed syntax. High level syntax 1601 includes sequence and picture level syntax, such as sequence parameter set syntax 1604 and picture parameter set syntax 1605. The slice level syntax 1602 includes a slice header syntax 1606, a slice data syntax 1607, and the like. The macroblock level syntax 1603 includes a macroblock layer syntax 1608, a macroblock prediction syntax 1609, and the like.
 図13に示されるスライスヘッダーシンタクス1605の例では、slice_affine_motion_prediction_flagは、スライスに幾何変換予測を適用するかどうかを示すシンタクス要素である。slice_affine_motion_prediction_flagが0である場合、幾何変換予測部202は、本スライスに対しては、式(8)におけるパラメータ(a,b,d,e)のパラメータを利用せず、式(19)を用いる。 In the example of the slice header syntax 1605 shown in FIG. 13, slice_affine_motion_prediction_flag is a syntax element indicating whether to apply geometric transformation prediction to a slice. When slice_affine_motion_prediction_flag is 0, the geometric transformation prediction unit 202 does not use the parameters (a, b, d, e) in Equation (8) but uses Equation (19) for this slice.
 式(19)は、H.264などで用いられている平行移動モデルを用いた動き補償予測を表しており、パラメータ(c,f)は動きベクトルに相当する。本フラグが0の場合、従来の平行移動モデルの動き補償予測が行われることと同一である。一方、slice_affine_motion_prediction_flagが1である場合、スライスにおいて予測選択情報314に示すように、予測分離スイッチ201は予測信号を動的に切り替える。 Equation (19) H.264 represents motion compensation prediction using a translation model used in H.264 and the like, and parameters (c, f) correspond to motion vectors. When this flag is 0, it is the same as the motion compensation prediction of the conventional translation model. On the other hand, when slice_affine_motion_prediction_flag is 1, the prediction separation switch 201 dynamically switches the prediction signal as indicated by the prediction selection information 314 in the slice.
 図14に示されるスライスデータシンタクス1606の例では、mb_skip_flagは、マクロブロックがスキップモードで符号化されているかどうかを示すフラグである。スキップモードである場合、変換係数や動きベクトルなどは符号化されない。このため、第一幾何変換予測はスキップモードには適用されない。 In the example of the slice data syntax 1606 shown in FIG. 14, mb_skip_flag is a flag indicating whether or not the macroblock is encoded in the skip mode. In the skip mode, transform coefficients, motion vectors, etc. are not encoded. For this reason, the first geometric transformation prediction is not applied to the skip mode.
 AvailAffineModeはマクロブロックで第二幾何変換予測が利用できるかどうかを示す内部パラメータである。AvailAffineModeが0の場合、第二幾何変換予測を利用しないように予測選択情報314が設定されていることを意味する。隣接ブロックの隣接動きベクトルと予測対象ブロックの動きベクトルが同一の値を持つ場合、AvailAffineModeは0となり、それ以外の場合、AvailAffineModeは1となる。 “AvailAffineMode” is an internal parameter indicating whether or not the second geometric transformation prediction can be used in the macroblock. When AvailAffineMode is 0, it means that the prediction selection information 314 is set not to use the second geometric transformation prediction. When the adjacent motion vector of the adjacent block and the motion vector of the prediction target block have the same value, AvailAffineMode is 0. Otherwise, AvailAffineMode is 1.
 AvailAffineModeの設定は、隣接幾何変換パラメータや隣接動きベクトルを用いて設定することも可能である。例えば、隣接動きベクトルがまったく異なる方向を指している場合、本予測対象ブロックの隣接ブロックにオブジェクトの境界が存在する可能性があるため、AvailAffineModeを0と設定することも可能である。 The setting of AvailAffineMode can also be set using an adjacent geometric transformation parameter or an adjacent motion vector. For example, when the adjacent motion vector points in a completely different direction, there is a possibility that an object boundary exists in the adjacent block of the current prediction target block. Therefore, it is possible to set AvailAffineMode to 0.
 一方、AvailAffineModeが1の場合は、第二幾何変換予測と動き補償予測のどちらを利用するかを示すmb_affine_motion_skip_flagが符号化される。mb_affine_motion_skip_flagが1の場合、スキップモードに対して第二幾何変換予測が適用されることを意味する。mb_affine_motion_skip_flagが0の場合、式(19)を用いて、動き補償予測が適用されることを意味する。 On the other hand, when AvailAffineMode is 1, mb_affine_motion_skip_flag indicating whether to use the second geometric transformation prediction or the motion compensation prediction is encoded. When mb_affine_motion_skip_flag is 1, it means that the second geometric transformation prediction is applied to the skip mode. When mb_affine_motion_skip_flag is 0, it means that motion compensation prediction is applied using equation (19).
 図15に示すマクロブロックレイヤーシンタクス1607の例では、mb_typeは、マクロブロックタイプ情報を示している。即ち、現在のマクロブロックがイントラ符号化されているか、インター符号化されているか、又はどのようなブロック形状で予測が行われているか、予測の方向が単方向予測か双方向予測か、などの情報を含んでいる。mb_typeは、マクロブロックプレディクションシンタクスと更にマクロブロック内のサブブロックのシンタクスを示すサブマクロブロックプレディクションシンタクスなどに渡される。 In the example of the macroblock layer syntax 1607 shown in FIG. 15, mb_type indicates macroblock type information. That is, whether the current macroblock is intra-coded, inter-coded, what block shape is being predicted, whether the prediction direction is unidirectional prediction or bidirectional prediction, etc. Contains information. The mb_type is passed to the macroblock prediction syntax and the submacroblock prediction syntax indicating the syntax of the subblock in the macroblock.
 図16に示すマクロブロックプレディクションシンタクスの例では、mb_affine_pred_flagは、ブロックで、第一幾何変換予測を用いるか、第二幾何変換予測を用いるかを示している。フラグが0の場合、予測選択情報314は、第二幾何変換パラメータを用いるように設定されている。一方、フラグが1の場合、予測選択情報314は、第一幾何変換パラメータを用いるように設定されている。 In the example of the macroblock prediction syntax shown in FIG. 16, mb_affine_pred_flag indicates whether the first geometric transformation prediction or the second geometric transformation prediction is used in the block. When the flag is 0, the prediction selection information 314 is set to use the second geometric transformation parameter. On the other hand, when the flag is 1, the prediction selection information 314 is set to use the first geometric transformation parameter.
 NumMbPart()は、mb_typeに規定されたブロック分割数を返す内部関数であり、16×16画素ブロックの場合は1、16×8、8×16画素ブロックの場合は2、8×8画素ブロックの場合は4を出力する。 NumMbPart () is an internal function that returns the number of block divisions specified in mb_type. It is 1 for a 16 × 16 pixel block, 2 for an 8 × 16 pixel block, and 2 for an 8 × 8 pixel block. In the case, 4 is output.
 図中のmv_l0、mv_l1はマクロブロックにおける動きベクトルの差分情報を示している。動きベクトル情報は、動画像符号化装置100の動き情報探索部106によって設定され、本実施形態で開示しない予測動きベクトルとの差分を取られた値である。 In the figure, mv_l0 and mv_l1 indicate motion vector difference information in the macroblock. The motion vector information is a value that is set by the motion information search unit 106 of the video encoding device 100 and is obtained by taking a difference from a predicted motion vector that is not disclosed in the present embodiment.
 図中のmvd_l0_affine、mvd_l1_affineはマクロブロックにおける導出パラメータを示しており、アフィン変換パラメータの動きベクトルを除いた成分(a,b,d,e)の差分情報を示している。本シンタクス要素は、第一幾何変換パラメータが選択されたときだけ、符号化されている。 In the figure, mvd_l0_affine and mvd_l1_affine indicate derived parameters in the macroblock, and indicate difference information of components (a, b, d, e) excluding motion vectors of affine transformation parameters. This syntax element is encoded only when the first geometric transformation parameter is selected.
 図17に示すサブマクロブロックプレディクションシンタクスの例では、mb_affine_pred_flagは、ブロックで、第一幾何変換予測を用いるか、第二幾何変換予測を用いるかを示している。フラグが0の場合、予測選択情報314は、第二幾何変換パラメータを用いるように設定されている。一方、フラグが1の場合、予測選択情報314は、第一幾何変換パラメータを用いるように設定されている。 In the example of the sub-macroblock prediction syntax shown in FIG. 17, mb_affine_pred_flag indicates whether to use the first geometric transformation prediction or the second geometric transformation prediction in the block. When the flag is 0, the prediction selection information 314 is set to use the second geometric transformation parameter. On the other hand, when the flag is 1, the prediction selection information 314 is set to use the first geometric transformation parameter.
 図中のmv_l0、mv_l1はサブマクロブロックにおける動きベクトルの差分情報を示している。動きベクトル情報は、動画像符号化装置100の動き情報探索部106によって設定され、本実施形態で開示しない予測動きベクトルとの差分を取られた値である。 In the figure, mv_l0 and mv_l1 indicate motion vector difference information in the sub macroblock. The motion vector information is a value that is set by the motion information search unit 106 of the video encoding device 100 and is obtained by taking a difference from a predicted motion vector that is not disclosed in the present embodiment.
 図中のmvd_l0_affine、mvd_l1_affineはサブマクロブロックにおける導出パラメータを示しており、アフィン変換パラメータの動きベクトルを除いた成分(a,b,d,e)の差分情報を示している。本シンタクス要素は、第一幾何変換パラメータが選択されたときだけ、符号化されている。 Mvd_l0_affine and mvd_l1_affine in the figure indicate derived parameters in the sub-macroblock, and indicate difference information of components (a, b, d, e) excluding the motion vectors of the affine transformation parameters. This syntax element is encoded only when the first geometric transformation parameter is selected.
 図12乃至図17に示されるシンタクス構造は、予測対象画素ブロックがスキップモードのときには、平行移動モデルを用いた従来の動き補償予測又は第二幾何変換予測を選択可能であり、それ以外のインター予測の場合では、第一幾何変換予測又は第二幾何変換予測を選択可能である。 The syntax structure shown in FIGS. 12 to 17 can select the conventional motion compensation prediction or the second geometric transformation prediction using the translation model when the prediction target pixel block is in the skip mode, and other inter predictions. In this case, the first geometric transformation prediction or the second geometric transformation prediction can be selected.
 なお、図12乃至図17に示すシンタクスの表中の行間には、本実施形態において規定していないシンタクス要素が挿入されてもよく、その他の条件分岐に関する記述が含まれていてもよい。また、シンタクステーブルを複数のテーブルに分割し、又は複数のシンタクステーブルを統合してもよい。また、必ずしも同一の用語を用いる必要は無く、利用する形態によって任意に変更してもよい。 It should be noted that syntax elements not defined in the present embodiment may be inserted between lines in the syntax tables shown in FIGS. 12 to 17, and descriptions regarding other conditional branches may be included. Further, the syntax table may be divided into a plurality of tables, or a plurality of syntax tables may be integrated. Moreover, it is not always necessary to use the same term, and it may be arbitrarily changed depending on the form to be used.
 [第4の実施形態]
 次に、第4の実施形態について説明する。第4の実施形態に係る動画像復号化装置の構成は、図21に示す第3の実施形態と同一である。なお、第3の実施形態と同じ機能を持つブロック、シンタクスには同一の符号を付し、ここでは説明を省略する。第4の実施形態では、シンタクス構造のみが第3の実施形態と異なるが、第2の実施形態のシンタックス構造と実質的に同じである。
[Fourth Embodiment]
Next, a fourth embodiment will be described. The configuration of the video decoding apparatus according to the fourth embodiment is the same as that of the third embodiment shown in FIG. Note that blocks and syntax having the same functions as those of the third embodiment are denoted by the same reference numerals, and description thereof is omitted here. In the fourth embodiment, only the syntax structure is different from the third embodiment, but is substantially the same as the syntax structure of the second embodiment.
 図18に示すマクロブロックレイヤーシンタクス1607の例のように、mb_typeは、マクロブロックタイプ情報を示している。即ち、現在のマクロブロックがイントラ符号化されているか、インター符号化されているか、又はどのようなブロック形状で予測が行われているか、予測の方向が単方向予測か双方向予測か、などの情報を含んでいる。mb_typeは、マクロブロックプレディクションシンタクスと更にマクロブロック内のサブブロックのシンタクスを示すサブマクロブロックプレディクションシンタクスなどに渡される。mb_additional_affine_motion_flagは、予測対象ブロックに対して、第一幾何変換パラメータを利用するか、第二幾何変換パラメータを利用するかを選択するフラグ情報を示している。本フラグが0の場合、第二幾何変換パラメータが利用され、本フラグが1の場合、第一幾何変換パラメータが利用される。 As in the example of the macroblock layer syntax 1607 shown in FIG. 18, mb_type indicates macroblock type information. That is, whether the current macroblock is intra-coded, inter-coded, what block shape is being predicted, whether the prediction direction is unidirectional prediction or bidirectional prediction, etc. Contains information. The mb_type is passed to the macroblock prediction syntax and the submacroblock prediction syntax indicating the syntax of the subblock in the macroblock. mb_additional_affine_motion_flag indicates flag information for selecting whether to use the first geometric transformation parameter or the second geometric transformation parameter for the prediction target block. When this flag is 0, the second geometric transformation parameter is used, and when this flag is 1, the first geometric transformation parameter is used.
 図19に示すマクロブロックプレディクションシンタクスの例では、mb_affine_pred_flagは、ブロックで、第一幾何変換予測又は第二幾何変換予測を含む、幾何変換予測を用いるか、平行移動モデルの動き補償予測を用いるか、を示している。フラグが0の場合、mb_additional_affine_motion_flagに関わらず、予測選択情報314は、平行移動モデルの動き補償予測を用いるように設定されている。一方、フラグが1の場合、予測選択情報314は、mb_additional_affine_motion_flagのフラグ情報に従って、第一幾何変換パラメータを用いるか、第二幾何変換パラメータを用いるかが設定されている。 In the example of the macroblock prediction syntax shown in FIG. 19, whether mb_affine_pred_flag uses a geometric transformation prediction including a first geometric transformation prediction or a second geometric transformation prediction or uses a motion compensation prediction of a translation model in a block. , Shows. When the flag is 0, the prediction selection information 314 is set to use the motion compensation prediction of the translation model regardless of the mb_additional_affine_motion_flag. On the other hand, when the flag is 1, the prediction selection information 314 is set to use the first geometric transformation parameter or the second geometric transformation parameter according to the flag information of mb_additional_affine_motion_flag.
 図20に示すサブマクロブロックプレディクションシンタクスの例では、mb_affine_pred_flagは、ブロックで、第一幾何変換予測又は第二幾何変換予測を含む、幾何変換予測を用いるか、平行移動モデルの動き補償予測を用いるか、を示している。フラグが0の場合、mb_additional_affine_motion_flagに関わらず、予測選択情報314は、平行移動モデルの動き補償予測を用いるように設定されている。一方、フラグが1の場合、予測選択情報314は、mb_additional_affine_motion_flagのフラグ情報に従って、第一幾何変換パラメータを用いるか、第二幾何変換パラメータを用いるかが設定されている。 In the example of the sub-macroblock prediction syntax shown in FIG. 20, mb_affine_pred_flag is a block that uses geometric transformation prediction including first geometric transformation prediction or second geometric transformation prediction, or uses motion compensation prediction of a translation model. Indicates whether or not When the flag is 0, the prediction selection information 314 is set to use the motion compensation prediction of the translation model regardless of the mb_additional_affine_motion_flag. On the other hand, when the flag is 1, the prediction selection information 314 is set to use the first geometric transformation parameter or the second geometric transformation parameter according to the flag information of mb_additional_affine_motion_flag.
 第3の実施形態に係るシンタクス構造は、予測対象画素ブロックがスキップモードのときには、平行移動モデルを用いた従来の動き補償又は第二幾何変換予測を選択可能であり、それ以外のインター予測の場合では、マクロブロックレベルで、第一幾何変換予測を用いるか、第二幾何変換予測を用いるかを判断し、サブマクロブロックレベルで第一幾何変換予測又は第二幾何変換予測を含む、幾何変換予測を用いるか、平行移動モデルの動き補償予測を用いるか、を選択可能である。 In the syntax structure according to the third embodiment, when the pixel block to be predicted is in the skip mode, the conventional motion compensation or the second geometric transformation prediction using the translation model can be selected. Then, at the macroblock level, it is determined whether the first geometric transformation prediction or the second geometric transformation prediction is used, and the geometric transformation prediction including the first geometric transformation prediction or the second geometric transformation prediction at the sub macroblock level. Or using motion compensated prediction of a translation model.
 なお、図18乃至図20に示すシンタクスの表中の行間には、本実施形態において規定していないシンタクス要素が挿入されてもよく、その他の条件分岐に関する記述が含まれていてもよい。また、シンタクステーブルを複数のテーブルに分割し、又は複数のシンタクステーブルを統合してもよい。また、必ずしも同一の用語を用いる必要は無く、利用する形態によって任意に変更してもよい。 It should be noted that syntax elements not defined in the present embodiment may be inserted between lines in the syntax tables shown in FIGS. 18 to 20, and descriptions regarding other conditional branches may be included. Further, the syntax table may be divided into a plurality of tables, or a plurality of syntax tables may be integrated. Moreover, it is not always necessary to use the same term, and it may be arbitrarily changed depending on the form to be used.
 (第1乃至第4の実施形態の変形例)
 (1)第1乃至第4の実施形態においては、処理対象フレームを16×16画素サイズなどの短形ブロックに分割し、図4に示したように画面左上のブロックから右下に向かって順に符号化/復号化する場合について説明しているが、符号化順序及び復号化順序はこれに限られない。例えば、右下から左上に向かって順に符号化及び復号化を行ってもよいし、画面中央から渦巻状に向かって順に符号化及び復号化を行ってもよい。さらに、右上から左下に向かって順に符号化及び復号化を行ってもよいし、画面の周辺部から中心部に向かって順に符号化及び復号化を行ってもよい。
(Modification of the first to fourth embodiments)
(1) In the first to fourth embodiments, the processing target frame is divided into short blocks of 16 × 16 pixel size or the like, and as shown in FIG. The case of encoding / decoding has been described, but the encoding order and decoding order are not limited to this. For example, encoding and decoding may be performed sequentially from the lower right to the upper left, or encoding and decoding may be performed sequentially from the center of the screen toward the spiral. Furthermore, encoding and decoding may be performed in order from the upper right to the lower left, or encoding and decoding may be performed in order from the peripheral part to the center part of the screen.
 (2)第1乃至第4の実施形態においては、ブロックサイズを4×4画素ブロック、8×8画素ブロックとして説明を行ったが、予測対象ブロックは均一なブロック形状にする必要なく、16×8画素ブロック、8×16画素ブロック、8×4画素ブロック、4×8画素ブロックなどの何れのブロックサイズであってもよい。また、1つのマクロブロック内でも全てのブロックを同一にする必要はなく、異なるサイズのブロックを混在させてもよい。この場合、分割数が増えると分割情報を符号化又は復号化するための符号量が増加する。そこで、変換係数の符号量と局部復号画像又は復号画像とのバランスを考慮して、ブロックサイズを選択すればよい。 (2) In the first to fourth embodiments, the block size has been described as a 4 × 4 pixel block and an 8 × 8 pixel block. However, the prediction target block does not need to have a uniform block shape, and 16 × Any block size such as an 8 pixel block, an 8 × 16 pixel block, an 8 × 4 pixel block, or a 4 × 8 pixel block may be used. Also, it is not necessary to make all blocks the same within one macroblock, and blocks of different sizes may be mixed. In this case, as the number of divisions increases, the amount of codes for encoding or decoding the division information increases. Therefore, the block size may be selected in consideration of the balance between the code amount of the transform coefficient and the locally decoded image or the decoded image.
 (3)第1乃至第4の実施形態においては、輝度信号と色差信号を分割せず、一方の色信号成分に限定した例として記述した。しかし、予測処理が輝度信号と色差信号で異なる場合、それぞれ異なる予測方法を用いてもよいし、同一の予測方法を用いても良い。異なる予測方法を用いる場合は、色差信号に対して選択した予測方法を輝度信号と同様の方法で符号化又は復号化する。 (3) In the first to fourth embodiments, the luminance signal and the color difference signal are not divided and described as an example limited to one color signal component. However, when the prediction processing is different between the luminance signal and the color difference signal, different prediction methods may be used, or the same prediction method may be used. When a different prediction method is used, the prediction method selected for the color difference signal is encoded or decoded by the same method as the luminance signal.
 (4)第1乃至第4の実施形態においては、予測幾何変換パラメータをどの隣接ブロックから利用したかの情報を符号化データに含ませない例を記述した。しかし、どの隣接ブロックから利用したかの情報を、符号化データに含ませてもよい。 (4) In the first to fourth embodiments, the example in which the information indicating which neighboring block used the predicted geometric transformation parameter is not included in the encoded data has been described. However, information on which neighboring block is used may be included in the encoded data.
 上述した実施形態の手法を用いることで、平行移動モデルに適さない動オブジェクトを予測するために、過度のブロック分割が施されて、ブロック分割情報が増大することを防ぐ。つまり、付加的な情報を大幅に増加させずに、ブロック内のオブジェクトの幾何変形を予測し、それぞれに好適な幾何変換パラメータを適用することによって、符号化効率を向上させると共に主観画質も向上するという効果を奏する。 By using the method of the above-described embodiment, in order to predict a moving object that is not suitable for the parallel movement model, excessive block division is performed and block division information is prevented from increasing. In other words, by predicting the geometric deformation of the object in the block without significantly increasing the additional information and applying a suitable geometric transformation parameter to each, the coding efficiency is improved and the subjective image quality is also improved. There is an effect.
 なお、本発明は上記実施形態そのままに限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で構成要素を変形して具体化できる。また、上記実施形態に開示されている複数の構成要素の適宜な組み合わせにより、種々の発明を形成できる。例えば、実施形態に示される全構成要素から幾つかの構成要素を削除してもよい。さらに、異なる実施形態にわたる構成要素を適宜組み合わせてもよい。 Note that the present invention is not limited to the above-described embodiment as it is, and can be embodied by modifying constituent elements without departing from the scope of the invention in the implementation stage. In addition, various inventions can be formed by appropriately combining a plurality of components disclosed in the embodiment. For example, some components may be deleted from all the components shown in the embodiment. Furthermore, constituent elements over different embodiments may be appropriately combined.
 上記の実施形態に記載した本発明の手法は、コンピュータによって実行させることができ、また、コンピュータに実行させることのできるプログラムとして、磁気ディスク(フレキシブルディスク、ハードディスクなど)、光ディスク(CD-ROM、DVDなど)、半導体メモリなどの記録媒体に格納して頒布することもできる。 The method of the present invention described in the above embodiment can be executed by a computer, and as a program that can be executed by the computer, a magnetic disk (flexible disk, hard disk, etc.), an optical disk (CD-ROM, DVD) It is also possible to store and distribute in a recording medium such as a semiconductor memory.
 本発明の実施形態に従った予測信号生成装置、動画像符号化装置及び動画像復号化装置によれば、幾何変換動き補償予測に用いる幾何変換パラメータの導出によって発生したパラメータの誤差を低減するとともに、誤差伝播を抑制し、符号量を増加させることなく予測効率を向上する予測信号生成装置、動画像符号化装置及び動画像復号化装置を提供することが可能になる。 According to the prediction signal generation device, the moving image encoding device, and the moving image decoding device according to the embodiment of the present invention, the error of the parameter generated by the derivation of the geometric transformation parameter used for the geometric transformation motion compensation prediction is reduced. Thus, it is possible to provide a prediction signal generation device, a moving image encoding device, and a moving image decoding device that suppress error propagation and improve prediction efficiency without increasing the code amount.
 また、本発明の他の態様として、以下のような予測信号生成方法、動画像符号化方法及び動画像復号化方法が提供できる。 Also, as another aspect of the present invention, the following prediction signal generation method, video encoding method, and video decoding method can be provided.
 予測信号を生成する予測信号生成方法は、画素ブロックの幾何変換による画像の形状に係る情報を示す第一の幾何変換パラメータと第二の幾何変換パラメータのどちらを用いるかを示す予測選択情報を設定するステップと、画像信号が分割された複数の画素ブロックの1つの画素ブロックに隣接する複数の第1の隣接ブロックのうちの、既に予測信号生成処理が完了した1つ以上の第2の隣接ブロックの動き情報又は前記第一の幾何変換パラメータ及び前記第二の幾何変換パラメータを取得するステップと、前記1つ以上の第2の隣接ブロックの幾何変換パラメータから、前記1つの画素ブロックの予測幾何変換パラメータを導出するステップと、外部から入力された幾何変換パラメータの導出値と前記予測幾何変換パラメータから、予め定められた方法によって前記第一の幾何変換パラメータを導出して、設定するステップと、前記1つの画素ブロック及び前記1つ以上の第2の隣接ブロックの動き情報に基づいて、前記第二の幾何変換パラメータを設定すること、前記1つの画素ブロックに対する動き補償を行う際の参照画像信号に対して前記選択情報に示される前記第一の幾何変換パラメータ又は第二の幾何変換パラメータを用いて幾何変換処理を行って予測信号を生成するステップと、を含む。 A prediction signal generation method for generating a prediction signal sets prediction selection information indicating whether to use a first geometric conversion parameter or a second geometric conversion parameter indicating information related to the shape of an image obtained by pixel block geometric conversion. And one or more second adjacent blocks for which prediction signal generation processing has already been completed among a plurality of first adjacent blocks adjacent to one pixel block of the plurality of pixel blocks into which the image signal is divided And obtaining the first geometric transformation parameter and the second geometric transformation parameter, and predictive geometric transformation of the one pixel block from the geometric transformation parameters of the one or more second neighboring blocks A step of deriving a parameter, a derived value of a geometric transformation parameter input from the outside, and the predicted geometric transformation parameter, in advance Deriving and setting the first geometric transformation parameter according to a selected method, and the second geometry based on motion information of the one pixel block and the one or more second neighboring blocks. Setting a transformation parameter; using the first geometric transformation parameter or the second geometric transformation parameter indicated in the selection information for a reference image signal when performing motion compensation for the one pixel block; Performing a process to generate a prediction signal.
 動画像符号化方法は画素ブロックの幾何変換による画像の形状に係る情報を示す第一の幾何変換パラメータと第二の幾何変換パラメータのどちらを用いるかを示す予測選択情報を設定するステップと、画像信号が分割された複数の画素ブロックの1つの画素ブロックに隣接する複数の第1の隣接ブロックのうちの、既に予測信号生成処理が完了した1つ以上の第2の隣接ブロックの動き情報又は前記第一の幾何変換パラメータ及び前記第二の幾何変換パラメータを取得するステップと、前記1つ以上の第2の隣接ブロックの幾何変換パラメータから、前記1つの画素ブロックの予測幾何変換パラメータを導出するステップと、外部から入力された幾何変換パラメータの導出値と前記予測幾何変換パラメータから、予め定められた方法によって前記第一の幾何変換パラメータを導出して、設定するステップと、前記1つの画素ブロック及び前記1つ以上の第2の隣接ブロックの動き情報に基づいて、前記第二の幾何変換パラメータを設定すること、前記1つの画素ブロックに対する動き補償を行う際の参照画像信号に対して前記選択情報に示される前記第一の幾何変換パラメータ又は第二の幾何変換パラメータを用いて幾何変換処理を行って予測信号を生成するステップと、前記第一の幾何変換パラメータと前記第二の幾何変換パラメータのどちらを用いるかを示す前記予測選択情報を符号化するステップと、前記第一の幾何変換パラメータが選択された場合に、前記第一の幾何変換パラメータと前記予測幾何変換パラメータから、予め規定された方法で幾何変換パラメータの導出値を導出して、前記導出値を符号化するステップと、前記入力画像信号と前記予測信号の差分信号を符号化するステップと、を含む。 The moving image encoding method includes a step of setting prediction selection information indicating whether to use a first geometric transformation parameter or a second geometric transformation parameter indicating information related to an image shape by geometric transformation of a pixel block; Motion information of one or more second adjacent blocks for which prediction signal generation processing has already been completed among a plurality of first adjacent blocks adjacent to one pixel block of the plurality of pixel blocks into which the signal is divided, or Obtaining a first geometric transformation parameter and the second geometric transformation parameter, and deriving a predicted geometric transformation parameter of the one pixel block from the geometric transformation parameters of the one or more second neighboring blocks. And a derived method of the geometric transformation parameter input from the outside and the predicted geometric transformation parameter by a predetermined method. Deriving and setting the first geometric transformation parameter, and setting the second geometric transformation parameter based on motion information of the one pixel block and the one or more second adjacent blocks. Prediction is performed by performing a geometric transformation process using the first geometric transformation parameter or the second geometric transformation parameter indicated in the selection information on a reference image signal when performing motion compensation on the one pixel block. A step of generating a signal, a step of encoding the prediction selection information indicating whether to use the first geometric transformation parameter or the second geometric transformation parameter, and the first geometric transformation parameter are selected. The derived value of the geometric transformation parameter from the first geometric transformation parameter and the predicted geometric transformation parameter by a predetermined method. Derived to includes a step of encoding the derived value, a step of encoding the differential signal of the input image signal and the prediction signal.
 入力画像信号を複数の画素ブロック単位に符号化処理された動画像符号化データを解読し、規定された方法で復号化処理する動画像復号化方法は、入力画像信号が分割された複数の画素ブロックの1つの画素ブロックに隣接する複数の第1の隣接ブロックのうちの、既に復号処理が完了した1つ以上の第2の隣接ブロックの動き情報又は画素ブロックの幾何変換による画像の形状に係る情報を示す第一の幾何変換パラメータ及び第二の幾何変換パラメータを取得するステップと、前記第一の幾何変換パラメータと前記第二の幾何変換パラメータのどちらを用いるかを示す選択情報を復号するステップと、前記第一の幾何変換パラメータが選択されている場合に、幾何変換パラメータの導出値を復号するステップと、前記1つ以上の第2の隣接ブロックの前記幾何変換パラメータから、前記1つの画素ブロックの予測幾何変換パラメータを導出するステップと、復号された前記幾何変換パラメータの導出値と前記予測幾何変換パラメータから、予め規定された方法で前記第一の幾何変換パラメータを導出して、設定するステップと、前記1つの画素ブロック及び前記1つ以上の第2の隣接ブロックの動き情報に基づいて、前記第二の幾何変換パラメータを設定するステップと、前記1つの画素ブロックに対する動き補償を行う際の参照画像信号に対して前記選択情報に示される前記第一の幾何変換パラメータ又は第二の幾何変換パラメータを用いて幾何変換処理を行なって予測信号を生成するステップと、を含む。 A moving image decoding method for decoding moving image encoded data obtained by encoding an input image signal in units of a plurality of pixel blocks and performing decoding processing by a prescribed method includes a plurality of pixels obtained by dividing an input image signal. Of the plurality of first neighboring blocks adjacent to one pixel block of the block, the motion information of one or more second neighboring blocks that have already been decoded, or the shape of the image by the geometric transformation of the pixel block Obtaining a first geometric transformation parameter and a second geometric transformation parameter indicating information; and decoding selection information indicating whether to use the first geometric transformation parameter or the second geometric transformation parameter Decoding a derived value of the geometric transformation parameter if the first geometric transformation parameter is selected; and the one or more second neighbors Deriving a predicted geometric transformation parameter of the one pixel block from the geometric transformation parameter of the block, and deriving the predicted geometric transformation parameter from the decoded derived value of the geometric transformation parameter and the predicted geometric transformation parameter in a predetermined manner. Deriving and setting one geometric transformation parameter; setting the second geometric transformation parameter based on motion information of the one pixel block and the one or more second neighboring blocks; A prediction signal obtained by performing a geometric transformation process using the first geometric transformation parameter or the second geometric transformation parameter indicated in the selection information on a reference image signal when performing motion compensation for the one pixel block. Generating.
 本発明にかかる動画像符号化装置、動画像復号化装置、動画像符号化方法、及び、動画像復号化方法は、高効率な動画像の符号化に有用であり、特に、幾何変換動き補償予測に用いる幾何変換パラメータの推定に必要な動き検出処理を低減する動画像符号化に適している。 The moving image encoding device, the moving image decoding device, the moving image encoding method, and the moving image decoding method according to the present invention are useful for highly efficient moving image encoding, and in particular, geometric transformation motion compensation. It is suitable for moving picture coding that reduces the motion detection process necessary for estimating the geometric transformation parameter used for prediction.

Claims (7)

  1.  画素ブロックの幾何変換による画像の形状に係る情報を示す第一の幾何変換パラメータと第二の幾何変換パラメータのどちらを用いるかを示す予測選択情報を設定する第1の設定部と、
     画像信号が分割された複数の画素ブロックの1つの画素ブロックに隣接する複数の第1の隣接ブロックのうちの、既に予測信号生成処理が完了した1つ以上の第2の隣接ブロックの動き情報又は前記幾何変換パラメータを取得する取得部と、
     前記1つ以上の第2の隣接ブロックの幾何変換パラメータから、前記1つの画素ブロックの予測幾何変換パラメータを導出する導出部と、
     入力された幾何変換パラメータの導出値と前記予測幾何変換パラメータから、予め定められた方法によって前記第一の幾何変換パラメータを導出して、設定する第2の設定部と、
     前記1つの画素ブロック及び前記1つ以上の第2の隣接ブロックの動き情報に基づいて、前記第二の幾何変換パラメータを設定する第3の設定部と、
     前記1つの画素ブロックに対する動き補償を行う際の参照画像信号に対して前記選択情報に示される前記第一の幾何変換パラメータ又は第二の幾何変換パラメータを用いて幾何変換処理を行って予測信号を生成する生成部と、
     を具備する予測信号生成装置。
    A first setting unit for setting prediction selection information indicating which of the first geometric transformation parameter indicating the information related to the shape of the image by the geometric transformation of the pixel block and the second geometric transformation parameter is used;
    Motion information of one or more second adjacent blocks for which prediction signal generation processing has already been completed among a plurality of first adjacent blocks adjacent to one pixel block of the plurality of pixel blocks into which the image signal is divided or An acquisition unit for acquiring the geometric transformation parameters;
    A derivation unit for deriving predicted geometric transformation parameters of the one pixel block from geometric transformation parameters of the one or more second adjacent blocks;
    A second setting unit for deriving and setting the first geometric transformation parameter from the input geometric transformation parameter derived value and the predicted geometric transformation parameter by a predetermined method;
    A third setting unit that sets the second geometric transformation parameter based on motion information of the one pixel block and the one or more second adjacent blocks;
    A prediction signal is obtained by performing a geometric transformation process using the first geometric transformation parameter or the second geometric transformation parameter indicated in the selection information on a reference image signal when performing motion compensation for the one pixel block. A generating unit to generate;
    A prediction signal generation apparatus comprising:
  2.  前記第3の設定部は、前記動き情報を、前記隣接ブロックと前記画素ブロックとの相対位置に基づいて変換することにより、前記第二の幾何変換パラメータを取得する、請求項1記載の予測信号生成装置。 The prediction signal according to claim 1, wherein the third setting unit obtains the second geometric transformation parameter by transforming the motion information based on a relative position between the adjacent block and the pixel block. Generator.
  3.  前記動き情報取得部は、前記複数の第2の隣接ブロックの、前記隣接ブロック毎に前記参照画像信号に対する動き予測を行って得られる動きベクトルに基づいて、1つの動き情報を取得する、請求項1記載の予測信号生成装置。 The motion information acquisition unit acquires one motion information based on a motion vector obtained by performing motion prediction on the reference image signal for each of the adjacent blocks of the plurality of second adjacent blocks. The prediction signal generation device according to 1.
  4.  前記動き情報取得部は、前記1つ以上の第2の隣接ブロックで、幾何変換予測が選択されていない場合に、隣接ブロックを基準としたときの隣接ブロックに対応する既に予測処理が完了した動きベクトルを用いて、幾何変換パラメータを導出する、請求項1記載の予測信号生成装置。 The motion information acquisition unit, in the one or more second adjacent blocks, when the geometric transformation prediction is not selected, the motion for which the prediction process corresponding to the adjacent block when the adjacent block is used as a reference has been completed The prediction signal generation device according to claim 1, wherein a geometric transformation parameter is derived using a vector.
  5.  前記請求項1に加えて、前記第一の幾何変換パラメータと前記第二の幾何変換パラメータのどちらを用いるかを示す前記予測選択情報を符号化する第1の符号化部と、
     前記第一の幾何変換パラメータが選択された場合に、前記第一の幾何変換パラメータと前記予測幾何変換パラメータから、予め規定された方法で幾何変換パラメータの導出値を導出して、前記導出値を符号化する第2の符号化部と、
     前記入力画像信号と前記予測信号の差分信号を示す情報を符号化する第3の符号化部と、
     を有する、請求項1の予測信号生成装置を用いた動画像符号化装置。
    In addition to the first aspect, a first encoding unit that encodes the prediction selection information indicating whether to use the first geometric transformation parameter or the second geometric transformation parameter;
    When the first geometric transformation parameter is selected, a derived value of the geometric transformation parameter is derived from the first geometric transformation parameter and the predicted geometric transformation parameter by a predetermined method, and the derived value is obtained. A second encoding unit for encoding;
    A third encoding unit that encodes information indicating a difference signal between the input image signal and the prediction signal;
    A video encoding apparatus using the prediction signal generation apparatus according to claim 1.
  6.  前記第1の符号化部は、さらに、前記導出部において前記1つ以上の第2の隣接ブロックの前記幾何変換パラメータのいずれを利用したかを示す情報を符号化する、請求項1の予測信号生成装置を用いた動画像符号化装置。 2. The prediction signal according to claim 1, wherein the first encoding unit further encodes information indicating which of the one or more second adjacent blocks the geometric transformation parameters are used in the deriving unit. A video encoding apparatus using the generation apparatus.
  7.  入力画像信号を複数の画素ブロック単位に符号化処理された動画像符号化データを解読し、規定された方法で復号化処理する動画像復号化装置において、
     入力画像信号が分割された複数の画素ブロックの1つの画素ブロックに隣接する複数の第1の隣接ブロックのうちの、既に復号処理が完了した1つ以上の第2の隣接ブロックの動き情報又は画素ブロックの幾何変換による画像の形状に係る情報を示す幾何変換パラメータを取得する動き情報取得部と、
     前記第一の幾何変換パラメータと前記第二の幾何変換パラメータのどちらを用いるかを示す選択情報を復号する第1の復号部と、
     前記第一の幾何変換パラメータが選択されている場合に、幾何変換パラメータの導出値を復号する第2の復号部と、
     前記1つ以上の第2の隣接ブロックの前記幾何変換パラメータから、前記1つの画素ブロックの予測幾何変換パラメータを導出する導出部と、
     復号された前記幾何変換パラメータの導出値と前記予測幾何変換パラメータから、予め規定された方法で前記第一の幾何変換パラメータを導出して、設定する第1の設定部と、
     前記第二の幾何変換パラメータが選択されている場合に、前記1つの画素ブロック及び前記1つ以上の第2の隣接ブロックの動き情報に基づいて、前記第二の幾何変換パラメータを設定する第2の設定部と、
     前記1つの画素ブロックに対する動き補償を行う際の参照画像信号に対して前記選択情報に示される前記第一の幾何変換パラメータ又は第二の幾何変換パラメータを用いて幾何変換処理を行なって予測信号を生成する生成部と、
     を具備することを特徴とする動画像復号化装置。
    In a moving image decoding apparatus that decodes moving image encoded data obtained by encoding an input image signal in units of a plurality of pixel blocks and performs decoding processing by a prescribed method,
    Motion information or pixels of one or more second adjacent blocks that have already been decoded among a plurality of first adjacent blocks adjacent to one pixel block of the plurality of pixel blocks into which the input image signal is divided A motion information acquisition unit that acquires a geometric transformation parameter indicating information related to the shape of the image by geometric transformation of the block;
    A first decoding unit for decoding selection information indicating whether to use the first geometric transformation parameter or the second geometric transformation parameter;
    A second decoding unit that decodes a derived value of the geometric transformation parameter when the first geometric transformation parameter is selected;
    A derivation unit for deriving predicted geometric transformation parameters of the one pixel block from the geometric transformation parameters of the one or more second adjacent blocks;
    A first setting unit that derives and sets the first geometric transformation parameter from the decoded derived value of the geometric transformation parameter and the predicted geometric transformation parameter by a predetermined method;
    A second setting of the second geometric transformation parameter based on motion information of the one pixel block and the one or more second neighboring blocks when the second geometric transformation parameter is selected; The setting part of
    A prediction signal is obtained by performing a geometric transformation process using the first geometric transformation parameter or the second geometric transformation parameter indicated in the selection information on a reference image signal when performing motion compensation for the one pixel block. A generating unit to generate;
    A moving picture decoding apparatus comprising:
PCT/JP2009/063692 2009-07-31 2009-07-31 Prediction-signal producing device using geometric transformation motion-compensation prediction, time-varying image encoding device, and time-varying image decoding device WO2011013253A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2009/063692 WO2011013253A1 (en) 2009-07-31 2009-07-31 Prediction-signal producing device using geometric transformation motion-compensation prediction, time-varying image encoding device, and time-varying image decoding device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2009/063692 WO2011013253A1 (en) 2009-07-31 2009-07-31 Prediction-signal producing device using geometric transformation motion-compensation prediction, time-varying image encoding device, and time-varying image decoding device

Publications (1)

Publication Number Publication Date
WO2011013253A1 true WO2011013253A1 (en) 2011-02-03

Family

ID=43528928

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2009/063692 WO2011013253A1 (en) 2009-07-31 2009-07-31 Prediction-signal producing device using geometric transformation motion-compensation prediction, time-varying image encoding device, and time-varying image decoding device

Country Status (1)

Country Link
WO (1) WO2011013253A1 (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2504069A (en) * 2012-07-12 2014-01-22 Canon Kk Intra-prediction using a parametric displacement transformation
JP2018509087A (en) * 2015-02-16 2018-03-29 華為技術有限公司Huawei Technologies Co.,Ltd. Video image encoding method, video image decoding method, encoding device, and decoding device
WO2018067823A1 (en) * 2016-10-05 2018-04-12 Qualcomm Incorporated Motion vector prediction for affine motion models in video coding
KR20180043830A (en) * 2015-09-29 2018-04-30 후아웨이 테크놀러지 컴퍼니 리미티드 Image prediction method and apparatus
JP2018529255A (en) * 2015-08-29 2018-10-04 ホアウェイ・テクノロジーズ・カンパニー・リミテッド Image prediction method and apparatus
WO2019245228A1 (en) * 2018-06-18 2019-12-26 엘지전자 주식회사 Method and device for processing video signal using affine motion prediction
US10560712B2 (en) 2016-05-16 2020-02-11 Qualcomm Incorporated Affine motion prediction for video coding
GB2577318A (en) * 2018-09-21 2020-03-25 Canon Kk Video coding and decoding
WO2020058954A1 (en) * 2018-09-23 2020-03-26 Beijing Bytedance Network Technology Co., Ltd. Representation of affine model
CN111066324A (en) * 2017-08-03 2020-04-24 Lg 电子株式会社 Method and apparatus for processing video signal using affine prediction
WO2020108560A1 (en) * 2018-11-30 2020-06-04 Mediatek Inc. Video processing methods and apparatuses of determining motion vectors for storage in video coding systems
US11778226B2 (en) 2018-10-22 2023-10-03 Beijing Bytedance Network Technology Co., Ltd Storage of motion information for affine mode
US11877001B2 (en) 2017-10-10 2024-01-16 Qualcomm Incorporated Affine prediction in video coding
WO2024066332A1 (en) * 2022-09-27 2024-04-04 Beijing Xiaomi Mobile Software Co., Ltd. Encoding/decoding video picture data

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09266574A (en) * 1996-01-22 1997-10-07 Matsushita Electric Ind Co Ltd Image encoding device and image decoding device
JP2000138935A (en) * 1998-10-29 2000-05-16 Fujitsu Ltd Motion vector encoding device and decoding device
JP2005244503A (en) * 2004-02-25 2005-09-08 Sony Corp Apparatus and method for coding image information
JP2007312397A (en) * 2007-05-25 2007-11-29 Nokia Corp Method and apparatus for video frame transfer in communication system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09266574A (en) * 1996-01-22 1997-10-07 Matsushita Electric Ind Co Ltd Image encoding device and image decoding device
JP2000138935A (en) * 1998-10-29 2000-05-16 Fujitsu Ltd Motion vector encoding device and decoding device
JP2005244503A (en) * 2004-02-25 2005-09-08 Sony Corp Apparatus and method for coding image information
JP2007312397A (en) * 2007-05-25 2007-11-29 Nokia Corp Method and apparatus for video frame transfer in communication system

Cited By (41)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2504069B (en) * 2012-07-12 2015-09-16 Canon Kk Method and device for predicting an image portion for encoding or decoding of an image
US9779516B2 (en) 2012-07-12 2017-10-03 Canon Kabushiki Kaisha Method and device for predicting an image portion for encoding or decoding of an image
GB2504069A (en) * 2012-07-12 2014-01-22 Canon Kk Intra-prediction using a parametric displacement transformation
US10349079B2 (en) 2015-02-16 2019-07-09 Huawei Technologies Co., Ltd. Video image encoding method, video image decoding method, encoding device, and decoding device
JP2018509087A (en) * 2015-02-16 2018-03-29 華為技術有限公司Huawei Technologies Co.,Ltd. Video image encoding method, video image decoding method, encoding device, and decoding device
JP2018529255A (en) * 2015-08-29 2018-10-04 ホアウェイ・テクノロジーズ・カンパニー・リミテッド Image prediction method and apparatus
US11368678B2 (en) 2015-08-29 2022-06-21 Huawei Technologies Co., Ltd. Image prediction method and device
US11979559B2 (en) 2015-08-29 2024-05-07 Huawei Technologies Co., Ltd. Image prediction method and device
US10880543B2 (en) 2015-08-29 2020-12-29 Huawei Technologies Co., Ltd. Image prediction method and device
KR20180043830A (en) * 2015-09-29 2018-04-30 후아웨이 테크놀러지 컴퍼니 리미티드 Image prediction method and apparatus
US11323736B2 (en) 2015-09-29 2022-05-03 Huawei Technologies Co., Ltd. Image prediction method and apparatus
KR20200057120A (en) * 2015-09-29 2020-05-25 후아웨이 테크놀러지 컴퍼니 리미티드 Image prediction method and device
US10560712B2 (en) 2016-05-16 2020-02-11 Qualcomm Incorporated Affine motion prediction for video coding
US11503324B2 (en) 2016-05-16 2022-11-15 Qualcomm Incorporated Affine motion prediction for video coding
WO2018067823A1 (en) * 2016-10-05 2018-04-12 Qualcomm Incorporated Motion vector prediction for affine motion models in video coding
RU2718225C1 (en) * 2016-10-05 2020-03-31 Квэлкомм Инкорпорейтед Motion vector prediction for affine motion models in video coding
US10448010B2 (en) 2016-10-05 2019-10-15 Qualcomm Incorporated Motion vector prediction for affine motion models in video coding
US11082687B2 (en) 2016-10-05 2021-08-03 Qualcomm Incorporated Motion vector prediction for affine motion models in video coding
EP3758378A1 (en) * 2016-10-05 2020-12-30 QUALCOMM Incorporated Motion vector prediction for affine motion models in video coding
US11736707B2 (en) 2017-08-03 2023-08-22 Lg Electronics Inc. Method and apparatus for processing video signal using affine prediction
CN111066324B (en) * 2017-08-03 2024-05-28 Oppo广东移动通信有限公司 Method and apparatus for processing video signal using affine prediction
CN111066324A (en) * 2017-08-03 2020-04-24 Lg 电子株式会社 Method and apparatus for processing video signal using affine prediction
US11877001B2 (en) 2017-10-10 2024-01-16 Qualcomm Incorporated Affine prediction in video coding
WO2019245228A1 (en) * 2018-06-18 2019-12-26 엘지전자 주식회사 Method and device for processing video signal using affine motion prediction
US11140410B2 (en) 2018-06-18 2021-10-05 Lg Electronics Inc. Method and device for processing video signal using affine motion prediction
CN112567749B (en) * 2018-06-18 2024-03-26 Lg电子株式会社 Method and apparatus for processing video signal using affine motion prediction
CN112567749A (en) * 2018-06-18 2021-03-26 Lg电子株式会社 Method and apparatus for processing video signal using affine motion prediction
US11632567B2 (en) 2018-06-18 2023-04-18 Lg Electronics Inc. Method and device for processing video signal using affine motion prediction
GB2577318A (en) * 2018-09-21 2020-03-25 Canon Kk Video coding and decoding
GB2577318B (en) * 2018-09-21 2021-03-10 Canon Kk Video coding and decoding
US11909953B2 (en) 2018-09-23 2024-02-20 Beijing Bytedance Network Technology Co., Ltd Representation of affine model
US11870974B2 (en) 2018-09-23 2024-01-09 Beijing Bytedance Network Technology Co., Ltd Multiple-hypothesis affine mode
WO2020058954A1 (en) * 2018-09-23 2020-03-26 Beijing Bytedance Network Technology Co., Ltd. Representation of affine model
US11778226B2 (en) 2018-10-22 2023-10-03 Beijing Bytedance Network Technology Co., Ltd Storage of motion information for affine mode
US11785242B2 (en) 2018-11-30 2023-10-10 Hfi Innovation Inc. Video processing methods and apparatuses of determining motion vectors for storage in video coding systems
WO2020108560A1 (en) * 2018-11-30 2020-06-04 Mediatek Inc. Video processing methods and apparatuses of determining motion vectors for storage in video coding systems
US11290739B2 (en) 2018-11-30 2022-03-29 Mediatek Inc. Video processing methods and apparatuses of determining motion vectors for storage in video coding systems
CN113170174B (en) * 2018-11-30 2024-04-12 寰发股份有限公司 Video processing method and apparatus for determining motion vectors for storage in video coding system
TWI737055B (en) * 2018-11-30 2021-08-21 聯發科技股份有限公司 Video processing methods and apparatuses of determining motion vectors for storage in video coding systems
CN113170174A (en) * 2018-11-30 2021-07-23 联发科技股份有限公司 Video processing method and apparatus for determining motion vector for storage in video coding system
WO2024066332A1 (en) * 2022-09-27 2024-04-04 Beijing Xiaomi Mobile Software Co., Ltd. Encoding/decoding video picture data

Similar Documents

Publication Publication Date Title
WO2011013253A1 (en) Prediction-signal producing device using geometric transformation motion-compensation prediction, time-varying image encoding device, and time-varying image decoding device
KR101984764B1 (en) Video Coding and Decoding Method and Apparatus
JP6615287B2 (en) Image decoding device
JP6667609B2 (en) Image encoding device, image encoding method, image decoding device, and image decoding method
KR101362757B1 (en) Method and apparatus for image encoding and decoding using inter color compensation
JP7012809B2 (en) Image coding device, moving image decoding device, moving image coding data and recording medium
JP5061179B2 (en) Illumination change compensation motion prediction encoding and decoding method and apparatus
KR101670532B1 (en) Method for decoding a stream representative of a sequence of pictures, method for coding a sequence of pictures and coded data structure
JP4844449B2 (en) Moving picture encoding apparatus, method, program, moving picture decoding apparatus, method, and program
WO2012176381A1 (en) Moving image encoding apparatus, moving image decoding apparatus, moving image encoding method and moving image decoding method
JP5310614B2 (en) Moving picture coding apparatus, moving picture coding method, moving picture decoding apparatus, and moving picture decoding method
JP2010011075A (en) Method and apparatus for encoding and decoding moving image
WO2014163200A1 (en) Color image encoding apparatus, color image decoding apparatus, color image encoding method, and color image decoding method
WO2010090335A1 (en) Motion picture coding device and motion picture decoding device using geometric transformation motion compensating prediction
JP2005086834A (en) Method for encoding frame sequence, method for decoding frame sequence, apparatus for implementing the method, computer program for implementing the method and recording medium for storing the computer program
JP4360093B2 (en) Image processing apparatus and encoding apparatus and methods thereof
KR20150135457A (en) Method for encoding a plurality of input images and storage medium and device for storing program
JP2010183162A (en) Motion picture encoder
WO2012176387A1 (en) Video encoding device, video decoding device, video encoding method and video decoding method
JP2008193501A (en) Image encoding device and image encoding method
JP5533885B2 (en) Moving picture encoding apparatus and moving picture decoding apparatus
JP2023086397A (en) Intra prediction device, decoding device, and program
WO2013077305A1 (en) Image decoding device, image decoding method, image coding device
JP2009111733A (en) Method, device and program for encoding image
KR20150022952A (en) Reference Frame Creating Method and Apparatus and Video Encoding/Decoding Method and Apparatus Using Same

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09847844

Country of ref document: EP

Kind code of ref document: A1

DPE2 Request for preliminary examination filed before expiration of 19th month from priority date (pct application filed from 20040101)
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09847844

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP