WO2009128208A1 - Dynamic image encoder, dynamic image decoder, dynamic image encoding method, and dynamic image decoding method - Google Patents

Dynamic image encoder, dynamic image decoder, dynamic image encoding method, and dynamic image decoding method Download PDF

Info

Publication number
WO2009128208A1
WO2009128208A1 PCT/JP2009/001449 JP2009001449W WO2009128208A1 WO 2009128208 A1 WO2009128208 A1 WO 2009128208A1 JP 2009001449 W JP2009001449 W JP 2009001449W WO 2009128208 A1 WO2009128208 A1 WO 2009128208A1
Authority
WO
WIPO (PCT)
Prior art keywords
motion
unit
prediction
inter
encoding
Prior art date
Application number
PCT/JP2009/001449
Other languages
French (fr)
Japanese (ja)
Inventor
高橋昌史
山口宗明
伊藤浩朗
Original Assignee
株式会社日立製作所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社日立製作所 filed Critical 株式会社日立製作所
Priority to JP2010508099A priority Critical patent/JPWO2009128208A1/en
Publication of WO2009128208A1 publication Critical patent/WO2009128208A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • H04N19/52Processing of motion vectors by encoding by predictive encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding

Definitions

  • the present invention relates to a moving picture encoding technique for encoding a moving picture and a moving picture decoding technique for decoding a moving picture.
  • the encoding target image is predicted in block units using the image information that has been encoded, and the prediction difference from the original image is encoded, thereby eliminating the redundancy of the moving image.
  • the code amount is reduced.
  • high-precision prediction is enabled by searching the reference image for a block having a high correlation with the encoding target block.
  • H. is one of the standards for such inter-screen prediction.
  • the H.264 / AVC (Advanced Video Coding) standard introduces a prediction technique for motion vectors in order to reduce the amount of motion vector codes.
  • the motion vector of the target block is predicted using an encoded block located around the target block, and the difference between the prediction vector and the motion vector (difference vector) is variable. Encode long.
  • an object of the present invention is to improve a compression efficiency by reducing a code amount of a motion vector by improving a difference vector calculation method, a moving image decoding device, a moving image It is to provide an encoding method and a moving image decoding method.
  • a moving image coding apparatus specifies a start frame and an end frame within a plurality of frames, and models the movement of a target region between the start frame and the end frame by a time function. It has a motion description part.
  • the moving image decoding apparatus further includes a variable length decoding unit that decodes a motion vector of the target region based on a time function in which the motion of the target region between the start frame and the end frame is modeled. Is.
  • the effect obtained by a typical one is that by improving the calculation method of the difference vector, the code amount of the motion vector can be reduced to improve the compression efficiency, and a high-quality video with a small code amount can be obtained. Can be provided.
  • FIG. 1 is an explanatory diagram for explaining an inter-screen prediction process that is a premise of a moving picture coding method and a moving picture decoding method according to an embodiment of the present invention.
  • 2 conceptually shows the operation of the inter-screen prediction processing by H.264 / AVC.
  • FIG. 2 is an explanatory diagram for explaining calculation of a prediction vector in inter-frame prediction processing which is a premise of the video encoding method and video decoding method according to an embodiment of the present invention. This is a conceptual illustration of how to do this.
  • the encoding target image is encoded in block units according to the raster scan order.
  • a decoded image of an encoded image included in the same video 501 as the encoding target image 503 is used as a reference image 502, and the target block 504 in the target image A block 505 having a high correlation is searched from the reference image.
  • the difference between the coordinate values of both blocks is encoded as a motion vector 506.
  • the reverse procedure described above may be performed at the time of decoding, and the decoded image can be acquired by adding the decoded prediction difference to the block 505 in the reference image.
  • a prediction technique for the motion vector is introduced. That is, when a motion vector is encoded, a motion vector of the target block is predicted using an encoded block located around the target block, and a difference vector between the prediction vector and the motion vector is encoded. At this time, since the size of the difference vector is concentrated to almost zero, the amount of codes can be reduced by variable-length encoding the difference vector.
  • the prediction vector is calculated by setting the encoded blocks adjacent to the left side, the upper side, and the upper right side of the target block 601 as the block A 602, the block B 603, and the block C 604, respectively.
  • MVA, MVB, and MVC are the encoded blocks adjacent to the left side, the upper side, and the upper right side of the target block 601 as the block A 602, the block B 603, and the block C 604, respectively.
  • the prediction vector PMV is calculated like the prediction vector PMV605 using a function Median that returns the median value of a plurality of values designated as arguments.
  • the difference vector DMV is calculated as the difference vector 606 between the motion vector MV of the target block and the prediction vector PMV, and then the DMV is variable-length encoded.
  • H. In H.264 / AVC it has become possible to significantly reduce the amount of code required for motion vectors by introducing a prediction technique for motion vectors.
  • H. In the case of H.264 / AVC only the neighboring blocks in the spatial direction are considered when calculating the prediction vector, and it cannot be said that the motion of the object is necessarily reflected.
  • the motion vector prediction accuracy is not sufficient particularly in an image having a plurality of moving objects, and a large amount of code is still required for the motion vector.
  • the prediction accuracy for a motion vector can be improved by modeling the motion of the encoding target region as a time function and using it for calculation of a prediction vector.
  • FIG. 3 is a diagram showing an example of motion modeling of the moving picture encoding method and the moving picture decoding method according to the embodiment of the present invention
  • FIG. 4 is a moving picture encoding according to the embodiment of the present invention
  • FIG. 5 is a diagram showing an example of motion vector encoding of the method and the video decoding method
  • FIG. 5 is a selection of start and end frames of the video encoding method and video decoding method according to an embodiment of the present invention.
  • FIG. 6 is a diagram illustrating another example of motion modeling of the moving picture coding method and the moving picture decoding method according to an embodiment of the present invention
  • FIG. 7 is a diagram illustrating one embodiment of the present invention. It is a figure which shows the other example of selection of the start frame and end frame of the moving image encoding method and moving image decoding method which concern on this form.
  • a start frame 701 and an end frame 705 are prepared with n ⁇ 1 frames 702 to 704 interposed therebetween. Subsequently, an area 707 corresponding to the specific area 708 in the end frame is searched in the start frame, and the movement of the target area is modeled as a time function MVMt 706 based on the difference value of the coordinates of these corresponding areas.
  • the motion vectors in the frames 702 to 704 sandwiched between the start frame and the end frame are encoded. Further, the modeled motion information is separately encoded and stored in a stream.
  • the movement of the target area is modeled linearly and represented by a linear function at time t.
  • the coefficients A, B, C, and D of the function MVMt 706 are encoded as motion parameters.
  • information for specifying a range of a frame to which motion modeling by the function MVMt can be applied such as a start frame number and an end frame number
  • region information to which motion modeling can be applied such as a block number Is encoded.
  • the movement of the target region is approximated by a straight line, but this is approximated by, for example, an ellipse, a quadratic parabola, a Bezier curve, a clothoid curve, a cycloid, a reflection, a pendulum motion, etc. It doesn't matter.
  • a motion vector is encoded using a motion model represented by the function MVMt.
  • the motion MVMt 805 modeled using the region 807 in the start frame and the region 809 in the corresponding end frame is used.
  • H.C. As with H.264 / AVC, DMV is variable length encoded.
  • the motion vector encoding method in the frame sandwiched between the start frame and the end frame has been described, but the motion vector encoding method in the start frame and the end frame is not particularly limited. However, for example, as shown in FIG. Similar to the H.264 / AVC method, it is effective to use a method of predicting a motion vector by referring to the peripheral blocks of the target block.
  • the selection method of the start frame and the end frame is not particularly limited.
  • inter-picture prediction using an I picture 901 that can use only intra-prediction and one reference image is performed using P pictures 904 and 907 that can be encoded and B pictures 902, 903, 905, and 906 that are capable of inter-screen prediction using two reference images.
  • the P picture 904 is encoded with reference to this.
  • the B pictures 902 and 903 are encoded with reference to the two encoded images 901 and 904.
  • the P picture 907 is encoded next, and then the B pictures 905 and 906 are encoded with reference to the two images 904 and 907.
  • B pictures 902 and 903 in between are encoded using this model.
  • the next P picture 907 is encoded, and motion modeling is performed using the two P pictures 904 and 907 as a start frame and an end frame, respectively.
  • B pictures 905 and 906 in between are encoded using this model.
  • encoding using a motion model may be combined with encoding not using a motion model.
  • the first frame 1301 is set as the start frame, and the end frame is selected according to the nature of the image.
  • an end frame 1305 is specified with three frames 1302, 1303, and 1304 sandwiched between the start frame and the start frame.
  • the method for determining the end frame is not particularly limited.
  • the range of the frame in which the motion of the target region can be represented by the motion model MVMt 706 in FIG. 3 is specified, and the end frame is sandwiched with the start frame It is effective to determine the frame.
  • the motion vector may be encoded by By doing so, encoding can be performed efficiently.
  • FIG. 7 shows another example of the method for selecting the start frame and the end frame.
  • the type of picture is not particularly limited, but for simplicity, only the case of using only an I picture and a P picture is shown.
  • encoding is performed in the same order as the video display order (1001 ⁇ 1002 ⁇ 1003 ⁇ ).
  • an I picture 1001 is encoded, and this is used as a start frame, and an image 1004 after n ⁇ 1 frames is used as an end frame to model motion.
  • the motion modeling is performed in units of blocks, but in addition, for example, the modeling may be performed in units of objects separated from the background of the image.
  • FIG. 8 is a block diagram showing the configuration of the moving picture coding apparatus according to the embodiment of the present invention.
  • the moving image coding apparatus includes an input image memory 102 that holds an input original image 101, a block dividing unit 103 that divides the input image into small regions, and an intra-screen prediction unit that performs intra-screen prediction in units of blocks. 105. Determine an inter-screen prediction unit 106 that performs inter-screen prediction on a block basis based on the amount of motion detected by the motion search unit 104, and predictive encoding means (prediction method and block size) that matches the nature of the image.
  • the reference image memory 117 is used for later prediction.
  • the input image memory 102 holds one image from the original image 101 as an encoding target image, and divides the image into fine blocks by the block dividing unit 103, and a motion search unit 104 and an in-screen prediction unit 105. , And the inter-screen prediction unit 106.
  • the motion search unit 104 calculates the motion amount of the corresponding block using the decoded image stored in the reference image memory 117, and passes the motion vector to the inter-screen prediction unit 106.
  • the intra-screen prediction unit 105 and the inter-screen prediction unit 106 execute the intra-screen prediction process and the inter-screen prediction process in units of blocks of several sizes, and the mode selection unit 107 selects an optimal prediction method.
  • the subtraction unit 108 generates a prediction difference by the optimal prediction encoding means (prediction method and block size) and passes it to the frequency conversion unit 109.
  • the frequency conversion unit 109 and the quantization processing unit 110 perform frequency conversion and quantization processing such as DCT (Discrete Cosine Transformation) in units of blocks having a size specified for the transmitted prediction difference. And pass it to the variable length coding unit 113 and the inverse quantization processing unit 114.
  • DCT Discrete Cosine Transformation
  • the motion description unit 111 models the motion of the target region by a time function based on information (image information, motion vector, etc.) regarding the start frame and the end frame, and information such as a start frame number, an end frame number, and a motion parameter. Is sent to the motion information memory 112 for storage.
  • variable length coding unit 113 the prediction difference information represented by the frequency conversion coefficient, for example, the prediction direction used when performing the intra prediction, the motion vector used when performing the inter prediction, and the motion Information necessary for decoding, such as motion parameters used for modeling, is subjected to variable length coding based on the probability of symbol generation to generate a coded stream.
  • a motion model stored in the motion information memory 112 is used to encode motion vectors in frames other than the start frame and the end frame.
  • the inverse quantization processing unit 114 and the inverse frequency transform unit 115 perform inverse frequency transform such as inverse quantization and IDCT (Inverse DCT) on the quantized frequency transform coefficients, respectively, so that the prediction difference Is sent to the adder 116.
  • the adder 116 generates a decoded image and stores it in the reference image memory 117.
  • FIG. 9 is a block diagram showing the configuration of the video decoding apparatus according to one embodiment of the present invention.
  • the moving picture decoding apparatus includes, for example, a variable length decoding unit 202 that performs the reverse procedure of variable length coding on the coded stream 201 generated by the moving picture coding apparatus illustrated in FIG. 8.
  • An inter-screen prediction unit 206 that performs inter-screen prediction, an intra-screen prediction unit 207 that performs intra-screen prediction, an adder unit 208 for acquiring a decoded image, and a reference image memory for temporarily storing the decoded image 209.
  • the variable length decoding unit 202 performs variable length decoding on the encoded stream 201 and acquires information necessary for prediction processing such as a frequency transform coefficient component of a prediction difference, a block size, a motion vector, and a motion parameter.
  • the former prediction difference information is sent to the inverse quantization processing unit 203, and the information necessary for the latter prediction processing is sent to the motion information memory 205, the inter-screen prediction unit 206, or the intra-screen prediction unit 207. .
  • the inverse quantization processing unit 203 and the inverse frequency transform unit 204 perform decoding by performing inverse quantization and inverse frequency transform on the prediction difference information, respectively. Further, the motion information memory 205 stores information necessary for motion modeling such as motion parameters.
  • the inter-screen prediction unit 206 or the intra-screen prediction unit 207 executes a prediction process with reference to the reference image memory 209 based on the information sent from the variable length decoding unit 202, and the addition unit 208 performs decoding.
  • a converted image is generated and the decoded image is stored in the reference image memory 209.
  • FIG. 10 is a block diagram showing an example of the configuration of a motion picture encoding apparatus motion description section according to an embodiment of the present invention.
  • FIG. 11 is a diagram of the motion picture encoding apparatus motion description section according to an embodiment of the present invention. It is a block diagram which shows the other example of a structure.
  • the motion description unit 111 receives the start frame and the end frame original image or the target image 301, and searches for a corresponding area between the start frame and the end frame, and a start frame memory 302 for storing the start frame. It includes a motion search unit 303 that performs the motion information modeling unit 304 that performs motion modeling based on the search result, and outputs motion information such as the motion parameter 305.
  • the original image or decoded image of the start frame is input to the motion description unit 111 and stored in the start frame memory 302. Subsequently, when the original image or decoded image of the end frame is input, the motion search unit 303 searches for the corresponding area between the start frame stored in the start frame memory 302 and the input end frame. The search result is passed to the motion information modeling unit 304.
  • the motion information modeling unit 304 performs motion modeling based on the search result.
  • the motion of the target region is modeled by the function MVMt 306 in FIG.
  • a motion description unit 111 receives a motion vector 401 calculated when encoding a start frame and an end frame, a start frame memory 402 for storing the motion vector of the start frame, and a motion vector of the start frame. And a motion information modeling unit 403 that models the motion of the target region from the motion vector of the end frame, and outputs motion information such as the motion parameter 404.
  • a motion vector calculated when a start frame is encoded is input to the motion description unit 111 and stored in the start frame memory 402. Subsequently, when the motion vector calculated when the end frame is encoded is input, the motion information modeling unit 403 performs motion modeling based on both vectors. In the motion information modeling unit 403, for example, the motion of the target region is modeled by the function MVMt405 in FIG.
  • FIG. 12 is a flowchart showing a one-frame encoding process procedure of the video encoding apparatus according to the embodiment of the present invention.
  • step 1101 the following processing is performed as loop 1 for all the blocks present in the frame to be encoded (step 1101). That is, prediction is executed as loop 2 for all combinations of encoding mode prediction methods and block sizes once for the corresponding block (step 1102).
  • step 1103 it is determined whether the mode is the intra prediction mode (step 1103), and in accordance with the determination in step 1103, the intra prediction process (step 1104) or the inter prediction process (step 1105) is performed to calculate the prediction difference. Do.
  • a motion vector is encoded in addition to the prediction difference.
  • it is determined whether the frame is a start frame (step 1106). If the target frame is the start frame in step 1106, the motion vector used for inter-screen prediction is stored by, for example, the method shown in FIG. 1107).
  • step 1106 it is determined whether it is the start frame (step 1108). If the target frame is the end frame in step 1108, the motion vector of the corresponding area and the stored start frame are determined. The motion is modeled using the motion vector of, and a motion parameter is calculated (step 1109).
  • step 1106 and step 1108 if the target frame is neither the start frame nor the end frame, the difference vector DMV is calculated using the motion model (step 1111).
  • the DMV calculation for the start frame and end frame is H.264. This is performed by a conventional method using H.264 / AVC (step 1110).
  • step 1112 frequency conversion processing
  • quantization processing step 1113
  • variable length encoding processing step 1114
  • the mode with the highest coding efficiency is selected based on the above results (step 1115).
  • the RD-Optimization method that determines the optimum coding mode from the relationship between the image quality distortion and the code amount is used.
  • encoding can be performed efficiently.
  • the quantized frequency transform coefficient is subjected to inverse quantization processing (step 1116) and inverse frequency transform processing (step 1117) to decode the prediction difference, and the decoded image Is stored in the reference image memory (step 1118).
  • FIG. 13 is a flowchart showing a one-frame encoding process procedure of the moving picture decoding apparatus according to the embodiment of the present invention.
  • variable length decoding processing is performed on the input stream (step 1202), and inverse quantization processing (step 1203) and inverse frequency conversion processing (step 1204) are performed to decode the prediction difference.
  • step 1205 it is determined whether the mode is an intra-screen prediction mode, and an intra-screen prediction process (step 1206) or an inter-screen prediction process (step 1210) is performed according to the determination in step 1205.
  • the motion vector MV when performing inter-screen prediction, it is necessary to decode the motion vector MV prior to the prediction.
  • it is determined whether the current frame is a start frame or an end frame (step 1207).
  • the MV is decoded by a conventional method based on H.264 / AVC (step 1208).
  • step 1207 if it is determined in step 1207 that the target frame is neither a start frame nor an end frame, MV decoding is performed using the motion model (step 1209).
  • decoding for one frame of the image is completed (step 1211).
  • DCT is cited as an example of frequency conversion, but DST (Discrete Sine Transformation), WT (Wavelet Transformation), DFT (Discrete Fourier Transform: Discrete Fourier Transform). Any orthogonal transform can be used for removing the correlation between pixels, such as KLT (Karhunen-Loeve Transformation), and even if the prediction difference itself is encoded without frequency conversion. I do not care.
  • DST Discrete Sine Transformation
  • WT Widelet Transformation
  • DFT Discrete Fourier Transform: Discrete Fourier Transform.
  • Any orthogonal transform can be used for removing the correlation between pixels, such as KLT (Karhunen-Loeve Transformation), and even if the prediction difference itself is encoded without frequency conversion. I do not care.
  • the prediction vector is calculated using the function MVMt, but the motion vector itself may be expressed using this function.
  • the motion vector MV is equal to MVMt, and there is no need to encode the difference vector DMV.
  • the present invention relates to a moving picture coding technique for coding a moving picture and a moving picture decoding technique for decoding a moving picture, and can be widely applied to apparatuses that perform coding and decoding of a moving picture.
  • Inverse frequency transform unit 205 ... Motion information memory, 206 ... Inter-screen prediction unit, 207 ... intra-screen prediction unit, 208 ... adding unit, 209 ... reference image memory, 301 ... target image, 302 ... start frame memory, 303 ... motion search unit, 304 ... motion information modeling unit 305 ... Motion parameters, 306 ... Motion modeling functions, 401 ... Motion vectors, 402 ... Start frame memory, 403 ... Motion information modeling unit, 404 ... Motion parameters, 405 ... Motion modeling functions, 501 ... Images, 502 Reference image, 503 ... Image to be encoded, 504, 505 ... Block, 601-606 ... Block, 701-705 ... Frame, 706 ...
  • Motion modeling function 707, 708 ... Region, 801-803 ... Frame, 804 ... motion vector MV, 805 ... motion MVMt, 806 ... difference vector DMV, 807 to 809 ... area, 901 ... I picture, 904, 907 ... P picture, 902, 903, 905, 906 ... B picture, 1001 ... I picture, 1002 to 1004 ... I picture, 1301 to 1307 ... frame

Abstract

Provided is a dynamic image encoder comprised of an interframe prediction unit (106) which performs the interframe prediction and calculates the prediction error, a motion description unit (111) which models the motion information of a target region crossing a plurality of frames, a frequency converter (109) and a quantization processor (110) which encode the prediction error, and a variable-length encoder (113) which performs the variable-length coding corresponding to the symbol probability based on the information modeled by the motion description unit (111). The motion description unit (111) specifies the start frame and the end frame in a plurality of frames, and models the motion of the target region between the start frame and the end frame by a temporal function.

Description

動画像符号化装置、動画像復号化装置、動画像符号化方法、および動画像復号化方法Moving picture encoding apparatus, moving picture decoding apparatus, moving picture encoding method, and moving picture decoding method
 本発明は動画像を符号化する動画像符号化技術および動画像を復号化する動画像復号化技術に関する。 The present invention relates to a moving picture encoding technique for encoding a moving picture and a moving picture decoding technique for decoding a moving picture.
 大容量の動画像情報をデジタルデータ化して記録、伝達する手法として、さまざまな規格が国際標準の符号化方式として策定されている。 さ ま ざ ま Various standards have been established as international standard encoding methods for recording and transmitting large volumes of moving image information as digital data.
 このような符号化方式のいくつかは、デジタル衛星放送やDVD、携帯電話やデジタルカメラなどにおける符号化方式として採用され、現在ますます利用の範囲が広がり、身近なものとなってきている。 Some of these encoding methods have been adopted as encoding methods for digital satellite broadcasting, DVDs, mobile phones, digital cameras, and the like, and the range of use is now expanding and becoming familiar.
 これらの規格では、符号化処理が完了した画像情報を利用して符号化対象画像をブロック単位で予測し、原画像との予測差分を符号化することによって、動画像の持つ冗長性を除いて符号量を減らしている。 In these standards, the encoding target image is predicted in block units using the image information that has been encoded, and the prediction difference from the original image is encoded, thereby eliminating the redundancy of the moving image. The code amount is reduced.
 特に、対象画像とは別の画像を参照する画面間予測では、符号化対象ブロックと相関の高いブロックを参照画像中から探索することによって、高精度な予測を可能としている。 In particular, in inter-screen prediction that refers to an image different from the target image, high-precision prediction is enabled by searching the reference image for a block having a high correlation with the encoding target block.
 しかしながら、従来の画面間予測では、予測差分に加えて、ブロック探索の結果を動きベクトルとして符号化する必要があり、符号量のオーバーヘッドが発生してしまう。 However, in the conventional inter-screen prediction, it is necessary to encode the result of the block search as a motion vector in addition to the prediction difference, resulting in a code amount overhead.
 このような画面間予測を行う規格の1つであるH.264/AVC(Advanced Video Coding)規格では、動きベクトルの符号量を減らすために、動きベクトルに対する予測技術を導入している。 H. is one of the standards for such inter-screen prediction. The H.264 / AVC (Advanced Video Coding) standard introduces a prediction technique for motion vectors in order to reduce the amount of motion vector codes.
 すなわち、動きベクトルを符号化する際には、対象ブロックの周辺に位置する符号化済みのブロックを利用して対象ブロックの動きベクトルを予測し、予測ベクトルと動きベクトルの差分(差分ベクトル)を可変長符号化する。 In other words, when encoding a motion vector, the motion vector of the target block is predicted using an encoded block located around the target block, and the difference between the prediction vector and the motion vector (difference vector) is variable. Encode long.
 これにより動きベクトルの符号量を大幅に削減することに成功したが、H.264/AVCによる動きベクトルの予測精度は十分であるとは言えず、特に動く物体が複数存在するなど動きの複雑な画像に対しては、依然として動きベクトルに多くの符号量が必要であるといった問題があった。 This has succeeded in significantly reducing the amount of motion vector codes. The accuracy of motion vector prediction by H.264 / AVC is not sufficient, and a large amount of code is still required for motion vectors, especially for images with complex motion such as the presence of multiple moving objects. was there.
 そこで、本発明の目的は、差分ベクトルの算出方法を改善することにより、動きベクトルの符号量を減少させて圧縮効率を向上させることができる動画像符号化装置、動画像復号化装置、動画像符号化方法、および動画像復号化方法を提供することにある。 Accordingly, an object of the present invention is to improve a compression efficiency by reducing a code amount of a motion vector by improving a difference vector calculation method, a moving image decoding device, a moving image It is to provide an encoding method and a moving image decoding method.
 本発明の前記ならびにその他の目的と新規な特徴は、本明細書の記述および添付図面から明らかになるであろう。 The above and other objects and novel features of the present invention will be apparent from the description of this specification and the accompanying drawings.
 本願において開示される発明のうち、代表的なものの概要を簡単に説明すれば、次のとおりである。 Of the inventions disclosed in this application, the outline of typical ones will be briefly described as follows.
 すなわち、代表的なものの概要は、動画像符号化装置において、複数フレーム内で、開始フレームおよび終了フレームを指定し、開始フレームと終了フレームとの間の対象領域の動きを時間関数によってモデル化する動き記述部を備えたものである。 That is, the outline of a typical one is that a moving image coding apparatus specifies a start frame and an end frame within a plurality of frames, and models the movement of a target region between the start frame and the end frame by a time function. It has a motion description part.
 また、動画像復号化装置において、開始フレームと終了フレームとの間の対象領域の動きがモデル化された時間関数に基づいて、対象領域の動きベクトルを復号化する可変長復号化部を備えたものである。 The moving image decoding apparatus further includes a variable length decoding unit that decodes a motion vector of the target region based on a time function in which the motion of the target region between the start frame and the end frame is modeled. Is.
 本願において開示される発明のうち、代表的なものによって得られる効果を簡単に説明すれば以下のとおりである。 Among the inventions disclosed in the present application, effects obtained by typical ones will be briefly described as follows.
 すなわち、代表的なものによって得られる効果は、差分ベクトルの算出方法を改善することにより、動きベクトルの符号量を減少させて圧縮効率を向上させることができ、少ない符号量で高画質の映像を提供することができる。 In other words, the effect obtained by a typical one is that by improving the calculation method of the difference vector, the code amount of the motion vector can be reduced to improve the compression efficiency, and a high-quality video with a small code amount can be obtained. Can be provided.
 以下、本発明の実施の形態を図面に基づいて詳細に説明する。なお、実施の形態を説明するための全図において、同一の部材には原則として同一の符号を付し、その繰り返しの説明は省略する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. Note that components having the same function are denoted by the same reference symbols throughout the drawings for describing the embodiment, and the repetitive description thereof will be omitted.
 まず、図1および図2により、本発明の一実施の形態に係る動画像符号化方法および動画像復号化方法の前提となる画面間予測処理について説明する。図1は本発明の一実施の形態に係る動画像符号化方法および動画像復号化方法の前提となる画面間予測処理を説明するための説明図であり、H.264/AVCによる画面間予測処理の動作について概念的に示したものである。図2は本発明の一実施の形態に係る動画像符号化方法および動画像復号化方法の前提となる画面間予測処理の予測ベクトルの算出を説明するための説明図であり、予測ベクトルを算出する方法について概念的に示したものである。 First, an inter-screen prediction process that is a premise of a moving picture coding method and a moving picture decoding method according to an embodiment of the present invention will be described with reference to FIGS. FIG. 1 is an explanatory diagram for explaining an inter-screen prediction process that is a premise of a moving picture coding method and a moving picture decoding method according to an embodiment of the present invention. 2 conceptually shows the operation of the inter-screen prediction processing by H.264 / AVC. FIG. 2 is an explanatory diagram for explaining calculation of a prediction vector in inter-frame prediction processing which is a premise of the video encoding method and video decoding method according to an embodiment of the present invention. This is a conceptual illustration of how to do this.
 H.264/AVCでは、符号化対象画像に対してラスタースキャンの順序に従ってブロック単位による符号化を行う。 H. In H.264 / AVC, the encoding target image is encoded in block units according to the raster scan order.
 画面間予測を行う際には、図1に示すように、符号化対象画像503と同じ映像501に含まれる符号化済みの画像の復号画像を参照画像502とし、対象画像中の対象ブロック504と相関の高いブロック505を参照画像中から探索する。 When performing inter-screen prediction, as shown in FIG. 1, a decoded image of an encoded image included in the same video 501 as the encoding target image 503 is used as a reference image 502, and the target block 504 in the target image A block 505 having a high correlation is searched from the reference image.
 このとき、両ブロックの差分として計算される予測差分に加えて、両ブロックの座標値の差分を動きベクトル506として符号化する。一方復号化の際には上記の逆の手順を行えば良く、復号化された予測差分を参照画像中のブロック505に加えることにより、復号化画像を取得できる。 At this time, in addition to the prediction difference calculated as the difference between both blocks, the difference between the coordinate values of both blocks is encoded as a motion vector 506. On the other hand, the reverse procedure described above may be performed at the time of decoding, and the decoded image can be acquired by adding the decoded prediction difference to the block 505 in the reference image.
 H.264/AVCでは、以上で説明した動きベクトルによる符号量のオーバーヘッドを低減するために、動きベクトルに対する予測技術を導入している。すなわち、動きベクトルを符号化する際には、対象ブロックの周辺に位置する符号化済みのブロックを利用して対象ブロックの動きベクトルを予測し、予測ベクトルと動きベクトルの差分ベクトルを符号化する。このとき、差分ベクトルの大きさはほぼ0に集中するため、これを可変長符号化することにより符号量を削減できる。 H. In H.264 / AVC, in order to reduce the overhead of the code amount due to the motion vector described above, a prediction technique for the motion vector is introduced. That is, when a motion vector is encoded, a motion vector of the target block is predicted using an encoded block located around the target block, and a difference vector between the prediction vector and the motion vector is encoded. At this time, since the size of the difference vector is concentrated to almost zero, the amount of codes can be reduced by variable-length encoding the difference vector.
 この予測ベクトルの算出は、図2に示すように、対象ブロック601の左側、上側、右上側に隣接する符号化済みのブロックをそれぞれブロックA602、ブロックB603、ブロックC604とし、各ブロックにおける動きベクトルをMVA、MVB、MVCとする。 As shown in FIG. 2, the prediction vector is calculated by setting the encoded blocks adjacent to the left side, the upper side, and the upper right side of the target block 601 as the block A 602, the block B 603, and the block C 604, respectively. MVA, MVB, and MVC.
 このとき、予測ベクトルPMVは、引数として指定された複数の値の中央値を返す関数Medianを用いて予測ベクトルPMV605のように算出される。さらに、差分ベクトルDMVを対象ブロックの動きベクトルMVと予測ベクトルPMVの差分ベクトル606として算出し、続いてDMVを可変長符号化する。 At this time, the prediction vector PMV is calculated like the prediction vector PMV605 using a function Median that returns the median value of a plurality of values designated as arguments. Further, the difference vector DMV is calculated as the difference vector 606 between the motion vector MV of the target block and the prediction vector PMV, and then the DMV is variable-length encoded.
 以上のように、H.264/AVCでは、動きベクトルに対する予測技術を導入することにより、動きベクトルに必要な符号量を大幅に削減することが可能になった。しかし、H.264/AVCの場合、予測ベクトルを算出する際に空間方向の近傍ブロックしか考慮しておらず、必ずしもオブジェクトの動きを反映できているとは言えなかった。 As mentioned above, H. In H.264 / AVC, it has become possible to significantly reduce the amount of code required for motion vectors by introducing a prediction technique for motion vectors. However, H. In the case of H.264 / AVC, only the neighboring blocks in the spatial direction are considered when calculating the prediction vector, and it cannot be said that the motion of the object is necessarily reflected.
 そのため、特に動く物体が複数存在する画像において、動きベクトルの予測精度が十分とは言えず、依然として動きベクトルに多くの符号量が必要であった。 For this reason, the motion vector prediction accuracy is not sufficient particularly in an image having a plurality of moving objects, and a large amount of code is still required for the motion vector.
 本実施の形態では、後述するように、符号化対象領域の動きを時間関数としてモデル化して予測ベクトルの算出に利用することにより、動きベクトルに対する予測精度を向上できるようになった。 In this embodiment, as will be described later, the prediction accuracy for a motion vector can be improved by modeling the motion of the encoding target region as a time function and using it for calculation of a prediction vector.
 次に、図3~図7により、本発明の一実施の形態に係る動画像符号化方法および動画像復号化方法のモデル化について説明する。図3は本発明の一実施の形態に係る動画像符号化方法および動画像復号化方法の動きのモデル化の一例を示す図、図4は本発明の一実施の形態に係る動画像符号化方法および動画像復号化方法の動きベクトルの符号化の一例を示す図、図5は本発明の一実施の形態に係る動画像符号化方法および動画像復号化方法の開始フレームと終了フレームの選択の一例を示す図、図6は本発明の一実施の形態に係る動画像符号化方法および動画像復号化方法の動きのモデル化の他の例を示す図、図7は本発明の一実施の形態に係る動画像符号化方法および動画像復号化方法の開始フレームと終了フレームの選択の他の例を示す図である。 Next, modeling of a moving picture coding method and a moving picture decoding method according to an embodiment of the present invention will be described with reference to FIGS. FIG. 3 is a diagram showing an example of motion modeling of the moving picture encoding method and the moving picture decoding method according to the embodiment of the present invention, and FIG. 4 is a moving picture encoding according to the embodiment of the present invention. FIG. 5 is a diagram showing an example of motion vector encoding of the method and the video decoding method, and FIG. 5 is a selection of start and end frames of the video encoding method and video decoding method according to an embodiment of the present invention. FIG. 6 is a diagram illustrating another example of motion modeling of the moving picture coding method and the moving picture decoding method according to an embodiment of the present invention, and FIG. 7 is a diagram illustrating one embodiment of the present invention. It is a figure which shows the other example of selection of the start frame and end frame of the moving image encoding method and moving image decoding method which concern on this form.
 図3に示すように、本実施の形態では、まず、間にn-1枚のフレーム702~704を挟んで開始フレーム701と終了フレーム705を用意する。続いて、終了フレーム内の特定領域708に対応する領域707を開始フレーム内で探索し、これら対応領域の座標の差分値に基づいて、対象領域の動きを時間関数MVMt706としてモデル化する。 As shown in FIG. 3, in the present embodiment, first, a start frame 701 and an end frame 705 are prepared with n−1 frames 702 to 704 interposed therebetween. Subsequently, an area 707 corresponding to the specific area 708 in the end frame is searched in the start frame, and the movement of the target area is modeled as a time function MVMt 706 based on the difference value of the coordinates of these corresponding areas.
 この動きのモデルを利用して、開始フレームと終了フレームに挟まれるフレーム702~704における動きベクトルの符号化を行う。さらに、モデル化された動き情報を別途符号化し、ストリームに格納する。 Using this motion model, the motion vectors in the frames 702 to 704 sandwiched between the start frame and the end frame are encoded. Further, the modeled motion information is separately encoded and stored in a stream.
 例えば、図3に示した例では、対象領域の動きを直線的にモデル化し、時刻tの一次関数によって表している。この場合、関数MVMt706の係数A、B、C、Dを動きパラメータとして符号化する。 For example, in the example shown in FIG. 3, the movement of the target area is modeled linearly and represented by a linear function at time t. In this case, the coefficients A, B, C, and D of the function MVMt 706 are encoded as motion parameters.
 さらに、例えば、開始フレーム番号や終了フレーム番号など、関数MVMtによる動きのモデル化が適用可能なフレームの範囲を特定するための情報、および、ブロック番号など、動きのモデル化が適用可能な領域情報を符号化する。 Furthermore, for example, information for specifying a range of a frame to which motion modeling by the function MVMt can be applied, such as a start frame number and an end frame number, and region information to which motion modeling can be applied, such as a block number Is encoded.
 このとき、対象領域の動きをどのようにモデル化するかについては特に問わない。すなわち、図3に示した例では、対象領域の動きを直線によって近似しているが、これを例えば楕円、二次曲線放物線、ベジエ曲線、クロソイド曲線、サイクロイド、反射、振り子運動などによって近似しても構わない。 At this time, there is no particular question as to how to model the movement of the target area. That is, in the example shown in FIG. 3, the movement of the target region is approximated by a straight line, but this is approximated by, for example, an ellipse, a quadratic parabola, a Bezier curve, a clothoid curve, a cycloid, a reflection, a pendulum motion, etc. It doesn't matter.
 また、図4においては、関数MVMtによって表された動きのモデルを利用して動きベクトルを符号化しており、ここでは、開始フレーム時刻t=0とするフレーム801と終了フレーム時刻t=nとするフレーム803に挟まれたフレーム時刻t=mとするフレーム802を符号化する場合について示している。 In FIG. 4, a motion vector is encoded using a motion model represented by the function MVMt. Here, a frame 801 with a start frame time t = 0 and an end frame time t = n. A case where a frame 802 with frame time t = m sandwiched between frames 803 is encoded is shown.
 すでに説明したとおり、対象ブロック808の動きベクトルMV804を符号化する際には、開始フレーム内の領域807と、それに対応する終了フレーム内の領域809を利用してモデル化された動きMVMt805を用いる。 As already described, when the motion vector MV804 of the target block 808 is encoded, the motion MVMt 805 modeled using the region 807 in the start frame and the region 809 in the corresponding end frame is used.
 すなわち、時刻t=mにおける動きMVMmを動きベクトルMVの予測値とし、差分ベクトルDMV806を動きベクトルMVと予測ベクトルMVMmの差分によって算出する。さらに、H.264/AVCと同様にDMVを可変長符号化する。 That is, the motion MVMm at time t = m is set as the predicted value of the motion vector MV, and the difference vector DMV806 is calculated from the difference between the motion vector MV and the predicted vector MVMm. Further, H.C. As with H.264 / AVC, DMV is variable length encoded.
 以上では、開始フレームと終了フレームに挟まれたフレームにおける動きベクトルの符号化方法を示したが、開始フレームと終了フレームにおける動きベクトルの符号化方法は特に問わない。しかし、例えば、図2に示すH.264/AVCによる方法と同様に、対象ブロックの周辺ブロックを参照して動きベクトルを予測する方法を用いると効果的である。 In the above, the motion vector encoding method in the frame sandwiched between the start frame and the end frame has been described, but the motion vector encoding method in the start frame and the end frame is not particularly limited. However, for example, as shown in FIG. Similar to the H.264 / AVC method, it is effective to use a method of predicting a motion vector by referring to the peripheral blocks of the target block.
 また、開始フレームと終了フレームの選択方法については特に問わないが、例えば、図5に示す例では、画面内予測のみが利用可能なIピクチャ901と、1枚の参照画像を利用した画面間予測が可能なPピクチャ904、907と、2枚の参照画像を利用した画面間予測が可能なBピクチャ902、903、905、906を利用して符号化する場合について示している。 Further, the selection method of the start frame and the end frame is not particularly limited. For example, in the example illustrated in FIG. 5, inter-picture prediction using an I picture 901 that can use only intra-prediction and one reference image. In this example, encoding is performed using P pictures 904 and 907 that can be encoded and B pictures 902, 903, 905, and 906 that are capable of inter-screen prediction using two reference images.
 この場合、Iピクチャ901が符号化された後、これを参照してPピクチャ904が符号化される。続いて、符合化済みの2枚の画像901、904を参照して、Bピクチャ902、903が符号化される。同様にして、Pピクチャ907が次に符号化され、続いて2枚の画像904、907を参照してBピクチャ905、906が符号化される。 In this case, after the I picture 901 is encoded, the P picture 904 is encoded with reference to this. Subsequently, the B pictures 902 and 903 are encoded with reference to the two encoded images 901 and 904. Similarly, the P picture 907 is encoded next, and then the B pictures 905 and 906 are encoded with reference to the two images 904 and 907.
 このようなピクチャ構造においては、例えば、次のように本実施の形態を適用すると効果的である。すなわち、まず、Iピクチャ901とPピクチャ904に対して符号化を行うと共に、この2枚の画像をそれぞれ開始フレーム、終了フレームとして動きのモデル化を行う。 In such a picture structure, it is effective to apply this embodiment as follows, for example. That is, first, the I picture 901 and the P picture 904 are encoded, and the motion is modeled using the two images as a start frame and an end frame, respectively.
 そして、このモデルを利用して間のBピクチャ902、903を符号化する。続いて、次のPピクチャ907の符号化を行うとともに、この2枚のPピクチャ904、907をそれぞれ開始フレーム、終了フレームとして動きのモデル化を行う。そして、このモデルを利用して間のBピクチャ905、906を符号化する。 Then, B pictures 902 and 903 in between are encoded using this model. Subsequently, the next P picture 907 is encoded, and motion modeling is performed using the two P pictures 904 and 907 as a start frame and an end frame, respectively. Then, B pictures 905 and 906 in between are encoded using this model.
 以上のように開始フレームと終了フレームを選択することにより、H.264/AVCと比べて特に大きな遅延を起こすことなく符号化を行うことができる。 By selecting the start frame and end frame as described above, Compared with H.264 / AVC, encoding can be performed without causing a particularly large delay.
 また、図6に示すように、動きのモデルを利用した符号化と、動きのモデルを利用しない符号化を組み合わせても良い。ここでは、最初のフレーム1301を開始フレームとし、画像の性質に応じて終了フレームを選択する。 Also, as shown in FIG. 6, encoding using a motion model may be combined with encoding not using a motion model. Here, the first frame 1301 is set as the start frame, and the end frame is selected according to the nature of the image.
 この例では、開始フレームとの間に3枚のフレーム1302、1303、1304を挟んで終了フレーム1305を指定している。終了フレームの決定する方法については特に問わないが、例えば、対象領域の動きが、図3の動きモデルMVMt706によって表すことができるフレームの範囲を特定し、それらのフレームを開始フレームと挟むように終了フレームを決定すると効果的である。 In this example, an end frame 1305 is specified with three frames 1302, 1303, and 1304 sandwiched between the start frame and the start frame. The method for determining the end frame is not particularly limited. For example, the range of the frame in which the motion of the target region can be represented by the motion model MVMt 706 in FIG. 3 is specified, and the end frame is sandwiched with the start frame It is effective to determine the frame.
 また、対象領域の動きが、図3の動きモデルMVMt706等によってモデル化できない場合には、フレーム1306、1307のように、動きのモデル化を行わずに、例えば、図2に示すような従来方法によって動きベクトルを符号化しても良い。こうすることにより、効率的に符号化を行うことができる。 Further, when the motion of the target region cannot be modeled by the motion model MVMt 706 of FIG. 3 or the like, for example, the conventional method as shown in FIG. The motion vector may be encoded by By doing so, encoding can be performed efficiently.
 また、開始フレームと終了フレームの選択方法の他の例として、図7に示す。ここではピクチャの種類は特に問わないが、簡単のためIピクチャとPピクチャのみを利用する場合について示している。 FIG. 7 shows another example of the method for selecting the start frame and the end frame. Here, the type of picture is not particularly limited, but for simplicity, only the case of using only an I picture and a P picture is shown.
 この場合、符号化は映像の表示順序と同じ順番(1001→1002→1003→・・・)で行われる。この例では、まず、Iピクチャ1001を符号化した後、これを開始フレーム、n-1枚後の画像1004を終了フレームとして、動きのモデル化を行う。 In this case, encoding is performed in the same order as the video display order (1001 → 1002 → 1003 →...). In this example, first, an I picture 1001 is encoded, and this is used as a start frame, and an image 1004 after n−1 frames is used as an end frame to model motion.
 このとき、画像1004に対しては、動きのモデル化を行うための対応領域探索のみを行い、符号化処理は行わない。続いて、このモデルを利用して、間の画像(1002、1003、・・)を符号化する。 At this time, for the image 1004, only the corresponding region search for modeling the motion is performed, and the encoding process is not performed. Subsequently, the image (1002, 1003,...) Is encoded using this model.
 この場合、画像1001を符号化した後で画像1004を先読みする必要があるため符号化を行う際に大きな遅延が発生することになるが、開始フレームと終了フレームの間に多くの画像を挟むことができるようになり、動きのモデル化の効率を高めることができるという利点がある。 In this case, since it is necessary to pre-read the image 1004 after the image 1001 is encoded, a large delay occurs when encoding, but many images are sandwiched between the start frame and the end frame. There is an advantage that the efficiency of motion modeling can be increased.
 上記の例は、いずれも動きのモデル化をブロック単位で行っているが、それ以外にも例えば画像の背景から分離したオブジェクト単位でモデル化を行っても良い。 In each of the above examples, the motion modeling is performed in units of blocks, but in addition, for example, the modeling may be performed in units of objects separated from the background of the image.
 次に、図8により、本発明の一実施の形態に係る動画像符号化装置の構成および動作について説明する。図8は本発明の一実施の形態に係る動画像符号化装置の構成を示す構成図である。 Next, the configuration and operation of the moving picture coding apparatus according to the embodiment of the present invention will be described with reference to FIG. FIG. 8 is a block diagram showing the configuration of the moving picture coding apparatus according to the embodiment of the present invention.
 図8において、動画像符号化装置は、入力された原画像101を保持する入力画像メモリ102、入力画像を小領域に分割するブロック分割部103、ブロック単位で画面内予測を行う画面内予測部105、動き探索部104にて検出された動き量を基にブロック単位で画面間予測を行う画面間予測部106、画像の性質に合った予測符号化手段(予測方法およびブロックサイズ)を決定するモード選択部107、予測差分を生成するための減算部108、予測差分に対して符号化を行う周波数変換部109および量子化処理部110、対象領域における動きのモデル化を行う動き記述部111、モデル化された動きの情報を保持する動き情報メモリ112、記号の発生確率に応じた符号化を行うための可変長符号化部113、一度符号化した予測差分を復号化するための逆量子化処理部114および逆周波数変換部115、復号化された予測差分を用いて復号化画像を生成するための加算部116、復号化画像を保持して後の予測に活用するための参照画像メモリ117から構成されている。 In FIG. 8, the moving image coding apparatus includes an input image memory 102 that holds an input original image 101, a block dividing unit 103 that divides the input image into small regions, and an intra-screen prediction unit that performs intra-screen prediction in units of blocks. 105. Determine an inter-screen prediction unit 106 that performs inter-screen prediction on a block basis based on the amount of motion detected by the motion search unit 104, and predictive encoding means (prediction method and block size) that matches the nature of the image. A mode selection unit 107, a subtraction unit 108 for generating a prediction difference, a frequency conversion unit 109 and a quantization processing unit 110 for encoding the prediction difference, a motion description unit 111 for modeling a motion in the target region, A motion information memory 112 that holds modeled motion information, a variable length encoding unit 113 that performs encoding according to the probability of occurrence of symbols, and encoding once An inverse quantization processing unit 114 and an inverse frequency transform unit 115 for decoding the predicted difference, an addition unit 116 for generating a decoded image using the decoded prediction difference, and holding the decoded image The reference image memory 117 is used for later prediction.
 入力画像メモリ102は、原画像101の中から1枚の画像を符号化対象画像として保持し、これをブロック分割部103にて細かなブロックに分割し、動き探索部104、画面内予測部105、および画面間予測部106に渡す。 The input image memory 102 holds one image from the original image 101 as an encoding target image, and divides the image into fine blocks by the block dividing unit 103, and a motion search unit 104 and an in-screen prediction unit 105. , And the inter-screen prediction unit 106.
 動き探索部104では、参照画像メモリ117に格納されている復号化済み画像を用いて該当ブロックの動き量を計算し、動きベクトルを画面間予測部106に渡す。画面内予測部105および画面間予測部106では画面内予測処理および画面間予測処理をいくつかの大きさのブロック単位で実行し、モード選択部107にてどちらか最適な予測方法を選ぶ。 The motion search unit 104 calculates the motion amount of the corresponding block using the decoded image stored in the reference image memory 117, and passes the motion vector to the inter-screen prediction unit 106. The intra-screen prediction unit 105 and the inter-screen prediction unit 106 execute the intra-screen prediction process and the inter-screen prediction process in units of blocks of several sizes, and the mode selection unit 107 selects an optimal prediction method.
 続いて減算部108では最適な予測符号化手段(予測方法およびブロックサイズ)による予測差分を生成し、周波数変換部109に渡す。周波数変換部109および量子化処理部110では、送られてきた予測差分に対して指定された大きさのブロック単位でそれぞれDCT(Discrete Cosine Transformation:離散コサイン変換)などの周波数変換および量子化処理を行い、可変長符号化部113および逆量子化処理部114に渡す。 Subsequently, the subtraction unit 108 generates a prediction difference by the optimal prediction encoding means (prediction method and block size) and passes it to the frequency conversion unit 109. The frequency conversion unit 109 and the quantization processing unit 110 perform frequency conversion and quantization processing such as DCT (Discrete Cosine Transformation) in units of blocks having a size specified for the transmitted prediction difference. And pass it to the variable length coding unit 113 and the inverse quantization processing unit 114.
 また、動き記述部111では、開始フレームおよび終了フレームに関する情報(画像情報、動きベクトル等)に基づいて対象領域の動きを時間関数によってモデル化し、開始フレーム番号、終了フレーム番号、動きパラメータなどの情報を動き情報メモリ112に送り、記憶する。 Further, the motion description unit 111 models the motion of the target region by a time function based on information (image information, motion vector, etc.) regarding the start frame and the end frame, and information such as a start frame number, an end frame number, and a motion parameter. Is sent to the motion information memory 112 for storage.
 さらに、可変長符号化部113では、周波数変換係数によって表される予測差分情報と、例えば、画面内予測を行う際に利用した予測方向や画面間予測を行う際に利用した動きベクトル、および動きのモデル化に利用する動きパラメータなど、復号化に必要な情報を、記号の発生確率に基づいて可変長符号化を行って符号化ストリームを生成する。 Furthermore, in the variable length coding unit 113, the prediction difference information represented by the frequency conversion coefficient, for example, the prediction direction used when performing the intra prediction, the motion vector used when performing the inter prediction, and the motion Information necessary for decoding, such as motion parameters used for modeling, is subjected to variable length coding based on the probability of symbol generation to generate a coded stream.
 このとき、開始フレームと終了フレーム以外のフレームにおける動きベクトルの符号化には、動き情報メモリ112に格納されている動きモデルが利用される。また、逆量子化処理部114および逆周波数変換部115では、量子化後の周波数変換係数に対して、それぞれ逆量子化およびIDCT(Inverse DCT:逆DCT)などの逆周波数変換を施し、予測差分を取得して加算部116に送る。続いて加算部116により復号化画像を生成して参照画像メモリ117に格納する。 At this time, a motion model stored in the motion information memory 112 is used to encode motion vectors in frames other than the start frame and the end frame. In addition, the inverse quantization processing unit 114 and the inverse frequency transform unit 115 perform inverse frequency transform such as inverse quantization and IDCT (Inverse DCT) on the quantized frequency transform coefficients, respectively, so that the prediction difference Is sent to the adder 116. Subsequently, the adder 116 generates a decoded image and stores it in the reference image memory 117.
 次に、図9により、本発明の一実施の形態に係る動画像復号化装置の構成および動作について説明する。図9は本発明の一実施の形態に係る動画像復号化装置の構成を示す構成図である。 Next, the configuration and operation of the moving picture decoding apparatus according to an embodiment of the present invention will be described with reference to FIG. FIG. 9 is a block diagram showing the configuration of the video decoding apparatus according to one embodiment of the present invention.
 図9において、動画像復号化装置は、例えば、図8に示す動画像符号化装置によって生成された符号化ストリーム201に対して可変長符号化の逆の手順を踏む可変長復号化部202、予測差分を復号化するための逆量子化処理部203および逆周波数変換部204、動きパラメータや開始フレーム番号、終了フレーム番号など動きのモデル化に必要な情報を蓄積するための動き情報メモリ205、画面間予測を行う画面間予測部206、画面内予測を行う画面内予測部207、復号化画像を取得するための加算部208、復号化画像を一時的に記憶しておくための参照画像メモリ209から構成されている。 In FIG. 9, the moving picture decoding apparatus includes, for example, a variable length decoding unit 202 that performs the reverse procedure of variable length coding on the coded stream 201 generated by the moving picture coding apparatus illustrated in FIG. 8. An inverse quantization processing unit 203 and an inverse frequency transform unit 204 for decoding the prediction difference, a motion information memory 205 for storing information necessary for motion modeling such as motion parameters, start frame numbers, and end frame numbers; An inter-screen prediction unit 206 that performs inter-screen prediction, an intra-screen prediction unit 207 that performs intra-screen prediction, an adder unit 208 for acquiring a decoded image, and a reference image memory for temporarily storing the decoded image 209.
 可変長復号化部202では、符号化ストリーム201を可変長復号化し、予測差分の周波数変換係数成分と、ブロックサイズや動きベクトル、および動きパラメータなど予測処理に必要な情報を取得する。 The variable length decoding unit 202 performs variable length decoding on the encoded stream 201 and acquires information necessary for prediction processing such as a frequency transform coefficient component of a prediction difference, a block size, a motion vector, and a motion parameter.
 前者の予測差分情報に対しては逆量子化処理部203に、後者の予測処理に必要な情報に対しては、動き情報メモリ205、画面間予測部206、または画面内予測部207に送られる。 The former prediction difference information is sent to the inverse quantization processing unit 203, and the information necessary for the latter prediction processing is sent to the motion information memory 205, the inter-screen prediction unit 206, or the intra-screen prediction unit 207. .
 続いて、逆量子化処理部203および逆周波数変換部204では、予測差分情報に対してそれぞれ逆量子化と逆周波数変換を施して復号化を行う。また、動き情報メモリ205では、動きパラメータなど動きのモデル化に必要な情報を記憶する。 Subsequently, the inverse quantization processing unit 203 and the inverse frequency transform unit 204 perform decoding by performing inverse quantization and inverse frequency transform on the prediction difference information, respectively. Further, the motion information memory 205 stores information necessary for motion modeling such as motion parameters.
 続いて画面間予測部206または画面内予測部207では、可変長復号化部202から送られてきた情報を基に参照画像メモリ209を参照して予測処理を実行し、加算部208にて復号化画像を生成するとともに、復号化画像を参照画像メモリ209に格納する。 Subsequently, the inter-screen prediction unit 206 or the intra-screen prediction unit 207 executes a prediction process with reference to the reference image memory 209 based on the information sent from the variable length decoding unit 202, and the addition unit 208 performs decoding. A converted image is generated and the decoded image is stored in the reference image memory 209.
 次に、図10および図11により、本発明の一実施の形態に係る動画像符号化装置の動き記述部の構成および動作について説明する。図10は本発明の一実施の形態に係る動画像符号化装置動き記述部の構成の一例を示す構成図、図11は本発明の一実施の形態に係る動画像符号化装置動き記述部の構成の他の例を示す構成図である。 Next, the configuration and operation of the motion description unit of the video encoding device according to the embodiment of the present invention will be described with reference to FIGS. FIG. 10 is a block diagram showing an example of the configuration of a motion picture encoding apparatus motion description section according to an embodiment of the present invention. FIG. 11 is a diagram of the motion picture encoding apparatus motion description section according to an embodiment of the present invention. It is a block diagram which shows the other example of a structure.
 図10において、動き記述部111は、開始フレームおよび終了フレーム原画像もしくは対象画像301を入力し、開始フレームを記憶するための開始フレームメモリ302、開始フレームと終了フレームの間で対応領域の探索を行う動き探索部303、探索結果に基づいて動きのモデル化を行う動き情報モデル化部304から構成され、動きパラメータ305などの動き情報を出力する。 In FIG. 10, the motion description unit 111 receives the start frame and the end frame original image or the target image 301, and searches for a corresponding area between the start frame and the end frame, and a start frame memory 302 for storing the start frame. It includes a motion search unit 303 that performs the motion information modeling unit 304 that performs motion modeling based on the search result, and outputs motion information such as the motion parameter 305.
 動き記述部111には、まず開始フレームの原画像もしくは復号化画像が入力され、開始フレームメモリ302に記憶される。続いて、終了フレームの原画像もしくは復号化画像が入力されると、動き探索部303では、開始フレームメモリ302に記憶されている開始フレームと、入力された終了フレームの間で対応領域の探索が行われ、探索結果が動き情報モデル化部304に渡される。 First, the original image or decoded image of the start frame is input to the motion description unit 111 and stored in the start frame memory 302. Subsequently, when the original image or decoded image of the end frame is input, the motion search unit 303 searches for the corresponding area between the start frame stored in the start frame memory 302 and the input end frame. The search result is passed to the motion information modeling unit 304.
 動き情報モデル化部304では、探索結果に基づいて動きのモデル化が行われる。動き情報モデル化部304では、例えば、図10の関数MVMt306によって対象領域の動きをモデル化する。 The motion information modeling unit 304 performs motion modeling based on the search result. In the motion information modeling unit 304, for example, the motion of the target region is modeled by the function MVMt 306 in FIG.
 図11において、動き記述部111は、開始フレームおよび終了フレームを符号化する際に算出した動きベクトル401を入力し、開始フレームの動きベクトルを記憶するための開始フレームメモリ402、開始フレームの動きベクトルと終了フレームの動きベクトルから対象領域の動きのモデル化を行う動き情報モデル化部403から構成され、動きパラメータ404などの動き情報を出力する。 11, a motion description unit 111 receives a motion vector 401 calculated when encoding a start frame and an end frame, a start frame memory 402 for storing the motion vector of the start frame, and a motion vector of the start frame. And a motion information modeling unit 403 that models the motion of the target region from the motion vector of the end frame, and outputs motion information such as the motion parameter 404.
 動き記述部111には、まず開始フレームを符号化する際に算出した動きベクトルが入力され、開始フレームメモリ402に記憶される。続いて、終了フレームを符号化する際に算出した動きベクトルが入力されると、動き情報モデル化部403では、両ベクトルに基づいて動きのモデル化が行われる。動き情報モデル化部403では、例えば、図11の関数MVMt405によって対象領域の動きをモデル化する。 First, a motion vector calculated when a start frame is encoded is input to the motion description unit 111 and stored in the start frame memory 402. Subsequently, when the motion vector calculated when the end frame is encoded is input, the motion information modeling unit 403 performs motion modeling based on both vectors. In the motion information modeling unit 403, for example, the motion of the target region is modeled by the function MVMt405 in FIG.
 次に、図12により、本発明の一実施の形態に係る動画像符号化装置の1フレームの符号化処理手順について説明する。図12は本発明の一実施の形態に係る動画像符号化装置の1フレームの符号化処理手順を示すフローチャートである。 Next, with reference to FIG. 12, a description will be given of an encoding process procedure for one frame of the moving picture encoding apparatus according to the embodiment of the present invention. FIG. 12 is a flowchart showing a one-frame encoding process procedure of the video encoding apparatus according to the embodiment of the present invention.
 まず、符号化対象となるフレーム内に存在するすべてのブロックに対して、以下の処理をループ1として行う(ステップ1101)。すなわち、該当ブロックに対して一度すべての符号化モード予測方法とブロックサイズの組み合わせに対して、ループ2として予測を実行する(ステップ1102)。 First, the following processing is performed as loop 1 for all the blocks present in the frame to be encoded (step 1101). That is, prediction is executed as loop 2 for all combinations of encoding mode prediction methods and block sizes once for the corresponding block (step 1102).
 ここでは、画面内予測モードかを判断し(ステップ1103)、ステップ1103の判断に応じて、画面内予測処理(ステップ1104)、または画面間予測処理(ステップ1105)を行い、予測差分の計算を行う。 Here, it is determined whether the mode is the intra prediction mode (step 1103), and in accordance with the determination in step 1103, the intra prediction process (step 1104) or the inter prediction process (step 1105) is performed to calculate the prediction difference. Do.
 さらに、画面間予測を行う際には、予測差分の他に動きベクトルを符号化する。ここでは、開始フレームかを判断し(ステップ1106)、ステップ1106で対象フレームが開始フレームであれば、画面間予測に利用した動きベクトルを、例えば、図10に示した方法にて保存する(ステップ1107)。 Furthermore, when performing inter-screen prediction, a motion vector is encoded in addition to the prediction difference. Here, it is determined whether the frame is a start frame (step 1106). If the target frame is the start frame in step 1106, the motion vector used for inter-screen prediction is stored by, for example, the method shown in FIG. 1107).
 一方、ステップ1106で対象フレームが開始フレームでなければ、終了フレームかを判断し(ステップ1108)、ステップ1108で対象フレームが終了フレームであれば、対応領域の動きベクトルと、保存されている開始フレームの動きベクトルとを利用して動きのモデル化を行い、動きパラメータを算出する(ステップ1109)。 On the other hand, if the target frame is not the start frame in step 1106, it is determined whether it is the end frame (step 1108). If the target frame is the end frame in step 1108, the motion vector of the corresponding area and the stored start frame are determined. The motion is modeled using the motion vector of, and a motion parameter is calculated (step 1109).
 また、ステップ1106、ステップ1108において、対象フレームが開始フレームでも終了フレームでもなければ、動きモデルを利用して差分ベクトルDMVを算出する(ステップ1111)。 In step 1106 and step 1108, if the target frame is neither the start frame nor the end frame, the difference vector DMV is calculated using the motion model (step 1111).
 なお、開始フレームおよび終了フレームに対するDMVの計算は、H.264/AVCによる従来方式によって行う(ステップ1110)。 Note that the DMV calculation for the start frame and end frame is H.264. This is performed by a conventional method using H.264 / AVC (step 1110).
 続いて、予測差分に対して周波数変換処理(ステップ1112)、量子化処理(ステップ1113)、可変長符号化処理(ステップ1114)を行い、各符号化モードの画質歪と符号量を計算する。 Subsequently, frequency conversion processing (step 1112), quantization processing (step 1113), and variable length encoding processing (step 1114) are performed on the prediction difference, and image quality distortion and code amount of each encoding mode are calculated.
 以上の処理を、ループ2により、すべての符合化モードに対して終了すれば、以上の結果に基づいて最も符号化効率の良いモードを選択する(ステップ1115)。 If the above processing is completed for all coding modes by loop 2, the mode with the highest coding efficiency is selected based on the above results (step 1115).
 なお、多数の符号化モードの中から最も符号化効率の高いものを選択する際には、例えば、画質歪みと符号量の関係から最適な符号化モードを決定するRD-Optimization方式を利用することによって、効率良く符号化できる。 When selecting the one with the highest coding efficiency from among a large number of coding modes, for example, the RD-Optimization method that determines the optimum coding mode from the relationship between the image quality distortion and the code amount is used. Thus, encoding can be performed efficiently.
 RD-Optimization方式の詳細については、以下の文献を参照のこと。
  「G. Sullivan and T.Wiegand : “Rate-Distortion Optimization for Video Compression”, IEEE Signal Processing Magazine, vol.15, no.6, pp.74-90, 1998.」
For details of the RD-Optimization method, refer to the following documents.
“G. Sullivan and T. Wiegand:“ Rate-Distration Optimization for Video Compression ”, IEEE Signal Processing Magazine, vol. 15, no. 6, pp. 74-90.
 続いて、選択された符号化モードに対して、量子化済みの周波数変換係数に逆量子化処理(ステップ1116)と逆周波数変換処理(ステップ1117)を施して予測差分を復号化し、復号化画像を生成して参照画像メモリに格納する(ステップ1118)。 Subsequently, for the selected encoding mode, the quantized frequency transform coefficient is subjected to inverse quantization processing (step 1116) and inverse frequency transform processing (step 1117) to decode the prediction difference, and the decoded image Is stored in the reference image memory (step 1118).
 以上の処理を、ループ1により、すべてのブロックに対して完了すれば、画像1フレーム分の符号化は終了する(ステップ1119)。 If the above processing is completed for all the blocks by loop 1, the encoding for one frame of the image is completed (step 1119).
 次に、図13により、本発明の一実施の形態に係る動画像復号化装置の1フレームの符号化処理手順について説明する。図13は本発明の一実施の形態に係る動画像復号化装置の1フレームの符号化処理手順を示すフローチャートである。 Next, with reference to FIG. 13, a description will be given of a one-frame encoding process procedure of the video decoding apparatus according to the embodiment of the present invention. FIG. 13 is a flowchart showing a one-frame encoding process procedure of the moving picture decoding apparatus according to the embodiment of the present invention.
 まず、1フレーム内のすべてのブロックに対して、以下の処理をループ1として行う(ステップ1201)。すなわち、入力ストリームに対して可変長復号化処理を施し(ステップ1202)、逆量子化処理(ステップ1203)および逆周波数変換処理(ステップ1204)を施して予測差分を復号化する。 First, the following processing is performed as loop 1 for all blocks in one frame (step 1201). That is, variable length decoding processing is performed on the input stream (step 1202), and inverse quantization processing (step 1203) and inverse frequency conversion processing (step 1204) are performed to decode the prediction difference.
 続いて、画面内予測モードかを判断し(ステップ1205)、ステップ1205の判断に応じて、画面内予測処理(ステップ1206)、または画面間予測処理(ステップ1210)を行う。 Subsequently, it is determined whether the mode is an intra-screen prediction mode (step 1205), and an intra-screen prediction process (step 1206) or an inter-screen prediction process (step 1210) is performed according to the determination in step 1205.
 なお、画面間予測を行う際には予測に先駆けて動きベクトルMVの復号化を行う必要がある。ここでは、開始フレームまたは終了フレームかを判断し(ステップ1207)、ステップ1207で、対象フレームが開始フレームまたは終了フレームであれば、H.264/AVCによる従来方式によってMVの復号を行う(ステップ1208)。 Note that when performing inter-screen prediction, it is necessary to decode the motion vector MV prior to the prediction. Here, it is determined whether the current frame is a start frame or an end frame (step 1207). The MV is decoded by a conventional method based on H.264 / AVC (step 1208).
 一方で、ステップ1207で、対象フレームが開始フレームでも終了フレームでもなければ、動きモデルを利用してMVの復号を行う(ステップ1209)。 On the other hand, if it is determined in step 1207 that the target frame is neither a start frame nor an end frame, MV decoding is performed using the motion model (step 1209).
 以上の処理を、ループ1により、フレーム中のすべてのブロックに対して完了すれば、画像1フレーム分の復号化が終了する(ステップ1211)。 When the above processing is completed for all the blocks in the frame by loop 1, decoding for one frame of the image is completed (step 1211).
 なお、本実施の形態では、周波数変換の一例としてDCTを挙げているが、DST(Discrete Sine Transformation:離散サイン変換)、WT(Wavelet Transformation:ウェーブレット変換)、DFT(Discrete Fourier Transformation:離散フーリエ変換)、KLT(Karhunen-Loeve Transformation:カルーネン-レーブ変換)など、画素間相関除去に利用する直交変換ならどんなものでも構わないし、特に周波数変換を施さずに予測差分そのものに対して符号化を行っても構わない。 In this embodiment, DCT is cited as an example of frequency conversion, but DST (Discrete Sine Transformation), WT (Wavelet Transformation), DFT (Discrete Fourier Transform: Discrete Fourier Transform). Any orthogonal transform can be used for removing the correlation between pixels, such as KLT (Karhunen-Loeve Transformation), and even if the prediction difference itself is encoded without frequency conversion. I do not care.
 さらに、可変長符号化も特に行わなくて良い。また、本実施の形態では、関数MVMtを利用して予測ベクトルを算出していたが、この関数を用いて動きベクトル自体を表現しても良い。この場合、動きベクトルMVはMVMtと等しくなり、差分ベクトルDMVを符号化する必要はない。 Furthermore, there is no need to perform variable length coding. In the present embodiment, the prediction vector is calculated using the function MVMt, but the motion vector itself may be expressed using this function. In this case, the motion vector MV is equal to MVMt, and there is no need to encode the difference vector DMV.
 以上、本発明者によってなされた発明を実施の形態に基づき具体的に説明したが、本発明は前記実施の形態に限定されるものではなく、その要旨を逸脱しない範囲で種々変更可能であることはいうまでもない。 As mentioned above, the invention made by the present inventor has been specifically described based on the embodiment. However, the present invention is not limited to the embodiment, and various modifications can be made without departing from the scope of the invention. Needless to say.
 本発明は動画像を符号化する動画像符号化技術および動画像を復号化する動画像復号化技術に関し、動画像の符号化・復号化を行う装置に広く適用可能である。 The present invention relates to a moving picture coding technique for coding a moving picture and a moving picture decoding technique for decoding a moving picture, and can be widely applied to apparatuses that perform coding and decoding of a moving picture.
本発明の一実施の形態に係る動画像符号化方法および動画像復号化方法の前提となる画面間予測処理を説明するための説明図である。It is explanatory drawing for demonstrating the prediction process between screens used as the premise of the moving image encoding method and moving image decoding method which concern on one embodiment of this invention. 本発明の一実施の形態に係る動画像符号化方法および動画像復号化方法の前提となる画面間予測処理の予測ベクトルの算出を説明するための説明図である。It is explanatory drawing for demonstrating calculation of the prediction vector of the inter-screen prediction process used as the premise of the moving image encoding method and moving image decoding method which concern on one embodiment of this invention. 本発明の一実施の形態に係る動画像符号化方法および動画像復号化方法の動きのモデル化の一例を示す図である。It is a figure which shows an example of the motion modeling of the moving image encoding method and moving image decoding method which concern on one embodiment of this invention. 本発明の一実施の形態に係る動画像符号化方法および動画像復号化方法の動きベクトルの符号化の一例を示す図である。It is a figure which shows an example of the encoding of the motion vector of the moving image encoding method and moving image decoding method which concern on one embodiment of this invention. 本発明の一実施の形態に係る動画像符号化方法および動画像復号化方法の開始フレームと終了フレームの選択の一例を示す図である。It is a figure which shows an example of selection of the start frame and end frame of the moving image encoding method and moving image decoding method which concern on one embodiment of this invention. 本発明の一実施の形態に係る動画像符号化方法および動画像復号化方法の動きのモデル化の他の例を示す図である。It is a figure which shows the other example of the motion modeling of the moving image encoding method and moving image decoding method which concern on one embodiment of this invention. 本発明の一実施の形態に係る動画像符号化方法および動画像復号化方法の開始フレームと終了フレームの選択の他の例を示す図である。It is a figure which shows the other example of selection of the start frame and the end frame of the moving image encoding method and moving image decoding method which concern on one embodiment of this invention. 本発明の一実施の形態に係る動画像符号化装置の構成を示す構成図である。It is a block diagram which shows the structure of the moving image encoder which concerns on one embodiment of this invention. 本発明の一実施の形態に係る動画像復号化装置の構成を示す構成図である。It is a block diagram which shows the structure of the moving image decoding apparatus which concerns on one embodiment of this invention. 本発明の一実施の形態に係る動画像符号化装置動き記述部の構成の一例を示す構成図である。It is a block diagram which shows an example of a structure of the moving image encoder motion description part which concerns on one embodiment of this invention. 本発明の一実施の形態に係る動画像符号化装置動き記述部の構成の他の例を示す構成図である。It is a block diagram which shows the other example of a structure of the moving image encoder motion description part which concerns on one embodiment of this invention. 本発明の一実施の形態に係る動画像符号化装置の1フレームの符号化処理手順を示すフローチャートである。It is a flowchart which shows the encoding process procedure of 1 frame of the moving image encoder which concerns on one embodiment of this invention. 本発明の一実施の形態に係る動画像復号化装置の1フレームの符号化処理手順を示すフローチャートである。It is a flowchart which shows the encoding process procedure of 1 frame of the moving image decoding apparatus which concerns on one embodiment of this invention.
符号の説明Explanation of symbols
 101…原画像、102…入力画像メモリ、103…ブロック分割部、104…動き探索部、105…画面内予測部、106…画面間予測部、107…モード選択部、108…減算部、109…周波数変換部、110…量子化処理部、111…動き記述部、112…動き情報メモリ、113…可変長符号化部、114…逆量子化処理部、115…逆周波数変換部、116…加算部、117…参照画像メモリ、201…符号化ストリーム、202…可変長復号化部、203…逆量子化処理部、204…逆周波数変換部、205…動き情報メモリ、206…画面間予測部、207…画面内予測部、208…加算部、209…参照画像メモリ、301…対象画像、302…開始フレームメモリ、303…動き探索部、304…動き情報モデル化部、305…動きパラメータ、306…動きのモデル化関数、401…動きベクトル、402…開始フレームメモリ、403…動き情報モデル化部、404…動きパラメータ、405…動きのモデル化関数、501…画像、502…参照画像、503…符号化対象画像、504,505…ブロック、601~606…ブロック、701~705…フレーム、706…動きのモデル化関数、707,708…領域、801~803…フレーム、804…動きベクトルMV、805…動きMVMt、806…差分ベクトルDMV、807~809…領域、901…Iピクチャ、904、907…Pピクチャ、902、903、905、906…Bピクチャ、1001…Iピクチャ、1002~1004…Iピクチャ、1301~1307…フレーム。 DESCRIPTION OF SYMBOLS 101 ... Original image, 102 ... Input image memory, 103 ... Block division part, 104 ... Motion search part, 105 ... In-screen prediction part, 106 ... Inter-screen prediction part, 107 ... Mode selection part, 108 ... Subtraction part, 109 ... Frequency conversion unit, 110 ... quantization processing unit, 111 ... motion description unit, 112 ... motion information memory, 113 ... variable length coding unit, 114 ... inverse quantization processing unit, 115 ... inverse frequency conversion unit, 116 ... addition unit 117: Reference image memory, 201: Encoded stream, 202: Variable length decoding unit, 203 ... Inverse quantization processing unit, 204 ... Inverse frequency transform unit, 205 ... Motion information memory, 206 ... Inter-screen prediction unit, 207 ... intra-screen prediction unit, 208 ... adding unit, 209 ... reference image memory, 301 ... target image, 302 ... start frame memory, 303 ... motion search unit, 304 ... motion information modeling unit 305 ... Motion parameters, 306 ... Motion modeling functions, 401 ... Motion vectors, 402 ... Start frame memory, 403 ... Motion information modeling unit, 404 ... Motion parameters, 405 ... Motion modeling functions, 501 ... Images, 502 Reference image, 503 ... Image to be encoded, 504, 505 ... Block, 601-606 ... Block, 701-705 ... Frame, 706 ... Motion modeling function, 707, 708 ... Region, 801-803 ... Frame, 804 ... motion vector MV, 805 ... motion MVMt, 806 ... difference vector DMV, 807 to 809 ... area, 901 ... I picture, 904, 907 ... P picture, 902, 903, 905, 906 ... B picture, 1001 ... I picture, 1002 to 1004 ... I picture, 1301 to 1307 ... frame

Claims (12)

  1.  画面間予測を行って予測差分を計算する画面間予測部と、
     複数フレームに渡る対象領域の動き情報をモデル化する動き記述部と、
     前記予測差分に対して符号化を行う周波数変換部および量子化処理部と、
     前記動き記述部による前記モデル化の情報に基づいて、記号の発生確率に応じた可変長符号化を行う可変長符号化部とを有し、
     前記動き記述部は、前記複数フレーム内で、開始フレームおよび終了フレームを指定し、前記開始フレームと前記終了フレームとの間の前記対象領域の動きを時間関数によってモデル化することを特徴とする動画像符号化装置。
    An inter-screen prediction unit that performs inter-screen prediction and calculates a prediction difference;
    A motion description unit that models motion information of a target area across multiple frames;
    A frequency conversion unit and a quantization processing unit that perform encoding on the prediction difference;
    A variable length coding unit that performs variable length coding according to the probability of occurrence of a symbol based on the modeling information by the motion description unit;
    The motion description unit specifies a start frame and an end frame in the plurality of frames, and models the motion of the target area between the start frame and the end frame by a time function. Image encoding device.
  2.  請求項1記載の動画像符号化装置において、
     前記可変長符号化部は、前記対象領域に対して、前記画面間予測を行う際に利用した動きベクトルと、前記動き記述部にて前記対象領域の動きをモデル化した前記時間関数によって算出されたベクトルの差分に対して可変長符号化を行うことを特徴とする動画像符号化装置。
    The moving picture encoding apparatus according to claim 1,
    The variable length coding unit is calculated by using the motion vector used when performing the inter-screen prediction on the target region and the time function obtained by modeling the motion of the target region in the motion description unit. A moving picture coding apparatus characterized in that variable length coding is performed on a difference between vectors.
  3.  請求項1記載の動画像符号化装置において、
     前記画面間予測部は、前記動き記述部にて前記対象領域の動きをモデル化した前記時間関数によって算出されたベクトルを用いて動き補償を行うことを特徴とする動画像符号化装置。
    The moving picture encoding apparatus according to claim 1,
    The inter-picture prediction unit performs motion compensation using a vector calculated by the time function obtained by modeling the motion of the target region in the motion description unit.
  4.  請求項1記載の動画像符号化装置において、
     前記動き記述部は、前記開始フレームおよび前記終了フレームの指定を、前記時間関数によってモデル化可能な範囲で指定することを特徴とする動画像符号化装置。
    The moving picture encoding apparatus according to claim 1,
    The moving picture coding apparatus, wherein the motion description unit designates the start frame and the end frame within a range that can be modeled by the time function.
  5.  画面間予測を行って予測差分を計算する画面間予測部と、複数フレームに渡る対象領域の動き情報をモデル化する動き記述部と、前記予測差分に対して符号化を行う周波数変換部および量子化処理部と、前記動き記述部による前記モデル化の情報に基づいて、記号の発生確率に応じた可変長符号化を行う可変長符号化部とを有する動画像符号化装置における動画像符号化であって、
     前記動き記述部により、前記複数フレーム内で、開始フレームおよび終了フレームが指定され、前記開始フレームと前記終了フレームとの間の前記対象領域の動きが時間関数によってモデル化されることを特徴とする動画像符号化方法。
    An inter-screen prediction unit that performs inter-screen prediction and calculates a prediction difference, a motion description unit that models motion information of a target region over a plurality of frames, a frequency conversion unit that performs encoding on the prediction difference, and a quantum Coding in a moving picture coding apparatus comprising: a coding processing unit; and a variable length coding unit that performs variable length coding according to a symbol generation probability based on the modeling information by the motion description unit Because
    The motion description unit designates a start frame and an end frame within the plurality of frames, and the motion of the target region between the start frame and the end frame is modeled by a time function. Video encoding method.
  6.  請求項5記載の動画像符号化方法において、
     前記可変長符号化部により、前記対象領域に対して、前記画面間予測を行う際に利用した動きベクトルと、前記動き記述部にて前記対象領域の動きをモデル化した前記時間関数によって算出されたベクトルの差分に対して可変長符号化が行われることを特徴とする動画像符号化方法。
    The moving image encoding method according to claim 5, wherein
    Calculated by the variable length encoding unit using the motion vector used when performing the inter-screen prediction on the target region and the time function modeling the motion of the target region in the motion description unit. A moving image encoding method, wherein variable-length encoding is performed on a difference between vectors.
  7.  請求項5記載の動画像符号化方法において、
     前記画面間予測部により、前記動き記述部にて前記対象領域の動きをモデル化した前記時間関数によって算出されたベクトルを用いて動き補償が行われることを特徴とする動画像符号化方法。
    The moving image encoding method according to claim 5, wherein
    A moving picture coding method, wherein motion compensation is performed by the inter-frame prediction unit using a vector calculated by the time function obtained by modeling the motion of the target region in the motion description unit.
  8.  請求項5記載の動画像符号化方法において、
     前記動き記述部により、前記開始フレームおよび前記終了フレームの指定が、前記時間関数によってモデル化可能な範囲で指定されることを特徴とする動画像符号化方法。
    The moving image encoding method according to claim 5, wherein
    The moving picture coding method, wherein the motion description unit designates the start frame and the end frame within a range that can be modeled by the time function.
  9.  可変長符号化されたデータを前記可変長符号化の逆の手順で復号化する可変長復号化部と、
     予測差分を復号化する逆量子化処理部および逆周波数変換部と、
     画面間予測を行い復号化画像を取得する画面間予測部とを有し、
     前記可変長復号化部は、開始フレームと終了フレームとの間の対象領域の動きがモデル化された時間関数に基づいて、前記対象領域の動きベクトルを復号化することを特徴とする動画像復号化装置。
    A variable length decoding unit that decodes variable length encoded data in the reverse procedure of the variable length encoding;
    An inverse quantization processing unit and an inverse frequency transform unit for decoding the prediction difference;
    An inter-screen prediction unit that performs inter-screen prediction and obtains a decoded image,
    The variable length decoding unit decodes the motion vector of the target region based on a time function in which the motion of the target region between the start frame and the end frame is modeled. Device.
  10.  請求項9記載の動画像復号化装置において、
     前記画面間予測部は、前記対象領域の動きをモデル化した時間関数によって算出された動きベクトルを用いて動き補償を行うことを特徴とする動画像復号化装置。
    The moving picture decoding apparatus according to claim 9, wherein
    The video decoding apparatus, wherein the inter-screen prediction unit performs motion compensation using a motion vector calculated by a time function that models the motion of the target region.
  11.  可変長符号化されたデータを前記可変長符号化の逆の手順で復号化する可変長復号化部と、予測差分を復号化する逆量子化処理部および逆周波数変換部と、画面間予測を行い復号化画像を取得する画面間予測部とを有する動画像復号化装置における動画像復号化方法であって、
     前記可変長復号化部により、開始フレームと終了フレームとの間の対象領域の動きがモデル化された時間関数に基づいて、前記対象領域の動きベクトルが復号化されることを特徴とする動画像復号化方法。
    A variable-length decoding unit that decodes variable-length encoded data in the reverse procedure of the variable-length encoding; an inverse quantization processing unit that decodes a prediction difference; and an inverse frequency conversion unit; A moving picture decoding method in a moving picture decoding apparatus having an inter-screen prediction unit that performs a decoded image and performs:
    A moving image in which a motion vector of the target area is decoded based on a time function in which the motion of the target area between a start frame and an end frame is modeled by the variable length decoding unit. Decryption method.
  12.  請求項11記載の動画像復号化方法において、
     前記画面間予測部により、前記対象領域の動きをモデル化した時間関数によって算出された動きベクトルを用いて動き補償が行われることを特徴とする動画像復号化方法。
    The video decoding method according to claim 11, wherein
    A moving picture decoding method, wherein motion compensation is performed by the inter prediction unit using a motion vector calculated by a time function modeling the movement of the target region.
PCT/JP2009/001449 2008-04-16 2009-03-30 Dynamic image encoder, dynamic image decoder, dynamic image encoding method, and dynamic image decoding method WO2009128208A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2010508099A JPWO2009128208A1 (en) 2008-04-16 2009-03-30 Moving picture encoding apparatus, moving picture decoding apparatus, moving picture encoding method, and moving picture decoding method

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2008106980 2008-04-16
JP2008-106980 2008-04-16

Publications (1)

Publication Number Publication Date
WO2009128208A1 true WO2009128208A1 (en) 2009-10-22

Family

ID=41198917

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2009/001449 WO2009128208A1 (en) 2008-04-16 2009-03-30 Dynamic image encoder, dynamic image decoder, dynamic image encoding method, and dynamic image decoding method

Country Status (2)

Country Link
JP (1) JPWO2009128208A1 (en)
WO (1) WO2009128208A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103222265A (en) * 2010-09-30 2013-07-24 三菱电机株式会社 Dynamic image encoding device, dynamic image decoding device, dynamic image encoding method, and dynamic image decoding method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08107558A (en) * 1994-10-07 1996-04-23 Nec Corp Method and device for encoding and encoding motion picture
JP2003319403A (en) * 2002-04-09 2003-11-07 Lg Electronics Inc Method of predicting block in improved direct mode
JP2005513929A (en) * 2001-12-19 2005-05-12 トムソン ライセンシング ソシエテ アノニム Method for estimating the main motion in a sequence of images
JP2005533465A (en) * 2002-07-15 2005-11-04 アップル コンピュータ、インコーポレイテッド Variable precision inter-picture timing designation method and apparatus in digital video encoding processing
JP2007110672A (en) * 2005-09-14 2007-04-26 Sanyo Electric Co Ltd Encoding method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4401336B2 (en) * 2005-08-31 2010-01-20 三洋電機株式会社 Encoding method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08107558A (en) * 1994-10-07 1996-04-23 Nec Corp Method and device for encoding and encoding motion picture
JP2005513929A (en) * 2001-12-19 2005-05-12 トムソン ライセンシング ソシエテ アノニム Method for estimating the main motion in a sequence of images
JP2003319403A (en) * 2002-04-09 2003-11-07 Lg Electronics Inc Method of predicting block in improved direct mode
JP2005533465A (en) * 2002-07-15 2005-11-04 アップル コンピュータ、インコーポレイテッド Variable precision inter-picture timing designation method and apparatus in digital video encoding processing
JP2007110672A (en) * 2005-09-14 2007-04-26 Sanyo Electric Co Ltd Encoding method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103222265A (en) * 2010-09-30 2013-07-24 三菱电机株式会社 Dynamic image encoding device, dynamic image decoding device, dynamic image encoding method, and dynamic image decoding method

Also Published As

Publication number Publication date
JPWO2009128208A1 (en) 2011-08-04

Similar Documents

Publication Publication Date Title
US11546627B2 (en) Moving picture decoding method and moving picture encoding method
US10051273B2 (en) Video decoder and video decoding method
JP5061179B2 (en) Illumination change compensation motion prediction encoding and decoding method and apparatus
KR101422422B1 (en) System and method for enhanced dmvd processing
JP5422168B2 (en) Video encoding method and video decoding method
JP5579937B2 (en) System and method for deriving low complexity motion vectors
WO2015010319A1 (en) P frame-based multi-hypothesis motion compensation encoding method
US9596467B2 (en) Motion estimation device for predicting a vector by referring to motion vectors of adjacent blocks, motion estimation method and storage medium of motion estimation program
WO2012121234A1 (en) Video encoding device, video encoding method and video encoding program
JP2010183162A (en) Motion picture encoder
JP5832263B2 (en) Image coding apparatus and image coding method
WO2009128208A1 (en) Dynamic image encoder, dynamic image decoder, dynamic image encoding method, and dynamic image decoding method
KR20130049736A (en) Method and apparatus for inter prediction
JP6016488B2 (en) Video compression format conversion apparatus, video compression format conversion method, and program
WO2010061515A1 (en) Dynamic image encoding device, encoding method, dynamic image decoding device, and decoding method
KR101841352B1 (en) Reference frame selection method and apparatus
JP5788952B2 (en) Video decoding method
JP2014230031A (en) Image encoding device and image encoding program
JP2008017304A (en) Image coding device, image decoding device, image coding method, and image coding program
JP2006295502A (en) Reencoding apparatus, re-encoding method and program for re-encoding
JP2007228400A (en) Moving image converter
JP5887020B1 (en) Video decoding method
JP5890933B1 (en) Video decoding method
JP2016136741A (en) Moving image decoding method
JP2016051922A (en) Dynamic image encoding device, dynamic image decoding device, dynamic image processing system, dynamic image encoding method, dynamic image decoding method, and program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09731624

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2010508099

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09731624

Country of ref document: EP

Kind code of ref document: A1