WO2012172667A1 - Procédé de codage vidéo, procédé de décodage vidéo et dispositif - Google Patents

Procédé de codage vidéo, procédé de décodage vidéo et dispositif Download PDF

Info

Publication number
WO2012172667A1
WO2012172667A1 PCT/JP2011/063737 JP2011063737W WO2012172667A1 WO 2012172667 A1 WO2012172667 A1 WO 2012172667A1 JP 2011063737 W JP2011063737 W JP 2011063737W WO 2012172667 A1 WO2012172667 A1 WO 2012172667A1
Authority
WO
WIPO (PCT)
Prior art keywords
prediction
transform
unit
intra
reference pixel
Prior art date
Application number
PCT/JP2011/063737
Other languages
English (en)
Japanese (ja)
Inventor
昭行 谷沢
山口 潤
太一郎 塩寺
山影 朋夫
Original Assignee
株式会社 東芝
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社 東芝 filed Critical 株式会社 東芝
Priority to PCT/JP2011/063737 priority Critical patent/WO2012172667A1/fr
Publication of WO2012172667A1 publication Critical patent/WO2012172667A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/12Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards

Definitions

  • Embodiments described herein relate generally to an intra-screen prediction method, a moving image encoding method, a moving image decoding method, and an apparatus for encoding and decoding moving images.
  • H. H.264 adopts direction prediction in the spatial domain (pixel domain) and performs orthogonal transform based on discrete cosine transform (DCT) on the prediction error signal generated by the difference between the input image signal and the predicted image signal, thereby obtaining ISO / IEC.
  • DCT discrete cosine transform
  • Non-Patent Document 1 a separation type defined by a combination of two types of one-dimensional transformations with respect to a prediction error generated by weighted averaging of predicted images using two types of direction prediction.
  • JCT-VC Joint Collaborative Team on Video Coding
  • Non-Patent Document 1 possesses a prediction mode corresponding to one direction prediction and a look-up table (LUT) for one-dimensional conversion, and uses the property that the prediction error tends to be different for each prediction direction.
  • the prediction direction is mapped to four types of separate two-dimensional transformations including discrete cosine transformation (DCT) and predetermined orthogonal transformation (for example, discrete sine transformation (DST) and Karhunen-Loeve transformation (KLT)).
  • DCT discrete cosine transformation
  • DST discrete sine transformation
  • KLT Karhunen-Loeve transformation
  • a separation type two-dimensional transform corresponding to a prediction mode with a small prediction mode number is selected from the two prediction modes, but the two prediction modes of bi-directional prediction are different reference pixel lines. Is used, the prediction residual of bi-directional prediction is different from the tendency of any prediction error of these two unidirectional predictions, so that the coding efficiency may be reduced.
  • an object of the present embodiment is to provide a moving picture coding method, a moving picture decoding method, and an apparatus that can improve coding efficiency.
  • the moving image encoding method of the embodiment selects a combination of one-dimensional transformation only for the first orthogonal transformation when two or more prediction modes are intra-frame prediction processing using one or more reference pixel lines, A combination of the first orthogonal transform and the second orthogonal transform is selected when each of the two or more prediction modes is an intra-screen prediction process using one reference pixel line and the same reference pixel line.
  • a predicted image signal is generated using the two or more prediction modes.
  • a prediction coefficient signal derived from the prediction image signal is subjected to a two-dimensional conversion process using the selected combination of one-dimensional conversions to generate a conversion coefficient.
  • the prediction information indicating a combination of the two or more prediction modes and the transform coefficient are encoded.
  • FIG. 3 is a block diagram illustrating an intra bi-predictive image generation unit 109 according to the first embodiment.
  • FIG. 3 is a block diagram illustrating an intra bi-predictive image generation unit 109 according to the first embodiment.
  • FIG. 3 is a block diagram illustrating an orthogonal transform unit 102 according to the first embodiment.
  • Explanatory drawing which illustrates the relationship of unidirectional intra prediction and error distribution, and vertical conversion and horizontal conversion based on 1st Embodiment.
  • Explanatory drawing which illustrates the relationship of unidirectional intra prediction and error distribution, and vertical conversion and horizontal conversion based on 1st Embodiment.
  • Explanatory drawing which illustrates the relationship of unidirectional intra prediction and error distribution, and vertical conversion and horizontal conversion based on 1st Embodiment.
  • Explanatory drawing which illustrates the relationship between bidirectional
  • FIG. 3 is a block diagram illustrating an inverse orthogonal transform unit 105 according to the first embodiment.
  • the table figure which illustrates the relationship between the conversion index which concerns on 1st Embodiment, a vertical conversion index, and a horizontal conversion index.
  • the table figure which illustrates the conversion matrix name of the vertical conversion index which concerns on 1st Embodiment, and a horizontal conversion index.
  • Explanatory drawing which illustrates a reference pixel line and prediction direction derivation based on a 2nd embodiment.
  • the table figure which illustrates the relationship between prediction mode, bidirectional
  • Explanatory drawing of the slice header syntax based on 2nd Embodiment Explanatory drawing which shows an example of the prediction unit syntax based on 2nd Embodiment.
  • Explanatory drawing which shows an example of the prediction unit syntax based on 2nd Embodiment.
  • FIG. 26B is a table diagram following FIG.
  • FIG. 26B is a table diagram following FIG.
  • the first embodiment relates to an image encoding device.
  • a moving picture decoding apparatus corresponding to the picture encoding apparatus according to the present embodiment will be described in a third embodiment.
  • This image encoding device can be realized by hardware such as an LSI (Large-Scale Integration) chip, a DSP (Digital Signal Processor), or an FPGA (Field Programmable Gate Array).
  • the image encoding apparatus can also be realized by causing a computer to execute an image encoding program.
  • the image encoding device 100 includes a subtraction unit 101, an orthogonal transformation unit 102, a quantization unit 103, an inverse quantization unit 104, an inverse orthogonal transformation unit 105, an addition unit 106, and a reference.
  • Image memory 107 intra unidirectional prediction image generation unit 108, intra bidirectional prediction image generation unit 109, inter prediction image generation unit 110, prediction selection switch 111, conversion information setting unit 112, prediction selection unit 113, entropy encoding unit 114 And an output buffer 115.
  • the image coding apparatus in FIG. 1 divides each frame or each field constituting the input image 117 into a plurality of pixel blocks, performs predictive coding on the divided pixel blocks, and outputs coded data 128. To do.
  • pixel blocks are predictively encoded from the upper left to the lower right as shown in FIG. 2A.
  • the encoded pixel block p is located on the left side and the upper side of the encoding target pixel block c in the encoding processing target frame f.
  • the pixel block refers to a unit for processing an image such as an M ⁇ N size block (N and M are natural numbers), a coding tree block, a macro block, a sub block, and one pixel.
  • N and M are natural numbers
  • the pixel block is basically used in the sense of a coding tree block.
  • the pixel block can be interpreted in the above-described meaning by appropriately replacing the description.
  • the pixel block in the description of the prediction unit is interpreted as the pixel block of the prediction unit.
  • the coding tree block is typically a 16 ⁇ 16 pixel block shown in FIG. 2B, for example, but may be a 32 ⁇ 32 pixel block shown in FIG. 2C or a 64 ⁇ 64 pixel block shown in FIG.
  • the coding unit is not limited to a pixel block such as a coding tree block, and a frame, a field, a slice, or a combination thereof can be used.
  • FIG. 3A to 3D are diagrams showing specific examples of coding tree blocks.
  • N represents the size of the reference coding tree block.
  • the size when divided is defined as N, and the size when not divided is defined as 2N.
  • the coding tree block has a quadtree structure, and when divided, the four pixel blocks are indexed in the Z-scan order as shown in FIG. 3B.
  • An example in which the 64 ⁇ 64 pixel block of FIG. 3A is divided into quadtrees is shown in FIG. 3B. Further, it is possible to further divide the quadtree within the index of one quadtree of the coding tree block.
  • the unit having the largest coding tree block is called a large coding tree block, and the input image signal is encoded in this unit in raster scan order.
  • FIG. 3D shows an example in which the 32 ⁇ 32 pixel block of FIG. 3C is divided into quadtrees.
  • the image encoding apparatus in FIG. 1 performs intra prediction (also referred to as intra-frame prediction, intra-frame prediction, etc.) or inter prediction (inter-screen prediction) on a pixel block based on the encoding parameter input from the encoding control unit 116. Prediction, inter-frame prediction, motion compensation prediction, etc.) is performed to generate a predicted image 125.
  • This image coding apparatus orthogonally transforms and quantizes a prediction error (also called a prediction difference signal) 118 generated by subtracting an input image 117 divided into pixel blocks and a predicted image 125, performs entropy coding, and performs coding. Generated data 128 is output.
  • the image encoding device in FIG. 1 performs encoding by selectively applying a plurality of prediction modes having different block sizes and generation methods of the predicted image 125.
  • the generation method of the prediction image 125 is roughly classified into two types, that is, intra prediction in which prediction is performed in the encoding target frame and inter prediction in which prediction is performed using one or a plurality of reference frames that are temporally different. .
  • the subtractor 101 subtracts the corresponding prediction image 125 from the input image 117 divided into pixel blocks to obtain a prediction error 118.
  • the prediction error 118 output from the subtractor 101 is input to the orthogonal transform unit 102.
  • the orthogonal transform unit 102 performs, for example, discrete cosine transform (DCT) or discrete sine transform (DCT) on the prediction error 118 output from the subtractor 101 based on the transform information 127 output from the transform information setting unit 112 described later. DST) is performed to obtain a transform coefficient 119.
  • the transform coefficient 119 output from the orthogonal transform unit 102 is input to the quantization unit 103.
  • the quantization unit 103 performs a quantization process on the transform coefficient 119 output from the orthogonal transform unit 102 to obtain a quantized transform coefficient 120. Specifically, the quantization unit 103 performs quantization according to quantization information such as a quantization parameter and a quantization matrix specified by the encoding control unit 116 (quantization step size derived from the quantization coefficient by the transform information). Divide by). The quantization parameter indicates the fineness of quantization. The quantization matrix is used for weighting the fineness of quantization for each component of the transform coefficient. The quantization unit 103 inputs the quantized transform coefficient 120 to the entropy encoding unit 114 and the inverse quantization unit 104.
  • quantization information such as a quantization parameter and a quantization matrix specified by the encoding control unit 116 (quantization step size derived from the quantization coefficient by the transform information). Divide by).
  • the quantization parameter indicates the fineness of quantization.
  • the quantization matrix is used for weighting the fineness of quantization for each component of
  • the entropy encoding unit 114 performs various encoding parameters such as the quantized transform coefficient 120 from the quantization unit 103, the prediction information 126 from the prediction selection unit 113, and the quantization information specified by the encoding control unit 116.
  • Entropy encoding (for example, Huffman encoding, arithmetic encoding, etc.) is performed to generate encoded data.
  • the encoding parameter is a parameter necessary for decoding such as prediction information 126, information on the transform coefficient 120, information on quantization, and the like.
  • the encoding control unit 116 has an internal memory (not shown), the encoding parameter is held in this memory, and the encoding parameter of the adjacent already encoded pixel block is used when encoding the pixel block. It is good also as a structure. For example, H.M. In the H.264 intra prediction, the prediction value of the prediction mode of the pixel block can be derived from the prediction mode information of the encoded adjacent block.
  • the encoded data generated by the entropy encoding unit 114 is temporarily accumulated in the output buffer 115 through multiplexing, for example, and output as encoded data 128 according to an appropriate output timing managed by the encoding control unit 116. .
  • the encoded data 128 is output to, for example, a storage system (storage medium) or a transmission system (communication line) not shown.
  • the inverse quantization unit 104 performs an inverse quantization process on the quantized transform coefficient 120 output from the quantization unit 103 to obtain a restored transform coefficient 121. Specifically, the inverse quantization unit 104 performs inverse quantization according to the quantization information used in the quantization unit 103 (multiplies the quantization transform coefficient 120 by the quantization step size derived from the quantization information). . At this time, the quantization information used in the quantization unit 103 is loaded from an internal memory (not shown) of the encoding control unit 116 and used. The inverse quantization unit 104 inputs the restored transform coefficient 120 to the inverse orthogonal transform unit 105.
  • the inverse orthogonal transform unit 105 performs, for example, inverse discrete cosine transform (IDCT) or inverse discrete on the restored transform coefficient 121 from the inverse quantization unit 104 based on transform information 127 output from the transform information setting unit 112 described later.
  • An inverse orthogonal transformation corresponding to the orthogonal transformation performed in the orthogonal transformation unit 102 such as sine transformation (IDST) is performed to obtain a restored prediction error 122.
  • the reconstruction prediction error 122 output from the inverse orthogonal transform unit 105 is input to the addition unit 106.
  • the addition unit 106 adds the restored prediction error 122 and the corresponding prediction image 125 to generate a local decoded image 123.
  • the locally decoded image 123 is input to the reference image memory 107.
  • the reference image memory 107 stores the local decoded image 123 in the memory, and a prediction image is generated as necessary by the intra unidirectional prediction image generation unit 108, the intra bidirectional prediction image generation unit 109, and the inter prediction image generation unit 110. Is referred to as the reference image 124 each time.
  • the intra unidirectional predicted image generation unit 108 performs unidirectional intra prediction using the reference image 124 stored in the reference image memory 107.
  • pixel supplementation copying or copying
  • An intra prediction image is generated by performing an interpolation process (referring to a filter process or the like).
  • FIG. 4A shows a prediction direction of intra prediction in H.264 / MPEG-4 AVC.
  • FIG. 4B shows an arrangement relationship between reference pixel lines and encoding target pixels in H.264 / MPEG-4 AVC.
  • FIG. 4C shows a predicted image generation method in mode 1 (horizontal prediction), in which pixels I to L are copied in the prediction direction from the left reference pixel line.
  • FIG. 4D shows a predicted image generation method in mode 4 (diagonal lower right prediction).
  • the predicted value of the pixel position below the reference pixel B is derived by performing (1, 2, 1) 3-tap linear filter processing on the three reference pixels A, B, and C. The derived pixel values are also copied in the prediction direction shown in the figure.
  • FIG. 5 shows an example of the prediction angle and the prediction mode when the prediction mode is expanded up to 34 prediction modes.
  • FIG. 5 there are 33 different prediction directions for the vertical and horizontal coordinates indicated by the bold lines.
  • the direction of a typical prediction angle indicated by H.264 / MPEG-4 AVC is indicated by an arrow.
  • 33 types of prediction directions are prepared in a direction in which a line is drawn from the origin to a mark indicated by a circle.
  • DC prediction for prediction based on the average value of available reference pixels is added, and there are a total of 34 prediction modes.
  • IntraPredMode 4
  • IntraPredAngleIdL0 in FIG. 7A described later is ⁇ 4
  • An arrow indicated by a dotted line in FIG. 5 indicates a prediction mode whose prediction type is Intra_Vertical
  • an arrow indicated by a solid line indicates a prediction mode whose prediction type is Intra_Horizontal.
  • FIG. 6 shows the relationship between IntraPredAngleIdLX and intraPredAngle used for predictive image value generation.
  • 7A and 7B show the relationship between the prediction mode (PredMode), a bidirectional prediction flag (BipredFlag) described later, a prediction mode type (PredTypeL0, PredTypeL1), and a prediction angle (PredAngleIdL0, PredAngleIdL1).
  • intraPredAngle indicates the prediction angle that is actually used when the predicted value is generated.
  • a prediction value generation method in the case where the prediction type is Intra_Vertical and the intraPredAngle shown in FIGS. 7A and 7B is a positive value is expressed by the following equation (1).
  • BLK_SIZE indicates the size of the pixel block
  • ref [] indicates an array in which a reference image (also referred to as a reference pixel line) is stored.
  • pred (k, m) indicates the generated predicted image 125.
  • a predicted value can be generated by a similar method according to the tables of FIGS. 7A and 7B.
  • the intra bidirectional prediction image generation unit 109 performs bidirectional intra prediction using the reference image 124 stored in the reference image memory 107. For example, in Non-Patent Document 1 described above, after selecting two types of prediction modes from nine types of prediction modes defined in H.264 / MPEG-4 AVC and generating respective prediction image signals, The predicted image signal is generated by performing the filtering process every time.
  • Bidirectional prediction when the number of unidirectional prediction modes is expanded to 34 types will be described more specifically with reference to FIG.
  • the maximum number of modes is not limited, and bi-directional prediction can be easily expanded in any number of uni-directional predictions.
  • the 8 includes a weighted average unit 801, a first unidirectional intra predicted image generation unit 802, and a second unidirectional intra predicted image generation unit 803.
  • the functions of the first unidirectional intra predicted image generation unit 802 and the second unidirectional intra predicted image generation unit 803 are the same. These may be the same as the intra unidirectional predicted image generation unit 108. In this case, since the three processing units can have the same hardware configuration, the circuit scale can be reduced. Each of these generates a prediction image corresponding to the prediction mode given according to the prediction mode information controlled by the encoding control unit 116.
  • a first predicted image 851 is output from the first unidirectional intra predicted image generation unit 802, and a second predicted image 852 is output from the second unidirectional intra predicted image generation unit 803.
  • Each predicted image is input to the weighted average unit 801, and a weighted average process is performed.
  • a weighted average process is performed.
  • calculation based on the following formula (2) is performed.
  • the bidirectionally predicted image P [x, y] is expressed by the following equation.
  • W [x, y] represents a weighted table, and is set by a combination of two prediction modes used in bidirectional intra prediction.
  • Norm is a fixed number for normalization introduced in order to make Equation (2) an integer arithmetic process
  • Offset indicates an offset in rounding
  • Shift indicates a shift amount for division.
  • the Norm value is 1024
  • the Offset value is 512
  • the Shift value is 10.
  • Bidirectional intra prediction indicates that the BipedFlag shown in FIGS. 7A and 7B is 1.
  • two prediction mode types are defined, the prediction mode type corresponding to the first unidirectional intra prediction image generation unit 802 is PredTypeL0, and the prediction mode corresponding to the second unidirectional intra prediction image generation unit 803 is PredTypeL1. It is shown in The same applies to PredAngleIdLX, and the type and angle of each prediction mode are expressed by whether the X portion is 0 or 1.
  • FIG. 7A and FIG. 7B the combination of two prediction modes corresponding to bidirectional intra prediction is illustrated as a fixed combination. However, a special mode that uses a prediction mode held by an encoded adjacent pixel block is illustrated. A prediction mode may be added. In this case, since a combination of two prediction modes having a close spatial distance can be selected, a combination of bidirectional intra prediction that matches the characteristics of the image can be realized without depending on the number of fixed prediction modes. It is also possible to reduce the number of fixed prediction modes. Further, in the present embodiment of the present invention, an example in which the number of bidirectional intra prediction modes is 16 is shown, but the increase / decrease in the number of prediction modes can be easily changed.
  • the optimum number of prediction modes may be set in consideration of the balance between the number of prediction modes and coding efficiency. The above is the description of the intra bidirectional prediction image generation unit 109.
  • the inter prediction unit 110 uses the reference image 124 stored in the reference image memory 107 to perform inter prediction.
  • the inter prediction unit 110 performs an interpolation process (motion compensation) based on a motion shift amount (motion vector) between the prediction target block and the reference image 124 to generate an inter prediction image.
  • interpolation processing up to 1 ⁇ 4 pixel accuracy is possible.
  • the derived motion vector is entropy encoded as part of the prediction information 126.
  • the prediction selection switch 111 outputs the output terminal of the intra unidirectional prediction image generation unit 108, the output terminal of the intra bidirectional prediction image generation unit 109, or the output terminal of the inter prediction image generation unit 110 from the prediction selection unit 113.
  • the intra-predicted image or the inter-predicted image is selected as the predicted image 125 and input to the subtracting unit 101 and the adding unit 106.
  • the prediction selection switch 111 outputs from the intra unidirectional prediction image generation unit 108 or the intra bidirectional prediction image generation unit 109 according to the prediction mode shown in FIGS. 7A and 7B. Connect the switch to the end.
  • the prediction selection switch 111 connects a switch to the output terminal from the inter prediction unit 110.
  • the prediction selection unit 113 has a function of setting the prediction information 126 according to the prediction mode controlled by the encoding control unit 116. As described above, intra prediction or inter prediction can be selected to generate the predicted image 125, but a plurality of modes can be further selected for each of intra prediction and inter prediction.
  • the encoding control unit 116 selects one of a plurality of prediction modes of intra prediction and inter prediction as the optimal prediction mode, and the prediction selection unit 113 sets the prediction information 126 according to the determined optimal prediction mode. .
  • the intra unidirectional prediction image generation unit 108 or the intra bidirectional prediction image generation unit 109 is selected.
  • the encoding control unit 116 may specify the prediction mode information in order from the smallest prediction mode number, or may specify the prediction mode information in order from the largest. Further, the prediction mode may be limited according to the characteristics of the input image, or a predetermined prediction mode may be selected. It is not always necessary to specify all the prediction modes, and at least one prediction mode information may be specified for the encoding target block.
  • the encoding control unit 116 determines an optimal prediction mode using a cost function shown in the following mathematical formula (3).
  • OH represents an estimated amount of code amount (for example, the number of bits of a value representing a symbol in binary) related to the prediction information 126 (for example, prediction mode information, motion vector information, and prediction block size information), and SAD Indicates the sum of absolute differences between the prediction target block and the predicted image 125 (that is, the cumulative sum of the absolute values of the prediction errors 118).
  • represents a Lagrange undetermined multiplier determined based on the value of quantization information (quantization parameter)
  • K represents an encoding cost.
  • the prediction mode that minimizes the coding cost K is determined as the optimum prediction mode from the viewpoint of the generated code amount and the prediction error.
  • the encoding cost may be estimated from OH alone or SAD alone, or the encoding cost may be estimated using a value obtained by subjecting SAD to Hadamard transform or an approximate value thereof.
  • the encoding control unit 116 determines an optimal prediction mode using a cost function represented by the following mathematical formula (4).
  • Equation (4) D indicates a square error sum (ie, encoding distortion) between the prediction target block and the local decoded image, and R indicates a prediction error between the prediction target block and the prediction image 125 in the prediction mode.
  • provisional encoding processing and local decoding processing are required for each prediction mode, so that the circuit scale or the amount of calculation increases.
  • the encoding cost J is derived based on more accurate encoding distortion and code amount, it is easy to determine the optimal prediction mode with high accuracy and maintain high encoding efficiency.
  • the encoding cost may be estimated from only R or D, or the encoding cost may be estimated using an approximate value of R or D. These costs may be used hierarchically.
  • the encoding control unit 116 performs determination using Expression (3) or Expression (4) based on information obtained in advance regarding the prediction target block (prediction mode of surrounding pixel blocks, results of image analysis, and the like). The number of prediction mode candidates may be narrowed down in advance.
  • the conversion information setting unit 112 has a function of generating conversion information 127 used in orthogonal transformation based on the prediction information 126 output from the prediction selection unit 113 and input via the prediction selection switch 111.
  • the orthogonal transform unit 102 includes a selection switch A 901, a vertical conversion unit 906, a transposition unit 904, a selection switch B 905, and a horizontal conversion unit 907.
  • the vertical transform unit 906 includes a 1D discrete cosine transform unit 902 and a 1D discrete sine transform unit 903.
  • the horizontal conversion unit 907 includes a 1D discrete cosine conversion unit 902 and a 1D discrete cosine conversion unit 903.
  • the order of the vertical conversion unit 906 and the horizontal conversion unit 907 is an example, and these may be reversed.
  • the 1D discrete cosine transform unit 902 and the 1D discrete sine transform unit 903 have a common function in that the input matrix is multiplied by a discrete cosine transform or a 1D transform matrix for discrete sine transform, respectively.
  • the selection switch A 901 guides the prediction error 118 to one of the 1D discrete cosine transform unit 902 and the 1D discrete sine transform unit 903 according to the vertical transform index (Vertical_transform_idx) included in the 1D transform index 952.
  • the 1D discrete cosine transform unit 902 performs 1D discrete cosine transform on the input prediction error (matrix) 118 and outputs a temporary transform coefficient 951.
  • the 1D discrete sine transform unit 903 performs 1D discrete sine transform on the input prediction error (matrix) 118 and outputs a temporary transform coefficient 951.
  • the 1D discrete cosine transform unit 902 and the 1D discrete sine transform unit 903 perform a one-dimensional orthogonal transform represented by the following formula (5) to remove the vertical correlation of the prediction error (matrix) 118. To do.
  • Equation (5) X represents a matrix (N ⁇ N) of a prediction error (matrix) 118, and V comprehensively includes a 1D discrete cosine transform unit 902 and a 1D discrete sine transform unit 903 (both are N ⁇ N).
  • Y indicates an output matrix (N ⁇ N) of the 1D discrete cosine transform unit 902 and the 1D discrete sine transform unit 903.
  • the transformation matrix V is, for example, N ⁇ N in which discrete cosine transform bases or discrete sine transform bases prepared for removing the vertical correlation of the matrix X are arranged horizontally. Is the transformation matrix.
  • the transposition unit 904 transposes the output matrix (Y) of the vertical conversion unit 906, and gives the result to the selection switch B905.
  • the transposing unit 904 is an example, and corresponding hardware may not necessarily be prepared.
  • 1D orthogonal transformation one-dimensional orthogonal transformation
  • the vertical transformation unit 906 executes (each element of the output matrix of the vertical transformation unit 906) is held, and 1D orthogonal transformation is performed by the horizontal transformation unit 907
  • the transposition of the output matrix (Y) can be executed without preparing the hardware corresponding to the transposition unit 904.
  • the selection switch B 905 guides the input matrix from the transposing unit 904 to either the 1D discrete cosine transform unit 902 or the 1D discrete sine transform unit 903 according to the horizontal transform index (Horizontal_transform_idx) included in the 1D transform index 952.
  • the 1D discrete cosine transform unit 902 performs a discrete cosine transform on the input matrix and outputs a transform coefficient 119.
  • the 1D discrete sine transform unit 903 performs discrete sine transform on the input matrix and outputs a transform coefficient 119.
  • the 1D discrete cosine transform unit 902 and the 1D discrete sine transform unit 903 perform a one-dimensional orthogonal transform represented by the following formula (8) to generate a horizontal direction of the prediction error. Remove the correlation.
  • H comprehensively represents a 1D discrete cosine transform matrix and a 1D discrete sine transform matrix (both N ⁇ N), and Z represents a 1D discrete cosine transform unit 902 or a 1D discrete sine transform unit 903.
  • An output matrix (N ⁇ N) is shown, which refers to the transform coefficient 119.
  • the transformation matrix H is a transform matrix of N ⁇ N which the discrete cosine transform matrix or discrete sine transform matrix to remove the correlation in the vertical direction arranged horizontally in matrix Y T. That is, the discrete cosine transform matrix of Equation (6) or the discrete sine transform matrix of Equation (7) corresponds to each.
  • the orthogonal transform unit 102 performs the separation-type 2D orthogonal transform (two-dimensional orthogonal transform) on the prediction error (matrix) 118 according to the 1D transform index 952 output from the 1D transform setting unit 908.
  • a conversion coefficient (matrix) 119 is generated.
  • the 1D discrete cosine transform unit 902 may be replaced with a discrete cosine transform of H.264 / MPEG-4 ⁇ ⁇ AVC in a form that reuses an existing orthogonal transform.
  • the orthogonal transform unit 102 may implement various orthogonal transforms such as Hadamard transform and Karhunen-Loeve transform in addition to DCT.
  • Intra prediction modes supported by H.264 / MPEG-4 AVC, etc. are interpolated using a reference pixel group (reference pixel line) on a line adjacent to one or both of the left and upper sides of the prediction target block. Some of them generate a predicted image by copying processed pixel values (for example, filter processing) along the prediction direction. Since the prediction mode of the intra prediction uses the spatial correlation of the image, the prediction accuracy tends to decrease as the distance from the reference pixel to the prediction target pixel increases.
  • an intra prediction mode for example, mode 1 and mode 8 in FIG. 4A
  • interpolation processing for example, filter processing
  • the prediction error shows a tendency in the horizontal direction.
  • Intra prediction mode for example, mode 0, mode 3 and mode 7 in FIG. 4A, FIG.
  • the prediction error shows a tendency in the vertical direction.
  • a prediction mode for example, mode 4, mode 4 in FIG. 4A
  • the prediction error shows a tendency related to the horizontal direction and the vertical direction.
  • the tendency is related to the direction orthogonal to the line of the reference pixel group used for generating the predicted image.
  • the 1D discrete sine transform unit 903 has a higher coefficient density when performing 1D orthogonal transform in the orthogonal direction (vertical direction or horizontal direction) than the 1D discrete cosine transform unit 902 ( That is, the non-zero coefficient ratio in the quantized transform coefficient 121 is reduced).
  • the 1D discrete cosine transform unit 902 is a general-purpose transform matrix that does not have such properties. If 1D orthogonal transformation is performed in the orthogonal direction using 1D discrete sine transformation, the prediction error conversion efficiency of intra prediction is improved, and consequently the coding efficiency is improved.
  • a prediction error signal in mode 0 shows the above tendency in the vertical direction but does not show the above tendency in the horizontal direction. Therefore, by performing 1D orthogonal transform using the 1D discrete sine transform unit 903 in the vertical transform unit 906 and performing 1D orthogonal transform using the 1D discrete cosine transform unit 902 in the horizontal transform unit 907, efficient orthogonal transform is performed. realizable.
  • the predicted image 125 generated by the intra unidirectional predicted image generation unit 108 has an effect of removing the spatial direction correlation of the input image 117.
  • the spatial correlation of an image is higher as the distance is shorter, and decreases as the distance is longer. For this reason, the prediction error is smaller as the distance between the reference pixel and the prediction coping pixel is closer to the prediction direction, and the prediction error is larger as the distance is longer.
  • FIG. 10A shows an example of a general prediction error tendency statistically seen as a prediction direction corresponding to mode 0 in FIG. 4A.
  • the prediction error in the vertical direction increases as the distance from the reference pixel increases, while the prediction error in the horizontal direction has the same magnitude.
  • the spatial correlation can be efficiently removed by selecting DST as 1D vertical conversion and selecting DCT as 1D horizontal conversion.
  • FIG. 10B shows an example of a general prediction pixel tendency statistically seen as a prediction direction corresponding to mode 1 in FIG. 4A. It can be seen that while the vertical prediction error is the same size, the horizontal prediction error increases as the distance from the reference pixel increases. In this case, spatial correlation can be efficiently removed by selecting DCT as 1D vertical conversion and selecting DST as 1D horizontal conversion.
  • FIG. 10B shows an example of a general prediction pixel tendency statistically seen as a prediction direction corresponding to mode 1 in FIG. 4A. It can be seen that while the vertical prediction error is the same size, the horizontal prediction error increases as the distance from the reference pixel
  • 10C shows an example of the prediction direction corresponding to mode 4 in FIG. It can be seen that the prediction error in the vertical direction and the prediction error in the horizontal direction both increase as the distance from the reference pixel that is the starting point in the prediction direction increases. In this case, spatial correlation can be efficiently removed by selecting DST as 1D vertical conversion and selecting DST as 1D horizontal conversion.
  • the bidirectional prediction image generation unit 109 is a prediction image generation method that averages two prediction directions with weights. For this reason, there is an effect of removing the correlation that changes smoothly in space while maintaining the characteristics of the two prediction directions.
  • FIG. 10D shows an example of a general tendency of prediction error seen statistically when mode 1 and mode 8 in FIG. 4A are used. In any mode, the prediction is performed only from the line positioned to the left of the prediction target block as the reference pixel line. Therefore, it can be seen that the prediction error in the vertical direction is the same size, but the prediction error in the horizontal direction increases as the distance from the reference pixel increases. In this case, spatial correlation can be efficiently removed by selecting DCT as 1D vertical conversion and selecting DST as 1D horizontal conversion.
  • FIG. 10D shows an example of a general tendency of prediction error seen statistically when mode 1 and mode 8 in FIG. 4A are used. In any mode, the prediction is performed only from the line positioned to the left of the prediction target block as the reference pixel line. Therefore, it can be seen that
  • 10E shows an example of a general tendency of prediction error seen statistically when mode 0 and mode 1 in FIG. 4A are used.
  • Mode 0 uses a reference line located above the prediction target block
  • mode 1 uses a reference line located to the left of the prediction target block.
  • the prediction error in this case has a tendency similar to the prediction error of the intra unidirectional prediction described with reference to FIG. 10C
  • the prediction error in the vertical direction and the prediction error in the horizontal direction are the left and upper references that are the starting points of the prediction direction. It can be seen that both increase as the distance from the pixel increases.
  • Non-Patent Document 1 when 1D vertical conversion and 1D horizontal conversion are selected according to the two selected prediction modes, either the conversion of FIG. 10A or FIG.
  • Intra_DC indicates DC prediction that is predicted by the average value of available reference pixels.
  • the prediction error of DC prediction the directional error as described above does not occur statistically, so the DCT used in H.264 / MPEG-4 AVC is selected.
  • the tendency of the prediction error changes according to the prediction direction of the other prediction modes. Therefore, in such a case, it is possible to efficiently reduce the redundancy of the prediction error by setting TransformIdx determined in the other prediction mode. For example, as illustrated in FIGS.
  • the inverse orthogonal transform unit 105 includes a selection switch A 1201, a vertical inverse transform unit 1206, a transposition unit 1204, a selection switch B 1205, and a horizontal inverse transform unit 1207.
  • the vertical inverse transform unit 1206 includes a 1D inverse discrete cosine transform unit 1202 and a 1D inverse discrete sine transform unit 1203.
  • the horizontal inverse transform unit 1207 includes a 1D inverse discrete cosine transform unit 1202 and a 1D inverse discrete sine transform unit 1203. Note that the order of the vertical inverse transform unit 1206 and the horizontal inverse transform unit 1207 is an example, and these may be reversed.
  • the two 1D inverse discrete cosine transform units 1202 shown in the figure can also be realized by using physically identical hardware in a time division manner. The same applies to the 1D inverse discrete sine transform unit 1203.
  • the selection switch A 1201 converts the restored transform coefficient 121 of the 1D inverse discrete cosine transform unit 1202 and the 1D inverse discrete sine transform unit 1203 according to the vertical transform index (Vertical_transform_idx) included in the 1D transform index 1252 output from the 1D transform setting unit 1208. Lead to one of them.
  • the 1D inverse discrete cosine transform unit 1202 multiplies the input restoration transform coefficient 121 (matrix format) by a transposed matrix of the discrete cosine transform matrix and outputs the result.
  • the 1D inverse discrete sine transform unit 1203 multiplies the input restoration transform coefficient 121 by the transposed matrix of the discrete sine transform matrix and outputs the result.
  • the 1D inverse discrete cosine transform unit 1202 and the 1D inverse discrete sine transform unit 1203 perform a one-dimensional inverse orthogonal transform represented by the following equation (9).
  • Equation (9) a Z 'represents a matrix of restoring conversion coefficient 121 (N ⁇ N), V T is the transpose matrix of 1D inverse discrete cosine transform matrix and 1D inverse discrete sine transform matrix (both N ⁇ N) Y ′ indicates an output matrix (N ⁇ N) of the 1D inverse discrete cosine transform unit 1202 and the 1D inverse discrete sine transform unit 1203. That, V T is the discrete cosine transform matrix in Equation (6), or a discrete sine transform matrix equation (7) corresponds, respectively.
  • the transposition unit 1204 transposes the output matrix (Y ′) of the vertical inverse transform unit 1206 and supplies the transposition to the selection switch B 1205.
  • the transposition unit 1204 is an example, and corresponding hardware may not necessarily be prepared.
  • the result one element of the output matrix of the vertical inverse transform unit 1206) obtained by executing the 1D inverse orthogonal transform (one-dimensional inverse orthogonal transform) by the vertical inverse transform unit 1206 is held, and the 1D inverse by the horizontal inverse transform unit 1207 is held. If reading is performed in an appropriate order when performing orthogonal transformation, transposition of the output matrix (Y ′) can be performed without preparing hardware corresponding to the transposition unit 1204.
  • the selection switch B 1205 converts the input matrix from the transposition unit 1204 into the 1D inverse discrete cosine transform unit 1202 and the 1D inverse discrete sine transform according to the horizontal transformation index (Horizontal_transform_idx) included in the 1D transformation index 1252 output from the 1D transformation setting unit 1208. Lead to one of the sections 1203.
  • the 1D inverse discrete cosine transform unit 1202 performs 1D inverse discrete cosine transform on the input matrix and outputs the result.
  • the 1D inverse discrete sine transform unit 1203 performs 1D inverse discrete sine transform on the input matrix and outputs the result.
  • the 1D inverse discrete cosine transform unit 1202 and the 1D inverse discrete sine transform unit 1203 perform a one-dimensional inverse orthogonal transform represented by the following formula (10).
  • H T is 1D are generically indicates a transposed matrix of the discrete cosine transform matrix and 1D discrete sine transform matrix (both N ⁇ N), X 'is 1D inverse discrete cosine transform unit 1202 and 1D An output matrix (N ⁇ N) of the inverse discrete sine transform unit 1203 is shown, which indicates the restored prediction error 122. That, H T is the discrete cosine transform matrix in Equation (6), or a discrete sine transform matrix equation (7) corresponds, respectively.
  • the inverse orthogonal transform unit 105 performs inverse orthogonal transform on the reconstructed transform coefficient (matrix) 121 according to the input orthogonal transform information 127 to generate a reconstructed prediction error (matrix) 122.
  • the 1D inverse discrete cosine transform unit 1202 may be replaced with an inverse discrete cosine transform of H.264 / MPEG-4 AVC by reusing the existing inverse orthogonal transform.
  • the inverse orthogonal transform unit 105 may realize various inverse orthogonal transforms such as Hadamard transform and Karhunen-Loeve transform. In any case, an inverse orthogonal transform corresponding to the orthogonal transform unit 102 may be selected.
  • the 1D conversion setting unit 1208 selects a transformation matrix used for vertical orthogonal transformation and vertical inverse orthogonal transformation, and performs horizontal orthogonal transformation and horizontal inverse orthogonal transformation. And a function of setting a 1D conversion index for selecting a conversion matrix to be used.
  • the 1D transform index 1252 directly or indirectly indicates the orthogonal transform selected by the vertical orthogonal transform and the horizontal orthogonal transform, respectively.
  • the 1D transform index 1252 can be expressed by a transform index (TransformIdx) shown in FIG. 13A and a 1D orthogonal transform in the vertical or horizontal direction (Vertical_transform_idx and Horizontal_transform_idx, respectively).
  • a 1D transformation index (Vertical_transform_idx) for a vertical transformation unit and a 1D transformation index (Horizontal_transform_idx) for a horizontal transformation unit can be derived from the transformation index.
  • FIG. 13B shows whether each idx indicates discrete cosine transform or discrete sine transform.
  • idx When idx is “1”, it indicates a discrete sine transform matrix (DST), and when it is “0”, it indicates a discrete cosine transform matrix (DCT).
  • the corresponding 1D transform index 1252 is referenced based on TransformIdx included in the orthogonal transform information 127, and Vertical_transform_idx is output to the selection switch A and Horizontal_transform_idx is output to the selection switch B.
  • the selection switch when Vertical_transform_idx or Horizontal_transform_idx indicates DCT with reference to FIG. 13B, the selection switch connects the output end to the 1D inverse discrete cosine transform unit 1202 (or 1D discrete cosine transform unit 902).
  • the selection switch When Vertical_transform_idx or Horizontal_transform_idx indicates DST with reference to FIG. 13B, the selection switch connects the output end to the 1D inverse discrete sine transform unit 1203 (or 1D discrete sine transform unit 903).
  • the orthogonal transform information indirectly and directly indicates a transform index corresponding to the selected prediction mode with reference to a predetermined transform index and prediction mode map.
  • 11A and 11B show the relationship between the prediction mode and the conversion index.
  • 11A and 11B are obtained by adding TransformIdx to the intra prediction mode shown in FIGS. 7A and 7B. From this table, TransformIdx can be derived according to the selected prediction mode.
  • the prediction selection unit 113 determines whether the prediction mode is intra prediction based on the selection of the encoding control unit 116 (S1402). When the determination is No, the inter prediction process is performed as usual. If the determination is Yes, the prediction selection unit 113 next determines whether the prediction mode is bidirectional intra prediction (S1403). When the determination is No, the unidirectional prediction mode is output to the prediction selection unit 113 after being set (S1404) and input to the prediction selection switch 111.
  • the prediction selection switch 111 connects the input end of the switch to the intra unidirectional prediction image generation unit 108, and the prediction image 125 generated by the intra unidirectional prediction image generation unit 108 is output (S1405).
  • Prediction information 126 indicating the prediction mode of unidirectional prediction set by the prediction selection unit 113 is input to the conversion information setting unit 112.
  • the transformation information setting unit 112 refers to the LUT shown in FIGS. 11A and 11B and outputs the transformation information 127 (TransformIdx) of the selected prediction mode (S1406).
  • the prediction selection unit 113 sets two prediction modes (S1407) and outputs the prediction mode to the prediction selection switch 111.
  • two prediction modes are included in one PreMode indicating the prediction mode.
  • PredMode is 34
  • bidirectional intra prediction is performed in two prediction modes in which PredMode is 2 and PredMode is 0. That is, when calling the prediction mode, there are cases where two prediction modes are included, but here when referring to the prediction mode, there are two “PredTypeLX” (prediction type) and “PredAngleIdLX” (prediction angle). Indicates the specified mode.
  • the prediction selection switch 111 connects the input end of the switch to the intra bidirectional prediction image generation unit 109, and the prediction selection switch 111 connects the input end of the switch to the intra bidirectional prediction image generation unit 109.
  • the predicted image 125 generated by the directional predicted image generation unit 109 is output (S1408).
  • Prediction information 126 indicating the prediction mode of bidirectional prediction set by the prediction selection unit 113 is input to the conversion information setting unit 112.
  • the conversion information setting unit 112 determines whether the reference pixel lines in the two set prediction modes are different (S1409).
  • the prediction mode transformation information 127 (TransformIdx) is output so that Vertical_transform_idx and Horizontal_transform_idx select DST, respectively (S1410).
  • the conversion information setting unit 112 outputs the conversion information 127 (TransformIdx) possessed by the smaller of the two set prediction mode numbers (S1411).
  • the 1D transformation setting unit 908 in the orthogonal transformation unit 102 and the 1D transformation setting unit 1208 in the inverse orthogonal transformation unit 105 derive Vertical_transform_idx and Horizontal_transform_idx by referring to FIGS. 13A and 13B according to the inputted transformation information 127.
  • Vertical_transform_idx is input to the selection switches A 901 and 1201, and the output terminal of the switch is input to the 1D discrete cosine transform unit 902 (or 1D inverse discrete cosine transform unit 1202) or 1D discrete sine transform unit 903 (or 1D inverse discrete sine transform unit 1203). Connecting.
  • Horizontal_transform_idx is input to the selection switches B905 and 605, and the output end of the switch is input to the 1D discrete cosine transform unit 902 (or 1D inverse discrete cosine transform unit 1202) or 1D discrete sine transform unit 903 (or 1D inverse discrete sine transform unit 1203). Connect (S1412).
  • the prediction error 118 is input to the orthogonal transformation unit 102, and orthogonal transformation processing is performed using the set transformation matrix (S1413).
  • the transform coefficient 119 after the orthogonal transform is output to the quantization unit 103 (S1414).
  • the 1D transformation setting unit 1208 also sets Vertical_transform_idx and Horizontal_transform_idx in the same processing procedure for the inverse orthogonal transformation unit 105.
  • the restored transform coefficient 121 is input to the inverse orthogonal transform unit 105, and an inverse orthogonal transform process is performed using the set transform matrix. Thereafter, the restoration prediction error 122 is output to the adding unit 106.
  • the above is the description of the process flow of the predicted image generation process, the orthogonal transform process, and the inverse orthogonal transform process according to the present embodiment of the present invention.
  • FIG. 15 illustrates a syntax 1500 used by the image encoding device in FIG.
  • the syntax 1500 includes three parts: a high level syntax 1501, a slice level syntax 1502, and a coding tree level syntax 1503.
  • the high level syntax 1501 includes syntax information of a layer higher than the slice.
  • a slice refers to a rectangular area or a continuous area included in a frame or a field.
  • the slice level syntax 1502 includes information necessary for decoding each slice.
  • Coding tree level syntax 1503 includes information necessary to decode each coding tree (ie, each coding tree block). Each of these parts includes more detailed syntax.
  • the high level syntax 1501 includes sequence and picture level syntaxes such as a sequence parameter set syntax 1504 and a picture parameter set syntax 1505.
  • the slice level syntax 1502 includes a slice header syntax 1506, a slice data syntax 1507, and the like.
  • the coding tree level syntax 1503 includes a coding tree block syntax 1508, a prediction unit syntax 1509, and the like.
  • the coding tree block syntax 1508 can have a quadtree structure. Specifically, the coding tree block syntax 1508 can be recursively called as a syntax element of the coding tree block syntax 1508. That is, one coding tree block can be subdivided with a quadtree. Also, the coding tree block syntax 1508 includes a transform unit syntax 1510. The transform unit syntax 1510 is invoked at each coding tree block syntax 1508 at the extreme end of the quadtree. The transform unit syntax 1510 describes information related to inverse orthogonal transformation and quantization.
  • the transform unit syntax 1510 can have a quadtree structure. Specifically, the transform unit syntax 1510 can be further recursively called as a syntax element of the transform unit syntax 1510. That is, one transform unit can be subdivided with a quadtree.
  • FIG. 16 illustrates a slice header syntax 1506 according to the present embodiment.
  • the slice_bipred_intra_flag illustrated in FIG. 16 is a syntax element indicating, for example, the validity / invalidity of bidirectional intra prediction according to the present embodiment for the slice.
  • the prediction selection unit 113 does not set a prediction mode including bidirectional intra prediction, and the prediction selection switch 111 does not connect the output terminal of the switch to the intra bidirectional prediction image generation unit 109.
  • unidirectional intra prediction prediction in which BipedFlag [] in FIGS. 7A and 7B, FIG. 11A and FIG. 11B is 0, or intra prediction defined in H.264 / MPEG-4 AVC may be performed. Absent.
  • slice_bipred_intra_flag 1
  • the bidirectional intra prediction according to the present embodiment is valid in the entire area of the slice.
  • slice_bipred_intra_flag 1
  • the prediction validity / Invalidity may be specified.
  • the slice_directional_transform_intra_flag shown in FIG. 16 is a syntax element indicating, for example, the validity / invalidity of the discrete sine transform and the inverse discrete sine transform according to this embodiment with respect to the slice.
  • the transformation information setting unit 112 When slice_directional_transform_intra_flag is 0, the discrete sine transform and the inverse discrete sine transform according to the present embodiment in the slice are invalid. Therefore, the transformation information setting unit 112 always sets TransformIdx to 3 and outputs it. Alternatively, the 1D conversion setting units 908 and 1208 may set Vertical_transform_idx and Horizontal_transform_idx to 1 regardless of the value of TransformIdx. As an example, when slice_directional_transform_intra_flag is 1, the discrete sine transform and the inverse discrete sine transform according to the present embodiment are effective over the entire area in the slice.
  • slice_directional_transform_intra_flag 1
  • the discrete sine transform according to the present embodiment for each local region in the slice in the syntax of a lower layer (coding tree block, transform unit, etc.)
  • the validity / invalidity of the inverse discrete sine transform may be defined.
  • FIG. 17 illustrates the coding tree block syntax 1508 according to the present embodiment.
  • Ctb_directional_transform_flag shown in FIG. 17 is a syntax element indicating whether the discrete sine transform and the inverse discrete sine transform according to the present embodiment are valid / invalid for the coding tree block.
  • pred_mode shown in FIG. 17 is one of syntax elements included in the prediction unit syntax 1509, and indicates the coding type in the coding tree block or macroblock. MODE_INTRA indicates that the encoding type is intra prediction.
  • ctb_directional_transform_flag is encoded only when the above-described slice_directional_transform_flag is 1 and the encoding type of the coding tree block is intra prediction.
  • CBP indicates Coded_Block_Pattern information, and is information indicating whether or not there is a transform coefficient in the coding tree block. If the information is 0, it indicates that there is no transform coefficient, and the decoder does not need to perform inverse transform processing, so ctb_directional_transform_flag is not encoded.
  • the transformation information setting unit 112 always sets TransformIdx to 3 and outputs it.
  • the 1D conversion setting units 908 and 1208 may set Vertical_transform_idx and Horizontal_transform_idx to 1 regardless of the value of TransformIdx.
  • ctb_directional_transform_flag is 1, the discrete sine transform and the inverse discrete sine transform according to the present embodiment are valid in the coding tree block.
  • the flag that defines the validity / invalidity of the discrete sine transform and the inverse discrete sine transform according to the present embodiment is encoded, compared to the case where this flag is not encoded.
  • the information amount (code amount) increases.
  • this flag it is possible to perform optimal orthogonal transform for each local region (ie, coding tree block).
  • FIG. 18 illustrates a transform unit syntax 1510 according to this embodiment.
  • a tu_directional_transform_flag shown in FIG. 18 is a syntax element indicating validity / invalidity of the discrete sine transform and the inverse discrete sine transform according to this embodiment with respect to the transform unit.
  • pred_mode shown in FIG. 18 is one of syntax elements included in the prediction unit syntax 1509, and indicates the coding type in the coding tree block or macroblock. MODE_INTRA indicates that the encoding type is intra prediction.
  • tu_directional_transform_flag is encoded only when slice_directional_transform_flag is 1 and the encoding type of the coding tree block is intra prediction.
  • coded_block_flag is 1-bit information indicating whether or not there is a transform coefficient in the transform unit. If the information is 0, it indicates that there is no transform coefficient, and the decoder does not need to perform inverse transform processing, so tu_directional_transform_flag is not encoded.
  • the transformation information setting unit 112 always sets TransformIdx to 3 and outputs it.
  • the 1D conversion setting units 908 and 1208 may set Vertical_transform_idx and Horizontal_transform_idx to 1 regardless of the value of TransformIdx.
  • the transform unit syntax 1510 when the flag that defines the validity / invalidity of the discrete sine transform and the inverse discrete sine transform according to the present embodiment is encoded, compared to the case where this flag is not encoded. Thus, the information amount (code amount) increases. However, by encoding this flag, it is possible to perform optimal orthogonal transform for each local region (that is, transform unit).
  • FIG. 19 shows an example of the prediction unit syntax.
  • Pred_mode in FIG. 19 indicates the prediction type of the prediction unit.
  • MODE_INTRA indicates that the prediction type is intra prediction.
  • intra_split_flag is a flag indicating whether or not the prediction unit is further divided into four prediction units.
  • intra_split_flag is 1, a prediction unit is a prediction unit obtained by dividing the prediction unit into four in half in the vertical and horizontal sizes.
  • intra_split_flag is 0, the prediction unit is not divided.
  • Intra_luma_bipred_flag [i] is a flag indicating whether the prediction mode IntraPredMode applied to the prediction unit is a unidirectional intra prediction mode or a bidirectional intra prediction mode. i indicates the position of the divided prediction unit. When the intra_split_flag is 0, 0 is set, and when the intra_split_flag is 1, 0 to 3 are set. In the flag, the value of IntraBipredFlag of the prediction unit shown in FIGS. 9, 12, 13A, and 13B is set.
  • intra_luma_bipred_flag [i] When intra_luma_bipred_flag [i] is 1, this indicates that the prediction unit is bi-directional intra prediction, and is information that identifies the bi-directional intra prediction mode used from among the prepared bi-directional intra prediction modes.
  • Intra_luma_bipred_mode [i] is encoded.
  • intra_luma_bipred_mode [i] may be encoded with the isometric length according to the bidirectional intra prediction mode number IntraBipredNum shown in FIGS. 7A and 7B, FIG. 11A and FIG. 11B, or encoded using a predetermined code table. May be.
  • intra_luma_bipred_flag [i] When intra_luma_bipred_flag [i] is 0, it indicates that the prediction unit is unidirectional intra prediction, and predictive encoding is performed from adjacent blocks.
  • prev_intra_luma_unipred_idx [i] is a flag indicating whether or not the prediction value MostProbableMode of the prediction mode calculated from the adjacent block and the intra prediction mode of the prediction unit are the same. Details of the method of calculating MostProbableMode will be described later. When prev_intra_luma_unipred_idx [i] is not 0, it indicates that the MostProbableMode and the intra prediction mode IntraPredMode are equal.
  • prev_intra_luma_unipred_idx [i] When prev_intra_luma_unipred_idx [i] is 0, it indicates that the MostProbableMode and the intra prediction mode IntraPredMode are different, and information rem_intra_luma_unipred_mode [i] that specifies which mode other than the MostProbableMode is the intra prediction mode IntraPredMode is encoded. . rem_intra_luma_unipred_mode [i] may be encoded with the same length according to the bidirectional intra prediction mode number IntraPredModeNum shown in FIGS. 7A and 7B, FIG. 11A and FIG. 11B, or may be encoded using a predetermined code table. May be. From intra prediction mode IntraPredMode, rem_intra_luma_unipred_mode [i] is calculated using the following equation.
  • numCand indicates the number of candidates for MostProbableMode
  • candModeList [cIdx] indicates the MostProbableMode that is actually a candidate.
  • numCand is set to 2
  • the candidate MostProbableMode is set to IntraPredMode of pixel blocks positioned on the upper and left sides that have already been predicted and are adjacent to the prediction target block.
  • candModeList [0] is indicated as MPM_L0
  • candModeList [1] is indicated as MPM_L1.
  • prev_intra_luma_unipred_idx [i] When prev_intra_luma_unipred_idx [i] is 1, a prediction mode of MPM_L0 is derived, and when prev_intra_luma_unipred_idx [i] is 2, a prediction mode of MPM_L1 is derived.
  • candModeList [cIdx] may be in the same prediction mode. In this case, since the redundant information is included in the information to be encoded, the redundant prediction mode is omitted and expressed as shown in Expression (11). For the code table for actual encoding, an optimum code table may be created in consideration of the maximum number of these prediction modes.
  • Min (x, y) is a parameter for outputting the smaller one of the inputs x and y.
  • IntraPredModeA and IntraPredModeB indicate the intra prediction modes of prediction units adjacent to the left and above the encoded prediction unit.
  • syntax elements not defined in this embodiment may be inserted between the lines of the syntax tables illustrated in FIGS. 16, 17, 18 and 19, and other conditional branch descriptions are included. It may be. Further, the syntax table may be divided into a plurality of tables, or a plurality of syntax tables may be integrated. Moreover, the term of each illustrated syntax element can be changed arbitrarily.
  • the image coding apparatus when using two reference pixel lines having different prediction directions, has a different prediction error tendency from either of the two prediction directions, and follows one prediction mode.
  • orthogonal transform is selected, the problem that the coding efficiency is lowered without taking advantage of the tendency of intra prediction in which the prediction accuracy is lowered as the distance between the reference pixel and the prediction pixel is increased is solved.
  • This image encoding apparatus classifies the vertical direction and horizontal direction of each prediction mode into two classes according to the presence or absence of the above-described tendency, and adaptively performs 1D discrete cosine transform or 1D discrete sine transform for each of the vertical direction and horizontal direction. Apply.
  • the coefficient density is higher than the 1D discrete cosine transform (that is, quantization)
  • the ratio of the non-zero coefficient in the conversion coefficient 121 is reduced). Therefore, according to the image coding apparatus according to the present embodiment, high conversion efficiency is stably achieved as compared with a case where fixed orthogonal transformation such as DCT is uniformly applied to each prediction mode.
  • orthogonal transform unit 102 and the inverse orthogonal transform unit 105 according to the present embodiment are suitable for both hardware implementation and software implementation.
  • the above is the description of the image encoding device according to the first embodiment.
  • the image encoding device according to the second embodiment differs from the image encoding device according to the first embodiment described above in the details of the intra unidirectional prediction image generation unit 108 and the intra bidirectional prediction image generation unit 109.
  • the same parts as those in the first embodiment are denoted by the same reference numerals in the present embodiment, and different parts will be mainly described.
  • a moving picture decoding apparatus corresponding to the picture encoding apparatus according to the present embodiment will be described in a fourth embodiment.
  • FIG. 20 is a block diagram of an image encoding apparatus 2000 according to the second embodiment of the present invention.
  • a prediction direction deriving unit 2001 is newly added and that prediction direction deriving information 2051 is output from the prediction direction deriving unit 2001 to the prediction selecting unit 113.
  • the intra unidirectional prediction image generation unit 108 and the intra bidirectional prediction image generation unit 109 are extended to 128 directions. Specifically, it means that the angle gradient of 180 degrees used for direction prediction is divided into 128 pieces, and a prediction direction is assigned every 1.4 degrees.
  • the prediction modes of the unidirectional prediction described in the first embodiment are the same as those in FIGS. 7A and 7B and FIGS. 11A and 11B. Since the other configuration is the same as that of the first embodiment, description thereof is omitted.
  • intra prediction performed using the prediction direction deriving unit 2001 is referred to as a prediction direction derivation mode.
  • the reference image 124 output from the reference image memory 107 is input to the prediction direction deriving unit 2001.
  • the prediction direction deriving unit 2001 has a function of analyzing the input reference image 124 and generating prediction direction derivation information 2051.
  • the prediction direction deriving unit 2001 will be described with reference to FIG.
  • the prediction direction deriving unit 2001 includes a left reference pixel line edge deriving unit 2101, an upper reference pixel line edge deriving unit 2102, and a prediction direction deriving information generating unit 2103.
  • the left reference pixel line edge deriving unit 2101 has a function of performing edge detection processing on a reference pixel line located to the left of the prediction target pixel block and deriving an edge direction.
  • the upper reference pixel line edge deriving unit 2102 has a function of performing edge detection processing on a reference pixel line located above the prediction target pixel block and deriving an edge direction.
  • FIG. 22 shows an example of pixels used for the derivation of the left reference pixel line edge deriving unit 2101 and the upper reference pixel line edge deriving unit 2102.
  • the left reference pixel line edge deriving unit 2101 uses two lines indicated by diagonal lines from the upper right to the lower left located on the left side of the prediction target pixel block.
  • the upper reference pixel line edge deriving unit 2102 uses two lines indicated by diagonal lines from the upper left to the lower right located above the prediction target pixel block. In the present embodiment of the present invention, two lines have been described, but one line or three lines may be used, and more lines may be used. In the drawing, an example in which the edge direction is derived using the left reference pixel line is shown as A, and an example in which the edge direction is derived using the upper reference pixel line is shown as B.
  • both deriving units 2101 and 2102 perform edge intensity detection using an operator as shown in the following equation (13).
  • Gx represents the edge strength in the horizontal direction (x coordinate system)
  • Gy represents the edge strength in the vertical direction (y coordinate system).
  • any operator such as a Sobel operator, a Prewitt operator, or a Kirsch operator may be used as the edge detection operator.
  • Equation (13) When the operator of Equation (13) is applied to the reference pixel line, an edge direction vector is derived for each pixel.
  • the following formula (14) is used to derive the optimum edge direction from these edge vectors.
  • ⁇ a, b> represents the inner product of two vectors.
  • the unit vector and edge strength (direction vector) are respectively expressed by the following equations.
  • the edge strength of the following expression of each pixel line derived by the left reference pixel line edge deriving unit 2101 and the upper reference pixel line edge deriving unit 2102 is input to the prediction direction deriving information generating unit 2103.
  • the prediction direction derivation information generation unit 2103 calculates Equations (14) and (15) using all the input edge strengths, and derives a unidirectional representative edge angle.
  • both the expression edge (14) and the expression (15) are separately calculated with respect to the two representative edge angles of the left reference pixel line and the upper reference pixel line, and both have two representative edge angles.
  • the direction representative edge angle is derived.
  • the prediction direction derivation information generation unit 2103 derives a plurality of peripheral unidirectional representative edge angles that are angularly adjacent, starting from the unidirectional representative edge angle. For example, it is assumed that there are 128 types of edge angles and they are arranged in the order of angles.
  • the representative edge angle is RDM
  • the peripheral unidirectional representative edge angles are expressed as RDM-1, RDM + 1, RDM-2, RDM + 2,.
  • the peripheral unidirectional representative edge angle is 10 (for example, ⁇ 5).
  • FIG. 23 shows an example of the prediction direction derivation information 2051.
  • the unidirectional representative edge angle is indicated by RDM
  • the representative edge angle of the left reference pixel line of the bidirectional representative edge angle is indicated by RDM_L0
  • the representative edge angle of the upper reference pixel line of the bidirectional representative edge angle is It is indicated by RDM_L1.
  • RDMPredMode indicates the prediction mode derived by the prediction direction deriving unit 2001 in the present embodiment of the present invention.
  • RDMBipredFlag indicates whether the prediction mode derived by the prediction direction deriving unit 2001 in the present embodiment of the present invention is bidirectional prediction.
  • RDMPredAngleIdL0 and RDMPredAngleIdL1 indicate which prediction angle the prediction mode derived by the prediction direction deriving unit 2001 in the present embodiment of the present invention indicates.
  • the prediction direction derivation information 2051 includes two representative edge angles included in the unidirectional representative edge angle and the bidirectional representative edge angle, in addition to FIG. The relationship between these prediction modes and TransformIdx will be described later.
  • the prediction direction derivation information 2051 generated by the prediction direction derivation unit 2001 is input to the prediction selection unit 113.
  • a prediction image 125 is generated by the intra unidirectional prediction image generation unit 108 or the intra bidirectional prediction image generation unit 109 according to the prediction mode selected here. These predicted image generation units are the same as those in the first embodiment except that the number of predicted angles is expanded to 128. For example, when RDMPredMode is 1, the first predicted image signal 851 and the second predicted image signal 852 in FIG. 8 are generated at two unidirectional representative edge angles included in the bidirectional representative edge angle, and the weighted average An average process is performed in the unit 801, and the predicted image 125 is output.
  • the prediction direction deriving unit 2001 is added to change the configuration, so that the prediction direction, which is the conventional 33 directions, is expanded to 128 directions, and further based on the reference pixel line.
  • the prediction direction can be selected from the representative angle calculated by Expression (15). For example, by dividing 180 degrees into 128, quantizing the representative angle calculated by Equation (15), and mapping to a maximum 128 prediction directions, a more accurate prediction direction can be used.
  • the additional mode information required in the present embodiment is the 12 types shown in FIG. 23, and the overhead when encoding the prediction mode is significantly larger than when 128 types of prediction modes are simply added. Can be reduced.
  • TransformIdx is selected in units of coding tree blocks (or the first prediction unit included in the coding tree block). For example, assume that the N ⁇ N pixel block 0 shown in FIG. 2C is a prediction unit, and that a 2N ⁇ 2N pixel block is a coding tree block.
  • TransformIdx is selected using the information of the reference pixel line derived here.
  • TransformIdx [0] for following the TransformIdx of the head prediction unit is specified regardless of each prediction mode.
  • TransformIdx when the prediction mode described in FIG. 23 is selected in advance may be determined. For example, it is possible to easily set TransformIdx to 0 in order to reduce the process of deriving the predicted angle.
  • FIG. 24 illustrates a slice header syntax 1506 according to this embodiment.
  • the slice_derived_direction_intra_flag illustrated in FIG. 24 is a syntax element indicating, for example, the validity / invalidity of the prediction direction deriving method according to the present embodiment for the slice.
  • the prediction selection unit 113 does not set the prediction mode including the prediction direction derivation information 2051, and the prediction selection switch 111 connects the output terminals of the switches according to the first embodiment of the present invention.
  • slice_derived_direction_intra_flag 1
  • the prediction direction derivation mode according to the present embodiment is valid over the entire area in the slice.
  • the prediction direction derivation method is 1, the prediction according to the present embodiment is performed for each local region in the slice in the syntax of a lower layer (coding tree block, transform unit, etc.).
  • the validity / invalidity of the direction derivation mode may be defined.
  • FIG. 25A shows an example of the prediction unit syntax.
  • intra_derived_direction_flag [i] is a flag indicating whether the prediction mode IntraPredMode applied to the prediction unit is the first embodiment shown in FIGS. 11A and 11B or the second embodiment shown in FIG. It is. i indicates the position of the divided prediction unit.
  • intra_split_flag is 0, 0 is set, and when the intra_split_flag is 1, 0 to 3 are set.
  • intra_direction_mode [i] which is information identifying the used intra prediction mode, is encoded from among the prepared prediction modes. As shown in FIG. 23, in this mode, a bidirectional intra prediction mode and a unidirectional intra prediction are mixedly expressed. These prediction modes can be expressed as separate syntax elements and encoded.
  • intra_direction_mode [i] is described as an example corresponding to RDMPredMode, but the prediction mode expression method may be changed based on RDMPredAngleIdL0.
  • the prediction mode can also be expressed by dividing it into two types of syntax elements: a 1-bit flag representing a code and an index representing a change. It becomes. In this case, a new flag indicating whether or not bidirectional intra prediction is used may be prepared.
  • intra_direction_mode [i] may be encoded with the same length according to the number of prediction modes, or may be encoded using a predetermined code table. When intra_direction_mode [i] is 0, it indicates that the prediction unit does not use the prediction direction derivation method according to the present embodiment of the present invention, and is encoded according to the method described in the first embodiment.
  • FIG. 25B shows an example of the prediction unit syntax as another embodiment of the present invention.
  • intra_direction_mode [i] is expressed by being divided into prev_intra_direction_mode [i] and rem_intra_direction_mode [i]. These syntax elements introduce the prediction between prediction modes similarly to Formula (11) or (12).
  • prev_intra_direction_mode [i] is a flag indicating whether or not the prediction value MostProbableMode of the prediction mode calculated from the adjacent block and the intra prediction mode of the prediction unit are the same. When prev_intra_direction_mode [i] is 1, it indicates that the MostProbableMode and the intra prediction mode IntraPredMode are equal.
  • prev_intra_direction_mode [i] When prev_intra_direction_mode [i] is 0, it indicates that the MostProbableMode and the intra prediction mode IntraPredMode are different, and information rem_intra_direction_mode [i] that specifies which mode other than the MostProbableMode is the intra prediction mode IntraPredMode is encoded. .
  • the rem_intra_direction_mode [i] may be encoded with an equal length according to the number of prediction modes, or may be encoded using a predetermined code table.
  • FIG. 25C shows an example of the prediction unit syntax as another embodiment of the present invention.
  • PredMode described in the first embodiment and PredMode described in the second embodiment are integrated and expressed as one PredMode table.
  • PredMode in this case is shown in FIGS. 26A, 26B, and 26C.
  • These syntax elements introduce the prediction between prediction modes similarly to Formula (11) or (12).
  • prev_intra_luma_unipred_idx [i] is a flag indicating whether or not the prediction value MostProbableMode of the prediction mode calculated from the adjacent block and the intra prediction mode of the prediction unit are the same.
  • prev_intra_luma_unipred_idx [i] When prev_intra_luma_unipred_idx [i] is 1, it indicates that the MostProbableMode and the intra prediction mode IntraPredMode are equal. When prev_intra_luma_unipred_idx [i] is 0, it indicates that the MostProbableMode and the intra prediction mode IntraPredMode are different, and information rem_intra_luma_unipred_mode [i] that specifies which mode other than the MostProbableMode is the intra prediction mode IntraPredMode is encoded. .
  • the rem_intra_luma_unipred_mode [i] may be encoded with the same length according to the number of prediction modes, or may be encoded using a predetermined code table.
  • the above is the detailed description of the image coding apparatus 2000 according to the second embodiment of the present invention.
  • the third embodiment relates to a moving picture decoding apparatus for decoding encoded data encoded by the first moving picture decoding apparatus. That is, the decoding apparatus according to the present embodiment decodes encoded data generated by, for example, the image encoding apparatus according to the first embodiment.
  • the moving picture decoding apparatus 2700 includes an input buffer 2701, an entropy decoding unit 2702, an inverse quantization unit 2703, an inverse orthogonal transform unit 2704, an addition unit 2705, and a reference image memory 2706. , An intra unidirectional prediction image generation unit 2707, an intra bidirectional prediction image generation unit 2708, an inter prediction image generation unit 2709, a prediction selection switch 2710, a conversion information setting unit 2711, an output buffer 2712, and a prediction selection unit 2714.
  • the encoded data 2725 is output from, for example, the image encoding device of FIG. 1 and the like, and is temporarily stored in the input buffer 2701 through a storage system or a transmission system (not shown).
  • the entropy decoding unit 2702 performs decoding based on the syntax for each frame or field for decoding the encoded data 2725.
  • the entropy decoding unit 2702 sequentially entropy-decodes the code string of each syntax, and reproduces the encoding parameters of the encoding target block such as prediction information 2721 including the prediction mode information and the quantized transform coefficient (sequence) 2715.
  • the coding parameters are all parameters necessary for decoding such as prediction information 2721, information on transform coefficients, information on quantization, and the like.
  • the inverse quantization unit 2703 performs inverse quantization on the quantized transform coefficient 2715 from the entropy decoding unit 2702 to obtain a restored transform coefficient 2716. Specifically, the inverse quantization unit 2703 performs inverse quantization according to the information regarding the quantization decoded by the entropy decoding unit 2702 (the quantization step width derived from the quantization information into the quantization transform coefficient 2715). Multiply). The inverse quantization unit 2703 inputs the restored transform coefficient 2716 to the inverse orthogonal transform unit 2704.
  • the inverse orthogonal transform unit 2704 performs inverse orthogonal transform corresponding to the orthogonal transform performed on the encoding side on the reconstructed transform coefficient 2716 from the inverse quantization unit 2703, and reconstructed prediction error (also referred to as a prediction difference signal). 2717 is obtained.
  • the inverse orthogonal transform unit 2704 inputs the restoration prediction error 2717 to the addition unit 2705.
  • the addition unit 2705 adds the restored prediction error 2717 and the corresponding predicted image 2722 to generate a decoded image 2718.
  • the decoded image 2718 is input to the reference image memory 2706.
  • the decoded image 2718 is then temporarily stored in the output buffer 2712 for the output image.
  • the decoded image 2718 temporarily stored in the output buffer 2712 is output to a display device system such as a display or a monitor (not shown) or a video device system according to the output timing managed by the decoding control unit 2713.
  • the decoded image signal 2718 stored in the reference image memory 2706 is used as a reference image 2719 by the intra unidirectional prediction image generation unit 2707, the intra bidirectional prediction image generation unit 2708, and the inter prediction image generation unit 2709 as a frame unit or as necessary. Referenced by field.
  • Inverse quantization unit 2703, inverse orthogonal transform unit 2704, addition unit 2705, reference image memory 2706, intra unidirectional prediction image generation unit 2707, intra bidirectional prediction image generation unit 2708, inter prediction image generation unit 2709, conversion information setting unit 2711 and the selection switch 2710 are the inverse quantization unit 104, the inverse orthogonal transform unit 105, the addition unit 106, the reference image memory 107, the intra unidirectional prediction image generation unit 108, the intra bidirectional prediction image generation unit 109, the inter
  • the predicted image generation unit 110, the conversion information setting unit 112, and the selection switch 111 are substantially the same or similar elements.
  • the intra unidirectional predicted image generation unit 2707 (108) performs unidirectional intra prediction using the reference image 2719 (124) stored in the reference image memory 2706 (107). For example, in H.264 / MPEG-4 AVC, pixel interpolation (copying or interpolation) is performed along a prediction direction such as a vertical direction or a horizontal direction using a decoded reference pixel line spatially adjacent to a prediction target block.
  • the intra prediction image is generated by performing processing (referring to filter processing or the like).
  • FIG. 4A shows a prediction direction of intra prediction in H.264 / MPEG-4 AVC.
  • FIG. 4B shows an arrangement relationship between reference pixel lines and encoding target pixels in H.264 / MPEG-4 AVC.
  • FIG. 4C shows a predicted image generation method in mode 1 (horizontal prediction), in which pixels I to L are copied in the prediction direction from the left reference pixel line.
  • FIG. 4D shows a predicted image generation method in mode 4 (diagonal lower right prediction). It is also possible to easily expand the prediction direction of H.264 / MPEG-4 AVC and increase the number of prediction modes. For example, the prediction mode is expanded to 34, the pixel position accuracy corresponding to the fractional position is determined as 32 pixel accuracy, and the prediction pixel value is created by performing linear interpolation (such as 3-tap filter processing). It is also possible.
  • FIG. 4C shows a predicted image generation method in mode 1 (horizontal prediction), in which pixels I to L are copied in the prediction direction from the left reference pixel line.
  • FIG. 4D shows a predicted image generation method in mode 4 (diagonal lower right prediction). It is also possible to easily expand the
  • FIG. 5 shows an example of the prediction angle and the prediction mode when the prediction mode is expanded up to 34 prediction modes.
  • FIG. 5 there are 33 different prediction directions for the vertical and horizontal coordinates indicated by the bold lines.
  • the direction of a typical prediction angle indicated by H.264 / MPEG-4 AVC is indicated by an arrow.
  • 33 types of prediction directions are prepared in a direction in which a line is drawn from the origin to a mark indicated by a circle.
  • DC prediction for prediction based on the average value of available reference pixels is added, and there are a total of 34 prediction modes.
  • IntraPredMode 4
  • IntraPredAngleIdL0 in FIGS. 7A and 7B 4
  • An arrow indicated by a dotted line in FIG. 5 indicates a prediction mode whose prediction type is Intra_Vertical
  • an arrow indicated by a solid line indicates a prediction mode whose prediction type is Intra_Horizontal.
  • FIG. 6 shows the relationship between IntraPredAngleIdLX and intraPredAngle used for predictive image value generation.
  • 7A and 7B show the relationship between the prediction mode (PredMode), the bidirectional prediction flag (BipredFlag), the prediction mode type (PredTypeL0, PredTypeL1), and the prediction angle (PredAngleIdL0, PredAngleIdL1).
  • IntraPredAngle indicates the prediction angle that is actually used when the predicted value is generated. For example, when a prediction value generation method in the case where the prediction type is Intra_Vertical and the intraPredAngle shown in FIGS. 7A and 7B is a positive value is expressed by a mathematical expression, it is expressed by the mathematical expression (1).
  • BLK_SIZE indicates the size of the pixel block
  • ref [] indicates an array in which a reference image (also referred to as a reference pixel line) is stored.
  • pred (k, m) indicates the generated predicted image 125.
  • a predicted value can be generated by a similar method according to the tables of FIGS. 7A and 7B.
  • the intra bidirectional prediction image generation unit 2708 (109) performs bidirectional intra prediction using the reference image 2719 (124) stored in the reference image memory 2706 (107). For example, in Non-Patent Document 1 described above, after selecting two types of prediction modes from nine types of prediction modes defined in H.264 / MPEG-4 AVC and generating respective prediction image signals, The predicted image signal is generated by performing the filtering process every time.
  • Bidirectional prediction when the number of unidirectional prediction modes is expanded to 34 types will be described more specifically with reference to FIG.
  • the maximum number of modes is not limited, and bi-directional prediction can be easily expanded in any number of uni-directional predictions.
  • the intra bidirectional prediction image generation unit 109 (2708) illustrated in FIG. 8 holds a weighted average unit 801, a first unidirectional intra prediction image generation unit 802, and a second unidirectional intra prediction image generation unit 803. .
  • the functions of the first unidirectional intra predicted image generation unit 802 and the second unidirectional intra predicted image generation unit 803 are the same. These may be the same as the intra unidirectional predicted image generation unit 108. In this case, since the three processing units can have the same hardware configuration, the circuit scale can be reduced.
  • the first prediction image 851 is output from the first unidirectional intra prediction image generation unit 802, and the second prediction image 852 is output from the second unidirectional intra prediction image generation unit 803.
  • Each predicted image is input to the weighted average unit 801, and a weighted average process is performed.
  • a weighted average process is performed.
  • calculation based on Expression (2) is performed.
  • the bidirectionally predicted image P [x, y] is expressed by Equation (2).
  • Bidirectional intra prediction indicates that the BipedFlag shown in FIGS. 7A and 7B is 1.
  • two prediction mode types are defined, the prediction mode type corresponding to the first unidirectional intra prediction image generation unit 802 is PredTypeL0, and the prediction mode corresponding to the second unidirectional intra prediction image generation unit 803 is PredTypeL1. It is shown in The same applies to PredAngleIdLX, and the type and angle of each prediction mode are expressed by whether the X portion is 0 or 1.
  • the inter prediction image generation unit 2709 (110) in FIG. 27 performs inter prediction using the reference image 2719 (124) stored in the reference image memory 2706 (107). Specifically, the inter prediction unit 2709 (110) performs interpolating processing (motion compensation) based on the amount of motion shift (motion vector) between the prediction target block and the reference image 2719 (124) to perform inter prediction. Generate an image. In H.264 / MPEG-4 AVC, interpolation processing up to 1 ⁇ 4 pixel accuracy is possible.
  • the derived motion vector is decoded by the entropy decoding unit 2702 as a part of the prediction information 2721 (126).
  • the prediction selection switch 2710 (111) is an output terminal of the intra unidirectional prediction image generation unit 2707 (108), an output terminal of the intra bidirectional prediction image generation unit 2708 (109), or the inter prediction image generation unit 2709 (110).
  • the output end is selected according to the prediction information 2721 (126) output from the entropy decoding unit 2702, and the intra prediction image or the inter prediction image is input to the addition unit 2705 (106) as the prediction image 2722 (125).
  • the prediction selection switch 2710 (111) causes the intra unidirectional prediction image generation unit 2707 (108) or both intra according to the prediction mode shown in FIGS. 7A and 7B.
  • a switch is connected to the output terminal from the direction prediction image generation unit 2708 (109).
  • the prediction selection switch 2710 (111) connects the switch to the output terminal from the inter prediction unit 2709 (110).
  • the prediction selection unit 2714 controls the output terminal of the switch to the prediction selection switch 2710 based on the prediction information 2721 sent from the entropy decoding unit 2702.
  • intra prediction or inter prediction can be selected to generate the predicted image 2722, but a plurality of modes are defined for each of intra prediction and inter prediction.
  • One prediction mode is input as prediction information 2721 from these.
  • substantially two prediction modes are selected as shown in FIGS. 11A and 11B.
  • the inverse orthogonal transform unit 2704 (105) includes a selection switch A 1201, a vertical inverse transform unit 1206, a transposition unit 1204, a selection switch B 1205, and a horizontal inverse transform unit 1207.
  • the vertical inverse transform unit 1206 includes a 1D inverse discrete cosine transform unit 1202 and a 1D inverse discrete sine transform unit 1203.
  • the horizontal inverse transform unit 1207 includes a 1D inverse discrete cosine transform unit 1202 and a 1D inverse discrete sine transform unit 1203. Note that the order of the vertical inverse transform unit 1206 and the horizontal inverse transform unit 1207 is an example, and these may be reversed.
  • the two 1D inverse discrete cosine transform units 1202 shown in FIG. 12 can also be realized by using physically identical hardware in a time division manner. The same applies to the 1D inverse discrete sine transform unit 1203.
  • the selection switch A 1201 converts the restored transform coefficient 121 of the 1D inverse discrete cosine transform unit 1202 and the 1D inverse discrete sine transform unit 1203 according to the vertical transform index (Vertical_transform_idx) included in the 1D transform index 1252 output from the 1D transform setting unit 1208 Lead to one of them.
  • the 1D inverse discrete cosine transform unit 1202 multiplies the input restoration transform coefficient 121 (matrix format) by a transposed matrix of the discrete cosine transform matrix and outputs the result.
  • the 1D inverse discrete sine transform unit 1203 multiplies the input restoration transform coefficient 121 by the transposed matrix of the discrete sine transform matrix and outputs the result.
  • the 1D inverse discrete cosine transform unit 1202 and the 1D inverse discrete sine transform unit 1203 perform one-dimensional inverse orthogonal transform represented by Expression (9).
  • Equation (9) a Z 'represents a matrix of restoring conversion coefficient 121 (N ⁇ N), V T is the transpose matrix of 1D inverse discrete cosine transform matrix and 1D inverse discrete sine transform matrix (both N ⁇ N) Y ′ indicates an output matrix (N ⁇ N) of the 1D inverse discrete cosine transform unit 1202 and the 1D inverse discrete sine transform unit 1203. That, V T is the discrete cosine transform matrix in Equation (6), or a discrete sine transform matrix equation (7) corresponds, respectively.
  • the transposition unit 1204 transposes the output matrix (Y ′) of the vertical inverse transform unit 1206 and supplies the transposition to the selection switch B 1205.
  • the transposition unit 1204 is an example, and corresponding hardware may not necessarily be prepared.
  • 1D inverse orthogonal transform by the vertical inverse transform unit 1206 is executed (each element of the output matrix of the vertical inverse transform unit 1206) is held, and 1D inverse orthogonal transform is performed by the horizontal inverse transform unit 1207 If read in an appropriate order, transposition of the output matrix (Y ′) can be executed without preparing hardware corresponding to the transposition unit 1204.
  • the selection switch B 1205 converts the input matrix from the transposition unit 1204 into a 1D inverse discrete cosine transform unit 1202 and a 1D inverse discrete sine transform according to a horizontal transformation index (Horizontal_transform_idx) included in the 1D transformation index 1252 output from the 1D transformation setting unit 1208. Lead to one of the sections 1203.
  • the 1D inverse discrete cosine transform unit 1202 performs 1D inverse discrete cosine transform on the input matrix and outputs the result.
  • the 1D inverse discrete sine transform unit 1203 performs 1D inverse discrete sine transform on the input matrix and outputs the result.
  • the 1D inverse discrete cosine transform unit 1202 and the 1D inverse discrete sine transform unit 1203 perform one-dimensional inverse orthogonal transform represented by Expression (10).
  • H T is 1D are generically indicates a transposed matrix of the discrete cosine transform matrix and 1D discrete sine transform matrix (both N ⁇ N), X 'is 1D inverse discrete cosine transform unit 1202 and 1D An output matrix (N ⁇ N) of the inverse discrete sine transform unit 1203 is shown, which indicates the restored prediction error 122. That, H T is the discrete cosine transform matrix in Equation (6), or a discrete sine transform matrix equation (7) corresponds, respectively.
  • the inverse orthogonal transform unit 105 performs inverse orthogonal transform on the reconstructed transform coefficient (matrix) 2716 (121) according to the input orthogonal transform information 2723 (127), and reconstructed prediction error (matrix) 2717. (122) is generated.
  • the 1D inverse discrete cosine transform unit 1202 may be replaced with an inverse discrete cosine transform of H.264 / MPEG-4 AVC by reusing the existing inverse orthogonal transform.
  • the inverse orthogonal transform unit 2704 (105) may realize various inverse orthogonal transforms such as Hadamard transform and Karhunen-Loeve transform. In any case, an inverse orthogonal transform corresponding to the orthogonal transform unit 102 of the image encoding device 100 shown in FIG. 1 may be selected.
  • the 1D conversion setting unit 1208 based on the orthogonal conversion information 2723 (127), a 1D conversion index for selecting a conversion matrix used for vertical orthogonal conversion and vertical inverse orthogonal conversion, horizontal orthogonal conversion and horizontal reverse It has a function of setting a 1D transform index for selecting a transform matrix used for orthogonal transform.
  • the 1D transform index 1252 directly or indirectly indicates the orthogonal transform selected by the vertical orthogonal transform and the horizontal orthogonal transform, respectively.
  • the 1D transform index 1252 can be expressed by a transform index (TransformIdx) shown in FIG. 13A and a 1D orthogonal transform in the vertical or horizontal direction (Vertical_transform_idx and Horizontal_transform_idx, respectively).
  • a 1D transformation index (Vertical_transform_idx) for a vertical transformation unit and a 1D transformation index (Horizontal_transform_idx) for a horizontal transformation unit can be derived from the transformation index.
  • FIG. 13B shows whether each idx indicates discrete cosine transform or discrete sine transform.
  • idx When idx is “1”, it indicates a discrete sine transform matrix (DST), and when it is “0”, it indicates a discrete cosine transform matrix (DCT).
  • DST discrete sine transform matrix
  • DCT discrete cosine transform matrix
  • the corresponding 1D transform index 1252 is referenced based on TransformIdx included in the orthogonal transform information 127, and Vertical_transform_idx is output to the selection switch A and Horizontal_transform_idx is output to the selection switch B.
  • the selection switch connects the output end to the 1D inverse discrete cosine transform unit 1202 (or 1D discrete cosine transform unit 902).
  • the selection switch When Vertical_transform_idx or Horizontal_transform_idx indicates DST with reference to FIG. 13B, the selection switch connects the output end to the 1D inverse discrete sine transform unit 1203 (or 1D discrete sine transform unit 903).
  • the orthogonal transform information indirectly and directly indicates a transform index corresponding to the selected prediction mode with reference to a predetermined transform index and prediction mode map.
  • 11A and 11B show the relationship between the prediction mode and the conversion index.
  • 11A and 11B are obtained by adding TransformIdx to the intra prediction mode shown in FIGS. 7A and 7B. From this table, TransformIdx can be derived according to the selected prediction mode.
  • Intra_DC indicates DC prediction that is predicted by the average value of available reference pixels.
  • the prediction error of DC prediction the directional error as described above does not occur statistically, so the DCT used in H.264 / MPEG-4 AVC is selected.
  • the tendency of the prediction error changes according to the prediction direction of the other prediction modes. Therefore, in such a case, it is possible to efficiently reduce the redundancy of the prediction error by setting TransformIdx determined in the other prediction mode. For example, as illustrated in FIGS.
  • FIG. 15 illustrates a syntax 1500 used by the video decoding apparatus in FIG.
  • the syntax 1500 includes three parts: a high level syntax 1501, a slice level syntax 1502, and a coding tree level syntax 1503.
  • the high level syntax 1501 includes syntax information of a layer higher than the slice.
  • a slice refers to a rectangular area or a continuous area included in a frame or a field.
  • the slice level syntax 1502 includes information necessary for decoding each slice.
  • Coding tree level syntax 1503 includes information necessary to decode each coding tree (ie, each coding tree block). Each of these parts includes more detailed syntax.
  • the high level syntax 1501 includes sequence and picture level syntaxes such as a sequence parameter set syntax 1504 and a picture parameter set syntax 1505.
  • the slice level syntax 1502 includes a slice header syntax 1506, a slice data syntax 1507, and the like.
  • the coding tree level syntax 1503 includes a coding tree block syntax 1508, a prediction unit syntax 1509, and the like.
  • the coding tree block syntax 1508 can have a quadtree structure. Specifically, the coding tree block syntax 1508 can be recursively called as a syntax element of the coding tree block syntax 1508. That is, one coding tree block can be subdivided with a quadtree. Also, the coding tree block syntax 1508 includes a transform unit syntax 1510. The transform unit syntax 1510 is invoked at each coding tree block syntax 1508 at the extreme end of the quadtree. The transform unit syntax 1510 describes information related to inverse orthogonal transformation and quantization.
  • the transform unit syntax 1510 can have a quadtree structure. Specifically, the transform unit syntax 1510 can be further recursively called as a syntax element of the transform unit syntax 1510. That is, one transform unit can be subdivided with a quadtree.
  • FIG. 16 illustrates a slice header syntax 1506 according to the present embodiment.
  • the slice_bipred_intra_flag illustrated in FIG. 16 is a syntax element indicating, for example, the validity / invalidity of bidirectional intra prediction according to the present embodiment for the slice.
  • the prediction selection switch 2710 does not connect the output terminal of the switch to the intra bidirectional prediction image generation unit 2708 (109).
  • BipedFlag [] in FIGS. 7A and 7B, FIG. 11A and FIG. 11B is 0, or intra prediction defined in H.264 / MPEG-4 AVC may be performed. Absent.
  • slice_bipred_intra_flag 1
  • the bidirectional intra prediction according to the present embodiment is valid in the entire area of the slice.
  • slice_bipred_intra_flag 1
  • the prediction validity / Invalidity may be specified.
  • the slice_directional_transform_intra_flag shown in FIG. 16 is a syntax element indicating the validity / invalidity of the inverse discrete sine transform according to the present embodiment with respect to the slice, for example.
  • the transformation information setting unit 2711 (112) always sets TransformIdx to 3 and outputs it.
  • the 1D conversion setting units 908 and 1208 may set Vertical_transform_idx and Horizontal_transform_idx to 1 regardless of the value of TransformIdx.
  • slice_directional_transform_intra_flag is 1, the inverse discrete sine transform according to the present embodiment is valid over the entire area in the slice.
  • slice_directional_transform_intra_flag 1
  • the inverse discrete sine transform according to the present embodiment for each local region in the slice in the syntax of a lower layer (coding tree block, transform unit, etc.)
  • the validity / invalidity may be defined.
  • FIG. 17 illustrates the coding tree block syntax 1508 according to the present embodiment.
  • Ctb_directional_transform_flag shown in FIG. 17 is a syntax element indicating validity / invalidity of the inverse discrete sine transform according to this embodiment with respect to the coding tree block.
  • pred_mode shown in FIG. 17 is one of syntax elements included in the prediction unit syntax 1509, and indicates the coding type in the coding tree block or macroblock. MODE_INTRA indicates that the encoding type is intra prediction.
  • ctb_directional_transform_flag is encoded only when the above-described slice_directional_transform_flag is 1 and the encoding type of the coding tree block is intra prediction.
  • CBP indicates Coded_Block_Pattern information, and is information indicating whether or not there is a transform coefficient in the coding tree block. If the information is 0, it indicates that there is no transform coefficient, and the decoder does not need to perform inverse transform processing, so ctb_directional_transform_flag is not decoded.
  • the transformation information setting unit 2711 (112) always sets TransformIdx to 3 and outputs it.
  • the 1D conversion setting units 908 and 1208 may set Vertical_transform_idx and Horizontal_transform_idx to 1 regardless of the value of TransformIdx.
  • ctb_directional_transform_flag is 1, the inverse discrete sine transform according to the present embodiment is valid in the coding tree block.
  • the coding tree block syntax 1508 encodes a flag that defines the validity / invalidity of the inverse discrete sine transform according to the present embodiment, so that each local region (ie, coding tree block) is encoded.
  • Optimal inverse orthogonal transform can be performed.
  • FIG. 18 illustrates a transform unit syntax 1510 according to this embodiment.
  • the tu_directional_transform_flag shown in FIG. 18 is a syntax element indicating validity / invalidity of the inverse discrete sine transform according to this embodiment with respect to the transform unit.
  • pred_mode shown in FIG. 18 is one of syntax elements included in the prediction unit syntax 1509, and indicates the coding type in the coding tree block or macroblock. MODE_INTRA indicates that the encoding type is intra prediction.
  • tu_directional_transform_flag is encoded only when slice_directional_transform_flag is 1 and the encoding type of the coding tree block is intra prediction.
  • coded_block_flag is 1-bit information indicating whether or not there is a transform coefficient in the transform unit. If the information is 0, it indicates that there is no transform coefficient, and the decoder does not need to perform inverse transform processing, so tu_directional_transform_flag is not decoded.
  • the transformation information setting unit 2711 (112) always sets TransformIdx to 3 and outputs it.
  • the 1D conversion setting units 908 and 1208 may set Vertical_transform_idx and Horizontal_transform_idx to 1 regardless of the value of TransformIdx.
  • tu_directional_transform_flag 1, the inverse discrete sine transform according to the present embodiment is valid in the coding tree block.
  • each local region ie, transform unit
  • FIG. 19 shows an example of the prediction unit syntax.
  • Pred_mode in FIG. 19 indicates the prediction type of the prediction unit.
  • MODE_INTRA indicates that the prediction type is intra prediction.
  • intra_split_flag is a flag indicating whether or not the prediction unit is further divided into four prediction units.
  • intra_split_flag is 1, a prediction unit is a prediction unit obtained by dividing the prediction unit into four in half in the vertical and horizontal sizes.
  • intra_split_flag is 0, the prediction unit is not divided.
  • Intra_luma_bipred_flag [i] is a flag indicating whether the prediction mode IntraPredMode applied to the prediction unit is a unidirectional intra prediction mode or a bidirectional intra prediction mode. i indicates the position of the divided prediction unit. When the intra_split_flag is 0, 0 is set, and when the intra_split_flag is 1, 0 to 3 are set. In the flag, the value of IntraBipredFlag of the prediction unit shown in FIGS. 9, 12, 13A, and 13B is set.
  • intra_luma_bipred_flag [i] When intra_luma_bipred_flag [i] is 1, this indicates that the prediction unit is bi-directional intra prediction, and is information that identifies the bi-directional intra prediction mode used from among the prepared bi-directional intra prediction modes.
  • intra_luma_bipred_mode [i] may be decoded in equal length according to the bidirectional intra prediction mode number IntraBipredNum shown in FIGS. 7A and 7B, FIG. 11A and FIG. 11B, or may be decoded using a predetermined code table. good.
  • intra_luma_bipred_flag [i] When intra_luma_bipred_flag [i] is 0, it indicates that the prediction unit is unidirectional intra prediction, and predictive decoding is performed from adjacent blocks.
  • Prev_intra_luma_unipred_idx [i] is a flag indicating whether or not the prediction value MostProbableMode of the prediction mode calculated from the adjacent block and the intra prediction mode of the prediction unit are the same. Details of the calculation method of MostProbableMode will be described later. When prev_intra_luma_unipred_idx [i] is not 0, it indicates that the MostProbableMode and the intra prediction mode IntraPredMode are equal.
  • prev_intra_luma_unipred_idx [i] When prev_intra_luma_unipred_idx [i] is 0, it indicates that the MostProbableMode and the intra prediction mode IntraPredMode are different, and information rem_intra_luma_unipred_mode [i] that specifies which intra prediction mode IntraPredMode is other than MostProbableMode is decoded.
  • rem_intra_luma_unipred_mode [i] may be decoded in equal length according to the bidirectional intra prediction mode number IntraPredModeNum shown in FIGS. 7A and 7B, FIG. 11A and FIG. 11B, or may be decoded using a predetermined code table. good. From intra prediction mode IntraPredMode, rem_intra_luma_unipred_mode [i] is calculated using Equation (11).
  • numCand indicates the number of candidates for MostProbableMode
  • candModeList [cIdx] indicates the MostProbableMode that is actually a candidate.
  • numCand is set to 2
  • the candidate MostProbableMode is set to IntraPredMode of pixel blocks positioned on the upper and left sides that have already been predicted and are adjacent to the prediction target block.
  • candModeList [0] is indicated as MPM_L0
  • candModeList [1] is indicated as MPM_L1.
  • prev_intra_luma_unipred_idx [i] a prediction mode of MPM_L0 is derived, and when prev_intra_luma_unipred_idx [i] is 2, a prediction mode of MPM_L1 is derived.
  • candModeList [cIdx] may be in the same prediction mode.
  • the redundant prediction mode is omitted and expressed as shown in Expression (11).
  • an optimum code table may be created in consideration of the maximum number of these prediction modes.
  • numCand is 1, MostProbableMode is calculated according to Equation (12).
  • Min (x, y) is a parameter that outputs the smaller of input x and y.
  • IntraPredModeA and IntraPredModeB indicate the intra prediction modes of prediction units adjacent to the left and above the encoded prediction unit.
  • syntax elements not defined in this embodiment may be inserted between the lines of the syntax tables illustrated in FIGS. 16, 17, 18 and 19, and other conditional branch descriptions are included. It may be. Further, the syntax table may be divided into a plurality of tables, or a plurality of syntax tables may be integrated. Moreover, the term of each illustrated syntax element can be changed arbitrarily.
  • the image coding apparatus when using two reference pixel lines having different prediction directions, has a different prediction error tendency from either of the two prediction directions, and follows one prediction mode.
  • orthogonal transform is selected, the problem that the coding efficiency is lowered without taking advantage of the tendency of intra prediction in which the prediction accuracy is lowered as the distance between the reference pixel and the prediction pixel is increased is solved.
  • This image encoding apparatus classifies the vertical direction and horizontal direction of each prediction mode into two classes according to the presence or absence of the above-described tendency, and adaptively performs 1D discrete cosine transform or 1D discrete sine transform for each of the vertical direction and horizontal direction. Apply.
  • the coefficient density is higher than that of the 1D discrete cosine transform. Therefore, according to the image decoding apparatus according to the present embodiment, high conversion efficiency is stably achieved as compared with a case where fixed orthogonal transformation such as DCT is uniformly applied to each prediction mode.
  • the inverse orthogonal transform unit 105 is suitable for both hardware implementation and software implementation. The above is the description of the image encoding device according to the third embodiment.
  • the fourth embodiment relates to a moving picture decoding apparatus for decoding encoded data encoded by a second moving picture decoding apparatus. That is, the moving picture decoding apparatus according to the present embodiment decodes encoded data generated by, for example, the image encoding apparatus according to the second embodiment.
  • the moving picture decoding apparatus includes the moving picture decoding apparatus according to the third embodiment described above, an intra unidirectional prediction image generation unit 2707 (108), and an intra bidirectional prediction image generation unit 2708 ( 109) in detail.
  • an intra unidirectional prediction image generation unit 2707 108
  • an intra bidirectional prediction image generation unit 2708 109 in detail.
  • the same parts as those in the third embodiment are denoted by the same reference numerals, and different parts will be mainly described.
  • FIG. 28 is a block diagram of a moving image decoding apparatus 2800 according to the fourth embodiment of the present invention.
  • a prediction direction deriving unit 2801 (2001) is newly added and prediction direction deriving information 2851 (2051) from the prediction direction deriving unit 2801 (2001) to the prediction selection switch 2710 (111). This is the output point.
  • the intra unidirectional prediction image generation unit 2707 (108) and the intra bidirectional prediction image generation unit 2708 (109) are extended to 128 directions. Specifically, it means that the angle gradient of 180 degrees used for direction prediction is divided into 128 pieces, and a prediction direction is assigned every 1.4 degrees.
  • the prediction modes of the unidirectional prediction described in the third embodiment are the same as those in FIGS. 7A and 7B and FIGS. 11A and 11B. Since the other configuration is the same as that of the first embodiment, description thereof is omitted.
  • intra prediction performed using the prediction direction deriving unit 2801 (2001) is referred to as a prediction direction deriving mode.
  • the reference image 2719 (124) output from the reference image memory 2706 (107) is input to the prediction direction deriving unit 2801 (2001).
  • the prediction direction deriving unit 2801 (2001) has a function of analyzing the input reference image 2719 (124) and generating prediction direction deriving information 2851 (2051).
  • the prediction direction deriving unit 2801 (2001) will be described with reference to FIG. As shown in FIG. 21, the prediction direction deriving unit 2801 (2001) includes a left reference pixel line edge deriving unit 2101, an upper reference pixel line edge deriving unit 2102, and a prediction direction deriving information generating unit 2103.
  • the left reference pixel line edge deriving unit 2101 has a function of performing edge detection processing on a reference pixel line located to the left of the prediction target pixel block and deriving an edge direction.
  • the upper reference pixel line edge deriving unit 2102 has a function of performing edge detection processing on a reference pixel line located above the prediction target pixel block and deriving an edge direction.
  • FIG. 22 shows an example of pixels used for the derivation of the left reference pixel line edge deriving unit 2101 and the upper reference pixel line edge deriving unit 2102.
  • the left reference pixel line edge deriving unit 2101 uses two lines indicated by diagonal lines from the upper right to the lower left located on the left side of the prediction target pixel block.
  • the upper reference pixel line edge deriving unit 2102 uses two lines indicated by diagonal lines from the upper left to the lower right located above the prediction target pixel block. In the present embodiment of the present invention, two lines have been described, but one line or three lines may be used, and more lines may be used. In the drawing, an example in which the edge direction is derived using the left reference pixel line is shown as A, and an example in which the edge direction is derived using the upper reference pixel line is shown as B.
  • both processing units perform edge intensity detection using an operator as shown in Equation (13).
  • Gx indicates the edge strength in the horizontal direction (x coordinate system), and indicates the edge strength in Gy (y coordinate system).
  • any operator such as a Sobel operator, a Prewitt operator, or a Kirsch operator may be used as the edge detection operator.
  • Equation (13) When the operator of Equation (13) is applied to the reference pixel line, an edge direction vector is derived for each pixel. Equation (14) is used to derive the optimum edge direction from these edge vectors.
  • ⁇ a, b> represents the inner product of two vectors.
  • the unit vector and edge strength (direction vector) are respectively expressed by the following equations.
  • the representative edge angle can be calculated by optimizing Equation (13) using Equation (15).
  • the edge strength of the following expression of each pixel line derived by the left reference pixel line edge deriving unit 2101 and the upper reference pixel line edge deriving unit 2102 is input to the prediction direction deriving information generating unit 2103.
  • the prediction direction derivation information generation unit 2103 calculates Equations (14) and (15) using all the input edge strengths, and derives a unidirectional representative edge angle.
  • both the expression edge (14) and the expression (15) are separately calculated with respect to the two representative edge angles of the left reference pixel line and the upper reference pixel line, and both have two representative edge angles.
  • the direction representative edge angle is derived.
  • the prediction direction derivation information generation unit 2103 derives a plurality of peripheral unidirectional representative edge angles that are angularly adjacent, starting from the unidirectional representative edge angle. For example, it is assumed that there are 128 types of edge angles and they are arranged in the order of angles.
  • the representative edge angle is RDM
  • the peripheral unidirectional representative edge angles are expressed as RDM-1, RDM + 1, RDM-2, RDM + 2,.
  • the peripheral unidirectional representative edge angle is 10 (for example, ⁇ 5).
  • FIG. 23 shows an example of the prediction direction derivation information 2851 (2051).
  • the unidirectional representative edge angle is indicated by RDM
  • the representative edge angle of the left reference pixel line of the bidirectional representative edge angle is indicated by RDM_L0
  • the representative edge angle of the upper reference pixel line of the bidirectional representative edge angle is It is indicated by RDM_L1.
  • RDMPredMode indicates the prediction mode derived by the prediction direction deriving unit 2801 (2001) in the present embodiment of the present invention.
  • RDMBipredFlag indicates whether the prediction mode derived by the prediction direction deriving unit 2801 (2001) in the present embodiment of the present invention is bidirectional prediction.
  • RDMPredAngleIdL0 and RDMPredAngleIdL1 indicate which prediction angle the prediction mode derived by the prediction direction deriving unit 2801 (2001) in the present embodiment of the present invention indicates.
  • the prediction direction derivation information 2851 (2051) includes two representative edge angles included in the unidirectional representative edge angle and the bidirectional representative edge angle, in addition to FIG. The relationship between these prediction modes and TransformIdx will be described later.
  • the prediction direction derivation information 2851 (2051) generated by the prediction direction derivation unit 2801 (2001) is input to the prediction selection switch 2710 (111).
  • a prediction image 125 is generated by the intra unidirectional prediction image generation unit 2707 (108) or the intra bidirectional prediction image generation unit 2708 (109) according to the prediction mode selected here.
  • These predicted image generation units are the same as those in the first embodiment except that the number of predicted angles is expanded to 128. For example, when RDMPredMode is 1, the first predicted image signal 851 and the second predicted image signal 852 in FIG. 8 are generated at two unidirectional representative edge angles included in the bidirectional representative edge angle, and the weighted average An average process is performed in the unit 801, and a predicted image 2722 (125) is output.
  • the prediction selection unit 2714 controls the output terminal of the prediction selection switch 2710 based on the prediction information 2721 sent from the entropy decoding unit 2702 and the prediction direction derivation information 2851 input from the prediction direction derivation unit 2801.
  • intra prediction or inter prediction can be selected to generate the predicted image 2722, but a plurality of modes are defined for each of intra prediction and inter prediction.
  • One prediction mode is input as prediction information 2721 from these.
  • substantially two prediction modes are selected as shown in FIGS. 11A and 11B.
  • TransformIdx is selected in units of coding tree blocks (or the first prediction unit included in the coding tree block). For example, assume that the N ⁇ N pixel block 0 shown in FIG. 2C is a prediction unit, and that a 2N ⁇ 2N pixel block is a coding tree block.
  • TransformIdx is selected using the information of the reference pixel line derived here. .
  • TransformIdx [0] for following the TransformIdx of the head prediction unit is specified regardless of each prediction mode.
  • TransformIdx when the prediction mode described in FIG. 23 is selected in advance may be determined. For example, it is possible to easily set TransformIdx to 0 in order to reduce the process of deriving the predicted angle.
  • FIG. 24 illustrates a slice header syntax 1506 according to this embodiment.
  • the slice_derived_direction_intra_flag illustrated in FIG. 24 is a syntax element indicating, for example, the validity / invalidity of the prediction direction deriving method according to the present embodiment for the slice.
  • the prediction selection switch 2710 connects the output terminal of the switch according to the first embodiment of the present invention.
  • slice_derived_direction_intra_flag 1
  • the prediction direction derivation mode according to the present embodiment is valid over the entire area in the slice.
  • the prediction direction derivation method is 1, the prediction according to the present embodiment is performed for each local region in the slice in the syntax of a lower layer (coding tree block, transform unit, etc.).
  • the validity / invalidity of the direction derivation mode may be defined.
  • FIG. 25A shows an example of the prediction unit syntax.
  • intra_derived_direction_flag [i] is a flag indicating whether the prediction mode IntraPredMode applied to the prediction unit is the first embodiment shown in FIGS. 11A and 11B or the fourth embodiment shown in FIG. is there. i indicates the position of the divided prediction unit.
  • intra_split_flag is 0, 0 is set, and when the intra_split_flag is 1, 0 to 3 are set.
  • intra_direction_mode [i] which is information identifying the used intra prediction mode, is encoded from among the prepared prediction modes. As shown in FIG. 23, in this mode, a bidirectional intra prediction mode and a unidirectional intra prediction are mixedly expressed. These prediction modes can be expressed as separate syntax elements and encoded.
  • intra_direction_mode [i] is described as an example corresponding to RDMPredMode, but the prediction mode expression method may be changed based on RDMPredAngleIdL0.
  • the prediction mode is expressed by dividing it into two types of syntax elements: a 1-bit flag indicating a sign (+/ ⁇ ) and an index indicating a change. It is also possible. In this case, a new flag indicating whether or not bidirectional intra prediction is used may be prepared.
  • intra_direction_mode [i] may be decoded with an equal length according to the number of prediction modes, or may be decoded using a predetermined code table. When intra_direction_mode [i] is 0, it indicates that the prediction unit does not use the prediction direction deriving method according to the present embodiment of the present invention, and decoding is performed according to the method described in the first embodiment.
  • FIG. 25B shows an example of the prediction unit syntax as another embodiment of the present invention.
  • intra_direction_mode [i] is expressed by being divided into prev_intra_direction_mode [i] and rem_intra_direction_mode [i]. These syntax elements introduce the prediction between prediction modes similarly to Formula (11) or (12).
  • prev_intra_direction_mode [i] is a flag indicating whether or not the prediction value MostProbableMode of the prediction mode calculated from the adjacent block and the intra prediction mode of the prediction unit are the same. When prev_intra_direction_mode [i] is 1, it indicates that the MostProbableMode and the intra prediction mode IntraPredMode are equal.
  • prev_intra_direction_mode [i] When prev_intra_direction_mode [i] is 0, it indicates that the MostProbableMode and the intra prediction mode IntraPredMode are different, and information rem_intra_direction_mode [i] that specifies which mode other than the MostProbableMode is the intra prediction mode IntraPredMode is decoded.
  • the rem_intra_direction_mode [i] may be decoded in equal length according to the number of prediction modes, or may be decoded using a predetermined code table.
  • FIG. 25C An example of the prediction unit syntax is shown in FIG. 25C as another embodiment of the present invention.
  • PredMode described in the first embodiment and PredMode described in the second embodiment are integrated and expressed as one PredMode table.
  • PredMode in this case is shown in FIGS. 26A, 26B, and 26C.
  • These syntax elements introduce the prediction between prediction modes similarly to Formula (11) or (12).
  • prev_intra_luma_unipred_idx [i] is a flag indicating whether or not the prediction value MostProbableMode of the prediction mode calculated from the adjacent block and the intra prediction mode of the prediction unit are the same.
  • prev_intra_luma_unipred_idx [i] When prev_intra_luma_unipred_idx [i] is 1, it indicates that the MostProbableMode and the intra prediction mode IntraPredMode are equal. When prev_intra_luma_unipred_idx [i] is 0, it indicates that the MostProbableMode and the intra prediction mode IntraPredMode are different, and information rem_intra_luma_unipred_mode [i] that specifies which mode other than the MostProbableMode is the intra prediction mode IntraPredMode is decoded.
  • the rem_intra_luma_unipred_mode [i] may be decoded in the same length according to the number of prediction modes, or may be decoded using a predetermined code table.
  • encoding and decoding may be performed sequentially from the upper right to the lower left, or encoding and decoding may be performed so as to draw a spiral from the screen end toward the center of the screen.
  • the position of the adjacent pixel block that can be referred to changes depending on the encoding order, the position may be changed to a usable position as appropriate.
  • the prediction target block is a uniform block. It does not have to be a shape.
  • the prediction target block size may be a 16 ⁇ 8 pixel block, an 8 ⁇ 16 pixel block, an 8 ⁇ 4 pixel block, a 4 ⁇ 8 pixel block, or the like.
  • the code amount for encoding or decoding the division information increases with the increase in the number of divisions. Therefore, it is desirable to select the block size in consideration of the balance between the code amount of the division information and the quality of the locally decoded image or the decoded image.
  • the color signal component has been described without distinguishing between the prediction process for the luminance signal and the color difference signal.
  • the same or different prediction methods may be used. If different prediction methods are used between the luminance signal and the chrominance signal, the prediction method selected for the chrominance signal can be encoded or decoded in the same manner as the luminance signal.
  • the color signal component is described without distinguishing between the orthogonal transformation process and the inverse orthogonal transformation process for the luminance signal and the color difference signal.
  • the orthogonal transformation process is different between the luminance signal and the color difference signal, the same or different orthogonal transformation methods may be used. If different orthogonal transformation methods are used between the luminance signal and the color difference signal, the orthogonal transformation method selected for the color difference signal can be encoded or decoded in the same manner as the luminance signal.
  • syntax elements that are not defined in the present invention can be inserted between the rows of the table shown in the syntax configuration, and descriptions relating to other conditional branches are included. It does not matter.
  • the syntax table can be divided and integrated into a plurality of tables. Moreover, it is not always necessary to use the same term, and it may be arbitrarily changed depending on the form to be used.
  • each embodiment realizes high-efficiency intra prediction and corresponding high-efficiency orthogonal transform and inverse orthogonal transform while alleviating the difficulty in hardware implementation and software implementation. Therefore, according to each embodiment, the encoding efficiency is improved, and the subjective image quality is also improved.
  • the storage medium can be a computer-readable storage medium such as a magnetic disk, optical disk (CD-ROM, CD-R, DVD, etc.), magneto-optical disk (MO, etc.), semiconductor memory, etc.
  • the storage format may be any form.
  • the program for realizing the processing of each of the above embodiments may be stored on a computer (server) connected to a network such as the Internet and downloaded to the computer (client) via the network.
  • the instructions shown in the processing procedure shown in the above embodiment can be executed based on a program that is software.
  • a general-purpose computer system stores this program in advance, and by reading this program, it is also possible to obtain the same effects as those obtained by the video encoding device and video decoding device of the above-described embodiment. is there.
  • the instructions described in the above-described embodiments are, as programs that can be executed by a computer, magnetic disks (flexible disks, hard disks, etc.), optical disks (CD-ROM, CD-R, CD-RW, DVD-ROM, DVD). ⁇ R, DVD ⁇ RW, etc.), semiconductor memory, or a similar recording medium. As long as the recording medium is readable by the computer or the embedded system, the storage format may be any form.
  • the computer reads the program from the recording medium and causes the CPU to execute instructions described in the program based on the program
  • the computer is similar to the video encoding device and video decoding device of the above-described embodiment. Operation can be realized.
  • the computer acquires or reads the program, it may be acquired or read through a network.
  • the OS operating system
  • database management software database management software
  • MW middleware
  • a part of each process for performing may be executed.
  • the recording medium in the present invention is not limited to a medium independent of a computer or an embedded system, but also includes a recording medium in which a program transmitted via a LAN or the Internet is downloaded and stored or temporarily stored.
  • the program for realizing the processing of each of the above embodiments may be stored on a computer (server) connected to a network such as the Internet and downloaded to the computer (client) via the network.
  • the number of recording media is not limited to one, and when the processing in the present embodiment is executed from a plurality of media, it is included in the recording media in the present invention, and the configuration of the media may be any configuration.
  • the computer or the embedded system in the present invention is for executing each process in the present embodiment based on a program stored in a recording medium, and includes a single device such as a personal computer or a microcomputer, Any configuration such as a system in which apparatuses are connected to a network may be used.
  • the computer in the embodiment of the present invention is not limited to a personal computer, but includes an arithmetic processing device, a microcomputer, and the like included in an information processing device, and a device capable of realizing the functions in the embodiment of the present invention by a program, The device is a general term.
  • Restored transformation coefficient 122 ... Restored prediction error, 123 ... Restored image, 124 ... see Image, 125 ... Predictive image, 126 ... Prediction information, 127 ... Conversion information, 128 ... Encoded data, 801 ... Weighted average part, 802 ... First Direction intra predicted image generation unit, 803 ... second unidirectional intra predicted image generation unit, 851 ... first predicted image, 852 ... second predicted image, 901, 1201 ... selection switch A, 902 ... 1D discrete cosine transform unit, 903 ... 1D discrete sine transform unit, 904 ... transpose unit, 905, 1205 ... selection switch B, 906 ... vertical transform unit, 907 ...
  • Prediction direction derivation information 2715: Quantization transform coefficient (sequence), 2716: Restoration transform coefficient, 2717 ... Restoration prediction error, 2718 ... Decoded image, 2719 ... Reference image, 2721 ... Prediction information, 2722 ... Predicted image, 2724 ... Decoding Image, 2725 ... encoded data.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Discrete Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

Dans un procédé de codage vidéo selon un mode de réalisation de l'invention, une combinaison de transformations unidimensionnelles, comprenant seulement des premières transformations orthogonales, est sélectionnée lorsqu'une prédiction intra est en train d'être effectuée dans laquelle deux modes de prédiction ou plus utilisent une ou plusieurs lignes de pixels de référence ; et une combinaison de premières transformations orthogonales et de secondes transformations orthogonales est sélectionnée lorsqu'une prédiction intra est en train d'être effectuée dans laquelle chacun des deux modes de prédiction ou plus utilise une seule et même ligne de pixels de référence. Un signal d'image de prédiction est généré en utilisant les deux modes de prédiction ou plus. Une transformation bidimensionnelle est effectuée sur un signal différentiel de prédiction obtenu à partir du signal d'image de prédiction, en utilisant la combinaison sélectionnée des transformations unidimensionnelles pour générer un coefficient de transformation. Le coefficient de transformation et les informations de prédiction représentant la combinaison des deux modes de prédiction ou plus sont codés.
PCT/JP2011/063737 2011-06-15 2011-06-15 Procédé de codage vidéo, procédé de décodage vidéo et dispositif WO2012172667A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2011/063737 WO2012172667A1 (fr) 2011-06-15 2011-06-15 Procédé de codage vidéo, procédé de décodage vidéo et dispositif

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2011/063737 WO2012172667A1 (fr) 2011-06-15 2011-06-15 Procédé de codage vidéo, procédé de décodage vidéo et dispositif

Publications (1)

Publication Number Publication Date
WO2012172667A1 true WO2012172667A1 (fr) 2012-12-20

Family

ID=47356693

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2011/063737 WO2012172667A1 (fr) 2011-06-15 2011-06-15 Procédé de codage vidéo, procédé de décodage vidéo et dispositif

Country Status (1)

Country Link
WO (1) WO2012172667A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113411580A (zh) * 2016-05-13 2021-09-17 夏普株式会社 图像解码装置及其方法、图像编码装置及其方法
JP2021180492A (ja) * 2015-08-20 2021-11-18 日本放送協会 画像復号装置及び画像復号方法

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100246672A1 (en) * 2009-03-31 2010-09-30 Sony Corporation Method and apparatus for hierarchical bi-directional intra-prediction in a video encoder

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100246672A1 (en) * 2009-03-31 2010-09-30 Sony Corporation Method and apparatus for hierarchical bi-directional intra-prediction in a video encoder

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ANKUR SAXENA ET AL.: "Jointly optimal intra prediction and adaptive primary transform", JCTVC-C108, 7 October 2010 (2010-10-07) *
ANKUR SAXENA ET AL.: "Mode- dependent DCT/DST without 4*4 full matrix multiplication for intra prediction", JCTVC-E125, 20 March 2011 (2011-03-20) *
YAN YE ET AL.: "Improved h.264 intra coding based on bi-directional intra prediction, intra prediction, directional transform, and adaptive coefficient scanning", IMAGE PROCESSING, 2008. ICIP 2008. 15TH IEEE INTERNATIONAL CONFERENCE ON, 12 October 2008 (2008-10-12), pages 2116 - 2119 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2021180492A (ja) * 2015-08-20 2021-11-18 日本放送協会 画像復号装置及び画像復号方法
JP7242768B2 (ja) 2015-08-20 2023-03-20 日本放送協会 画像復号装置及び画像復号方法
JP7445799B2 (ja) 2015-08-20 2024-03-07 日本放送協会 画像復号装置及び画像復号方法
CN113411580A (zh) * 2016-05-13 2021-09-17 夏普株式会社 图像解码装置及其方法、图像编码装置及其方法
CN113411580B (zh) * 2016-05-13 2024-01-30 夏普株式会社 图像解码装置及其方法、图像编码装置及其方法

Similar Documents

Publication Publication Date Title
US11936858B1 (en) Constrained position dependent intra prediction combination (PDPC)
US11509936B2 (en) JVET coding block structure with asymmetrical partitioning
US11902519B2 (en) Post-filtering for weighted angular prediction
KR102628889B1 (ko) 인트라 모드 jvet 코딩
US11463708B2 (en) System and method of implementing multiple prediction models for local illumination compensation
WO2011125256A1 (fr) Procédé de codage d'image et procédé de décodage d'image
KR20190123288A (ko) 적응적인 화소 분류 기준에 따른 인루프 필터링 방법
JP2016187211A (ja) 映像復号化装置
KR20200112964A (ko) 변환 도메인에서 잔차 부호 예측 방법 및 장치
KR20200057082A (ko) 적응적 동일하지 않은 가중 평면 예측
WO2012035640A1 (fr) Procédé de codage d'image en mouvement et procédé de décodage d'image en mouvement
WO2019126163A1 (fr) Système et procédé de construction d'un plan pour une prédiction plane
WO2012172667A1 (fr) Procédé de codage vidéo, procédé de décodage vidéo et dispositif
CN113812156A (zh) 在视频编译系统中使用简化的残差数据编译解码视频的方法及其设备
WO2012090286A1 (fr) Procédé de codage d'image vidéo, et procédé de décodage d'image vidéo
JP5367161B2 (ja) 画像符号化方法、装置、及びプログラム
EP3446481B1 (fr) Structure de blocs de codages jvet avec partitionnement asymétrique
JP5649701B2 (ja) 画像復号化方法、装置、及びプログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11867621

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11867621

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP