WO2018061503A1 - Dispositif de codage, procédé de codage, dispositif de codage et procédé de décodage - Google Patents

Dispositif de codage, procédé de codage, dispositif de codage et procédé de décodage Download PDF

Info

Publication number
WO2018061503A1
WO2018061503A1 PCT/JP2017/029411 JP2017029411W WO2018061503A1 WO 2018061503 A1 WO2018061503 A1 WO 2018061503A1 JP 2017029411 W JP2017029411 W JP 2017029411W WO 2018061503 A1 WO2018061503 A1 WO 2018061503A1
Authority
WO
WIPO (PCT)
Prior art keywords
weighting factor
prediction
pixel
region
encoding
Prior art date
Application number
PCT/JP2017/029411
Other languages
English (en)
Japanese (ja)
Inventor
佐藤 数史
Original Assignee
株式会社ドワンゴ
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社ドワンゴ filed Critical 株式会社ドワンゴ
Publication of WO2018061503A1 publication Critical patent/WO2018061503A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/174Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a slice, e.g. a line of blocks or a group of blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/186Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/46Embedding additional information in the video signal during the compression process
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/577Motion compensation with bidirectional frame interpolation, i.e. using B-pictures

Definitions

  • Some aspects according to the present invention relate to an encoding device, an encoding method, a decoding device, and a decoding method for encoding and decoding an image.
  • H.264 H.264 / AVC (hereinafter also referred to as “AVC”) and H.264.
  • AVC H.264 H.264 / AVC
  • HEVC HEVC
  • JVET Joint Video Exploration Team
  • a cost function value is calculated for a candidate prediction mode on the encoder (encoding device) side, and then the value is minimized.
  • the prediction mode is applied to the block to be encoded.
  • JM Joint Model
  • different cost functions having different calculation amounts are used in the high complexity mode and the low complexity mode.
  • a cost function value is calculated for each candidate prediction mode, and a prediction mode that minimizes this value can be applied to the encoding target block (current block).
  • bi-prediction a prediction method called bi-prediction that performs prediction on a block to be encoded using two reference images (reference pictures) is defined.
  • the pixel value is calculated by the following equation (1).
  • x is the coordinate position (two-dimensional vector value including the x coordinate direction and y coordinate direction) of the pixel to be processed
  • v 0 and v 1 are motion vectors (including the x coordinate direction and the y coordinate direction). 2D vector value).
  • P is a predicted value of the pixel of the processing target block
  • P 0 and P 1 are pixel values in the reference image.
  • predicted pixel values are calculated by multiplying the pixel values of two reference images by the same weight 1/2.
  • Non-Patent Document 1 proposes that a predicted pixel value is calculated using different weights using the following equation (2).
  • w is a weight.
  • the weight coefficient w is limited to -1/4, 1/4, 3/8, 1/2, 5/8, 3/4, 5/4, and the index of w
  • the value is explicitly transmitted for each prediction block (PU: Prediction unit).
  • Non-Patent Document 1 the combinations of motion vectors and weighting coefficients are enormous. As a result, a large amount of calculation is required to obtain an optimal combination of motion vectors and weighting factors.
  • Non-Patent Document 1 it is necessary to explicitly transmit information related to the weighting factor w in the encoded data (bit stream). For this reason, the amount of information in the encoded data increases and the encoding efficiency decreases.
  • Non-Patent Document 1 does not consider how the weighting coefficient w is applied to the luminance signal and the color difference signal.
  • Some aspects of the present invention have been made in view of any of the above-described problems, and provide an encoding device, an encoding method, a decoding device, and a decoding method that enable suitable bidirectional prediction.
  • One of the purposes is to do.
  • An encoding apparatus encodes an input moving image having a luminance signal and two color difference signals by predictive encoding using bidirectional prediction using a first weighting factor and a second weighting factor.
  • a frame memory that stores a plurality of reference images including a first reference image and a second reference image, and the first reference that is referred to by an encoding target region in an encoding target image in the input moving image.
  • the first weighting coefficient and the second weighting coefficient applied to the luminance signal and the first weighting coefficient and the second weighting coefficient applied to the color difference signal may take
  • the decoding apparatus is a code in which an input moving image having a luminance signal and two color difference signals is predictively encoded by bidirectional prediction using a first weighting factor and a second weighting factor.
  • a first motion vector for determining a position of a first region in the first reference image to which a decoding target region of a decoding target image in the input moving image refers Information of the second motion vector that determines the position of the second region in the second reference image referred to by the decoding target region, the pixel value and the predicted pixel value of the pixel in the decoding target region, Means for receiving the encoded data including residual information based on the difference between the frame, a frame memory for storing a plurality of reference images including a first reference image and a second reference image, and reference by the first motion vector
  • Weight coefficient determining means for determining the second weight coefficient applied to the second area calculated by the first weight coefficient used and the second motion vector, and pixel values of the pixels in the first area Means for calculating the
  • An encoding method is an encoding method in which an input moving image having a luminance signal and two color difference signals is predictively encoded by bidirectional prediction using a first weighting factor and a second weighting factor.
  • a method of storing a plurality of reference images including a first reference image and a second reference image, and the first reference referred to by an encoding target region in an encoding target image in the input moving image Obtaining a first motion vector defining a position of a first region in an image and a second motion vector defining a position of a second region in the second reference image referred to by the encoding target region; Determining the first weighting factor applied to the first region referenced by one motion vector and the second weighting factor applied to the second region referenced by the second motion vector; First area The predicted pixel value for the encoding target area is calculated by multiplying the pixel value of each pixel by the first weighting coefficient and multiplying the pixel value of the pixel in the second area by the second weighting
  • the encoding device performs the steps and generating encoded data by encoding residual information relating to a difference between a pixel value of a pixel in the encoding target region and the predicted pixel value, and the luminance
  • the first weighting factor and the second weighting factor applied to the signal and the first weighting factor and the second weighting factor applied to the color difference signal may take different values.
  • a decoding method is a code in which an input moving image having a luminance signal and two color difference signals is predictively encoded by bidirectional prediction using a first weighting factor and a second weighting factor.
  • Information of the second motion vector that determines the position of the second region in the second reference image referred to by the decoding target region, the pixel value and the predicted pixel value of the pixel in the decoding target region, Receiving the encoded data including residual information based on the difference between the plurality of reference images, storing a plurality of reference images including a first reference image and a second reference image, and being referred to by the first motion vector.
  • the decoding apparatus performs a step of calculating the predicted pixel value for the decoding target region by multiplying a pixel value of a pixel in the second region by the second weighting factor while multiplying by a weighting factor.
  • the first weighting coefficient and the second weighting coefficient applied to the luminance signal and the first weighting coefficient and the second weighting coefficient applied to the color difference signal have different values. obtain.
  • “part”, “means”, “apparatus”, and “system” do not simply mean physical means, but “part”, “means”, “apparatus”, “system”. This includes the case where the functions possessed by "are realized by software. Further, even if the functions of one “unit”, “means”, “apparatus”, and “system” are realized by two or more physical means or devices, two or more “parts” or “means”, The functions of “device” and “system” may be realized by a single physical means or device.
  • FIG. 4 is a block diagram illustrating a partial functional configuration of the encoding device illustrated in FIG. 3. It is a block diagram which shows the function structure of the decoding apparatus which concerns on embodiment.
  • FIG. 4 is a block diagram illustrating a partial functional configuration of the encoding device illustrated in FIG. 3. It is a block diagram which shows the function structure of the decoding apparatus which concerns on embodiment.
  • FIG. 6 is a block diagram illustrating a partial functional configuration of the decoding device illustrated in FIG. 5. It is a flowchart which shows the flow of a process of the encoding apparatus shown in FIG. 6 is a flowchart showing a flow of processing of the decoding device shown in FIG. 5. It is a figure which shows the specific example of a structure of coding data.
  • FIG. 6 is a block diagram illustrating a specific example of a hardware configuration in which the encoding device and the decoding device illustrated in FIGS. 3 and 5 can be mounted.
  • bi-prediction In moving picture coding schemes such as AVC and HEVC, a prediction scheme called bi-prediction that performs prediction on a processing target block using two reference images (reference pictures) is defined.
  • bi-prediction a prediction scheme called bi-prediction that performs prediction on a processing target block using two reference images (reference pictures) is defined.
  • the bidirectional prediction will be briefly described with reference to FIG.
  • the luminance and color difference in the prediction unit PU are referred to by referring to the two reference images P 0 and P 1 that are decoded first.
  • the pixel value of is predicted.
  • the prediction unit PU refers to the reference images P 0 and P 1 using reference information r 0 and r 1 (hereinafter also referred to as reference information r).
  • reference information r reference information
  • the motion vector v are relative to the prediction unit PU. It is assumed that it is specified as a direction (vector value of x coordinate direction and y coordinate direction).
  • Blocks B 0 and B 1 are blocks that are specified by the reference information r and the motion vector v and are referenced from the prediction unit PU.
  • the same weighting factor 1/2 is applied to the pixel values P [x + v 0 ] and P [x + v 1 ] in the two reference images P 0 and P 1 .
  • the pixel value P [x] of the pixel in the prediction unit PU to be processed is calculated.
  • Equation (2) if the weighting factor w is an arbitrary value, a large amount of coding is required to transmit the weighting factor w, so that the coding efficiency decreases. Therefore, the weight coefficient w is limited to -1/4, 1/4, 3/8, 1/2, 5/8, 3/4, 5/4, and the index value of the weight coefficient w is predicted. An explicit transmission for each unit PU is also conceivable. However, when bidirectional prediction is performed by such a method, the following problems occur.
  • At least one of the two methods is used to solve at least one of the two problems. Both methods can be applied independently in the prediction process, and it is not always necessary to use both in combination.
  • reference image P will be described as two reference images P 0 and P 1 .
  • the number of reference images P is not necessarily two, and it may be predicted using three or more reference images.
  • reference information r, motion vector v, and weight coefficient w can be prepared for each reference image.
  • the weighting factors applied to the two reference images P 0 and P 1 are w and (1-w), respectively, and the weighting factors are applied to the two reference images P 0 and P 1. It is assumed that the sum of is 1. However, it is also conceivable that the sum of the weighting factors applied to the two reference images is other than 1.
  • the technique 1 is mainly for solving the above-described problem 1, that is, for reducing the amount of calculation related to the calculation of the combination of the motion vector v and the weighting coefficient w during the encoding process. Specifically, first, with respect to the luminance signal component in the prediction unit PU, motion vectors v 0 and v 1 that obtain a favorable result by prediction according to the above equation (1) are calculated. In the above equation (1), the weighting coefficient applied to the two reference images P 0 and P 1 (hereinafter, the weighting coefficient applied only to the luminance signal is also referred to as w Y ) is 1/2. Therefore, the amount of calculation required to determine the motion vector v is smaller than that in the equation (2).
  • the motion vector v from which a favorable result is obtained is, for example, a prediction unit of an absolute value difference (residual) between an actual pixel value of a pixel in the prediction unit PU to be processed and a predicted pixel value P [x].
  • the sum total of all the pixels in the PU is the smallest among the possible motion vectors.
  • a weighting coefficient w (hereinafter, applied only to the chrominance signal) that can obtain a favorable result by prediction using the above equation (2) using the motion vector v determined based on the luminance signal component.
  • the weight coefficient is also referred to as w C ). If the motion vector v is a fixed value, it is only necessary to perform an operation on w, so that the amount of operation is drastically reduced compared to the case of performing an operation on both.
  • the weighting factor w from which a suitable result is obtained is, for example, prediction of the absolute value difference (residual) between the actual pixel value of the pixel in the prediction unit PU to be processed and the predicted pixel value P [x]. The sum of all the pixels in the unit PU is the smallest among the possible weighting factors.
  • the weighting factor w can be calculated and applied by a shift operation. In other words, division is not necessary in calculating the weighting factor w, so that the amount of calculation can be further reduced.
  • the degree of freedom related to the prediction image generation is improved, so that the prediction accuracy of the prediction image is improved (residual is reduced). Is possible.
  • the color difference signal component includes two components (here, Cb and Cr), but independent weighting factors w Cb and w Cr may be calculated for Cb and Cr, respectively.
  • the weighting coefficient of one color difference signal for example, the weighting coefficient w Cb related to Cb processed and transmitted before Cr
  • the weighting coefficient of the other color difference signal for example, Cb
  • a weighting factor w Cr ) related to Cr to be transmitted may be predicted.
  • the color space is described as being expressed in YCbCr, but the color space expression method is not limited to this.
  • the processing method according to the following description can be similarly applied.
  • the motion vector v and the weighting factor w calculated in this way are each predicted and indexed and then stored in the bitstream, so that the coding efficiency can be improved. It is also conceivable that the weight coefficient w is not included in the bitstream by combining the technique 1 with the technique 2 described later.
  • the mechanism for sharing the weighting coefficient w between the encoding side and the decoding side is divided between the luminance signal and the color difference signal.
  • the luminance signal weight coefficient w Y may be encoded in the bit stream, and the color difference signal weight coefficient w C information may not be included in the bit stream by applying Method 2 described later.
  • Method 2 is mainly for solving the above-described problem 2, that is, for improving the coding efficiency related to the transmission of the weight coefficient.
  • the technique will be described with reference to FIGS. 2a and 2b.
  • the prediction unit PU, the motion vectors v 0 and v 1 , the reference images P 0 and P 1 , the reference information r 0 and r 1 , and the blocks B 0 and B 1 have the same meaning and relationship as in FIG. It is the same.
  • a left adjacent pixel lap adjacent to the prediction unit PU on the left side, an upper adjacent pixel uap adjacent on the upper side, and an upper left adjacent pixel ulap adjacent on the upper left (hereinafter also collectively referred to as an adjacent pixel ap) are considered.
  • To calculate the weighting coefficient w Details of the relationship between the prediction unit PU, the left adjacent pixel lap, the upper adjacent pixel uap, and the upper left adjacent pixel ulap are shown in FIG. 2b.
  • each pixel included in the adjacent pixel ap is drawn as a circle.
  • the size of the prediction unit PU is 8 pixels ⁇ 8 pixels.
  • the size of the prediction unit PU is not limited to this, and can be any size, for example, 4 pixels ⁇ 4 pixels, 16 pixels ⁇ 16 pixels, 8 pixels ⁇ 16 pixels, 16 pixels ⁇ 8 pixels, and the like. It is.
  • the left adjacent pixel lap is 8 pixels adjacent to the left side of the prediction unit PU.
  • the upper adjacent pixel uap is eight pixels adjacent to the upper side of the prediction unit PU.
  • the upper left adjacent pixel ulap is one pixel adjacent to the prediction unit PU at the upper left.
  • a suitable weight coefficient w is calculated based on the pixel value of the adjacent pixel ap adjacent on the left side and / or the upper side of the prediction unit PU, not the prediction unit PU itself.
  • the adjacent pixels of the blocks B 0 and B 1 in the reference images P 0 and P 1 are ap 0 and ap 1 , respectively, the adjacent pixel ap of the prediction unit PU and the adjacent pixels ap 0 and ap 1 of the blocks B 0 and B 1
  • the weighting factor w is calculated so that the total residual is minimized.
  • the weighting coefficient w is calculated based on the pixel value of the adjacent pixel ap of the prediction unit PU. If the same processing is performed in both the encoding device and the decoding device, the encoding device predicts the weighting factor w information (the value of the weighting factor w, the index for determining the weighting factor w, and the weighting factor w). For example, the decoding apparatus can calculate the weight coefficient w without including the prediction information and the like in the bitstream.
  • the image CP is divided into slices and / or tiles, and each slice and tile can be independently encoded / decoded. Therefore, even when trying to use the adjacent pixel ap when obtaining the weighting coefficient w of the prediction unit PU, there are cases where it cannot be used due to circumstances such as at least a part of the adjacent pixel ap is not in the same slice and / or tile. Therefore, depending on the situation, the weighting factor w may be calculated using only the upper adjacent pixel uap, or the weighting factor w may be calculated using only the left adjacent pixel lap. This point will be described below with reference to FIGS. 2c and 2d.
  • a slice and / or tile includes one or more blocks (macroblock or coding tree unit (CTU)).
  • CTU coding tree unit
  • the CTU is further recursively divided into one or more coding units (CUs).
  • the CU includes a prediction unit PU that divides the CU for prediction processing, and a conversion unit (Transform Unit: TU) that is divided for conversion processing.
  • the image CP includes 16 CTUs in the horizontal direction and 9 in the vertical direction, but the number of CTUs included in the image CP is not limited to this.
  • CU, PU, and TU obtained by further dividing the CTU are not shown.
  • the image CP can be divided into one or more slices and / or tiles.
  • the image CP is divided into four tiles T1 to T4 (hereinafter collectively referred to as tiles T) that are rectangular areas.
  • the tile T1 includes 35 CTUs surrounded by CTUs 0101, 0107, 0507, and 0501.
  • the tile T2 includes 45 CTUs surrounded by CTUs 0108, 0116, 0516, and 0508
  • the tile T3 includes 28 CTUs surrounded by CTU0601, 0607, 0907, and 0901
  • the tile T4 includes CTU0608, 36 CTUs surrounded by 0616, 0916 and 0908 are included, respectively.
  • the number of divisions from the image CP to the tile T is not limited to four, and can be divided into an arbitrary number.
  • the tile T1 is divided into two slices S1 and S2 (hereinafter collectively referred to as slice S).
  • slice S1 includes 18 CTUs of CTUs 0101 to 0107, 0201 to 0207, 0301 to 0304, and the slice S2 includes 17 CTUs of CTUs 0305 to 0307, 0401 to 0407, and 0501 to 0507. include.
  • the number of divisions from the tile T to the slice S is not limited to two, and can be divided into an arbitrary number.
  • the tile T is divided into slices S.
  • the present invention is not limited to this, and the slice S can be divided into tiles T.
  • the tile T is not necessarily divided into slices S.
  • the slice S is not necessarily divided into tiles T.
  • each CTU in the tile T can be decoded independently of the other tiles T.
  • each CTU in slice S can be decoded independently of the other slices S.
  • the processing for each CTU is performed in order from the left side to the right side and from the upper side to the lower side. For example, in the case of the slice S1, processing for the CTU 0101 is first performed. Thereafter, the processing for the CTU adjacent on the right side is sequentially performed up to CTU0107. When the processing up to CTU0107 is completed, the boundary of the slice S is reached, so the processing on the leftmost CTU0201 in the lower row is performed next.
  • the CTU in each tile T and the CTU in the slice S must be able to be processed independently of the other tiles T and slices S. Therefore, for example, the processing of CTU0607 cannot depend on CTU0507. That is, when calculating the weighting coefficient w for CTU0607, it is not possible to refer to pixels in the adjacent CTU0507 on the upper side. In such a case, when calculating the weighting coefficient w for CTU0607, only the pixel in CTU0606 that is the CTU adjacent to the left side, that is, the left adjacent pixel lap is used, and the upper adjacent pixel uap is not used. (See FIG. 2d).
  • the weighting factor is based on the adjacent pixel ap. w cannot be calculated. Therefore, in this case, for example, it is conceivable to transmit information including the weighting factor w included in the encoded data, or to set the weighting factor in bidirectional prediction to a fixed value. In the latter case, the weighting factor w may be fixed to 1/2, for example.
  • the weight coefficient w is obtained using the left adjacent pixel lap, and when the left adjacent pixel lap cannot be used, the upper adjacent pixel lap is upper adjacent. What is necessary is just to obtain
  • each adjacent pixel ap may be determined at any time on the encoding side and the decoding side, but information on the adjacent pixel ap used when calculating the weighting coefficient w is explicitly encoded. It is also possible to include it in the digitized data. In this case, for example, when calculating the weighting coefficient w, only the upper adjacent pixel uap or only the left adjacent pixel lap is used, or all of the upper adjacent pixel uap, the left adjacent pixel lap, and the upper left adjacent pixel ulap Whether or not to use the adjacent pixel ap may be given to the encoded data as flag information or the like.
  • the motion vectors v 0 and v 1 can be determined by various methods such as motion vector prediction using a motion vector of an adjacent block.
  • the weighting factor w for example, after calculating the weighting factor w to be applied to the prediction unit PU so that the sum of the residuals of the luminance signals of the adjacent pixels ap is minimized, the weighting factor w may be applied to the color difference signal. Conceivable. Alternatively, the weighting coefficient w applied to the prediction unit PU may be calculated so that the sum of the residuals of the luminance signal and the color difference signal of the adjacent pixel ap is reduced.
  • the luminance signal weight coefficient w Y and the color difference signal weight coefficient w C may be obtained so that the sum of the luminance signal and the color difference signal is reduced.
  • the color difference signals Cb and Cr it is conceivable to obtain the weighting factors w Cb and w Cr independently of each other. Furthermore, it is conceivable to predict w Cr from the weight coefficient w Cb .
  • Method 2 When Method 2 is used in combination with Method 1, motion vectors v 0 and v 1 are calculated for the luminance signal after the weighting factor w Y is halved. Then, it is conceivable to calculate the weighting coefficient w C applied to the color difference signal of the prediction unit PU so that the sum of the residuals of the color difference signals of the adjacent pixels ap is minimized. In this case, for each of the color difference signal C b C r, different weighting factors so that the total is less residual neighboring pixels ap w Cb, it may calculate the w Cr. Furthermore, w Cr may be predicted from the obtained weight coefficient w Cb .
  • the encoding apparatus 100 includes an input unit I1, an output unit O1, an A / D (analog / digital) conversion unit 101, a rearrangement buffer 103, an orthogonal conversion unit 105, a quantization unit 107, an entropy code. 109, accumulation buffer 111, rate control unit 113, inverse quantization unit 115, inverse orthogonal transform unit 117, loop filter 119, frame memory 121, intra prediction unit 123, motion prediction unit 125, and bidirectional prediction weight coefficient calculation Part 127.
  • input / output of main signals is represented by arrows. However, even when there is input / output of signals between functions, the description of the arrows is omitted. There may be. This also applies to FIGS. 4 to 6.
  • the A / D conversion unit 101 converts the analog signal image data input from the input unit I1 into image data for each frame of the digital signal, and supplies the image data to the rearrangement buffer 103 in the display order.
  • the rearrangement buffer 103 is a buffer for rearranging the order of image frame data (hereinafter also referred to as input image) for encoding.
  • the rearrangement buffer 103 outputs the input image to the arithmetic unit AD1 in the order of encoding.
  • the rearrangement buffer 103 also supplies the input image to be encoded to the intra prediction unit 123 and the motion prediction unit 125.
  • the image division unit 103a divides the input image and supplies the divided input image. More specifically, for example, as described with reference to FIG. 2c, the image dividing unit 103a may divide the input image into slices and / or tiles including one or more blocks (CTU).
  • CTU blocks
  • the computing unit AD1 subtracts the prediction image supplied from the intra prediction unit 123 or the motion prediction unit 125 from the input image to be encoded supplied from the rearrangement buffer 103, thereby obtaining each pixel value (luminance signal and A residual signal relating to a differential image consisting of residuals (which can include color difference signals) is obtained.
  • the orthogonal transform unit 105 performs orthogonal transform such as discrete cosine transform (DCT) and Karhunen-Loeve transform on the residual signal calculated by the arithmetic unit AD1.
  • the method of orthogonal transformation is arbitrary.
  • the transform coefficient of the residual signal calculated by the orthogonal transform unit 105 is supplied to the quantization unit 107.
  • the quantization unit 107 quantizes the transform coefficient supplied from the orthogonal transform unit 105. At this time, the quantization unit 107 performs quantization based on the quantization parameter based on the information related to the target value of the coding amount supplied from the rate control unit 113. Note that the quantization method is arbitrary.
  • the entropy encoding unit 109 encodes the transform coefficient quantized by the quantization unit 107 using an arbitrary encoding method.
  • the entropy encoding unit 109 receives information indicating the prediction mode and various types of information used for prediction, for example, information on the division mode in the prediction unit PU and information on the motion vector v, from the intra prediction unit 123 and the motion prediction unit 125. get. Also, information such as used filter coefficients is acquired from the loop filter 119. The entropy encoding unit 109 encodes the information by an arbitrary method, and generates header information from the encoded information. The entropy encoding unit 109 supplies encoded data obtained as a result of encoding to the accumulation buffer 111.
  • an arithmetic encoding method using a variable length encoding method such as CAVLC (Context-Adaptive Variable Length Coding) or a CABAC (Context-Adaptive Binary Arithmetic Coding) is used. Is possible.
  • the accumulation buffer 111 is a buffer that temporarily stores the encoded data supplied from the entropy encoding unit 109.
  • the accumulation buffer 111 outputs the stored encoded data from the output unit O1 as a bit stream to a storage device or a transmission path (not shown).
  • the rate control unit 113 sets a target value of the coding amount so that overflow or underflow does not occur based on the code amount of the coded data accumulated in the accumulation buffer 111, and performs the quantization operation in the quantization unit 107. Perform rate control.
  • the transform coefficient quantized by the quantization unit 107 is supplied not only to the entropy encoding unit 109 but also to the inverse quantization unit 115.
  • the inverse quantization unit 115 inversely quantizes the quantized transform coefficient.
  • any method can be used as long as it corresponds to the quantization method used in the quantization unit 107.
  • the inverse orthogonal transform unit 117 performs inverse orthogonal transform on the transform coefficient obtained by the inverse quantization by the inverse quantization unit 115.
  • the inverse orthogonal transform method any method can be used as long as it corresponds to the orthogonal transform method used in the orthogonal transform unit 105.
  • a residual signal constituting the difference image which is an input of the orthogonal transform unit 105, is restored.
  • the restored residual signal is supplied from the inverse orthogonal transform unit 117 to the arithmetic unit AD2.
  • the arithmetic unit AD2 adds the predicted image supplied from the intra prediction unit 123 or the motion prediction unit 125 to the residual signal supplied from the inverse orthogonal transform unit 117, thereby obtaining a locally restored decoded image. be able to.
  • the decoded image is supplied to the loop filter 119 or the frame memory 121.
  • the loop filter 119 performs various types of filter processing that can include deblocking filter processing, SAO (Sample Adaptive Offset) processing, and the like on the decoded image. For example, the loop filter 119 reduces block distortion by performing a deblocking filter process on the decoded image. Further, the loop filter 119 performs SAO processing on the decoded image after the deblocking filter processing, thereby reducing ringing that distorts pixel values around the edge and correcting pixel value deviation. In addition, the loop filter 119 may perform an arbitrary filter process for improving the image quality. The loop filter 119 supplies the decoded image after various filter processes to the frame memory 121.
  • SAO Sample Adaptive Offset
  • the frame memory 121 stores the decoded image and supplies the decoded image as a reference image to the selection unit SW1 and the bi-directional prediction weight coefficient calculation unit 127.
  • the frame memory 121 can store a plurality of reference images by dividing them into two reference image groups of list 0 (list0) and list 1 (list1).
  • a prediction method for performing motion prediction with reference to reference images stored in list 0 and motion prediction with reference to reference images stored in list 1 are performed. Any one of a prediction method and a prediction method (bidirectional prediction) in which motion prediction is performed with reference to two reference images of a reference image stored in list 0 and a reference image stored in list 1 can be applied. .
  • the selection unit SW1 supplies the reference image supplied from the frame memory 121 to the motion prediction unit 125 or the intra prediction unit 123 according to the prediction mode to be applied. For example, when encoding by intra prediction (intra-screen prediction), SW1 outputs a reference image to the intra prediction unit 123. On the other hand, when encoding by inter prediction (motion prediction), SW1 outputs a reference image to the motion prediction unit 125.
  • the intra prediction unit 123 uses the pixel values in the reference image (processing target picture) supplied from the frame memory 121 via the selection unit SW1 to generate a prediction image with a prediction unit PU as a processing unit. Perform intra prediction. At this time, the intra prediction unit 123 performs intra prediction in a plurality of intra prediction modes prepared in advance, and generates a prediction image, respectively. After that, the input image is read from the rearrangement buffer 103, and a prediction mode in which the difference (residual) between the input image and the predicted image is selected is selected. More specifically, for example, a cost function prepared in advance is applied to the input image and each predicted image, and a prediction mode that minimizes the obtained cost function value is applied to the prediction unit PU to be encoded. The intra prediction mode can be set.
  • the intra prediction unit 123 outputs various information necessary for intra prediction, such as information on the selected intra prediction mode, to the entropy encoding unit 109. Moreover, the prediction image produced
  • the motion prediction unit 125 performs a motion prediction process for each prediction unit PU, using the reference image supplied from the frame memory 121 via the selection unit SW1 and the input image supplied from the rearrangement buffer 103.
  • the motion prediction unit 125 performs inter prediction in a plurality of inter prediction modes prepared in advance, and generates a predicted image respectively. After that, the input image is read from the rearrangement buffer 103, and a prediction mode in which the difference (residual) between the input image and the predicted image is selected is selected. More specifically, for example, a cost function prepared in advance is applied to the input image and each predicted image, and a prediction mode that minimizes the obtained cost function value is applied to the prediction unit PU to be encoded.
  • the motion prediction mode can be set.
  • the motion prediction unit 125 outputs various information necessary for motion prediction, such as information on the selected motion prediction mode, to the entropy encoding unit 109. Further, the prediction image generated in the inter prediction mode is output to the arithmetic unit AD2 via the selection unit SW2.
  • the motion prediction unit 125 can perform prediction using a weighting factor w that is not fixed to 1 ⁇ 2 when performing bidirectional prediction.
  • the bi-directional prediction weight coefficient calculator 127 calculates the weight coefficient w.
  • the motion prediction unit 125 includes a motion vector search unit 151, a prediction direction determination unit 153, a bidirectional prediction unit 155, a mode determination unit 157, and a predicted image generation unit 159.
  • the selection units SW1 and SW2 are not shown.
  • the motion vector search unit 151 uses the reference image supplied from the frame memory 121 and the input image supplied from the rearrangement buffer 103, and uses the two motion vectors v (v to be applied to the processing target prediction unit PU. Search for 0 and v 1 ).
  • the motion vector v obtained for the reference image in the list 0 is referred to as a motion vector v 0
  • the motion vector v obtained for the reference image in the list 1 is referred to as a motion vector v 1 .
  • a motion vector v referring to a region (block B) in the reference image P in which the difference from the pixel value of the pixel included in the prediction unit PU is minimized is determined as a motion.
  • the vector search unit 151 searches. At this time, for example, a motion vector v having the smallest cost function value can be adopted by applying a predetermined cost function to all possible motion vectors v.
  • the motion vector search unit 151 may determine the motion vector v in consideration of all of the luminance signal and the two color difference signals included in the prediction unit PU, or only the luminance signal is considered. Thus, the motion vector v may be determined.
  • the prediction for the prediction unit PU can be performed by dividing the prediction unit PU into a plurality of pieces. Therefore, the motion vector search unit 151 calculates a motion vector v for each division method.
  • the division method of the prediction unit PU is also referred to as a division mode.
  • the prediction direction determination unit 153 determines, for each division mode, which one of the motion vectors v 0 and v 1 supplied from the motion vector search unit 151 is used or both.
  • prediction using only one of the motion vectors v 0 and v 1 is referred to as unidirectional prediction
  • prediction using both is referred to as bidirectional prediction.
  • unidirectional prediction or bidirectional prediction uses unidirectional prediction using motion vector v 0 , unidirectional prediction using motion vector v 1 , and both motion vectors v 0 and v 1 . It is possible to determine which of the two-way predictions has the smallest difference from the input image.
  • the prediction direction determination unit 153 may determine the prediction method in consideration of all of the luminance signal and the two color difference signals included in the prediction unit PU, or may determine the prediction method in consideration of only the luminance signal. You may do it. Moreover, when calculating the prediction pixel value in bidirectional prediction in order to determine the prediction method, the prediction pixel value when the weighting factor w is 1 ⁇ 2 may be applied.
  • the prediction direction information determined by the prediction direction determination unit 153 is used to control the selection units SW3 and SW4. Specifically, in the case of bidirectional prediction, the selection unit SW3 so that information on the motion vectors v 0 and v 1 used for prediction is supplied to the bidirectional prediction weight coefficient calculation unit 127 and the bidirectional prediction unit 155. And SW4 are controlled.
  • the bi-directional prediction weight coefficient calculation unit 127 includes an input image to be encoded supplied from the rearrangement buffer 103 and two reference images supplied from the frame memory 121 (reference images included in each of list 0 and list 1). ) To calculate a weighting coefficient w to be applied in bidirectional prediction. This processing is performed for each division mode in which bidirectional prediction is performed. At this time, as described above, the bidirectional prediction weight coefficient calculation unit 127 can limit the possible value of the weight coefficient w to m / (2 n ) (where m and n are integers). In this case, since the weighting factor w can be calculated only by the shift calculation (no division is required), it is possible to reduce the calculation amount.
  • the bidirectional prediction weight coefficient calculation unit 127 sets the weight coefficient applied to the luminance signal to 1 ⁇ 2, and bidirectionally only the weight coefficient w C for only the color difference signal.
  • the prediction weight coefficient calculation unit 127 may calculate the weight.
  • the bidirectional prediction weight coefficient calculation unit 127 may calculate different weight coefficients w Cb and w Cr for the two color difference signals.
  • a weight coefficient w that is not 1 ⁇ 2 that can be applied to all of the luminance signal and the color difference signal may be calculated, or for the luminance signal and the two color difference signals, Different weighting factors w Y and w C may be calculated.
  • Information on the calculated weighting factor w is output from the bidirectional prediction weighting factor calculation unit 127 to the bidirectional prediction unit 155 of the motion prediction unit 125.
  • the bidirectional prediction weight coefficient calculation unit 127 calculates the weight coefficient w and the pixel value of each pixel in the prediction unit PU in the input image to be encoded and the prediction unit PU calculated by the above equation (2).
  • the weighting coefficient w can be determined so that the sum of differences from the pixel values of the respective pixels in the inside becomes small.
  • the weighting factor w that reduces the difference in the pixel value calculated by the above equation (2) may be calculated in the adjacent pixel ap (the method 2 described above). ). In the latter case, the weighting factor w can be inevitably calculated from the adjacent pixel ap that is not included in the prediction unit PU to be processed.
  • the decoding apparatus can calculate the weighting coefficient w without receiving the value of the weighting coefficient w or the index information indicating the weighting coefficient w, it is not necessary to include information on the weighting coefficient w in the bitstream. For this reason, it is possible to prevent a decrease in coding efficiency due to transmission of information related to the weighting factor w, while adopting a flexible weighting factor w that is not fixed to 1 ⁇ 2.
  • the weight coefficient w can be set to a fixed value such as 1/2.
  • the bidirectional prediction unit 155 performs bidirectional prediction using the motion vectors v 0 and v 1 supplied from the prediction direction determination unit 153 and the weight coefficient w supplied from the bidirectional prediction weight coefficient calculation unit 127. This processing is performed for each division mode in which bidirectional prediction is performed.
  • the mode determination unit 157 determines which prediction mode should be used for motion prediction from the motion vector v and the weighting factor w in each prediction mode (including the division mode) generated in this way. To do. In this process, for example, a prediction mode in which a difference between a predicted image in each prediction mode and an input image read from the rearrangement buffer 103 is reduced is selected. More specifically, a prediction mode that applies a cost function prepared in advance to each of the input image and each prediction image and minimizes the obtained cost function value is applied to the prediction unit PU to be encoded. Motion prediction mode.
  • the prediction image generation unit 159 receives information for prediction in the prediction mode determined by the mode determination unit 157, for example, the motion vector v and the prediction mode information, and reads the reference image information from the frame memory 121 to thereby perform prediction. Generate an image. The generated prediction image is output to the arithmetic units AD1 and AD2.
  • the mode determination unit 157 outputs the determined motion vector v and prediction mode information to the entropy encoding unit 109.
  • the mode determination unit 157 does not necessarily obtain information on the weighting coefficient w even when bidirectional prediction is performed. There is no need to output.
  • the adjacent pixel ap used when calculating the weighting factor w is the upper adjacent pixel uap, the left adjacent pixel lap, or the upper adjacent pixel uap, the left adjacent pixel lap, and the upper left adjacent pixel ulap. It is also conceivable that information indicating whether or not any of the adjacent pixels ap is used is output to the entropy encoding unit 109 so as to be included in the encoded data.
  • the mode determination unit 157 uses index information for determining the value of the weight coefficient w, and the like. May be output to the entropy encoding unit 109.
  • the mode determination unit 157 may output flag information indicating that to the entropy encoding unit 109. The flag information may be included for each prediction unit PU or may be included for each slice.
  • the decoding apparatus 200 includes an input unit I2, an accumulation buffer 201, an entropy decoding unit 203, an inverse quantization unit 205, an inverse orthogonal transform unit 207, a loop filter 209, a rearrangement buffer 211, a D / D An A (digital / analog) conversion unit 213, a frame memory 215, an intra prediction unit 217, a motion prediction unit 219, and a bidirectional prediction weight coefficient calculation unit 221 are included.
  • the accumulation buffer 201 accumulates the encoded data (bit stream) input from the input unit I2, and appropriately outputs the encoded data to the entropy decoding unit 203.
  • the entropy decoding unit 203 decodes the information supplied from the accumulation buffer 201 and encoded by the entropy encoding unit 109 in FIG. 3 to the encoding method used at the time of encoding. Decrypt with.
  • the quantized transform coefficient of the difference image obtained as a result of decoding is supplied from the entropy decoding unit 203 to the inverse quantization unit 205.
  • the inverse quantization unit 205 inversely quantizes the quantized transform coefficient of the difference image by a method corresponding to the quantization method in the quantization unit 107 in FIG. 3 to obtain a transform coefficient.
  • the inverse quantization method any method can be used as long as it corresponds to the quantization method used in the quantization unit 107.
  • the inverse orthogonal transform unit 207 performs inverse orthogonal transform on the transform coefficient obtained by inverse quantization by the inverse quantization unit 205.
  • the inverse orthogonal transform method any method can be used as long as it corresponds to the orthogonal transform method used in the orthogonal transform unit 105.
  • the residual signal constituting the difference image is restored.
  • the residual signal relating to the difference image is supplied to the arithmetic unit AD3.
  • the arithmetic unit AD3 adds the predicted image supplied from the intra prediction unit 217 or the motion prediction unit 219 to the difference image supplied from the inverse orthogonal transform unit 207, thereby obtaining image data related to the decoded image.
  • the arithmetic unit AD3 supplies the image data related to the decoded image to the loop filter 209.
  • the loop filter 209 performs various types of filter processing that can include deblocking filter processing, SAO processing, and the like on the decoded image input from the arithmetic unit AD3. For example, the loop filter 209 reduces block distortion by performing a deblocking filter process on the decoded image. In addition, the loop filter 209 performs SAO processing on the decoded image after the deblocking filter processing, thereby reducing ringing in which pixel values around the edge are distorted and correcting pixel values. In addition, the loop filter 209 may perform an arbitrary filter process for improving the image quality. The loop filter 209 supplies the decoded image after various filter processes to the rearrangement buffer 211 and the frame memory 215.
  • the rearrangement buffer 211 is a buffer for rearranging the decoded image (image frame data) after the filtering process supplied from the loop filter 209 in the display order.
  • the rearrangement buffer 211 outputs the decoded images to the D / A conversion unit 213 in the display order.
  • the D / A conversion unit 213 converts the image data for each frame of the digital signal supplied from the rearrangement buffer 211 into an analog signal as appropriate, and then outputs it from the output unit O2.
  • the frame memory 215 stores the decoded image supplied from the loop filter 209 and supplies the decoded image as a reference image to the selection unit SW5.
  • the frame memory 215 can store a plurality of reference images by dividing them into two reference image groups, List 0 and List 1.
  • a prediction method for performing motion prediction with reference to reference images stored in list 0 and motion prediction with reference to reference images stored in list 1 are performed. Applying any of a prediction method, a reference image stored in list 0, and a prediction method (bidirectional prediction) for performing motion prediction with reference to two reference images stored in list 1 may be applied. it can.
  • the selection unit SW5 supplies the reference image supplied from the frame memory 215 to the intra prediction unit 217 or the motion prediction unit 219 according to the applied prediction mode. For example, when decoding an intra-coded image, the selection unit SW5 supplies the reference image supplied from the frame memory 215 to the intra prediction unit 217. The selection unit SW5 outputs the reference image supplied from the frame memory 215 to the motion prediction unit 219 when decoding an image encoded by motion prediction.
  • the intra prediction unit 217 performs intra prediction using the pixel values in the reference image (processing target picture) supplied from the frame memory 215 via the selection unit SW5 to generate a prediction image using the prediction unit PU as a processing unit. .
  • the intra prediction unit 217 decodes the prediction unit PU to be decoded using an intra prediction mode selected based on prediction mode information obtained from the entropy decoding unit 203 among a plurality of prepared intra prediction modes. To do. Thereby, a prediction image can be generated by the intra prediction mode used when the intra prediction unit 123 in FIG. 3 performs encoding.
  • the motion prediction unit 219 performs motion prediction processing for each prediction unit PU using the reference image supplied from the frame memory 215 via the selection unit SW5.
  • the motion prediction unit 219 decodes a prediction target PU to be decoded using a motion prediction mode selected based on prediction mode information obtained from the entropy decoding unit 203 among a plurality of motion prediction modes prepared in advance. To do.
  • the motion prediction unit 125 can generate a prediction image by the method used by the motion prediction unit 125 of FIG. 3 when encoding.
  • the motion prediction unit 219 can perform prediction using a weighting factor w that is not fixed to 1 ⁇ 2 when performing bidirectional prediction.
  • the bi-directional prediction weight coefficient calculation unit 221 calculates the weight coefficient w.
  • the motion prediction unit 219 includes a predicted image generation unit 251 and a prediction mode / motion vector buffer 253.
  • the selection units SW5 and SW6 are not shown.
  • the predicted image generation unit 251 generates a predicted image using the reference image supplied from the frame memory 215 as a processing unit.
  • the prediction image generation unit 251 corresponds to a plurality of motion prediction modes, and among these, the prediction unit PU to be decoded is selected according to the prediction mode selected based on the prediction mode information supplied from the prediction mode / motion vector buffer 253. Is decrypted.
  • the predicted image generation unit 251 reads the motion vector v and the weighting coefficient w as necessary from the prediction mode / motion vector buffer 253 when generating a predicted image by motion prediction. Thereby, the motion prediction unit 219 can generate a prediction image by the method used by the motion prediction unit 219 of FIG. 3 when encoding.
  • the generated predicted image is output to the arithmetic unit AD3.
  • the prediction mode / motion vector buffer 253 is a buffer that temporarily stores prediction mode information and motion vector information supplied from the entropy decoding unit 203.
  • the prediction mode / motion vector buffer 253 passes the information on the motion vector v to the bidirectional prediction weight coefficient calculation unit 221 and both The weight coefficient w is received from the direction prediction weight coefficient calculation unit 221 and temporarily stored.
  • the selection unit SW7 receives information related to the prediction direction of the prediction unit PU to be processed from the prediction mode / motion vector buffer 253, and switches whether to pass the information on the motion vector v to the bidirectional prediction weight coefficient calculation unit 221. . More specifically, when the prediction related to the prediction unit PU is unidirectional prediction using only one of the motion vectors v 0 and v 1 , information on the motion vector v is sent to the bidirectional prediction weight coefficient calculation unit 221. Not output. On the other hand, when bidirectional prediction using both motion vectors v 0 and v 1 is performed on the prediction unit to be processed, information on the motion vector v is output to the bidirectional prediction weight coefficient calculation unit 221. .
  • the bidirectional prediction weight coefficient calculation unit 221 calculates a weight coefficient w used for bidirectional prediction. At this time, for example, when index information or the like for calculating the weighting factor w is received from the entropy decoding unit 203, the bidirectional prediction weighting factor calculation unit 221 calculates the weighting factor w based on the information.
  • the bidirectional prediction weighting coefficient calculation unit 221 is supplied from the frame memory 215.
  • the weighting coefficient w is calculated based on the adjacent pixel ap of the prediction unit PU in the target picture and two reference images (reference images included in list 0 and list 1).
  • the weighting coefficient applied to the luminance signal is halved, and only the weighting coefficient w C for only the color difference signal is calculated by the bidirectional prediction weighting coefficient calculation unit 221. May be.
  • the bi-directional prediction weight coefficient calculation unit 221 may calculate different weight coefficients w Cb and w Cr for the two color difference signals. Alternatively, as described in Method 2 above, a weight coefficient w that is not 1 ⁇ 2 that can be applied to all of the luminance signal and the color difference signal may be calculated, or for the luminance signal and the two color difference signals, Different weighting factors w Y and w C may be calculated. Information on the calculated weight coefficient w is output from the bidirectional prediction weight coefficient calculation unit 221 to the prediction mode / motion vector buffer 253.
  • the bidirectional prediction weight coefficient calculation unit 221 is based on the information. May calculate the weight coefficient w. Furthermore, for example, when the encoded data includes a flag indicating that a fixed value such as 1/2 is to be used in the weighting factor w, the fixed value is used as the weighting factor w without considering the adjacent pixel ap. Just set.
  • FIG. 7 is a flowchart showing a flow of processing related to encoding when a predicted image is generated by motion prediction.
  • predicted image generation can be performed not only by motion prediction but also by intra prediction, but description thereof is omitted here. The same applies to the flowchart of FIG. 8 described later.
  • each reference image is divided into two reference image groups of list 0 and list 1 in the frame memory 121 and stored.
  • the motion vector search unit 151 searches for the motion vector v 0 for the reference images stored as the list 0 in the frame memory 121 (S703). As described above, this search can be performed, for example, as a search for the motion vector v 0 indicating the block B 0 in which the residual with the prediction unit PU is the smallest in the reference image. At this time, for example, a search for obtaining the motion vector v 0 may be performed in consideration of only the luminance signal.
  • the motion vector search unit 151 to the reference image stored as a list 1 in the frame memory 121, to search for a motion vector v 1 in the same manner as the motion vector v 0 (S705).
  • the prediction direction determination unit 153 uses which of the motion vectors v 0 and v 1 (unidirectional prediction) or both (both) Direction prediction) is determined (S707).
  • the predicted image generation unit 159 when performing unidirectional prediction (No in S709), the predicted image generation unit 159 generates a predicted image based on the motion vector v (S713).
  • the prediction direction determination unit 153 outputs information on the motion vectors v 0 and v 1 to the bi-directional prediction weight coefficient calculation unit 127 to calculate the bi-directional prediction weight coefficient.
  • the unit 127 obtains a weighting coefficient w using the motion vectors v 0 and v 1 (S711).
  • the weighting coefficient applied to the luminance signal is halved, and only the weighting coefficient w C for only the color difference signal is calculated by the bidirectional prediction weighting coefficient calculation unit 127.
  • the bidirectional prediction weight coefficient calculation unit 127 may calculate different weight coefficients w Cb and w Cr for the two color difference signals.
  • a weight coefficient w that is not 1 ⁇ 2 that can be applied to all of the luminance signal and the color difference signal may be calculated, or for the luminance signal and the two color difference signals, Different weighting factors w Y and w C may be calculated.
  • the possible value of the weighting factor w may be limited to m / (2 n ) (where m and n are integers).
  • the bidirectional prediction weighting coefficient calculating unit 127 calculates the pixel value of each pixel in the prediction unit PU in the input image to be encoded and the prediction calculated by the above equation (2).
  • the weighting factor w can be determined so that the difference from the pixel value of each pixel of the unit PU becomes small.
  • the weighting factor w may be calculated so that the difference between the pixel values calculated by the above equation (2) becomes small in the adjacent pixel ap (above Method 2).
  • the predicted image generation unit 159 When the motion vector v and the weighting factor w are calculated in this way, the predicted image generation unit 159 generates a predicted image using these pieces of information and the reference image read from the frame memory 121 (S713).
  • the mode determination unit 157 outputs information necessary for generating the predicted image, for example, prediction mode information, information on the motion vector v, and the like to the entropy encoding unit 109.
  • the entropy encoding unit 109 includes encoded data including prediction mode information and motion vector v information received from the mode determination unit 157 and a residual signal generated from a difference image that is a difference between the input image and the prediction image. Is generated (S715). At this time, when the weight coefficient w is generated based on the adjacent pixel ap, reference adjacent pixel information for determining the adjacent pixel ap to be referred to when the weight coefficient w is obtained may be included in the encoded data. good.
  • FIG. 8 is a flowchart illustrating a flow of processing related to decoding when a predicted image is generated using motion prediction.
  • the accumulation buffer 201 accumulates the encoded data input from the input unit I2, and the entropy decoding unit 203 sequentially decodes the encoded data (S801).
  • the encoded data can include, for example, prediction mode information related to the prediction unit PU, information on the motion vector v, and the like.
  • the image generated based on the encoded data is appropriately stored as a reference image in the frame memory 215 (S803).
  • each reference image is stored in the frame memory 215 by being divided into two reference image groups of list 0 and list 1.
  • the prediction mode / motion vector buffer 253 of the motion prediction unit 219 receives, from the entropy decoding unit 203, the prediction mode information regarding the division mode and the prediction direction of the prediction unit PU and the motion vector v from the entropy decoding unit 203. And memorize (S805). If the prediction direction of the prediction unit PU to be processed is unidirectional prediction (No in S807), the predicted image generation unit 251 generates a predicted image based on the motion vector v (S811).
  • the bidirectional prediction weight coefficient calculation unit 221 determines the weight coefficient w to be applied.
  • the weight coefficient w can be received, for example, as index information from the entropy decoding unit 203, for example.
  • the above equation (2) is used by using adjacent pixels ap 0 and ap 1 adjacent to the blocks B 0 and B 1 specified by the motion vectors v 0 and v 1 in the reference images P 0 and P 1.
  • the bidirectional prediction weight coefficient calculation unit 221 calculates the weight coefficient w so that the difference between the value calculated by (1) and the value of the adjacent pixel ap of the prediction unit PU becomes small. At this time, whether the adjacent pixel ap to be referred to is the upper adjacent pixel uap, the left adjacent pixel lap, the upper adjacent pixel uap, the left adjacent pixel lap, and the upper left adjacent pixel ulap is encoded data. It may be determined based on the reference adjacent pixel information included in.
  • the generated weight coefficient w is stored in the prediction mode / motion vector buffer 253.
  • the predicted image generation unit 251 generates a predicted image by bidirectional prediction using the motion vectors v 0 and v 1 and the weighting factor w (S807).
  • FIG. 9 is a diagram illustrating a specific example of the configuration of the encoded data 900.
  • the encoding device 100 and the decoding device 200 can process the image CP by dividing the image CP into slices S and / or tiles T. In the example of FIG. 9, the tile T is not considered.
  • the slice S is encoded as slice data 910.
  • the slice data 910 includes slice header information 911 and one or more coding tree units (CTU) 920.
  • Each CTU 920 includes CTU header information 921 and one or more coding units (CU) 930.
  • the CU 930 includes CU header information 931, a prediction unit (PU) 940, and a conversion unit (TU) 950.
  • the TU 950 includes data related to the residual signal related to the difference between the image to be encoded and the predicted image (quantized and orthogonally transformed as appropriate).
  • the PU940 includes various data related to the prediction process.
  • the PU 940 may include prediction mode information 941, image reference information 943, motion vector information 945, weight coefficient information 947, and reference adjacent pixel information 949. Note that it is not necessary for the PU 940 to include all of this information. Depending on whether the prediction method is motion prediction or intra prediction, unidirectional prediction or bidirectional prediction, how to determine the weighting factor w, etc., it is included in the PU 940 as appropriate. Information can be changed.
  • the prediction mode information 941 is information for determining a prediction method applied to the PU 940. For example, information on whether it is intra prediction or motion prediction, information on the division mode on how to divide the PU 940, such as 2N ⁇ 2N, 2N ⁇ N, N ⁇ 2N, etc. Can be included in the prediction mode information 941.
  • the image reference information 943 is information for specifying a reference image referred to by the PU 940 when performing motion prediction.
  • the image reference information 943 can be index information in each of the reference image groups in list 0 and list 1. .
  • the motion vector information 945 is information for specifying the motion vector v applied to the PU 940 when performing motion prediction.
  • the motion vector information 945 can include various types of information necessary for predicting the motion vector v.
  • the weighting factor information 947 is information for calculating the weighting factor w when bi-directional prediction is performed. For example, the value of the weighting factor w or an index indicating the value of the weighting factor w can be stored in the weighting factor information 947. As described in the above method 2, when the weighting factor w is obtained from the adjacent pixel ap, the weighting factor information 947 does not need to be included in the PU 940.
  • the weighting factor information 947 can include information on what weighting factor w is applied to either the luminance signal or the color difference signal. For example, whether the weighting factor for the luminance signal is halved, whether the weighting factor for the color difference signal is halved, whether the weighting factor for the luminance signal and the weighting signal is independent, Or the like, flag information 947 may be included in the weight coefficient information 947.
  • the flag information is not necessarily included in the weight coefficient information 947 in the PU 940. For example, it may be possible to include flag information related to application of the weighting factor w in the slice header information 911.
  • the reference adjacent pixel information 949 is information for indicating information of the adjacent pixel ap to be referred to when bi-directional prediction is performed and the weighting coefficient w is calculated from the adjacent pixel ap. For example, when calculating the weighting factor w, only the upper adjacent pixel uap should be referred to, only the left adjacent pixel lap should be referred to, or all of the upper left adjacent pixel lap and the upper left adjacent pixel ulap should be referred to Or a flag indicating whether neither should be referenced (or cannot be referred to) can be included in the reference neighboring pixel information 949.
  • the image processing apparatus 1000 includes encoded video data (bitstream) stored in various storage media such as a DVD (Digital Versatile Disc), Blu-ray (registered trademark), HDD (Hard Disk Drive), and flash memory. Or a dedicated player for reproducing encoded video data received from a network such as the Internet.
  • the image processing apparatus 1000 can also be realized as a camcorder or a recorder that encodes an input video or a video shot by a camera or the like as video data and stores it in various storage media.
  • the image processing apparatus 1000 may be a video distribution apparatus that outputs encoded video data to other apparatuses on the network.
  • the image processing apparatus 1000 can be realized as a personal computer or a mobile phone (whether it is a feature phone or a so-called smartphone) having functions such as reproduction, storage, and distribution of video data. is there.
  • the image processing apparatus 1000 includes a control unit 1001, a communication interface (I / F) unit 1005, a data I / F unit 1007, a storage unit 1009, a display unit 1015, and an input unit 1017. Each part is connected via a bus line 1019.
  • I / F communication interface
  • the image processing apparatus 1000 has both an encoding function and a decoding function.
  • the image processing apparatus 1000 includes an encoding function and a decoding function. Only one of them may be provided.
  • the image processing apparatus 1000 includes an encoding function (function of the encoding apparatus 100 illustrated in FIG. 3) and a decoding function (each function of the decoding apparatus 200 illustrated in FIG. 5).
  • the functions are realized by the encoding program 1011 and the decoding program 1013, which are programs, but it is not always necessary to realize these functions as programs.
  • the encoding program 1011 and the decoding program 1013 are changed. It may be unnecessary.
  • a part of the encoding device 100 and the decoding device 200 shown in FIGS. 3 and 5 can be realized by the encoding program 1011 / decoding program 1013 and other functions can be realized by the processor 1003 for image processing. It is.
  • the control unit 1001 includes one or more processors 1003, a ROM (Read Only Memory, not shown), a RAM (Random Access Memory, not shown), and the like.
  • the control unit 1001 executes an encoding program 1011 and / or a decoding program 1013 stored in the storage unit 1009, thereby performing an image related to encoding / decoding related to image processing in addition to various general control functions.
  • the process is configured to be executable. For example, at least a part of each function of the encoding device 100 shown in FIG. 3 and at least a part of each function of the decoding device 200 shown in FIG. 5 are realized as the encoding program 1011 and the decoding program 1013, respectively. Is possible.
  • the one or more processors 1003 may include a CPU (Central Processing Unit), an image processing processor for executing processing related to image encoding / encoding, and the like.
  • the CPU executes an encoding program 1011 and a decoding program 1013.
  • the image processing processor may include a part or all of the functions of the encoding device 100 illustrated in FIG. Some or all of the functions of the decryption apparatus 200 can be included. In this case, these functions do not need to be included in the encoding program 1011 or the decoding program 1013.
  • the communication I / F unit 1005 is an interface for inputting / outputting image data to / from an external device by wire or wirelessly.
  • the image processing apparatus 1000 can decode video data input from the communication I / F unit 1005 and output video data obtained by encoding video from the communication I / F unit 1005 to an external device.
  • various communication methods such as LAN, USB (Universal Serial Bus), mobile phone communication, Bluetooth (registered trademark) communication, and the like can be considered as the communication method performed by the communication I / F unit 1005.
  • the data I / F unit 1007 is a device for inputting / outputting data to / from various external storage devices such as an optical disc such as a DVD and Blu-ray (registered trademark), a flash memory, and an HDD.
  • an optical disc such as a DVD and Blu-ray (registered trademark)
  • a flash memory such as a CD-ROM, DVD-ROM, DVD-ROM, DVD-ROM, DVD-ROM, DVD-ROM, and an HDD.
  • a drive device for reading data stored in various storage devices can be considered.
  • the storage unit 1009 is a built-in nonvolatile storage medium such as an HDD or a flash memory.
  • the storage unit 1009 stores an encoding program 1011 and a decoding program 1013 for realizing an encoding and / or decoding function, in addition to a control program for realizing a function as a general information processing apparatus. Can do.
  • the display unit 1015 is a display device for displaying, for example, decoded video or video to be encoded. Specific examples of the display unit 1015 include a liquid crystal display and an organic EL (Electro-Luminescence) display.
  • the input unit 1017 is a device for accepting operation inputs as necessary. Specific examples of the input unit 1017 include a keyboard, a mouse, a touch panel, and various operation buttons.
  • the image processing apparatus 1000 does not necessarily include the display unit 1015 and the input unit 1017.
  • the display unit 1015 and the input unit 1017 may be connected to the image processing apparatus 1000 from the outside via various interfaces such as a USB and a display port.
  • the encoding device 100 and the decoding device 200 can perform bi-directional prediction using a weighting factor w other than 1 ⁇ 2.
  • the encoding device 100 tries to obtain an optimal combination of the motion vectors v 0 and v 1 and the weighting coefficient w, the amount of calculation increases dramatically. Therefore, the encoding device 100 according to one embodiment First, the motion vectors v 0 and v 1 are obtained by setting the weighting factor to 1 ⁇ 2 for the luminance signal, and then a suitable weighting factor w is obtained for the color difference signal. As a result, the amount of calculation can be drastically reduced as compared with the case of obtaining the optimum one from all combinations of the motion vectors v 0 and v 1 and the weight coefficient w.
  • the encoding device 100 and the decoding device 200 can calculate the weighting factor w based on the adjacent pixel ap of the prediction unit PU to be predicted. As a result, the same weighting factor w can be calculated by the encoding device 100 and the decoding device 200 without including the value and index of the weighting factor w in the bitstream. That is, it is possible to suppress an increase in the amount of coding related to transmission of the weight coefficient w.
  • the encoding device 100 and the decoding device 200 can apply different weighting factors w for the luminance signal and the color difference signal. Thereby, the prediction precision of the prediction image produced

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

La présente invention concerne un dispositif de codage, un procédé de codage, un dispositif de codage et un procédé de décodage grâce auxquels il est possible de réaliser une prédiction bidirectionnelle appropriée. La présente invention comprend : une mémoire de trame pour stocker une pluralité d'images de référence comportant une première image de référence et une seconde image de référence ; un moyen pour trouver des premier et second vecteurs de mouvement qui déterminent les positions des première et seconde régions dans les première et seconde images de référence désignées par une région à coder parmi des images à coder d'une image mobile introduite ; un moyen de détermination de facteur de pondération pour déterminer un premier facteur de pondération qui est appliqué à la première région et un second facteur de pondération qui est appliqué à la seconde région ; et un moyen pour multiplier la valeur de pixel des pixels dans la première région par le premier facteur de pondération et pour multiplier la valeur de pixel de pixels dans la seconde région par le second facteur de pondération, et calculer ainsi une valeur de pixel prédite pour la région à coder. Il est possible que le premier facteur de pondération et le second facteur de pondération appliqués à un signal de luminance et que le premier facteur de pondération et le second facteur de pondération appliqués à un signal de différence de couleur adoptent des valeurs différentes.
PCT/JP2017/029411 2016-09-27 2017-08-15 Dispositif de codage, procédé de codage, dispositif de codage et procédé de décodage WO2018061503A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2016-188308 2016-09-27
JP2016188308A JP2018056699A (ja) 2016-09-27 2016-09-27 符号化装置、符号化方法、復号化装置、及び復号化方法

Publications (1)

Publication Number Publication Date
WO2018061503A1 true WO2018061503A1 (fr) 2018-04-05

Family

ID=61759438

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2017/029411 WO2018061503A1 (fr) 2016-09-27 2017-08-15 Dispositif de codage, procédé de codage, dispositif de codage et procédé de décodage

Country Status (2)

Country Link
JP (1) JP2018056699A (fr)
WO (1) WO2018061503A1 (fr)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3554080A1 (fr) * 2018-04-13 2019-10-16 InterDigital VC Holdings, Inc. Procédés et dispositifs de codage et de décodage d'images
US11863774B2 (en) 2021-01-15 2024-01-02 Tencent America LLC Method and apparatus for video coding

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010035731A1 (fr) * 2008-09-24 2010-04-01 ソニー株式会社 Appareil de traitement d'image et procédé de traitement d'image
JP2013500661A (ja) * 2009-07-30 2013-01-07 トムソン ライセンシング 画像シーケンスを表す符号化されたデータのストリームを復号する方法および画像シーケンスを符号化する方法
WO2013057782A1 (fr) * 2011-10-17 2013-04-25 株式会社東芝 Procédé de codage et procédé de décodage

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010035731A1 (fr) * 2008-09-24 2010-04-01 ソニー株式会社 Appareil de traitement d'image et procédé de traitement d'image
JP2013500661A (ja) * 2009-07-30 2013-01-07 トムソン ライセンシング 画像シーケンスを表す符号化されたデータのストリームを復号する方法および画像シーケンスを符号化する方法
WO2013057782A1 (fr) * 2011-10-17 2013-04-25 株式会社東芝 Procédé de codage et procédé de décodage

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
CHUN-CHI CHEN ET AL.: "Generalized bi-prediction for inter coding", JOINT VIDEO EXPLORATION TEAM (JVET) OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/SC 29/WG 11, JVET-C0047_RL, 3RD MEETING, no. JVET-C0047, 17 May 2016 (2016-05-17), Geneva, CH, pages 1 - 4, XP030150142 *

Also Published As

Publication number Publication date
JP2018056699A (ja) 2018-04-05

Similar Documents

Publication Publication Date Title
KR102620624B1 (ko) 영상 복호화 방법 및 컴퓨터로 읽을 수 있는 기록 매체
KR102509543B1 (ko) 영상 부호화 및 복호화 방법 및 장치
CN112703740A (zh) 用于仿射模式的基于历史的运动矢量预测
JP2022507683A (ja) デコーダ側動きベクトル改良
US10735746B2 (en) Method and apparatus for motion compensation prediction
JP2022521979A (ja) デコーダ側動きベクトル改良に対する制約
KR20240024140A (ko) 스킵 모드를 이용한 영상 복호화 방법 및 이러한 방법을 사용하는 장치
JP2008278091A (ja) 動画像記録方法及びその装置
KR20060127155A (ko) 화상 정보 부호화 장치 및 화상 정보 부호화 방법
JP7447097B2 (ja) 画像符号化/復号化方法及び装置
KR20210088693A (ko) 임의의 ctu 크기에 대한 ibc 검색 범위 최적화를 사용하는 인코더, 디코더 및 대응하는 방법
JP7375224B2 (ja) 符号化・復号方法、装置及びそのデバイス
CN117241040A (zh) 图像信号编码/解码方法及其设备
JP2022533061A (ja) ビデオコーディングのための勾配ベースの予測改良
KR20210016054A (ko) 영상 부호화/복호화 방법 및 장치
WO2018061503A1 (fr) Dispositif de codage, procédé de codage, dispositif de codage et procédé de décodage
KR20210153548A (ko) 비디오 신호 부호화/복호화 방법 및 장치, 그리고 비트스트림을 저장한 기록 매체
WO2020184263A1 (fr) Dispositif de décodage d'image, procédé de décodage d'image et programme
JP5680778B2 (ja) 動画像符号化方法及び動画像復号化方法
KR20230133775A (ko) 영상 부호화/복호화 방법, 장치 및 비트스트림을 저장한 기록 매체
KR20240033672A (ko) 영상 부호화/복호화 방법, 장치 및 비트스트림을 저장한 기록 매체
JP5893178B2 (ja) 動画像符号化方法及び動画像復号化方法
JP5957513B2 (ja) 動画像復号化方法
KR20240026871A (ko) 영상 부호화/복호화 방법, 장치 및 비트스트림을 저장한 기록 매체
KR20240016222A (ko) 영상 부호화/복호화 방법, 장치 및 비트스트림을 저장한기록 매체

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17855461

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17855461

Country of ref document: EP

Kind code of ref document: A1