WO2019208258A1 - Dispositif de codage, procédé de codage, dispositif de décodage et procédé de décodage - Google Patents

Dispositif de codage, procédé de codage, dispositif de décodage et procédé de décodage Download PDF

Info

Publication number
WO2019208258A1
WO2019208258A1 PCT/JP2019/015907 JP2019015907W WO2019208258A1 WO 2019208258 A1 WO2019208258 A1 WO 2019208258A1 JP 2019015907 W JP2019015907 W JP 2019015907W WO 2019208258 A1 WO2019208258 A1 WO 2019208258A1
Authority
WO
WIPO (PCT)
Prior art keywords
unit
class
pixel
activity
image
Prior art date
Application number
PCT/JP2019/015907
Other languages
English (en)
Japanese (ja)
Inventor
優 池田
Original Assignee
ソニー株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ソニー株式会社 filed Critical ソニー株式会社
Publication of WO2019208258A1 publication Critical patent/WO2019208258A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/117Filters, e.g. for pre-processing or post-processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/14Coding unit complexity, e.g. amount of activity or edge presence estimation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/182Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/196Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/80Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
    • H04N19/82Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation involving filtering within a prediction loop
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/98Adaptive-dynamic-range coding [ADRC]

Definitions

  • the present technology relates to an encoding device, an encoding method, a decoding device, and a decoding method, and in particular, for example, an encoding device, an encoding method, and a decoding device that can improve encoding efficiency and image quality.
  • the present invention also relates to a decoding method.
  • FVC Full Video Coding
  • HEVC High Efficiency Video Coding
  • deblocking filters as ILF (In Loop Loop Filter) used for image encoding and decoding
  • ILF In Loop Loop Filter
  • a bilateral filter Bilateral Filter
  • ALF Adaptive Loop Filter
  • GALF Global Adaptive Loop Filter
  • JEM7 Joint Exploration Test Model 7
  • PCS Picture Coding Symposium
  • the present technology has been made in view of such a situation, and makes it possible to improve encoding efficiency and image quality.
  • the decoding device of the present technology decodes encoded data included in an encoded bitstream using a filter image and generates a decoded image; and a target pixel of the decoded image generated by the decoding unit.
  • ADRC Adaptive Dynamic Range ⁇ Coding
  • a class classification unit that classifies the pixel of interest into one of a plurality of classes, and the class classification unit performs the class classification
  • a filter unit that generates a filter image by performing a filter process that applies a prediction formula for performing a product-sum operation on a tap coefficient of the class of the target pixel obtained by class classification and a pixel of the decoded image to the decoded image; Is a decoding device.
  • the decoding method of the present technology decodes encoded data included in an encoded bitstream using a filter image to generate a decoded image, and ADRC (AdaptiveaptDynamic Range Coding) of an activity of a pixel of interest of the decoded image. ) Processing, classifying the target pixel into any one of a plurality of classes, performing tap classification of the class of the target pixel obtained by the class classification and the decoded image
  • a decoding method including: generating a filter image by performing a filter process that applies a prediction expression for performing a product-sum operation with a pixel to the decoded image.
  • the encoded data included in the encoded bitstream is decoded using the filter image, and a decoded image is generated.
  • ADRC Adaptive Dynamic Range Coding
  • class classification is performed to classify the target pixel into one of a plurality of classes.
  • a filter process is performed to apply a prediction formula for performing a product-sum operation on the tap coefficient of the class of the pixel of interest obtained by the class classification and the pixel of the decoded image to the decoded image, and the filter image is generated.
  • the encoding device of the present technology performs the ADRC (Adaptive Dynamic Range Coding) process on the activity of the target pixel of the decoded image subjected to local decoding, thereby classifying the target pixel into one of a plurality of classes.
  • a class classification unit that performs class classification, and a prediction expression that performs a product-sum operation on a tap coefficient of the class of the target pixel obtained by the class classification performed by the class classification unit and a pixel of the decoded image.
  • the encoding apparatus includes: a filter unit that performs a filter process to be applied to generate a filter image; and an encoding unit that encodes an original image using the filter image generated by the filter unit.
  • the encoding method of the present technology classifies the target pixel into one of a plurality of classes by performing ADRC (Adaptive Dynamic Range Coding) processing of the activity of the target pixel of the locally decoded decoded image.
  • ADRC Adaptive Dynamic Range Coding
  • filter classification to apply a prediction formula for performing product classification of a tap coefficient of the class of the pixel of interest obtained by the class classification and a pixel of the decoded image to the decoded image.
  • An encoding method includes generating an image and encoding an original image using the filter image.
  • the target pixel is selected from any of a plurality of classes by performing ADRC (Adaptive Dynamic Range Coding) processing of the activity of the target pixel of the locally decoded decoded image.
  • ADRC Adaptive Dynamic Range Coding
  • Classification into the classes is performed.
  • a filter process is performed to apply a prediction expression for performing a product-sum operation on the tap coefficient of the class of the pixel of interest obtained by the class classification and the pixel of the decoded image to the decoded image, thereby generating a filter image .
  • the original image is encoded using the filter image.
  • the encoding device and the decoding device may be independent devices, or may be internal blocks constituting one device.
  • the encoding device and the decoding device can be realized by causing a computer to execute a program.
  • the program can be provided by being transmitted through a transmission medium or by being recorded on a recording medium.
  • transposition tap coefficient used for the filter process of a class 3 pixel It is a figure which shows the transposed tap coefficient used for the filter process of the class 4 pixel. It is a figure which shows the transposition tap coefficient used for the filter process of the class 5 pixel. It is a figure which shows the transposition tap coefficient used for the filter process of the class 6 pixel. It is a figure which shows the transposed tap coefficient used for the filter process of the class 7 pixel. It is a figure explaining the filter process using a degenerate tap coefficient. It is a figure which shows the example of the class of the class classification by the activity ADRC system performed according to the dynamic range DR.
  • FIG. 3 is a block diagram illustrating a configuration example of a class classification unit 11.
  • FIG. It is a flowchart explaining the example of the filter process as a class classification prediction process which the class classification prediction filter 10 performs. It is a flowchart explaining the example of the process of class classification by the activity ADRC system performed at step S12. It is a block diagram showing an outline of an embodiment of an image processing system to which the present technology is applied. 12 is a flowchart illustrating an overview of an encoding process of the encoding device 60.
  • FIG. 12 is a flowchart illustrating an outline of a decoding process of the decoding device 70.
  • 3 is a block diagram illustrating a detailed configuration example of an encoding device 60.
  • FIG. 12 is a flowchart illustrating an example of encoding processing of the encoding device 60. It is a flowchart explaining the example of the prediction encoding process of step S105.
  • 3 is a block diagram illustrating a detailed configuration example of a decoding device 70.
  • FIG. 12 is a flowchart for explaining an example of decoding processing of the decoding device 70. It is a flowchart explaining the example of the prediction decoding process of step S205.
  • FIG. 18 is a block diagram illustrating a configuration example of an embodiment of a computer to which the present technology is applied.
  • Reference 1 AVC standard ("Advanced video coding for generic audiovisual services", ITU-T H.264 (04/2017))
  • Reference 2 HEVC standard ("High efficiency video coding", ITU-T H.265 (12/2016))
  • Reference 3 FVC Algorithm Description (Algorithm description of Joint Exploration Test Model 7 (JEM7), 2017-08-19)
  • the contents described in the above-mentioned documents are also grounds for judging support requirements.
  • the Quad-Tree Block Structure described in Reference 1 the QTBT (QuadTree Plus ⁇ Binary Tree) and Block Structure described in Reference 3 are not directly described in the embodiment, It is within the scope of disclosure and meets the support requirements of the claims. Further, for example, even for technical terms such as Parsing, Syntax, Semantics, etc., even if there is no direct description in the embodiment, it is within the disclosure range of the present technology, Satisfy claims support requirements.
  • block (not a block indicating a processing unit) used for explanation as a partial region or processing unit of an image (picture) indicates an arbitrary partial region in a picture unless otherwise specified. Its size, shape, characteristics, etc. are not limited.
  • the "block” includes TB (Transform Block), TU (Transform Unit), PB (Prediction Block), PU (Prediction Unit), SCU (Smallest Coding Unit), CU ( Coding Unit), LCU (Largest Coding Unit), CTB (Coding Tree Unit), CTU (Coding Tree Unit), transform block, sub-block, macroblock, tile, slice, etc. included.
  • the block size may be specified indirectly.
  • the block size may be designated using identification information for identifying the size.
  • the block size may be specified by a ratio or difference with the size of a reference block (for example, LCU or SCU).
  • a reference block for example, LCU or SCU.
  • the designation of the block size includes designation of a block size range (for example, designation of an allowable block size range).
  • the encoded data is data obtained by encoding an image, for example, data obtained by orthogonally transforming and quantizing an image (residual thereof).
  • the encoded bit stream is a bit stream including encoded data, and includes encoding information related to encoding as necessary.
  • the encoded information includes information necessary for decoding the encoded data, that is, for example, a quantization parameter (QP) when quantization is performed by encoding, and predictive encoding (motion by encoding).
  • QP quantization parameter
  • a motion vector or the like when compensation is performed is included at least.
  • the obtainable information is information that can be obtained from the encoded bitstream. Therefore, the acquirable information is information that can be acquired by any of an encoding device that encodes an image and generates an encoded bitstream, and a decoding device that decodes the encoded bitstream into an image.
  • the acquirable information includes, for example, encoded information included in the encoded bit stream and an image feature amount of an image obtained by decoding encoded data included in the encoded bit stream.
  • the prediction formula is a polynomial that predicts the second data from the first data.
  • the prediction formula is a polynomial for predicting the second image from the first image.
  • Each term of the prediction formula which is such a polynomial is composed of a product of one tap coefficient and one or more prediction taps. Therefore, the prediction formula is an expression for performing a product-sum operation of the tap coefficient and the prediction tap. is there.
  • the pixel (pixel value) as the i-th prediction tap used for prediction is x i
  • the i-th tap coefficient is w i
  • the pixel of the second image (pixel value thereof) (Predicted value) is expressed as y ′
  • a polynomial consisting of only a first-order term is adopted as the prediction formula
  • represents a summation about i.
  • the tap coefficient w i constituting the prediction formula is obtained by learning that statistically minimizes the error y′ ⁇ y between the value y ′ obtained by the prediction formula and the true value y.
  • a method for learning to obtain a tap coefficient there is a least square method.
  • the prediction image is applied to the student image as the student data (input x i to the prediction equation) corresponding to the first image to which the prediction equation is applied and the first image.
  • coefficient summation By adding the coefficients of the terms (coefficient summation), a normal equation is obtained, and by solving the normal equation, a tap coefficient that minimizes the sum of square errors of the predicted value y ′ is obtained.
  • the prediction process is a process for predicting the second image by applying a prediction formula to the first image.
  • the prediction formula is used by using the pixel (the pixel value) of the first image.
  • the predicted value of the second image is obtained by performing the product-sum operation as the above operation.
  • Performing a product-sum operation using the first image can be referred to as a filtering process that filters the first image.
  • a product-sum operation of a prediction expression (as a calculation of a prediction expression) It can be said that the prediction process for performing the product-sum operation is a kind of filter process.
  • the filter image means an image obtained as a result of the filtering process.
  • the second image (predicted value) obtained from the first image by the filter processing as the prediction processing is a filter image.
  • the tap coefficient is a coefficient constituting each term of a polynomial that is a prediction formula, and corresponds to a filter coefficient that is multiplied by a signal to be filtered in a tap of a digital filter.
  • the prediction tap is information such as a pixel (pixel value) used for calculation of the prediction formula, and is multiplied by the tap coefficient in the prediction formula.
  • the prediction tap includes not only the pixel (its pixel value) itself but also a value obtained from the pixel, for example, the sum or average value of the pixels (its pixel value) in a certain block.
  • selecting a pixel or the like as a prediction tap used in the calculation of the prediction formula is equivalent to extending (disposing) a connection line for supplying an input signal to the tap of the digital filter.
  • Selecting a pixel as a prediction tap used for the calculation of an expression is also referred to as “stretching a prediction tap”.
  • Class classification means classifying (clustering) pixels into one of a plurality of classes.
  • the classification can be performed using, for example, pixels in the peripheral area of the target pixel (pixel values thereof) and encoding information related to the target pixel.
  • Examples of the encoding information related to the target pixel include a quantization parameter used for quantization of the target pixel, DF (Deblocking Filter) information related to a deblocking filter applied to the target pixel, and the like.
  • the DF information is information that, for example, which of the strong filter and the weak filter is applied in the deblocking filter, or which is not applied.
  • the class classification prediction process is a filter process as a prediction process performed for each class.
  • the basic principle of the classification classification prediction process is described in, for example, Japanese Patent No. 4449489.
  • a high-order term is a term having a product of two or more prediction taps (as pixels) among terms constituting a polynomial as a prediction formula.
  • the D-order term is a term having a product of D prediction taps among terms constituting a polynomial as a prediction formula.
  • the primary term is a term having one prediction tap
  • the secondary term is a term having a product of two prediction taps.
  • the prediction taps that take the products may be the same prediction tap (pixel).
  • the D-th order coefficient means a tap coefficient constituting the D-th order term.
  • the D-th tap means a prediction tap (as a pixel) constituting the D-th order term.
  • a certain pixel is a D-order tap and may be a D'-order tap different from the D-order tap.
  • the tap structure of the D-order tap and the tap structure of the D′-order tap different from the D-order tap need not be the same.
  • the DC prediction formula is a prediction formula including a DC term.
  • the DC term is a product term of a value representing the DC component of the image as a prediction tap and a tap coefficient among terms constituting a polynomial as a prediction formula.
  • the DC tap means a prediction tap of the DC term, that is, a value representing the DC component.
  • DC coefficient means the tap coefficient of DC term.
  • the primary prediction formula is a prediction formula consisting of only the primary terms.
  • the high-order prediction formula is a prediction formula including a high-order term, that is, a prediction formula consisting of a primary term and a second-order or higher-order term, or a prediction formula consisting of only a second-order or higher-order term.
  • the i-th prediction tap (pixel value or the like) used for prediction is x i
  • the i-th tap coefficient is w i
  • the pixel of the second image obtained by the prediction formula (pixels thereof) (Predicted value of value) is represented by y
  • DC prediction equation moistened with DC term to the primary prediction equation for example, the formula ⁇ w i x i + w DCB DCB .
  • w DCB represents a DC coefficient
  • DCB represents a DC tap.
  • the tap coefficients of the primary prediction formula, the high-order prediction formula, and the DC prediction formula can all be obtained by performing tap coefficient learning using the least square method as described above.
  • a primary prediction formula is adopted as the prediction formula.
  • the tap structure means an arrangement of pixels as a prediction tap (for example, based on the position of the target pixel). It can also be said that the tap structure is a method of stretching the prediction tap. Considering a state in which a tap coefficient to be multiplied with the pixel is arranged at the position of the pixel constituting the prediction tap, it can be said that the tap structure is an arrangement of the tap coefficient. Therefore, the tap structure is the arrangement of the pixels constituting the prediction tap of the target pixel and the arrangement of the tap coefficients in the state where the tap coefficient to be multiplied with the pixel is arranged at the position of the pixel constituting the prediction tap. Both are meant.
  • ADRC Adaptive Dynamic Range Coding processing is processing that (re) quantizes a plurality of data as values within a dynamic range DR (dynamic range) of the plurality of data.
  • DR dynamic range
  • a value representing the difference between the maximum value Max and the minimum value Min of a plurality of data is obtained as the dynamic range DR of the plurality of data.
  • the minimum value Min is subtracted, and the subtracted value is divided by DR / 2 L (requantization).
  • a binary code obtained by arranging the ADRC codes which are values after the ADRC processing of each of the plurality of data, arranged in a predetermined order is output as a processing result of the ADRC processing.
  • the dynamic range DR of the plurality of data As the dynamic range DR of the plurality of data, a difference itself between the maximum value Max and the minimum value Min of the plurality of data, a value obtained by adding 1 to the difference between the maximum value Max and the minimum value Min, or the like is employed. In the present embodiment, a value obtained by adding 1 to the difference between the maximum value Max and the minimum value Min of a plurality of data is adopted as the dynamic range DR of the plurality of data.
  • each data is made 1 bit (binarized). That is, 1-bit ADRC processing, the average value between the maximum value Max and the minimum value Min of the plurality of data, or, the value DR / 2 obtained by adding 1/2 1 dynamic range DR to the minimum value Min, the ADRC processing If the data is greater than or equal to the threshold (or greater than the threshold), the threshold is quantized to 1, and if not greater than the threshold, it is quantized to 0. Then, a bit string in which 1 bit (ADRC code) after quantization is arranged in a predetermined order is output as a binary code.
  • ADRC code 1 bit
  • Activity code is an ADRC code (value after activity ADRC processing) obtained by ADRC processing multiple activities.
  • the transposed tap coefficient is the rotation of the tap coefficient of a tap structure around a predetermined axis such as an axis in the 90 degree direction (vertical direction) or a 45 degree direction with respect to the horizontal direction.
  • a decoded image is an image obtained by decoding encoded data obtained by encoding an original image.
  • the decoded image includes an image obtained by local decoding of the predictive encoding when the original image is predictively encoded by the encoding device. included. That is, in the encoding device, when the original image is predictively encoded, the prediction image and the (decoded) residual are added in the local decoding, and the addition result of the addition is a decoded image. is there.
  • a decoded image that is a result of adding a predicted image and a residual is a target of ILF filter processing, but a decoded image after ILF filter processing is a filtered image. But there is.
  • FIG. 1 is a diagram showing an example of class classification performed by performing ADRC processing on pixels (values).
  • the classification method performed by ADRC processing of pixels is also referred to as pixel ADRC method.
  • a plurality of pixels in the peripheral region of the target pixel for example, eight pixels adjacent to each of the upper, lower, left, and upper left, upper left, lower left, upper right, and lower right of the target pixel. 8 pixels are subjected to ADRC processing using (value) as a class tap which is information used for class classification.
  • ADRC processing performed by class classification is not limited to the 1-bit ADRC process, and an L-bit ADRC process with an arbitrary number L of 2 or more can be employed.
  • pixel values of pixels adjacent to the upper left, upper, upper right, left, right, lower left, lower, and lower right of the target pixel are 150, 10, 10, 200, 10, 10, Suppose that it is 10 20.
  • the decimal part is rounded down in the calculation.
  • V >> b represents that v is shifted right by b bits (divided by 2 b ).
  • the pixel value of each pixel adjacent to the upper left, upper, upper right, left, right, lower left, lower, and lower right of the pixel of interest as a class tap is represented by pel (*)
  • the pixel value pel (*) is equal to or greater than the threshold th
  • the ADRC code of the pixel value pel (*) is set to 1.
  • the pixel value pel (*) is not equal to or greater than the threshold th
  • the ADRC code of the pixel value pel (*) is set to 0.
  • the ADRC codes of the pixel values 150, 10, 10, 200, 10, 10, 10, and 20 in FIG. 1 are 1,0,0,1,0,0,0,0.
  • the ADRC code 1,0 (the pixel value pel (*) of each pixel adjacent to the upper left, upper, upper right, left, right, lower left, lower, and lower right of the pixel of interest as the class tap, 100_10_000 in which 0,1,0,0,0,0 are arranged in the order of adjacent pixels in the upper left, upper, upper right, left, right, lower left, lower, and lower right of the target pixel, for example, is ADRC processing Are output as binary code.
  • a binary code obtained by ADRC processing is converted into a class code representing a class of a target pixel, and is output as a result of class classification of the target pixel.
  • a decimal value representing a binary code as a value corresponding to a binary code obtained by ADRC processing is output as a class code representing a class of a pixel of interest.
  • the binary code is 100_10_000
  • the decimal value 144 corresponding to the binary code 100_10_000 is output as the class code.
  • FIG. 2 is a diagram for explaining the class number (total number) of classes obtained by class classification by the pixel ADRC method of FIG.
  • the number of classes is 256.
  • the filtering process as the class classification prediction process for performing the class classification of 256 classes is employed as the ILF of the encoding apparatus and the decoding apparatus for predictive encoding and decoding, the image quality of the decoded image can be improved. .
  • the tap coefficient When the tap coefficient is included in the encoded bit stream and transmitted from the encoding apparatus to the decoding apparatus, the tap coefficient becomes an overhead. Therefore, if the data amount of the tap coefficient is large, the encoding efficiency decreases.
  • the encoding device and the decoding device can store the common tap coefficient in advance. However, when the encoding device and the decoding device store the common tap coefficient in advance, the tap coefficient can be stored in advance. If the amount of data is large, a memory having a large storage capacity is required as a memory for storing tap coefficients.
  • the class classification based on the activity ADRC method is adopted as the class classification performed in the class classification prediction process, thereby improving the image quality of the decoded image, for example, PSNR (Peak Signal-to-NoiseNRatio). Improvement and improvement of encoding efficiency are aimed at.
  • PSNR Peak Signal-to-NoiseNRatio
  • Activity ADRC method is a classification method that is performed by ADRC processing image activity.
  • FIG. 3 is a diagram illustrating an example of a method for obtaining a (space) activity for a target pixel.
  • class classification prediction processing is employed as the ILF filter processing of the encoding device or decoding device, for example, as shown in FIG. 3, a decoded image (including a decoded image obtained by local decoding in the encoding device) is used.
  • Class classification can be performed in the classification unit with 2 ⁇ 2 pixels in the horizontal and vertical directions as the classification unit of the class classification.
  • 2 ⁇ 2 pixels of the classification unit are classified into the same class.
  • a method for classifying 2 ⁇ 2 pixels of the classification unit into the same class for example, among the 2 ⁇ 2 pixels of the classification unit, for example, classifying for one pixel such as the upper left pixel, There is a method of adopting that one pixel class as it is for the other three pixel classes.
  • class classification is performed for each of the 2 ⁇ 2 pixels of the classification unit, and 2 ⁇ 2 pixels of the classification unit
  • the activity in each of a plurality of directions starting from the pixel of interest with the pixel at the upper left as the pixel of interest is obtained.
  • an upward direction as the vertical direction starting from the target pixel for example, an upward direction as the vertical direction starting from the target pixel, a left direction as the horizontal direction, an upper left direction as the first diagonal direction, and (first It is assumed that four directions in the upper right direction are adopted as the second diagonal direction (different from the diagonal direction of 1).
  • the upward direction, the left direction, the upper left direction, and the upper right direction are also referred to as a V direction, an H direction, a D0 direction, and a D1 direction, respectively.
  • the point of interest is the center of symmetry with respect to the V direction, H direction, D0 direction, and D1 direction.
  • V ′ direction, H ′ direction, D0 ′ direction, and D1 ′ direction which are symmetrical directions (opposite directions).
  • the V ′ direction, H ′ direction, D0 ′ direction, and D1 ′ direction are the downward direction as the vertical direction starting from the target pixel, the right direction as the horizontal direction, and the lower right direction as the first diagonal direction. , And the lower left direction as the second diagonal direction.
  • directions other than the V direction, the H direction, the D0 direction, and the D1 direction for example, a direction between the V direction and the D0 direction, a direction between the V direction and the D1 direction, or the like is adopted. Can do.
  • the activity A (D) in the D direction with the target pixel can be obtained by applying, for example, a Laplacian filter to the decoded image including the target pixel.
  • the activities A (V), A (H), A (D0), and A (D1) in the V direction, H direction, D0 direction, and D1 direction of the target pixel are obtained according to, for example, the following equations: Can do.
  • a (V) abs ((L [y] [x] ⁇ 1)-L [y-1] [x]-L [y + 1] [x])
  • a (H) abs ((L [y] [x] ⁇ 1)-L [y] [x-1]-L [y] [x + 1])
  • a (D0) abs ((L [y] [x] ⁇ 1)-L [y-1] [x-1]-L [y + 1] [x + 1])
  • a (D1) abs ((L [y] [x] ⁇ 1)-L [y + 1] [x-1]-L [y-1] [x + 1]) ... (1)
  • L [y] [x] represents the pixel value (luminance value) of the pixel at the position of the y-th row and x-th column of the decoded image, and in this case, the position of the y-th row and x-th column of the decoded image This pixel is the target pixel.
  • abs (v) represents the absolute value of v
  • v ⁇ b is, v are expressed by left shifting b bits (2 b multiplying it).
  • the activity of the formula (1) is used as it is as the activities A (V), A (H), A (D0), and A (D1) in the V direction, H direction, D0 direction, and D1 direction of the pixel of interest.
  • a value obtained by adding the activity of the pixel of interest and the activity of the pixels in the peripheral region of the pixel of interest (hereinafter also referred to as activity summation) ) Is used as the final activity of the pixel of interest.
  • a 6 ⁇ 6 pixel range centered on a 2 ⁇ 2 pixel classification unit including the target pixel is obtained as an addition window for pixels or classification units in the addition window.
  • An activity sum that is an addition value obtained by adding A (V) in Expression (1) can be used as the final activity in the V direction of the pixel of interest. The same applies to the H direction, D0 direction, and D1 direction of the target pixel.
  • the activity of the pixel of interest and the activities of the pixels in the peripheral area of the pixel of interest in the D direction of the pixel of interest (here, the V direction, the H direction, the D0 direction, and the D1 direction).
  • the activity sum of the pixel of interest obtained by adding and is also described as A (D).
  • the activity sum is also simply referred to as activity.
  • FIG. 4 is a diagram for explaining how to represent the activity of the pixel of interest and that the activity used for class classification is degenerated using the point symmetry of the activity.
  • the activity (activity sum) A (D) of the pixel of interest in the D direction is represented by a rectangle adjacent to the pixel in the D direction (the portion indicated by C in the figure).
  • a rectangle adjacent on the target pixel (C) represents an activity (activity sum) A (V) in the V direction of the target pixel.
  • the class classification based on the activity ADRC system is performed (almost) with the class classification based on the pixel ADRC system using 8 pixels (directly) adjacent to the target pixel as the class tap.
  • each of the 8 directions of the target pixel in the V direction, H direction, D0 direction, D1 direction, V ′ direction, H ′ direction, D0 ′ direction, and D1 ′ direction Use activity (activity sum) A (V), A (H), A (D0), A (D1), A (V '), A (H'), A (D0 '), A (D1') It is possible.
  • the activity A (V in the V direction, H direction, D0 direction, D1 direction, V ′ direction, H ′ direction, D0 ′ direction, and D1 ′ direction of the pixel of interest. ), A (H), A (D0), A (D1), A (V '), A (H'), A (D0 '), A (D1')
  • the number of classes is 256, the amount of tap coefficient data increases, and the coding efficiency decreases.
  • the class tap used for class classification by the activity ADRC method is degenerated using the point symmetry of the activity.
  • the activity in the D direction where the target pixel is present generally has a point symmetry that is (substantially) coincident with the activity in the direction of point symmetry with the D direction (a direction opposite to 180 degrees) due to the nature of the image.
  • V direction and V ′ direction of the target pixel for example, only the activity A (V) in the V direction is used for the class tap.
  • activity A (H) in the H direction for the H direction and the H ′ direction of the target pixel and only the activity A (D0) in the D0 direction for the D0 direction and the D0 ′ direction, the D1 direction and the D1 'For direction, only activity A (D1) in direction D1 is used for each class tap.
  • the class taps used for class classification by the activity ADRC method are as shown in FIG. 4 in the V direction, H direction, D0 direction, D1 direction, V ′ direction, H ′ direction, D0 ′ direction, and D1 ′.
  • Activity A (V), A (H), A (D0), A (D1), A (V '), A (H'), A (D0 '), A (D1') Are degenerated into activities A (V), A (H), A (D0), and A (D1) in the four directions of the V direction, the H direction, the D0 direction, and the D1 direction.
  • pixels having point symmetry of activity can be classified with (almost) the same performance as when class taps before degeneration are used.
  • FIG. 5 is a diagram showing an example of class classification by the activity ADRC method performed by degenerating class taps as described in FIG.
  • the activities (activity sums) A (V), A (H), A (D0), A (D1) in each of the four directions of the H direction, D0 direction, and D1 direction are subjected to, for example, 1-bit ADRC processing.
  • the activity A (D0), A (V), A (D1), and A (H) in the D0 direction, V direction, D1 direction, and H direction as the class tap of the target pixel are Assume that they are 150, 10, 10, and 200, respectively.
  • the activity sum_ (D) as the class tap is equal to or greater than the threshold th (or greater than the threshold th).
  • the activity code which is the ADRC code of the activity sum_ (D) is 1.
  • the activity code that is the ADRC code of the activity sum_ (D) is set to 0.
  • the activity codes of the activities (activity sums) 150, 10, 10, and 200 in FIG. 5 are 1,0,0,1.
  • activity codes 1,0,0,1 that are ADRC codes of activities sum_ (D) in the D0 direction, the V direction, the D1 direction, and the H direction of the target pixel as the class tap are, for example, 1001 arranged in the order of the D0 direction, the V direction, the D1 direction, and the H direction is output as a binary code that is a result of the ADRC process.
  • a binary code obtained by ADRC processing is converted into a class code representing the class of the target pixel and output as a result of class classification of the target pixel.
  • a decimal value representing a binary code as a value corresponding to a binary code obtained by ADRC processing is output as a class code representing a class of a pixel of interest.
  • a decimal value 9 corresponding to the binary code 1001 is output as a class code.
  • the conversion of binary code to class code is not limited to conversion to a decimal value representing the binary code.
  • the conversion of the binary code into the class code can be performed according to an arbitrary conversion rule.
  • the activity A (D0), A (V), A (D1), A (H) in the four directions of the D0 direction, V direction, D1 direction, and H direction of the target pixel is used as a class tap.
  • the class number is reduced from 256 classes while maintaining the performance of class classification (almost) the same as before the reduction. It can be reduced to 16 classes. As a result, the image quality of the decoded image can be improved as in the case of classifying 256 classes, and the data amount of the number of taps can be reduced to improve the encoding efficiency.
  • the activity code (ADRC code) of the activity A (D) is set to 1, so the dynamic range When DR is 0, that is, the activities A (D0), A (V), A (D1), and A (H) in the D0 direction, V direction, D1 direction, and H direction of the target pixel are all the same.
  • the activity codes of activities A (D0), A (V), A (D1), and A (H) in the D0 direction, V direction, D1 direction, and H direction of the target pixel are all 1
  • the binary code is 1111.
  • the activity code of the activity A (D) can be set to 1. .
  • the activity A (D0), A (V), A (D1), A (H) of the pixel of interest in the D0 direction, V direction, D1 direction, and H direction All ADRC codes are 0, and binary codes are 0000.
  • the activity codes of activities A (D0), A (V), A (D1), and A (H) in the D0 direction, V direction, D1 direction, and H direction of the target pixel are all 1, That is, when the binary code is 1111, it does not exist.
  • the activity A (D0), A (V), A (D1), A (H) in the four directions of the D0 direction, V direction, D1 direction, and H direction of the target pixel are adopted as class taps,
  • the number of classes in the class classification for performing 1-bit ADRC processing is substantially 15 classes, excluding 1 class for binary code that does not exist (cannot be obtained by ADRC processing) from 16 classes.
  • the activity is almost as small, so in class classification using activity in multiple directions (class classification using activity in multiple directions as class taps)
  • the pixels of the flat part are preferably classified into the same class.
  • each pixel in the flat part of the image is classified into a different class and is not likely to be classified into the same class.
  • the activity A (D0 direction, V direction, D1 direction, and H direction of the pixel of interest D0), A (V), A (D1), A (H), as well as the dynamic range DR of the activity A (D0), A (V), A (D1), A (H) Classification can be performed.
  • the binary code 0000 or 1111 is obtained as a result of the ADRC process.
  • the binary code 0000 or 1111 is converted into a class (class code) assigned to the binary code 0000 or 1111 and having a dynamic range DR less than the threshold TH DR , and output as a result of classification. .
  • the dynamic range DR of the activities A (D0), A (V), A (D1), and A (H) is not less than the threshold value TH DR , as described in FIG. 5, the activity A (D0) , A (V), A (D1), A (H) are subjected to ADRC processing, and the binary code obtained as a result of the ADRC processing is converted into a class (class code) assigned to the binary code, and classified into classes. Is output as a result of.
  • the activity ADRC class classification according to the dynamic range DR of activities A (D0), A (V), A (D1), and A (H) is activity A (D0).
  • a (V), A (D1), A (H) class classification using dynamic range DR and class using activity A (D0), A (V), A (D1), A (H) It can be said that each of the classifications is a class classification obtained by combining the two subclass classifications as a so-called subclass classification.
  • Activity A (D0), A (V), A (D1), and A (H), subclass classification using the dynamic range DR (hereinafter also referred to as DR subclass classification), activity A (D0), A (V) , a (D1), a dynamic range DR of the a (H) is, the threshold value TH by whether it is less than DR, the pixel of interest is either a class dynamic range DR is less than the threshold TH DR, the dynamic range DR is the threshold value TH DR Classified into a class of less than As described above, a class classified depending on whether or not the dynamic range DR of the activities A (D0), A (V), A (D1), and A (H) is less than the threshold value TH DR is also referred to as a DR class.
  • the activities A (D0), A of the target pixel in the D0 direction, the V direction, the D1 direction, and the H direction Subclass classification (hereinafter also referred to as direction subclass classification) using (V), A (D1), and A (H) is performed.
  • ADRC processing of activities A (D0), A (V), A (D1), and A (H) is performed, and the target pixel is a binary code other than 0000 and 1111 obtained by the ADRC processing. It is classified into the class assigned to.
  • classes classified by ADRC processing of activities A (D0), A (V), A (D1), and A (H) in the D0 direction, V direction, D1 direction, and H direction of the target pixel Is also referred to as a direction class.
  • Class classification combining DR subclass classification and direction subclass classification that is, activity A (D0), A (V), A (D1), A (H) in D0 direction, V direction, D1 direction, and H direction
  • DR subclass classification is performed.
  • the direction subclass classification when the target pixel is classified into the DR class whose dynamic range DR is less than the threshold TH DR , the direction subclass classification is not performed, and therefore the direction class is not obtained.
  • the final class of the target pixel is determined according to only the DR class.
  • the DR subclass classification when the target pixel is classified into the DR class whose dynamic range DR is not less than the threshold TH DR , the direction subclass classification is performed, and the target pixel is classified into the direction class. Then, the final class of the target pixel is determined according to the DR class and the direction class of the target pixel. That is, for example, an integrated class obtained by integrating the DR class and the direction class of the target pixel is determined as the final class of the target pixel.
  • FIG. 6 is a diagram for explaining that the tap coefficient used in the class classification prediction process is degenerated using the point symmetry of the image activity.
  • the rectangle with the number #n is the primary prediction formula for the most prediction taps, with the pixels in the range of horizontal ⁇ vertical 9 ⁇ 9 pixels centered on the pixel of interest as the most prediction taps. it represents the tap coefficient w n used.
  • the tap structure of the tap coefficient w n (the tap structure of the prediction tap) is a rhombus (diamond type) whose horizontal and vertical diagonals are 9 pixels long, as shown in FIG. And
  • the calculation of the prediction formula includes tap coefficients w 4 , w 12 to w 14 , w 20 to w 24 , w 28 to w 34 , w 36 to w 44 , w 46.
  • 41 tap coefficients w n of w 52 , w 56 to w 60 , w 66 to w 68 , and w 76 are used.
  • the 41 pieces of tap coefficients w 40 of the center of the tap coefficient w n is (in the prediction equation, the pixel of interest (pixel value of) the tap coefficient to be multiplied) tap coefficient for the pixel of interest is.
  • an image generally has a point symmetry of an activity in which activities in a point-symmetrical direction coincide (almost).
  • a tap coefficient obtained by performing tap coefficient learning using an image having point symmetry of an activity is a tap coefficient that tends to have a (substantially) similar value of the tap coefficient at the point-symmetrical position.
  • this technology uses the fact that the tap coefficient at the point-symmetrical position tends to have the same value due to the point symmetry of the activity, and reduces the tap coefficient (the (substantial) number of tap coefficients). Less).
  • the raster scan order is 20 tap coefficients w 41 to w 44 , w 46 to w 52 , w 56 to w 60 , w 66 to w 68 after the tap coefficient w 40 for the pixel of interest.
  • W 76 is eliminated, so that the tap coefficient w n is degenerated.
  • the tap coefficient w n after degeneration obtained by degenerating the tap coefficient is the tap coefficient w 4 , W 12 to w 14 , w 20 to w 24 , w 28 to w 34 , w 36 to w 40 (parts hatched in FIG. 6).
  • the encoding efficiency can be further improved by reducing the tap coefficients as described above.
  • the tap coefficient after degeneration is also referred to as a degenerate tap coefficient.
  • the degenerate tap coefficient is expanded and used.
  • FIG. 7 is a diagram for explaining the development of the degenerate tap coefficients.
  • the tap coefficients w n are degenerate tap coefficients
  • the tap coefficients of the relative position of the tap coefficient w n a point symmetry about the position of the tap coefficients w 40 position
  • the degenerate tap coefficient is expanded into 41 tap coefficients.
  • the expanded tap coefficient obtained by expanding the degenerate tap coefficient is also referred to as an expanded tap coefficient.
  • FIG. 8 is a diagram for explaining an example of transposition of expanded tap coefficients when performing filter processing as prediction processing using expanded tap coefficients.
  • FIG. 8A shows the expansion tap coefficient.
  • the j-th position from the left and the i-th position from the top is represented as position (i, j).
  • the expanded tap coefficients after transposition indicates the expanded tap coefficients after transposition obtained by transposing each transposed mode with respect to the expanded tap coefficients.
  • the transposition mode represents a transposition method.
  • the expanded tap coefficient after transposition is also referred to as a transposed tap coefficient hereinafter.
  • FIG. 8B shows the transposed tap coefficient obtained by transposing the transpose mode 0.
  • Transposition mode 0 is a mode in which transposition is not performed, that is, a mode in which transposition of moving the expanded tap coefficient at position (i, j) to position (i, j) is performed. Matches the expansion tap coefficient.
  • the transposed tap coefficient after transposition corresponds to the tap coefficient obtained by moving the expanded tap coefficient in line symmetry with respect to a straight line that is 45 degrees oblique to the horizontal.
  • Transposition mode 3 is a mode in which transposition modes 1 and 2 are performed, and the transposed tap coefficient after transposition in transposition mode 3 is a line symmetry with respect to a straight line that is 45 degrees oblique to the horizontal. Furthermore, it matches the tap coefficient moved in line symmetry with respect to a straight line of 90 degrees with respect to the horizontal.
  • an appropriate filter process when an appropriate filter process can be performed on a certain pixel of interest using a transposed tap coefficient (transposed) in transposed mode 0 (a prediction value obtained from a prediction formula using the transposed tap coefficient).
  • an appropriate filter process may be performed using the transposition tap coefficient of the transposition mode 1 for the pixel having the transposition mode 1 transposition for the activity of the target pixel. It is assumed that it can be done.
  • an appropriate filter process can be performed using the transposition tap coefficient of the transposition mode 2 for a pixel having an activity that has been transposed in the transposition mode 2. It is assumed that an appropriate filter process can be performed using a transposition tap coefficient in the transposition mode 3 for a pixel having an activity in which transposition mode 3 is transposed with respect to the activity of the pixel of interest.
  • the transposed tap coefficient obtained by performing the transposition in the transposition mode for matching the activity is used for the same expanded tap coefficient.
  • filtering can be performed.
  • pixels having the same activity in which transposition of any of the transposition modes 0 to 3 is matched are classified into the same class, and any of the transposition modes 0 to 3 is included in the tap coefficient of the class.
  • the transposed tap coefficient obtained by performing such transposition is used for the filter processing.
  • the number of classes is reduced (reduced). )be able to.
  • degrading the number of classes by classifying pixels having the same activity by transposition in any one of transposition modes 0 to 3 into the same class is also referred to as degeneration of the number of classes by transposition. .
  • the number of classes obtained by the class classification by the activity ADRC method described in FIG. 5 can be reduced from 16 classes to 8 classes.
  • FIG. 9 is a diagram for explaining an example of class classification by the activity ADRC method in which the number of classes is reduced from 16 classes to 8 classes.
  • the horizontal axis represents a class (class code thereof), and the vertical axis represents a transposition mode.
  • FIG. 9 shows activities (activity sums) A (V), A (H), A (D0) in the four directions of the target pixel (indicated by C in the figure) in the V direction, H direction, D0 direction, and D1 direction.
  • a (D1) 1-bit ADRC processing activity code (activity sum value after ADRC processing) is placed in each of the four directions of V direction, H direction, D0 direction and D1 direction , V direction, H direction, D0 direction, D1 direction and the point-symmetrical direction V 'direction, H' direction, D0 'direction, D1' direction (Pattern pattern) Yes.
  • pixels whose code patterns are the same by transposing one of transposition modes 0 to 3 are classified into the same class.
  • D0 direction, V direction, D1 direction, H direction activity A (D0), A (V), A (D1), ⁇ ⁇ ⁇ ⁇ A (H) activity code in order is binary code 1100, 1001, 0110 , And 0011, the code pattern becomes the same by performing transposition in transposition modes 0 to 3, respectively.
  • binary codes 1100, 1001, 0110, and 0011 are assigned to the same class.
  • class 4 class of class code 4
  • the pixels whose binary code is 1100, the pixels 1001, the pixels 0110, and the pixels 0011 are all classified into the same class 4.
  • binary codes 1000 and 0010 in which activity codes of activities A (D0), A (V), A (D1), and A (H) in the D0 direction, V direction, D1 direction, and H direction are arranged in order.
  • class 6 is assigned to the binary codes 1000 and 0010 as the same class. Yes. For this reason, both the pixels with binary code 1000 and the pixels with 0010 are classified into the same class 6.
  • a class is assigned to a binary code, and a transposition mode is assigned. That is, in each class, the transposition mode 0 is assigned to one binary code assigned to the class.
  • the code pattern obtained by performing transposition of any one of the transposition modes 1 to 3 on the code pattern of the binary code to which the transposition mode 0 is assigned is the same as the code pattern of the other binary code
  • a transposition mode representing transposition performed on a code pattern of a binary code to which transposition mode 0 is assigned is assigned.
  • the code pattern of the binary code to which any one of the transposition modes 1 to 3 is assigned is the transposition of the transposition mode assigned to the binary code and the binary code to which the transposition mode 0 is assigned. It matches the transposed code pattern obtained by performing on the code pattern.
  • the code pattern obtained by transposing the transpose mode 2 to the code pattern of the binary code 1100 to which the transpose mode 0 of class 4 is assigned is a binary code to which the transpose mode 2 of class 4 is assigned. Matches the 0110 code pattern.
  • the number of classes obtained by the class classification by the activity ADRC method described in FIG. 5 is reduced from 16 classes to 8 classes 0 to 7 shown in FIG. Can do.
  • class 0 is assigned to binary code 0000 or 1111.
  • classification can be performed according to the dynamic range DR of the activities A (D0), A (V), A (D1), and A (H).
  • class 0 is assigned to a class (DR class) in which the dynamic range DR of activities A (D0), A (V), A (D1), and A (H) is less than the threshold TH DR .
  • the binary code 0000 or 1111 in the ADRC process that is, the activity A (D0 ), A (V), A (D1), and A (H) are all equal, the binary code in which activity codes obtained by ADRC processing are arranged or binary code that cannot be obtained by ADRC processing is the result of ADRC processing.
  • the binary code 0000 or 1111 includes a class (DR class) 0 in which the dynamic range DR of the activities A (D0), A (V), A (D1), and A (H) is less than the threshold TH DR , Transposition mode 0 is assigned.
  • the dynamic range DR of the activities A (D0), A (V), A (D1), A (H) is less than the threshold TH DR .
  • class 0 assigned to binary code 0000 or 1111 output as a result of ADRC processing is output as a result of class classification, and transpose mode 0 assigned to binary code 0000 or 1111 is output. Is done.
  • the activities A (D0), A (V), A (D1), A (H) are ADRC processed.
  • the binary obtained as a result of the ADRC process among the classes assigned to the binary codes 0001 to 1110 is obtained.
  • the class assigned to the code is output as a result of classification.
  • the transposition mode assigned to the binary code obtained as a result of the ADRC process among the transposition modes assigned to the binary code is output.
  • the threshold value TH DR of the dynamic range DR can be set in advance to a fixed value such as 16, 32, 64, for example. Further, the threshold TH DR of the dynamic range DR is set to a value of 16, 32, or 64 according to a quantization parameter such as slice QP, for example, for example, a larger value as the quantization parameter increases. Can be set.
  • FIG. 10 illustrates classes 1 to 7 in the case where the dynamic range DR is not less than the threshold TH DR among the 8 classes after degeneration obtained by the class classification based on the activity ADRC method performed in accordance with the dynamic range DR.
  • FIG. 10 illustrates classes 1 to 7 in the case where the dynamic range DR is not less than the threshold TH DR among the 8 classes after degeneration obtained by the class classification based on the activity ADRC method performed in accordance with the dynamic range DR.
  • Class 1 to 5 are pixel classes in which activity in the V direction or H direction is dominant, and classes 3 to 7 are classes of pixels in which activity in the D0 direction or D1 direction is dominant.
  • a pixel in which activity in the V direction or H direction is dominant is indicated by B1 in FIG. 10 depending on the transposition mode and class if the dynamic range DR is equal to or greater than the threshold TH DR . It is classified into any of 11 patterns. Further, a pixel in which activity in the D0 direction or D1 direction is dominant is classified into any of 11 patterns surrounded by B2 in FIG. 10 according to the transposition mode and class if the dynamic range DR is equal to or greater than the threshold value TH DR .
  • the activity in the V direction and the H direction is larger than the activity in the D0 direction and the D1 direction (class 1 transposition mode 0), the D0 direction, the V direction, and the D1.
  • the direction of the activity depends on the activity in the four directions of H direction, V direction, D0 direction, and D1 direction, such as the activity in the direction is larger than the activity in the H direction (class 3 transposition mode 0).
  • the activity ADRC method it is possible to classify the activity direction in more detail and perform appropriate filter processing in the activity direction as compared with the GALF class classification described in Non-Patent Document 2. As a result, the image quality of the filter image obtained by the filter processing can be improved.
  • FIG. 11 is a diagram showing transposed tap coefficients used for class 1 pixel filtering.
  • transposition mode 0 is transposed on the expanded tap coefficients obtained by expanding class 1 degenerate tap coefficients as described in FIG.
  • the transposed tap coefficient obtained that is, the expanded tap coefficient obtained by expanding the class 1 degenerate tap coefficient is used.
  • class 0 transpose mode 0 pixel filtering processing uses class 0.
  • expanded tap coefficients obtained by expanding class 0 degenerated tap coefficients are used.
  • FIG. 12 is a diagram showing transposed tap coefficients used for class 2 pixel filtering.
  • transpose modes 0 and 1 exist.
  • the transposed tap coefficients obtained by transposing the transposed mode 0 on the expanded tap coefficients obtained by expanding the class 2 degenerate tap coefficients that is, An expanded tap coefficient obtained by expanding a class 2 degenerate tap coefficient is used.
  • the filter processing of pixels in class 2 transposition mode 1 uses transposition tap coefficients obtained by transposing transposition mode 1 with respect to expanded tap coefficients obtained by expanding class 2 degenerate tap coefficients. .
  • FIG. 13 is a diagram showing transposed tap coefficients used for class 3 pixel filtering.
  • transpose modes 0 and 1 exist.
  • the transposed tap coefficient obtained by transposing the transposed mode 0 on the expanded tap coefficient obtained by expanding the class 3 degenerate tap coefficient that is, An expanded tap coefficient obtained by expanding a class 3 degenerate tap coefficient is used.
  • the filter processing of pixels in class 3 transpose mode 1 uses transposed tap coefficients obtained by transposing transpose mode 1 with respect to expanded tap coefficients obtained by expanding class 3 degenerate tap coefficients. .
  • FIG. 14 is a diagram showing transposed tap coefficients used for class 4 pixel filtering.
  • transpose modes 0 to 3 exist.
  • the transposed tap coefficient obtained by transposing the transposed mode 0 to the expanded tap coefficient obtained by expanding the class 4 degenerate tap coefficient that is, An expanded tap coefficient obtained by expanding a class 4 degenerate tap coefficient is used.
  • transposed taps obtained by transposing transpose modes 1 to 3 with respect to expanded tap coefficients obtained by expanding class 4 degenerate tap coefficients Each coefficient is used.
  • FIG. 15 is a diagram showing transposed tap coefficients used for class 5 pixel filtering.
  • transpose modes 0 and 2 exist.
  • transposed tap coefficients obtained by performing transposition mode 0 transposition on expanded tap coefficients obtained by expanding class 5 degenerate tap coefficients that is, An expanded tap coefficient obtained by expanding a class 5 degenerate tap coefficient is used.
  • the filter processing of pixels in class 5 transpose mode 2 uses transpose tap coefficients obtained by transposing transpose mode 2 with respect to expanded tap coefficients obtained by expanding class 5 degenerate tap coefficients. .
  • FIG. 16 is a diagram showing transposed tap coefficients used for class 6 pixel filtering.
  • transpose modes 0 and 2 exist.
  • transposed tap coefficients obtained by transposing transpose mode 0 on expanded tap coefficients obtained by expanding class 6 degenerate tap coefficients that is, An expanded tap coefficient obtained by expanding a class 6 degenerate tap coefficient is used.
  • the filter processing of pixels in class 6 transpose mode 2 uses transposed tap coefficients obtained by transposing transpose mode 2 with respect to expanded tap coefficients obtained by expanding class 6 degenerate tap coefficients. .
  • FIG. 17 is a diagram showing transposed tap coefficients used for class 7 pixel filtering.
  • transposed tap coefficients obtained by transposing transpose mode 0 on expanded tap coefficients obtained by expanding class 7 degenerate tap coefficients that is, An expanded tap coefficient obtained by expanding a class 7 degenerate tap coefficient is used.
  • FIG. 18 is a diagram for explaining filter processing as prediction processing using degenerate tap coefficients.
  • the transposition mode obtained by the class classification by the activity ADRC method of the target pixel is the transposition mode 0.
  • the filter process the expansion tap coefficients obtained by expanding the degenerate tap coefficient w n of the class of the subject pixel, used as the transpose tap coefficients, the primary prediction equation is calculated.
  • the primary prediction formula y ′ w 4 (x 4 + x 76 ) + w 12 (x 12 + x 68 ) + w 13 (x 13 + x 67 ) + w 14 (x 14 + x 66 ) + w 20 (x 20 + x 60 ) + w 21 (x 21 + x 59 ) + w 22 (x 22 + x 58 ) + w 23 (x 23 + x 57 ) + w 24 (x 24 + x 56 ) + w 28 (x 28 + x 52 ) + w 29 (x 29 + x 51 ) + w 30 (x 30 + x 50 ) + w 31 (x 31 + x 49 ) + w 32 (x 32 + x 48 ) + w 33 (x 33 + x 47 ) + w 34 (x 34 + x 46 ) + w 36 (x 36 + x 44
  • FIG. 19 is a diagram showing an example of class classification by the activity ADRC method performed in accordance with the dynamic range DR.
  • FIG. 19A shows a first example of class classification based on the activity ADRC method performed in accordance with the dynamic range DR.
  • the target pixel has the dynamic range DR.
  • a subclass is classified into a low class that is a DR class lower than the threshold TH DR .
  • the target pixel when the dynamic range DR of the activities A (D0), A (V), A (D1), and A (H) is not less than the threshold TH DR , the target pixel does not have the dynamic range DR less than the threshold TH DR . It is sub-classified to the middle class which is the DR class.
  • the target pixel is classified into a final class (hereinafter also referred to as a final class) according to the DR class and the direction class.
  • a final class hereinafter also referred to as a final class
  • the pixel of interest in which the DR class is subclassed into the low class is classified into the final class 0.
  • the pixel of interest in which the DR class is subclassed into the middle class is classified into one of the final classes 1 to 7 depending on the direction class of the pixel of interest.
  • the pixel of interest is classified into one of the eight final classes as described above.
  • FIG. 19B shows a second example of class classification based on the activity ADRC method performed in accordance with the dynamic range DR.
  • the target pixel is dynamic.
  • the range DR is subclassified into a low class, which is a DR class that is less than the first threshold TH DR1 .
  • the activity A (D0), A (V ), A (D1), a dynamic range DR of the A (H) is, if it is less than the second threshold value TH DR2 least first threshold value TH DR1 is the pixel of interest Are subclassified into a Middle class, which is a DR class having a dynamic range DR that is greater than or equal to the first threshold TH DR1 and less than the second threshold TH DR2 .
  • the target pixel has the dynamic range DR Subclasses into a high class, which is a DR class equal to or greater than a threshold TH DR2 of 2.
  • the dynamic range DR of the activities A (D0), A (V), A (D1), A (H) is equal to or greater than the first threshold TH DR1 , the activities A (D0), A (V ), A (D1), A (H) are subjected to ADRC processing, and the target pixel is subclassed in the direction class assigned to the binary code obtained as a result of ADRC processing, among the direction classes 1 to 7 described in FIG. being classified.
  • the target pixel is classified into the final class according to the DR class and the direction class.
  • the pixel of interest in which the DR class is subclassed into the low class is classified into the final class 0.
  • the pixel of interest in which the DR class is subclassed into the middle class is classified into one of the final classes 1 to 7 depending on the direction class of the pixel of interest.
  • the pixel of interest in which the DR class is subclassed to the high class is classified into one of the final classes 8 to 14 depending on the direction class of the pixel of interest.
  • the pixel of interest is classified into one of the 15 final classes as described above.
  • the pixel of interest is subclassified into two or three classes of DR classes according to the dynamic ranges DR of activities A (D0), A (V), A (D1), and A (H).
  • the pixel of interest can be subclassified into four or more DR classes.
  • FIG. 20 shows a configuration example of a class classification prediction filter that performs class classification by the activity ADRC method and performs filter processing as class classification prediction processing to which a prediction formula using a tap coefficient of a class obtained by the class classification is applied. It is a block diagram.
  • the class classification prediction filter 10 includes a class classification unit 11, a tap coefficient acquisition unit 12, and a prediction unit 13.
  • the class classification unit 11 and the prediction unit 13 are supplied with a target image (for example, a decoded image) to be subjected to filter processing.
  • a target image for example, a decoded image
  • the class classification unit 11 sequentially selects a pixel of the target image as a target pixel, classifies the target pixel by the activity ADRC method, obtains a class (final class) and a transposition mode of the target pixel, and tap coefficients Supply to the acquisition unit 12.
  • the tap coefficient acquisition unit 12 stores the degenerate tap coefficient for each class obtained by tap coefficient learning, and performs prediction pixel prediction processing according to the class and transposition mode of the pixel of interest from the class classification unit 11.
  • the tap coefficient used for the filter process is acquired.
  • the tap coefficient acquisition unit 12 expands the degenerated tap coefficient of the class of the target pixel from the class classification unit 11 and generates a developed tap coefficient. Further, the tap coefficient acquisition unit 12 transposes the expanded tap coefficient according to the transposition mode of the target pixel from the class classification unit 11 to generate a transposed tap coefficient. As described above, the tap coefficient acquisition unit 12 generates a transposed tap coefficient corresponding to the class of the target pixel and the transposed mode, and supplies the generated transposed tap coefficient to the prediction unit 13.
  • the prediction unit 13 performs a filtering process as a prediction process that applies a prediction formula using a transposed tap coefficient according to the class of the pixel of interest and the transposition mode from the tap coefficient obtaining unit 12 to the target image, and the filtering process The generated filter image is output.
  • the prediction unit 13 selects, for example, a plurality of pixels in the vicinity of the target pixel among the pixels of the target image as the prediction tap of the target pixel. Furthermore, the prediction unit 13 performs a prediction process that applies a prediction formula including a transposed tap coefficient corresponding to the class of the target pixel and the transposition mode to the target image, that is, the pixel as the prediction tap of the target pixel.
  • a predicted value y ′ of a pixel (its pixel value) of an image corresponding to a teacher image (for example, an original image for a decoded image) is obtained.
  • the prediction part 13 produces
  • the degenerate tap coefficients for each class stored in the tap coefficient acquisition unit 12 can be supplied to the class classification prediction filter 10 from the outside.
  • the class classification prediction filter 10 includes a learning unit 21 that performs tap coefficient learning.
  • the learning unit 21 degenerate tap coefficients for each class obtained by performing tap coefficient learning using a teacher image and a student image are obtained. Can be stored in the tap coefficient acquisition unit 12.
  • the class classification prediction filter 10 When the class classification prediction filter 10 is applied to an encoding apparatus, an original image to be encoded is adopted as a teacher image, and a decoded image obtained by encoding and local decoding the original image as a student image Can be adopted.
  • the tap coefficient learning of the learning unit 21 class classification by the activity ADRC method similar to the class classification unit 11 is performed using the decoded image as the student image, and each class obtained by the class classification is obtained by class classification.
  • a degenerate tap coefficient that statistically minimizes the prediction error of the predicted value of the teacher image obtained from the prediction formula composed of the transposed tap coefficient and the predicted tap that has been transposed according to the transpose mode is obtained by the method of least squares. It is done.
  • the class classification prediction filter 10 incorporating the learning unit 21 is also referred to as a class classification prediction filter 10 with a learning function.
  • FIG. 21 is a block diagram illustrating a configuration example of the class classification unit 11 of FIG.
  • the class classification unit 11 includes an activity calculation unit 31, an ADRC processing unit 32, and a code derivation unit 33.
  • the activity calculation unit 31 is supplied with, for example, a decoded image as a target image.
  • the activity calculation unit 31 sequentially selects a pixel (for example, a predetermined pixel) of the decoded image as a target pixel, and activity (activity) in the D0 direction, the V direction, the D1 direction, and the H direction of the target pixel. Sum) A (D0), A (V), A (D1), A (H) are obtained and supplied to the ADRC processing unit 32.
  • the ADRC processing unit 32 calculates the activity A (D0), A (V), A (D1), and A (H) in the D0 direction, V direction, D1 direction, and H direction of the target pixel from the activity calculation unit 31. Performs 1-bit ADRC processing. Then, the ADRC processing unit 32 supplies the code deriving unit 33 with a binary code for the target pixel obtained by the 1-bit ADRC process performed for the target pixel.
  • the code deriving unit 33 derives (generates) a class (final class) and a transposition mode assigned to the binary code according to the binary code for the pixel of interest from the ADRC processing unit 32, and classifies and transposes the pixel of interest. Output as mode.
  • the class and transposition mode of the target pixel output from the code deriving unit 33 are supplied to the tap coefficient acquisition unit 12 (FIG. 20).
  • FIG. 22 is a flowchart illustrating an example of filter processing as class classification prediction processing performed by the class classification prediction filter 10 of FIG.
  • step S11 the class classification unit 11 sequentially selects a pixel to be selected as a target pixel (a pixel to be subjected to class classification) of the decoded image as the target image as a target pixel, and the process proceeds to step S12. .
  • step S12 the class classification unit 11 classifies the target pixel by the activity ADRC method, obtains the class (final class) and the transposition mode of the target pixel, supplies them to the tap coefficient acquisition unit 12, and performs the processing. Proceed to step S13.
  • step S13 the tap coefficient acquisition unit 12 acquires the degenerate tap coefficient of the class of the pixel of interest from the class classification unit 11 from the degenerated tap coefficients for each class, and the process proceeds to step S14.
  • step S14 the tap coefficient acquisition unit 12 expands the degenerated tap coefficient of the class of the pixel of interest, and generates a developed tap coefficient. Further, the tap coefficient acquisition unit 12 transposes the expanded tap coefficient according to the transposition mode of the target pixel from the class classification unit 11 to generate a transposed tap coefficient. And the tap coefficient acquisition part 12 supplies the transposition tap coefficient according to the class and transposition mode of an attention pixel produced
  • step S ⁇ b> 15 the prediction unit 13 performs a filtering process as a prediction process in which a prediction formula configured by a transposed tap coefficient corresponding to the class of the pixel of interest and the transposed mode from the tap coefficient obtaining unit 12 is applied to the decoded image. .
  • the prediction unit 13 selects a pixel to be a prediction tap of the target pixel from the decoded image, and uses the transposition tap coefficient corresponding to the prediction tap, the class of the target pixel, and the transposition mode, to form a primary prediction formula. Is calculated to obtain the predicted value of the pixel (pixel value) of the original image with respect to the target pixel. And the prediction part 13 produces
  • FIG. 23 is a flowchart for explaining an example of class classification processing by the activity ADRC method performed in step S12 of FIG.
  • step S21 the activity calculation unit 31 of the class classification unit 11 (FIG. 21) uses the decoded image, and the activity (activity sum) A (D0) of the pixel of interest in the D0 direction, the V direction, the D1 direction, and the H direction. ), A (V), A (D1), A (H) are obtained and supplied to the ADRC processing unit 32, and the process proceeds to step S22.
  • step S22 the ADRC processing unit 32 performs activities A (D0), A (V), A (D1), and A in the D0 direction, V direction, D1 direction, and H direction of the target pixel from the activity calculation unit 31.
  • the maximum value Max and the minimum value Min in (H) are obtained, and the process proceeds to step S23.
  • step S23 the ADRC processing unit 32 uses the maximum value Max and the minimum value Min to perform activities A (D0), A (V), A in the D0 direction, V direction, D1 direction, and H direction of the target pixel.
  • the dynamic range DR Max ⁇ Min + 1 of (D1) and A (H) is obtained, and the process proceeds to step S24.
  • step S25 the ADRC processing unit 32 sequentially performs activities A (D0), A (V), A (D1), and A (H) in the D0 direction, V direction, D1 direction, and H direction of the target pixel.
  • the activity A is selected as the attention activity A, and the process proceeds to step S26.
  • step S26 the ADRC processing unit 32 determines whether or not the attention activity A is equal to or greater than the threshold th.
  • step S26 If it is determined in step S26 that the activity of interest A is equal to or greater than the threshold th, the process proceeds to step S27, the ADRC processing unit 32 sets the activity code of the activity of interest A to 1, and the process proceeds to step S27. Proceed to S29.
  • step S26 If it is determined in step S26 that the attention activity A is not equal to or greater than the threshold th, the process proceeds to step S28, the ADRC processing unit 32 sets the activity code of the attention activity A to 0, Proceed to step S29.
  • step S29 the ADRC processing unit 32 performs activity codes of activities A (D0), A (V), A (D1), and A (H) in the D0 direction, V direction, D1 direction, and H direction of the target pixel. Are arranged in that order to generate a binary code for the pixel of interest. Further, in step S29, the ADRC processing unit 32 supplies the binary code for the pixel of interest to the code deriving unit 33, and the process proceeds to step S30.
  • step S30 the code deriving unit 33 obtains (derived) the class (final class) and transposition mode assigned to the binary code in accordance with the binary code for the target pixel from the ADRC processing unit 32, and the target pixel. Are supplied to the tap coefficient acquisition unit 12 (FIG. 20) as the class and the transposition mode, and the process returns.
  • FIG. 24 is a block diagram showing an outline of an embodiment of an image processing system to which the present technology is applied.
  • the image processing system includes an encoding device 60 and a decoding device 70.
  • the encoding device 60 includes an encoding unit 61, a local decoding unit 62, and a filter unit 63.
  • the encoding unit 61 is supplied with an original image (data) that is an image to be encoded and a filter image from the filter unit 63.
  • the encoding unit 61 uses the filter image from the filter unit 63 to encode (predict) the original image in units of a predetermined block such as a CU, and perform local decoding on the encoded data obtained by the encoding. To the unit 62.
  • the encoding unit 61 subtracts the predicted image of the original image obtained by performing motion compensation of the filter image from the filter unit 63 from the original image, and encodes the residual obtained as a result.
  • the filter information is supplied from the filter unit 63 to the encoding unit 61.
  • the encoding unit 61 generates and transmits (transmits) an encoded bit stream including the encoded data and the filter information from the filter unit 63.
  • the local decoding unit 62 is supplied with encoded data from the encoding unit 61 and is also supplied with a filter image from the filter unit 63.
  • the local decoding unit 62 performs local decoding of the encoded data from the encoding unit 61 using the filter image from the filter unit 63, and supplies the (local) decoded image obtained as a result to the filter unit 63.
  • the local decoding unit 62 decodes the encoded data from the encoding unit 61 into a residual, and uses the predicted image of the original image obtained by performing motion compensation of the filter image from the filter unit 63 as the residual. By adding, a decoded image obtained by decoding the original image is generated.
  • the filter unit 63 is configured similarly to the class classification prediction filter 10 with a learning function, for example, and includes a class classification unit 64 that performs class classification by the activity ADRC method.
  • the filter unit 63 performs tap coefficient learning using the decoded image from the local decoding unit 62 and the original image for the decoded image as a student image and a teacher image, and obtains a degenerate tap coefficient for each class.
  • the filter unit 63 performs class classification by the activity ADRC method by performing ADRC processing of the activity of the target pixel of the decoded image using the decoded image from the local decoding unit 62 in the class classification unit 64. Furthermore, the filter unit 63 applies a prediction formula for performing a product-sum operation on the tap coefficient (transposed tap coefficient) of the class of the pixel of interest obtained by the class classification of the class classification unit 64 and the pixel of the decoded image to the decoded image. Perform filter processing as processing.
  • the filter unit 63 supplies the filter image obtained by the filter process to the encoding unit 61 and the local decoding unit 62. Furthermore, the filter unit 63 supplies the degenerate tap coefficient for each class obtained by tap coefficient learning to the encoding unit 61 as filter information.
  • the decoding device 70 includes a parsing unit 71, a decoding unit 72, and a filter unit 73.
  • the parsing unit 71 receives and parses the encoded bitstream transmitted by the encoding device 60, and supplies filter information obtained by the parsing to the filter unit 73. Further, the parsing unit 71 supplies the encoded data included in the encoded bit stream to the decoding unit 72.
  • the decoding unit 72 is supplied with encoded data from the parsing unit 71 and also with a filter image from the filter unit 73.
  • the decoding unit 72 performs decoding of the encoded data from the parsing unit 71 using a filter image from the filter unit 73, for example, in a predetermined block unit such as a CU, and the decoded image obtained as a result is displayed in the filter unit. 73.
  • the decoding unit 72 decodes the encoded data from the parsing unit 71 into a residual and performs motion compensation of the filter image from the filter unit 73 on the residual. By adding the predicted images of the original image, a decoded image obtained by decoding the original image is generated.
  • the filter unit 73 is configured in the same manner as the class classification prediction filter 10 without a learning function, for example, and includes a class classification unit 74 that performs class classification by the activity ADRC method.
  • the filter unit 73 performs a filtering process similar to that of the filter unit 63 on the decoded image from the decoding unit 72, generates a filtered image, and supplies the filtered image to the decoding unit 72.
  • the filter unit 73 performs class classification by the activity ADRC method by performing ADRC processing of the activity of the pixel of interest of the decoded image using the decoded image from the decoding unit 72 in the class classification unit 74. Further, the filter unit 73 is a filter as a prediction process that applies a prediction formula for performing a product-sum operation between the transposed tap coefficient of the class of the pixel of interest obtained by the class classification of the class classification unit 74 and the pixel of the decoded image to the decoded image. Process.
  • the transposed tap coefficient used for the filter processing is generated from the degenerate tap coefficient for each class included in the filter information from the parsing unit 71.
  • the filter unit 63 supplies the filtered image obtained by the filtering process to the decoding unit 72 and outputs the final decoded image obtained by decoding the original image.
  • the filter units 63 and 73 preliminarily store degenerate tap coefficients for each class obtained by performing tap coefficient learning using an image corresponding to the original image and an image corresponding to the decoded image as a teacher image and a student image. It can be stored (preset).
  • the filter unit 63 can be configured similarly to the class classification prediction filter 10 without the learning function.
  • FIG. 25 is a flowchart for explaining an outline of the encoding process of the encoding device 60 of FIG.
  • step S61 the encoding unit 61 (FIG. 24) performs (predictive) encoding of the original image using the filter image from the filter unit 63, and supplies the encoded data obtained by the encoding to the local decoding unit 62. Then, the process proceeds to step S62.
  • step S ⁇ b> 62 the local decoding unit 62 performs local decoding of the encoded data from the encoding unit 61 using the filter image from the filter unit 63, and the (local) decoded image obtained as a result is displayed as the filter unit 63. The process proceeds to step S63.
  • step S63 the filter unit 63 performs tap coefficient learning using the decoded image from the local decoding unit 62 and the original image for the decoded image as a student image and a teacher image, and obtains a degenerate tap coefficient for each class.
  • the process proceeds to step S64.
  • step S64 the filter unit 63 performs filter processing as class classification prediction processing on the decoded image from the local decoding unit 62 using the degenerate tap coefficient for each class obtained by tap coefficient learning, and the filter image is processed. Generate.
  • the class classification unit 64 of the filter unit 63 classifies the target pixel of the decoded image from the local decoding unit 62 by the activity ADRC method. Further, the filter unit 63 performs a product-sum operation on the transposed tap coefficient generated from the degenerated tap coefficient of the class of the pixel of interest and the decoded image pixel among the degenerated tap coefficients for each class obtained by tap coefficient learning. Filter processing is performed as prediction processing for applying the prediction formula to be applied to the decoded image, and a filtered image is generated. The filter image is supplied from the filter unit 63 to the encoding unit 61 and the local decoding unit 62.
  • the filter unit 63 supplies the degenerate tap coefficient for each class obtained by tap coefficient learning to the encoding unit 61 as filter information.
  • step S64 the encoding unit 61 generates and transmits an encoded bitstream including degenerate tap coefficients for each class as filter information from the filter unit 63.
  • FIG. 26 is a flowchart for explaining the outline of the decoding process of the decoding device 70 of FIG.
  • the process according to the flowchart of FIG. 26 is performed in units of frames, for example, similarly to the encoding process of FIG.
  • step S81 the parsing unit 71 (FIG. 24) receives the encoded bit stream transmitted from the encoding device 60, and parses the degenerate tap coefficient for each class as filter information included in the encoded bit stream. To the filter unit 73. Further, the parsing unit 71 supplies the encoded data included in the encoded bitstream to the decoding unit 72, and the process proceeds from step S81 to step S82.
  • step S82 the decoding unit 72 performs decoding of the encoded data from the parsing unit 71 using the filter image from the filter unit 73, and supplies the decoded image obtained as a result to the filter unit 73 for processing. Advances to step S83.
  • step S83 the filter unit 73 performs filter processing as class classification prediction processing on the decoded image from the decoding unit 72 using the degenerate tap coefficient for each class as filter information from the parsing unit 71, and the filter image Is generated.
  • the class classification unit 74 of the filter unit 73 classifies the target pixel of the decoded image from the decoding unit 72 by the activity ADRC method. Further, the filter unit 73 performs a sum-of-products operation on the transposed tap coefficient generated from the degenerated tap coefficient of the class of the target pixel and the decoded image pixel among the degenerated tap coefficients for each class from the parsing unit 71. Filter processing is performed as prediction processing for applying the expression to the decoded image, and a filtered image is generated.
  • the filter image is supplied from the filter unit 73 to the decoding unit 72 and is output as a final decoded image obtained by decoding the original image.
  • the filter image supplied from the filter unit 73 to the decoding unit 72 is used, for example, in the process of step S82 performed on the next frame of the decoded image.
  • FIG. 27 is a block diagram illustrating a detailed configuration example of the encoding device 60 of FIG.
  • the encoding device 60 includes an A / D conversion unit 101, a rearrangement buffer 102, a calculation unit 103, an orthogonal transformation unit 104, a quantization unit 105, a lossless encoding unit 106, and a storage buffer 107. Furthermore, the encoding device 60 includes an inverse quantization unit 108, an inverse orthogonal transform unit 109, a calculation unit 110, an ILF 111, a frame memory 112, a selection unit 113, an intra prediction unit 114, a motion prediction compensation unit 115, and a predicted image selection unit 116. And a rate control unit 117.
  • the A / D conversion unit 101 A / D converts the analog signal original image into a digital signal original image, and supplies the converted image to the rearrangement buffer 102 for storage.
  • the rearrangement buffer 102 rearranges the frames of the original image in the order of encoding (decoding) from the display order according to GOP (Group Of Picture), the arithmetic unit 103, the intra prediction unit 114, the motion prediction compensation unit 115, and , Supplied to ILF111.
  • GOP Group Of Picture
  • the calculation unit 103 subtracts the prediction image supplied from the intra prediction unit 114 or the motion prediction compensation unit 115 via the prediction image selection unit 116 from the original image from the rearrangement buffer 102 and obtains a residual obtained by the subtraction. (Prediction residual) is supplied to the orthogonal transform unit 104.
  • the calculation unit 103 subtracts the predicted image supplied from the motion prediction / compensation unit 115 from the original image read from the rearrangement buffer 102.
  • the orthogonal transform unit 104 performs orthogonal transform such as discrete cosine transform and Karhunen-Loeve transform on the residual supplied from the computation unit 103. Note that this orthogonal transformation method is arbitrary.
  • the orthogonal transform unit 104 supplies the orthogonal transform coefficient obtained by the orthogonal exchange to the quantization unit 105.
  • the quantization unit 105 quantizes the orthogonal transform coefficient supplied from the orthogonal transform unit 104.
  • the quantization unit 105 sets the quantization parameter QP based on the code amount target value (code amount target value) supplied from the rate control unit 117 and quantizes the orthogonal transform coefficient. Note that this quantization method is arbitrary.
  • the quantization unit 105 supplies the encoded data that is the quantized orthogonal transform coefficient to the lossless encoding unit 106.
  • the lossless encoding unit 106 encodes the quantized orthogonal transform coefficient as the encoded data from the quantization unit 105 by a predetermined lossless encoding method. Since the orthogonal transform coefficient is quantized under the control of the rate control unit 117, the code amount of the encoded bitstream obtained by the lossless encoding of the lossless encoding unit 106 is the code set by the rate control unit 117. It becomes the amount target value (or approximates the code amount target value).
  • the lossless encoding unit 106 acquires, from each block, encoding information necessary for decoding by the decoding device 70 out of encoding information related to predictive encoding by the encoding device 60.
  • motion information such as a motion vector, code amount target value, quantization parameter QP, picture type (I, P, B), CU (Coding Unit) and CTU (Coding
  • the prediction mode can be acquired from the intra prediction unit 114 or the motion prediction / compensation unit 115.
  • the motion information can be acquired from the motion prediction / compensation unit 115.
  • the lossless encoding unit 106 acquires encoding information, and also acquires a degenerate tap coefficient for each class as filter information regarding filter processing in the ILF 111 from the ILF 111.
  • the lossless encoding unit 106 converts the encoding information and the filter information into variable length encoding such as CAVLC (Context-Adaptive Variable Length Coding) or CABAC (Context-Adaptive Binary Arithmetic Coding) or other lossless encoding.
  • CAVLC Context-Adaptive Variable Length Coding
  • CABAC Context-Adaptive Binary Arithmetic Coding
  • the encoded bit stream including the encoded information and the filter information after encoding and the encoded data from the quantization unit 105 is generated and supplied to the accumulation buffer 107.
  • the accumulation buffer 107 temporarily accumulates the encoded bit stream supplied from the lossless encoding unit 106.
  • the encoded bit stream stored in the storage buffer 107 is read and transmitted at a predetermined timing.
  • the encoded data that is the orthogonal transform coefficient quantized by the quantization unit 105 is supplied to the lossless encoding unit 106 and also to the inverse quantization unit 108.
  • the inverse quantization unit 108 inverse quantizes the quantized orthogonal transform coefficient by a method corresponding to the quantization by the quantization unit 105, and the orthogonal transform coefficient obtained by the inverse quantization is sent to the inverse orthogonal transform unit 109. Supply.
  • the inverse orthogonal transform unit 109 performs inverse orthogonal transform on the orthogonal transform coefficient supplied from the inverse quantization unit 108 by a method corresponding to the orthogonal transform processing by the orthogonal transform unit 104, and the residual obtained as a result of the inverse orthogonal transform , Supplied to the arithmetic unit 110.
  • the calculation unit 110 adds the prediction image supplied from the intra prediction unit 114 or the motion prediction compensation unit 115 via the prediction image selection unit 116 to the residual supplied from the inverse orthogonal transform unit 109, thereby A decoded image obtained by decoding the image (a part thereof) is obtained and output.
  • the decoded image output from the calculation unit 110 is supplied to the ILF 111.
  • the ILF 111 is configured in the same manner as the class classification prediction filter 10 (FIG. 20) with a learning function, and performs a filter process as a class classification prediction process, so that a deblocking filter, an adaptive offset filter, a bilateral filter, an ALF One or two or more filters.
  • the ILF 111 is caused to function as two or more filters of a deblocking filter, an adaptive offset filter, a bilateral filter, and an ALF, the arrangement order of the two or more filters is arbitrary.
  • the ILF 111 is supplied with a decoded image from the arithmetic unit 110 and an original image for the decoded image from the rearrangement buffer 102.
  • the ILF 111 performs tap coefficient learning using, for example, the decoded image from the arithmetic unit 110 and the original image from the rearrangement buffer 102 as a student image and a teacher image, respectively, and obtains degenerate tap coefficients for each class.
  • tap coefficient learning using the decoded image as a student image, class classification was performed by the activity ADRC method, and each class obtained by the class classification was transposed according to the transposition mode obtained by the class classification as well.
  • a degenerate tap coefficient that statistically minimizes the prediction error of the predicted value of the original image as a teacher image obtained by a prediction formula composed of transposed tap coefficients and prediction taps is obtained by the method of least squares.
  • the ILF 111 supplies the degenerate tap coefficient for each class obtained by tap coefficient learning to the lossless encoding unit 106 as filter information.
  • the ILF 111 sequentially selects the pixels of the decoded image from the calculation unit 110 as the target pixel.
  • the ILF 111 classifies the target pixel by the activity ADRC method, and obtains the class and transposition mode of the target pixel.
  • the ILF 111 expands the degenerated tap coefficient of the class of the target pixel among the degenerated tap coefficients for each class obtained by the tap coefficient learning, and further transposes the expanded tap coefficient obtained by the expansion of the target pixel. By transposing according to the mode, a transposition tap coefficient corresponding to the class of the pixel of interest and the transposition mode is generated.
  • the ILF 111 selects a pixel in the vicinity of the target pixel from the decoded image as a prediction tap, and performs a product-sum operation on the transposed tap coefficient according to the class of the target pixel and the transposition mode and the pixel of the decoded image as the prediction tap.
  • Filter processing is performed as prediction processing for applying the prediction formula to be applied to the decoded image, and a filtered image is generated.
  • the filter image generated by the ILF 111 is supplied to the frame memory 112.
  • the frame memory 112 temporarily stores the filter image supplied from the ILF 111.
  • the filter image stored in the frame memory 112 is supplied to the selection unit 113 as a reference image used for generating a predicted image at a necessary timing.
  • the selection unit 113 selects a supply destination of the reference image supplied from the frame memory 112. For example, when intra prediction is performed in the intra prediction unit 114, the selection unit 113 supplies the reference image supplied from the frame memory 112 to the intra prediction unit 114. For example, when inter prediction is performed in the motion prediction / compensation unit 115, the selection unit 113 supplies the reference image supplied from the frame memory 112 to the motion prediction / compensation unit 115.
  • the intra prediction unit 114 uses the original image supplied from the rearrangement buffer 102 and the reference image supplied from the frame memory 112 via the selection unit 113, and uses, for example, PU (Prediction Unit) as a processing unit. Prediction (in-screen prediction) is performed.
  • the intra prediction unit 114 selects an optimal intra prediction mode based on a predetermined cost function (for example, RD cost), and sends a prediction image generated in the optimal intra prediction mode to the prediction image selection unit 116. Supply. Further, as described above, the intra prediction unit 114 appropriately supplies a prediction mode indicating the intra prediction mode selected based on the cost function to the lossless encoding unit 106 and the like.
  • the motion prediction / compensation unit 115 uses the original image supplied from the rearrangement buffer 102 and the reference image supplied from the frame memory 112 via the selection unit 113, for example, using the PU as a processing unit (motion prediction (inter Prediction). Furthermore, the motion prediction / compensation unit 115 performs motion compensation according to a motion vector detected by motion prediction, and generates a predicted image. The motion prediction / compensation unit 115 performs inter prediction in a plurality of inter prediction modes prepared in advance, and generates a prediction image.
  • the motion prediction / compensation unit 115 selects an optimal inter prediction mode based on a predetermined cost function of the prediction image obtained for each of the plurality of inter prediction modes. Furthermore, the motion prediction / compensation unit 115 supplies the predicted image generated in the optimal inter prediction mode to the predicted image selection unit 116.
  • the motion prediction / compensation unit 115 performs motion such as a prediction mode indicating an inter prediction mode selected based on a cost function and a motion vector necessary for decoding encoded data encoded in the inter prediction mode. Information or the like is supplied to the lossless encoding unit 106.
  • the prediction image selection unit 116 selects a supply source (intra prediction unit 114 or motion prediction compensation unit 115) of a prediction image to be supplied to the calculation units 103 and 210, and selects a prediction image supplied from the selected supply source. , To the arithmetic units 103 and 210.
  • the rate control unit 117 controls the quantization operation rate of the quantization unit 105 based on the code amount of the encoded bitstream stored in the storage buffer 107 so that overflow or underflow does not occur. That is, the rate control unit 117 sets the target code amount of the encoded bit stream so as not to cause overflow and underflow of the accumulation buffer 107, and supplies the target code amount to the quantization unit 105.
  • the arithmetic unit 103 to the lossless encoding unit 106 are the encoding unit 61 in FIG. 24, the inverse quantization unit 108 to the arithmetic unit 110 are to the local decoding unit 62 in FIG. 24, and the ILF 111 is the filter in FIG. It corresponds to the part 63, respectively.
  • FIG. 28 is a flowchart for explaining an example of the encoding process of the encoding device 60 of FIG.
  • the ILF 111 temporarily stores the decoded image supplied from the calculation unit 110 and temporarily stores the original image corresponding to the decoded image supplied from the rearrangement buffer 102 and supplied from the calculation unit 110.
  • step S101 the encoding device 60 (a control unit (not shown)) determines whether or not the current timing is an update timing for updating the filter information.
  • the update timing of the filter information is, for example, every one or more frames (pictures), every one or more sequences, every one or more slices, every one or more lines of a predetermined block such as CTU, etc. , Can be determined in advance.
  • a timing at which the S / N of the filter image becomes equal to or less than a threshold in addition to a periodic (fixed) timing such as a timing for each of one or more frames (pictures), a timing at which the S / N of the filter image becomes equal to or less than a threshold (filter image)
  • a dynamic timing such as a timing when an error with respect to the original image becomes equal to or greater than a threshold value or a timing when a residual (a sum of absolute values thereof) becomes equal to or greater than a threshold value can be adopted.
  • the ILF 111 performs tap coefficient learning using one frame of the decoded image and the original image, and the timing for each frame is the update timing of the filter information.
  • step S101 If it is determined in step S101 that the current timing is not the update timing of the filter information, the process skips steps S102 to S104 and proceeds to step S106.
  • step S101 If it is determined in step S101 that the current timing is the update timing of the filter information, the process proceeds to step S102, and the ILF 111 performs tap coefficient learning for obtaining a degenerate tap coefficient for each class.
  • the ILF 111 uses, for example, the decoded image and the original image (here, the latest one-frame decoded image and original image supplied to the ILF 111) stored between the previous update timing and the current update timing. Then, tap coefficient learning is performed to obtain a degenerate tap coefficient for each class.
  • step S102 the ILF 111 supplies the degenerate tap coefficient for each class to the lossless encoding unit 106 as filter information.
  • the lossless encoding unit 106 sets the filter information from the ILF 111 as a transmission target, and the process proceeds from step S103 to step S104.
  • the filter information set as the transmission target is included in the encoded bitstream and transmitted in the predictive encoding process performed in step S105 described later.
  • step S104 the ILF 111 updates the degenerate tap coefficient used for the class classification prediction process with the degenerate tap coefficient for each class obtained by the latest tap coefficient learning in step S102, and the process proceeds to step S105.
  • step S105 the predictive encoding process of the original image is performed, and the encoding process ends.
  • FIG. 29 is a flowchart illustrating an example of the predictive encoding process in step S105 of FIG.
  • step S111 the A / D conversion unit 101 performs A / D conversion of the original image, supplies the original image to the rearrangement buffer 102, and the process proceeds to step S112.
  • step S112 the rearrangement buffer 102 stores the original images from the A / D conversion unit 101, rearranges them in the encoding order, and outputs them, and the process proceeds to step S113.
  • step S113 the intra prediction unit 114 performs an intra prediction process in the intra prediction mode, and the process proceeds to step S114.
  • step S114 the motion prediction / compensation unit 115 performs an inter motion prediction process for performing motion prediction and motion compensation in the inter prediction mode, and the process proceeds to step S115.
  • a cost function of various prediction modes is calculated and a prediction image is generated.
  • step S115 the predicted image selection unit 116 determines an optimal prediction mode based on each cost function obtained by the intra prediction unit 114 and the motion prediction compensation unit 115. Then, the predicted image selection unit 116 selects and outputs the predicted image of the optimal prediction mode among the predicted image generated by the intra prediction unit 114 and the predicted image generated by the motion prediction / compensation unit 115, and performs processing. Advances from step S115 to step S116.
  • step S ⁇ b> 116 the calculation unit 103 calculates the residual between the encoding target image that is the original image output from the rearrangement buffer 102 and the predicted image output from the predicted image selection unit 116, and the orthogonal transform unit 104. The process proceeds to step S117.
  • step S117 the orthogonal transform unit 104 orthogonally transforms the residual from the operation unit 103, supplies the resulting orthogonal transform coefficient to the quantization unit 105, and the process proceeds to step S118.
  • step S118 the quantization unit 105 quantizes the orthogonal transform coefficient from the orthogonal transform unit 104, and supplies the quantized coefficient obtained by the quantization to the lossless encoding unit 106 and the inverse quantization unit 108.
  • the process proceeds to step S119.
  • step S119 the inverse quantization unit 108 inversely quantizes the quantization coefficient from the quantization unit 105, supplies the resulting orthogonal transform coefficient to the inverse orthogonal transform unit 109, and the process proceeds to step S120. move on.
  • step S120 the inverse orthogonal transform unit 109 performs the inverse orthogonal transform on the orthogonal transform coefficient from the inverse quantization unit 108, supplies the residual obtained as a result to the arithmetic unit 110, and the process proceeds to step S121. .
  • step S ⁇ b> 121 the calculation unit 110 adds the residual from the inverse orthogonal transform unit 109 and the predicted image output from the predicted image selection unit 116, and is the source of the residual calculation target in the calculation unit 103. A decoded image corresponding to the image is generated. The calculation unit 110 supplies the decoded image to the ILF 111, and the process proceeds from step S121 to step S122.
  • step S122 the ILF 111 performs filter processing as class classification prediction processing on the decoded image from the arithmetic unit 110, supplies the filter image obtained by the filter processing to the frame memory 112, and processing is performed in step S122. To step S123.
  • step S122 the same process as the class classification prediction filter 10 (FIG. 20) is performed.
  • the ILF 111 classifies the target pixel of the decoded image from the calculation unit 110 using the activity ADRC method, and obtains the class (final class) and transposition mode of the target pixel. Furthermore, the ILF 111 expands the degenerated tap coefficients of the class of the pixel of interest among the degenerated tap coefficients for each class updated in step S104 of FIG. 28, and generates expanded tap coefficients. Then, the ILF 111 transposes the expanded tap coefficient according to the transposition mode of the target pixel obtained by the class classification of the target pixel, and generates a transposed tap coefficient.
  • the ILF 111 performs a filtering process as a prediction process in which a prediction formula configured using a transposed tap coefficient corresponding to the class of the target pixel and the transposed mode is applied to the decoded image, and generates a filtered image.
  • the filter image is supplied from the ILF 111 to the frame memory 112.
  • step S123 the frame memory 112 stores the filter image supplied from the ILF 111, and the process proceeds to step S124.
  • the filtered image stored in the frame memory 112 is used as a reference image from which a predicted image is generated in steps S113 and S114.
  • step S124 the lossless encoding unit 106 encodes the encoded data that is the quantization coefficient from the quantizing unit 105, and generates an encoded bitstream including the encoded data. Further, the lossless encoding unit 106 uses the quantization parameter QP used for quantization in the quantization unit 105, the prediction mode obtained by the intra prediction process in the intra prediction unit 114, and the motion prediction compensation unit 115. Encoding information such as a prediction mode and motion information obtained by the inter motion prediction process is encoded as necessary, and is included in the encoded bitstream.
  • the lossless encoding unit 106 encodes the filter information set as the transmission target in step S103 of FIG. 28 as necessary, and includes the encoded information in the encoded bitstream. Then, the lossless encoding unit 106 supplies the encoded bit stream to the accumulation buffer 107, and the process proceeds from step S124 to step S125.
  • step S125 the accumulation buffer 107 accumulates the encoded bit stream from the lossless encoding unit 106, and the process proceeds to step S126.
  • the encoded bit stream stored in the storage buffer 107 is read out and transmitted as appropriate.
  • step S126 the rate control unit 117 performs quantization of the quantization unit 105 so that overflow or underflow does not occur based on the code amount (generated code amount) of the encoded bitstream stored in the storage buffer 107.
  • the rate of the encoding operation is controlled, and the encoding process ends.
  • FIG. 30 is a block diagram illustrating a detailed configuration example of the decoding device 70 of FIG.
  • the decoding device 70 includes a storage buffer 201, a lossless decoding unit 202, an inverse quantization unit 203, an inverse orthogonal transform unit 204, a calculation unit 205, an ILF 206, a rearrangement buffer 207, and a D / A conversion unit 208.
  • the decoding device 70 includes a frame memory 210, a selection unit 211, an intra prediction unit 212, a motion prediction / compensation unit 213, and a selection unit 214.
  • the accumulation buffer 201 temporarily accumulates the encoded bit stream transmitted from the encoding device 60, and supplies the encoded bit stream to the lossless decoding unit 202 at a predetermined timing.
  • the lossless decoding unit 202 receives the encoded bit stream from the accumulation buffer 201 and decodes it using a method corresponding to the encoding method of the lossless encoding unit 106 in FIG.
  • the lossless decoding unit 202 supplies a quantization coefficient as encoded data included in the decoding result of the encoded bitstream to the inverse quantization unit 203.
  • the lossless decoding unit 202 has a function of performing parsing.
  • the lossless decoding unit 202 parses necessary encoding information and filter information included in the decoding result of the encoded bitstream, and supplies the encoded information to the intra prediction unit 212, the motion prediction compensation unit 213, and other necessary blocks. To do. Further, the lossless decoding unit 202 supplies the filter information to the ILF 206.
  • the inverse quantization unit 203 inversely quantizes the quantization coefficient as the encoded data from the lossless decoding unit 202 by a method corresponding to the quantization method of the quantization unit 105 in FIG. 27, and is obtained by the inverse quantization.
  • the orthogonal transform coefficient is supplied to the inverse orthogonal transform unit 204.
  • the inverse orthogonal transform unit 204 performs the inverse orthogonal transform on the orthogonal transform coefficient supplied from the inverse quantization unit 203 by a method corresponding to the orthogonal transform method of the orthogonal transform unit 104 in FIG. It supplies to the calculating part 205.
  • the calculation unit 205 is also supplied with a predicted image from the intra prediction unit 212 or the motion prediction compensation unit 213 via the selection unit 214.
  • the calculation unit 205 adds the residual from the inverse orthogonal transform unit 204 and the predicted image from the selection unit 214, generates a decoded image, and supplies the decoded image to the ILF 206.
  • the ILF 206 is configured in the same manner as the class classification prediction filter 10 (FIG. 20) without a learning function, for example, and performs a filtering process as a class classification prediction process, thereby deblocking and adaptive as in the ILF 111 in FIG. It functions as one of an offset filter, a bilateral filter, an ALF, or two or more filters.
  • the ILF 206 sequentially selects the pixels of the decoded image from the calculation unit 205 as the target pixel.
  • the ILF 206 classifies the target pixel by the activity ADRC method, and obtains the class of the target pixel and the transposition mode. Further, the ILF 206 expands the degenerated tap coefficient of the class of the target pixel among the degenerated tap coefficients for each class as the filter information supplied from the lossless decoding unit 202, and further expands the expanded tap coefficient obtained by the expansion.
  • transposing according to the transposition mode of the target pixel a transposition tap coefficient according to the class of the target pixel and the transposition mode is generated.
  • the ILF 206 selects a pixel in the vicinity of the target pixel from the decoded image as a prediction tap, and performs a product-sum operation between the transposed tap coefficient corresponding to the class of the target pixel and the transposition mode and the pixel of the decoded image as the prediction tap.
  • Filter processing is performed as prediction processing in which the prediction formula to be applied is applied to the decoded image, and a filter image is generated and output.
  • the filter image output by the ILF 206 is the same image as the filter image output by the ILF 111 in FIG. 27, and is supplied to the rearrangement buffer 207 and the frame memory 210.
  • the rearrangement buffer 207 temporarily stores the filter image supplied from the ILF 206, rearranges the sequence of frames (pictures) of the filter image from the encoding (decoding) order to the display order, and supplies the rearranged image to the D / A conversion unit 208. .
  • the D / A converter 208 D / A converts the filter image supplied from the rearrangement buffer 207, and outputs and displays it on a display (not shown).
  • the frame memory 210 temporarily stores the filter image supplied from the ILF 206. Furthermore, the frame memory 210 selects the filtered image as a reference image used for generating a predicted image at a predetermined timing or based on an external request from the intra prediction unit 212, the motion prediction / compensation unit 213, or the like. To supply.
  • the selection unit 211 selects a supply destination of the reference image supplied from the frame memory 210.
  • the selection unit 211 supplies the reference image supplied from the frame memory 210 to the intra prediction unit 212 when decoding an intra-coded image.
  • the selection unit 211 supplies the reference image supplied from the frame memory 210 to the motion prediction / compensation unit 213 when decoding an inter-encoded image.
  • the intra prediction unit 212 is an intra prediction mode used in the intra prediction unit 114 of FIG. 27 according to the prediction mode included in the encoded information supplied from the lossless decoding unit 202, and is transmitted from the frame memory 210 via the selection unit 211. Intra prediction is performed using the supplied reference image. Then, the intra prediction unit 212 supplies a prediction image obtained by the intra prediction to the selection unit 214.
  • the motion prediction / compensation unit 213 moves the selection unit 211 from the frame memory 210 in the inter prediction mode used in the motion prediction / compensation unit 115 in FIG. 27 according to the prediction mode included in the encoded information supplied from the lossless decoding unit 202. Inter prediction is performed using a reference image supplied through the network. The inter prediction is performed using the motion information included in the encoded information supplied from the lossless decoding unit 202 as necessary.
  • the motion prediction / compensation unit 213 supplies a prediction image obtained by inter prediction to the selection unit 214.
  • the selection unit 214 selects a prediction image supplied from the intra prediction unit 212 or a prediction image supplied from the motion prediction / compensation unit 213 and supplies the selected prediction image to the calculation unit 205.
  • the lossless decoding unit 202 corresponds to the parsing unit 71 in FIG. 24
  • the inverse quantization unit 203 through the operation unit 205 corresponds to the decoding unit 72 in FIG. 24
  • the ILF 206 corresponds to the filter unit 73 in FIG. .
  • FIG. 31 is a flowchart for explaining an example of the decoding process of the decoding device 70 of FIG.
  • step S201 the accumulation buffer 201 temporarily accumulates the encoded bit stream transmitted from the encoding apparatus 60, supplies it to the lossless decoding unit 202 as appropriate, and the process proceeds to step S202.
  • step S202 the lossless decoding unit 202 receives and decodes the encoded bitstream supplied from the accumulation buffer 201, and dequantizes the quantization coefficient as encoded data included in the decoding result of the encoded bitstream. To the unit 203.
  • the lossless decoding unit 202 parses the filter information and the encoding information when the decoding result of the encoded bitstream includes the filter information and the encoding information. Then, the lossless decoding unit 202 supplies necessary encoding information to the intra prediction unit 212, the motion prediction / compensation unit 213, and other necessary blocks. Further, the lossless decoding unit 202 supplies the filter information to the ILF 206.
  • step S202 the process proceeds from step S202 to step S203, and the ILF 206 determines whether or not a degenerate tap coefficient for each class as filter information is supplied from the lossless decoding unit 202.
  • step S203 If it is determined in step S203 that the degenerate tap coefficient for each class as the filter information is not supplied, the process skips step S204 and proceeds to step S205.
  • step S203 If it is determined in step S203 that a degenerate tap coefficient for each class as filter information has been supplied, the process proceeds to step S204, and the ILF 206 degenerates for each class as filter information from the lossless decoding unit 202. Get the tap coefficient. Further, the ILF 206 updates the degenerate tap coefficient used for the class classification prediction process with the degenerate tap coefficient for each class as the filter information from the lossless decoding unit 202.
  • step S205 a prediction decoding process is performed, and a decoding process is complete
  • FIG. 32 is a flowchart for explaining an example of the predictive decoding process in step S205 of FIG.
  • step S211 the inverse quantization unit 203 inversely quantizes the quantized coefficient from the lossless decoding unit 202, supplies the orthogonal transform coefficient obtained as a result to the inverse orthogonal transform unit 204, and the process proceeds to step S212. move on.
  • step S212 the inverse orthogonal transform unit 204 performs inverse orthogonal transform on the orthogonal transform coefficient from the inverse quantization unit 203, supplies the residual obtained as a result to the calculation unit 205, and the process proceeds to step S213. .
  • step S213 the intra prediction unit 212 or the motion prediction / compensation unit 213 uses the reference image supplied from the frame memory 210 via the selection unit 211 and the encoding information supplied from the lossless decoding unit 202 to perform prediction. Intra prediction processing or inter motion prediction processing for generating an image is performed. Then, the intra prediction unit 212 or the motion prediction / compensation unit 213 supplies the prediction image obtained by the intra prediction process or the inter motion prediction process to the selection unit 214, and the process proceeds from step S213 to step S214.
  • step S214 the selection unit 214 selects the prediction image supplied from the intra prediction unit 212 or the motion prediction / compensation unit 213, supplies the prediction image to the calculation unit 205, and the process proceeds to step S215.
  • step S215 the arithmetic unit 205 adds the residual from the inverse orthogonal transform unit 204 and the predicted image from the selection unit 214 to generate a decoded image. Then, the arithmetic unit 205 supplies the decoded image to the ILF 206, and the process proceeds from step S215 to step S216.
  • step S216 the ILF 206 performs filter processing as class classification prediction processing on the decoded image from the arithmetic unit 205, and supplies the filter image obtained by the filter processing to the rearrangement buffer 207 and the frame memory 210. Processing proceeds from step S216 to step S217.
  • step S216 the same process as the class classification prediction filter 10 (FIG. 20) is performed.
  • the ILF 206 classifies the target pixel of the decoded image from the calculation unit 205 using the activity ADRC method, and obtains the class (final class) and transposition mode of the target pixel. Further, the ILF 206 expands the degenerated tap coefficients of the class of the pixel of interest among the degenerated tap coefficients for each class updated in step S204 in FIG. 31, and generates expanded tap coefficients. Then, the ILF 206 transposes the expanded tap coefficient according to the transposition mode of the pixel of interest from the class classification unit 11, and generates a transposed tap coefficient.
  • the ILF 206 performs a filtering process as a prediction process in which a prediction formula configured by a transposed tap coefficient corresponding to the class of the pixel of interest and the transposed mode is applied to the decoded image, and generates a filtered image.
  • the filter image is supplied from the ILF 206 to the rearrangement buffer 207 and the frame memory 210.
  • step S217 the rearrangement buffer 207 temporarily stores the filter image supplied from the ILF 206. Further, the rearrangement buffer 207 rearranges the stored filter images in the display order and supplies the rearranged buffer images to the D / A conversion unit 208, and the process proceeds from step S217 to step S218.
  • step S218 the D / A conversion unit 208 D / A converts the filter image from the rearrangement buffer 207, and the process proceeds to step S219.
  • the filter image after D / A conversion is output and displayed on a display (not shown).
  • step S219 the frame memory 210 stores the filter image supplied from the ILF 206, and the decoding process ends.
  • the filtered image stored in the frame memory 210 is used as a reference image from which a predicted image is generated in the intra prediction process or the inter motion prediction process in step S213.
  • the class classification prediction filter 10 (FIG. 20) is, for example, an interpolation filter used for generating predicted images of the motion prediction compensation units 115 and 213 in the encoding device 60 and the decoding device 70. Can be applied.
  • an activity in the spatial direction an activity in a frame (picture)
  • an activity in the time direction is included as an image activity. Can be used.
  • FIG. 33 is a block diagram illustrating a configuration example of an embodiment of a computer in which a program for executing the series of processes described above is installed.
  • the program can be recorded in advance on a hard disk 905 or a ROM 903 as a recording medium built in the computer.
  • the program can be stored (recorded) in a removable recording medium 911.
  • a removable recording medium 911 can be provided as so-called package software.
  • examples of the removable recording medium 911 include a flexible disk, a CD-ROM (Compact Disc Read Only Memory), a MO (Magneto Optical) disc, a DVD (Digital Versatile Disc), a magnetic disc, and a semiconductor memory.
  • the program can be installed on the computer from the removable recording medium 911 as described above, or downloaded to the computer via a communication network or a broadcast network, and installed on the built-in hard disk 905. That is, the program is transferred from a download site to a computer wirelessly via a digital satellite broadcasting artificial satellite, or wired to a computer via a network such as a LAN (Local Area Network) or the Internet. be able to.
  • a network such as a LAN (Local Area Network) or the Internet.
  • the computer includes a CPU (Central Processing Unit) 902, and an input / output interface 910 is connected to the CPU 902 via a bus 901.
  • CPU Central Processing Unit
  • the CPU 902 executes a program stored in a ROM (Read Only Memory) 903 accordingly. .
  • the CPU 902 loads a program stored in the hard disk 905 into a RAM (Random Access Memory) 904 and executes it.
  • the CPU 902 performs processing according to the flowchart described above or processing performed by the configuration of the block diagram described above. Then, the CPU 902 outputs the processing result as necessary, for example, via the input / output interface 910, output from the output unit 906, or transmitted from the communication unit 908, and further recorded on the hard disk 905.
  • the input unit 907 includes a keyboard, a mouse, a microphone, and the like.
  • the output unit 906 includes an LCD (Liquid Crystal Display), a speaker, and the like.
  • the processing performed by the computer according to the program does not necessarily have to be performed in chronological order in the order described as the flowchart. That is, the processing performed by the computer according to the program includes processing executed in parallel or individually (for example, parallel processing or object processing).
  • the program may be processed by one computer (processor), or may be distributedly processed by a plurality of computers. Furthermore, the program may be transferred to a remote computer and executed.
  • the system means a set of a plurality of components (devices, modules (parts), etc.), and it does not matter whether all the components are in the same housing. Accordingly, a plurality of devices housed in separate housings and connected via a network and a single device housing a plurality of modules in one housing are all systems. .
  • the present technology can take a cloud computing configuration in which one function is shared by a plurality of devices via a network and is jointly processed.
  • each step described in the above flowchart can be executed by one device or can be shared by a plurality of devices.
  • the plurality of processes included in the one step can be executed by being shared by a plurality of apparatuses in addition to being executed by one apparatus.
  • the present technology can be applied to any image encoding / decoding method. That is, unless there is a contradiction with the above-described present technology, specifications of various processes relating to image encoding / decoding such as transformation (inverse transformation), quantization (inverse quantization), encoding (decoding), prediction, etc. are arbitrary. The example is not limited. Moreover, as long as there is no contradiction with this technique mentioned above, you may abbreviate
  • the data unit in which various information described above is set and the data unit targeted by various processes are arbitrary and are not limited to the examples described above.
  • these information and processing are TU (Transform Unit), TB (Transform Block), PU (Prediction Unit), PB (Prediction Block), CU (Coding Unit), LCU (Largest Coding Unit), and sub-block, respectively. It may be set for each block, tile, slice, picture, sequence, or component, or the data of those data units may be targeted.
  • this data unit can be set for each information or process, and it is not necessary to unify all the data units of information and processes.
  • the storage location of these pieces of information is arbitrary, and the information may be stored in the above-described data unit header, parameter set, or the like. Moreover, you may make it store in multiple places.
  • control information related to the present technology described in each of the above embodiments may be transmitted from the encoding side to the decoding side. For example, you may make it transmit the control information (for example, enabled_flag) which controls whether application (or prohibition) of applying this technique mentioned above is permitted. Further, for example, control information indicating a target to which the present technology is applied (or a target to which the present technology is not applied) may be transmitted. For example, control information designating a block size (upper limit or lower limit, or both) to which the present technology is applied (or permission or prohibition of application), a frame, a component, a layer, or the like may be transmitted.
  • a block size upper limit or lower limit, or both
  • the block size may be specified indirectly.
  • the block size may be designated using identification information for identifying the size.
  • the block size may be specified by a ratio or difference with the size of a reference block (for example, LCU or SCU).
  • a reference block for example, LCU or SCU.
  • the designation of the block size includes designation of a block size range (for example, designation of an allowable block size range).
  • “flag” is information for identifying a plurality of states, and is not only information used for identifying two states of true (1) or false (0), but also three or more Information that can identify the state is also included. Therefore, the value that can be taken by the “flag” may be, for example, a binary value of 1/0, or may be three or more values. That is, the number of bits constituting this “flag” is arbitrary, and may be 1 bit or a plurality of bits.
  • the identification information includes not only the form in which the identification information is included in the bitstream but also the form in which the difference information of the identification information with respect to certain reference information is included in the bitstream.
  • the “flag” and “identification information” include not only the information but also difference information with respect to the reference information.
  • this technique can take the following structures.
  • a decoding unit that decodes encoded data included in the encoded bitstream using a filter image and generates a decoded image; By performing ADRC (Adaptive Dynamic Range Coding) processing of the activity of the target pixel of the decoded image generated by the decoding unit, class classification is performed to classify the target pixel into one of a plurality of classes.
  • a classification section Performing a filtering process that applies a prediction formula that performs a product-sum operation on the tap coefficient of the class of the target pixel obtained by the class classification performed by the class classification unit and the pixel of the decoded image to the decoded image;
  • a decoding device comprising: a filter unit that generates the filter image.
  • the classification unit includes: ADRC processing of activities in each of a plurality of directions of four or more directions starting from the pixel of interest, The decoding device according to ⁇ 1>, wherein classification is performed according to an activity code that is a value after ADRC processing of the activity.
  • the class classification unit performs ADRC processing of activities in each of four directions of a vertical direction, a horizontal direction, a first diagonal direction, and a second diagonal direction starting from the pixel of interest as a starting point. apparatus.
  • the decoding device according to ⁇ 3>, wherein the class classification unit performs class classification according to a binary code in which the activity codes obtained by ADRC processing of activities in each of the four directions are arranged.
  • the class classification unit arranges the activity codes in the four directions at the positions in the four directions, and the code patterns arranged at positions that are point-symmetric to the positions in the four directions become the same by a predetermined transposition.
  • the decoding device according to ⁇ 4> wherein the pixels from which the activity codes in the four directions are obtained are classified into the same class.
  • ⁇ 6> The decoding device according to ⁇ 5>, wherein the class classification unit obtains a class of the target pixel and a transposition mode representing the predetermined transposition according to the binary code.
  • ⁇ 7> The tap coefficients at point-symmetric positions are the same, The decoding device according to ⁇ 6>, wherein the filter unit performs the filtering process using a transposed tap coefficient obtained by transposing the tap coefficient according to the transposed mode.
  • the filter unit performs the filtering process using a transposed tap coefficient obtained by transposing the tap coefficient according to the transposed mode.
  • the decoding device according to any one of ⁇ 3> to ⁇ 7>, wherein the class classification unit performs class classification according to a dynamic range of activities in each of the four directions.
  • the class classification unit performs class classification on a pixel having a dynamic range of activity in each of the four directions that is equal to or greater than a threshold according to the activity code obtained by ADRC processing of the activity in each of the four directions.
  • ⁇ 10> The decoding device according to ⁇ 9>, wherein the class classification unit sets the threshold according to a quantization parameter.
  • the class classification unit obtains the final activity of the target pixel by adding the activity of the target pixel and the activity of the pixels in the peripheral region of the target pixel. Any one of ⁇ 1> to ⁇ 10> The decoding device according to 1.
  • ⁇ 12> The decoding device according to any one of ⁇ 1> to ⁇ 11>, further including a parsing unit that parses the tap coefficient included in the encoded bitstream.
  • the decoding unit decodes the encoded data using a CU (Coding Unit) of a Quad-Tree Block Structure or a QTBT (Quad Tree Plus Binary Tree) Block Structure as a processing unit. Any one of ⁇ 1> to ⁇ 12> A decoding apparatus according to claim 1.
  • ⁇ 14> Decoding encoded data included in the encoded bitstream using a filter image to generate a decoded image;
  • ADRC Adaptive Dynamic Range Coding
  • a decoding method including: ⁇ 15> A class classification unit for classifying the target pixel into any one of a plurality of classes by performing an ADRC (Adaptive Dynamic Range Coding) process on the activity of the target pixel of the locally decoded decoded image; , Performing a filtering process that applies a prediction formula that performs a product-sum operation on the tap coefficient of the class of the target pixel obtained by the class classification performed by the class classification unit and the pixel
  • the classification unit includes: ADRC processing of activities in each of a plurality of directions of four or more directions starting from the pixel of interest, The encoding device according to ⁇ 15>, wherein classification is performed according to an activity code that is a value after ADRC processing of the activity.
  • the class classification unit performs ADRC processing of activities in each of four directions of a vertical direction, a horizontal direction, a first diagonal direction, and a second diagonal direction starting from the target pixel.
  • Device ⁇ 18> The encoding device according to ⁇ 17>, wherein the class classification unit performs class classification according to a binary code in which the activity codes obtained by ADRC processing of the activities in the four directions are arranged.
  • the class classification unit arranges the activity codes in the four directions at the positions in the four directions, and the code patterns arranged at positions that are point-symmetric to the positions in the four directions become the same by a predetermined transposition.
  • ⁇ 21> The tap coefficients at point-symmetric positions are the same, The encoding device according to ⁇ 20>, wherein the filter unit performs the filtering process using a transposed tap coefficient obtained by transposing the tap coefficient according to the transposed mode.
  • the filter unit performs the filtering process using a transposed tap coefficient obtained by transposing the tap coefficient according to the transposed mode.
  • the class classification unit performs class classification on a pixel having a dynamic range of activity in each of the four directions that is equal to or greater than a threshold according to the activity code obtained by ADRC processing of the activity in each of the four directions.
  • ⁇ 24> The encoding device according to ⁇ 23>, wherein the class classification unit sets the threshold according to a quantization parameter.
  • the class classification unit obtains the final activity of the target pixel by adding the activity of the target pixel and the activity of the pixels in the peripheral region of the target pixel. Any one of ⁇ 15> to ⁇ 24> The encoding device described in 1.
  • the filter unit uses the original image and the decoded image to determine the tap coefficient that minimizes a statistical error in the predicted value of the original image obtained by the prediction formula, The encoding unit generates an encoded bitstream including encoded data obtained by encoding the original image and the tap coefficient. ⁇ 15> to ⁇ 25> .
  • the encoding unit encodes the original image using a CU (Coding Unit) of Quad-Tree Block Structure or QTBT (Quad Tree Plus Binary Tree) Block Structure as a processing unit.
  • CU Coding Unit
  • QTBT Quad Tree Plus Binary Tree
  • ADRC Adaptive Dynamic Range Coding

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

La présente technologie concerne un dispositif de codage et un procédé de codage, et un dispositif de décodage et un procédé de décodage, qui permettent d'améliorer l'efficacité du codage et la qualité des images. Le dispositif de codage et le dispositif de décodage réalisent une classification de classe pour classifier un pixel d'intérêt d'une image décodée dans l'une quelconque d'une pluralité de classes en soumettant l'activité du pixel d'intérêt à un processus de codage à plage dynamique adaptative (ADRC). Le dispositif de codage et le dispositif de décodage effectuent ensuite un processus de filtrage dans lequel une formule de prédiction pour effectuer une opération produit-somme entre un coefficient de prise de la classe du pixel d'intérêt obtenu par classification de classe et un pixel de l'image décodée est appliquée à l'image décodée et génère une image filtrée. La présente technologie peut être appliquée pour le codage ou le décodage d'une image.
PCT/JP2019/015907 2018-04-26 2019-04-12 Dispositif de codage, procédé de codage, dispositif de décodage et procédé de décodage WO2019208258A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2018-084983 2018-04-26
JP2018084983A JP2021114636A (ja) 2018-04-26 2018-04-26 符号化装置、符号化方法、復号装置、及び、復号方法

Publications (1)

Publication Number Publication Date
WO2019208258A1 true WO2019208258A1 (fr) 2019-10-31

Family

ID=68295405

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2019/015907 WO2019208258A1 (fr) 2018-04-26 2019-04-12 Dispositif de codage, procédé de codage, dispositif de décodage et procédé de décodage

Country Status (3)

Country Link
JP (1) JP2021114636A (fr)
TW (1) TW201946471A (fr)
WO (1) WO2019208258A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023237094A1 (fr) * 2022-06-11 2023-12-14 Beijing Bytedance Network Technology Co., Ltd. Prises étendues utilisant différentes sources pour un filtre à boucle adaptatif dans un codage vidéo

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10191353A (ja) * 1996-12-24 1998-07-21 Sony Corp 画像符号化装置および画像符号化方法、画像復号化装置および画像復号化方法、並びに記録媒体
JP2014532375A (ja) * 2011-10-13 2014-12-04 クゥアルコム・インコーポレイテッドQualcomm Incorporated ビデオコーディングにおいて適応ループフィルタとマージされたサンプル適応オフセット
JP2017523668A (ja) * 2014-06-13 2017-08-17 インテル コーポレイション ビデオ符号化用の高コンテンツ適応型品質回復フィルタ処理のためのシステムおよび方法
WO2017183479A1 (fr) * 2016-04-22 2017-10-26 ソニー株式会社 Dispositif d'encodage et procédé d'encodage, et dispositif de décodage et procédé de décodage

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10191353A (ja) * 1996-12-24 1998-07-21 Sony Corp 画像符号化装置および画像符号化方法、画像復号化装置および画像復号化方法、並びに記録媒体
JP2014532375A (ja) * 2011-10-13 2014-12-04 クゥアルコム・インコーポレイテッドQualcomm Incorporated ビデオコーディングにおいて適応ループフィルタとマージされたサンプル適応オフセット
JP2017523668A (ja) * 2014-06-13 2017-08-17 インテル コーポレイション ビデオ符号化用の高コンテンツ適応型品質回復フィルタ処理のためのシステムおよび方法
WO2017183479A1 (fr) * 2016-04-22 2017-10-26 ソニー株式会社 Dispositif d'encodage et procédé d'encodage, et dispositif de décodage et procédé de décodage

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
CHONG, I. S. ET AL.: "AHG6: Comparison of block adatpive (BA) and region adaptive (RA) ALF", JOINT COLLABORATIVE TEAM ON VIDEO CODING (JCT-VC) OF ITU-T SG 16 WP 3, 27 April 2012 (2012-04-27), pages 1 - 10, XP030052953 *
IKAI, TOMOHIRO ET AL.: "CE8. 1:Block based Adaptive Loop Filter with flexible syntax and additional BA mode by Sharp and Qualcomm", JOINT COLLABORATIVE TEAM ON VIDEO CODING (JCT-VC) OF ITU-T SG 16 WP3, 14 July 2011 (2011-07-14), Torino, IT, pages 1 - 6, XP030049373 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023237094A1 (fr) * 2022-06-11 2023-12-14 Beijing Bytedance Network Technology Co., Ltd. Prises étendues utilisant différentes sources pour un filtre à boucle adaptatif dans un codage vidéo

Also Published As

Publication number Publication date
JP2021114636A (ja) 2021-08-05
TW201946471A (zh) 2019-12-01

Similar Documents

Publication Publication Date Title
US11936858B1 (en) Constrained position dependent intra prediction combination (PDPC)
US10841614B2 (en) Low-complexity intra prediction for video coding
US9210442B2 (en) Efficient transform unit representation
US9462271B2 (en) Moving image encoding device, moving image decoding device, moving image coding method, and moving image decoding method
US20120128064A1 (en) Image processing device and method
WO2019220947A1 (fr) Dispositif de codage, procédé de codage, dispositif de décodage et procédé de décodage
WO2019208258A1 (fr) Dispositif de codage, procédé de codage, dispositif de décodage et procédé de décodage
US20210168407A1 (en) Encoding device, encoding method, decoding device, and decoding method
US11451833B2 (en) Encoding device, encoding method, decoding device, and decoding method
WO2020145143A1 (fr) Dispositif et procédé de traitement d'informations
WO2019131161A1 (fr) Dispositif de codage, procédé de codage, dispositif de décodage et procédé de décodage
WO2019198519A1 (fr) Dispositif de traitement de données et procédé de traitement de données
EP2941000B1 (fr) Dispositif de codage vidéo, procédé de codage vidéo et programme de codage vidéo
WO2020008910A1 (fr) Dispositif de codage, procédé de codage, dispositif de décodage, et procédé de décodage
WO2011161823A1 (fr) Procédé de vidéocodage et procédé de décodage vidéo
JP2001320587A (ja) データ処理装置およびデータ処理方法、並びに記録媒体
WO2020262370A1 (fr) Dispositif et procédé de traitement d'image
JP2001320277A (ja) データ処理装置およびデータ処理方法、並びに記録媒体
WO2020066643A1 (fr) Dispositif de codage, procédé de codage, dispositif de décodage, et procédé de décodage
KR20240094036A (ko) 화상 처리 장치 및 화상 처리 방법
US20160119630A1 (en) Video coding device, video coding method, and video coding program
JP2006311265A (ja) 映像符号化装置、映像復号化装置、映像符号化方法、映像復号化方法、映像符号化プログラムおよび映像復号化プログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19794036

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19794036

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP