WO2019141258A1 - Procédé de codage vidéo, procédé de décodage vidéo, dispositif, et système - Google Patents

Procédé de codage vidéo, procédé de décodage vidéo, dispositif, et système Download PDF

Info

Publication number
WO2019141258A1
WO2019141258A1 PCT/CN2019/072417 CN2019072417W WO2019141258A1 WO 2019141258 A1 WO2019141258 A1 WO 2019141258A1 CN 2019072417 W CN2019072417 W CN 2019072417W WO 2019141258 A1 WO2019141258 A1 WO 2019141258A1
Authority
WO
WIPO (PCT)
Prior art keywords
image block
target image
distorted
picture
distortion
Prior art date
Application number
PCT/CN2019/072417
Other languages
English (en)
Chinese (zh)
Inventor
周璐璐
Original Assignee
杭州海康威视数字技术股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杭州海康威视数字技术股份有限公司 filed Critical 杭州海康威视数字技术股份有限公司
Publication of WO2019141258A1 publication Critical patent/WO2019141258A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/117Filters, e.g. for pre-processing or post-processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/13Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/184Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being bits, e.g. of the compressed video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/80Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/80Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
    • H04N19/82Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation involving filtering within a prediction loop
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression

Definitions

  • the present application relates to the field of video coding and decoding, and in particular, to a video coding method, a video decoding method, apparatus, and system.
  • the original video picture when encoding the original video picture, the original video picture is processed multiple times to obtain a reconstructed picture.
  • the reconstructed picture can be used as a reference picture and used to encode the original video picture.
  • the reconstructed picture obtained after the original video picture is processed multiple times may have a pixel offset from the original video picture, that is, the reconstructed picture is distorted, resulting in visual impairment or artifact. These distortions affect the subjective and objective quality of the reconstructed picture. Since the reconstructed picture is used as the reference picture of the video coding, it also affects the prediction accuracy of the subsequent coding and affects the size of the final bit stream.
  • the embodiment of the present application provides a video encoding method, a video decoding method, a device, and a system.
  • the technical solution is as follows:
  • an embodiment of the present application provides a video encoding method, where the method includes:
  • the distortion picture is distorted with respect to a current original video picture of the input coding system, and the side information component represents distortion of the distortion picture relative to the current original video picture feature;
  • the distortion picture and the side information component into a convolutional neural network model to obtain a first de-distorted image block corresponding to the target image block, where the first de-distorted image block is guided by the edge information component
  • the target image block is any distorted image block included in the distorted picture
  • the performing the filtering process by inputting the distortion picture and the edge information component into a convolutional neural network model to obtain a first de-distorted image block corresponding to the target image block including:
  • a first de-distorted image block, the target image block being any of the distorted image blocks included in the distorted picture; or
  • the distortion picture and the side information component are input into a convolutional neural network model to perform a filtering process to obtain a de-distortion image, and the first de-distorted image block corresponding to the target image block is divided into the de-distorted image, the target image A block is any block of distorted images included in the distorted picture.
  • the method further includes:
  • the selecting one image block from the at least one image block corresponding to the target image block includes:
  • the selecting an image block from the at least one image block according to the original image block corresponding to the target image block in the current original video image comprises:
  • An image block having the smallest difference value between the original image blocks corresponding to the target image block is selected from the at least one image block.
  • the video bitstream further includes a filtered flag map corresponding to the distorted picture, where the method further includes:
  • an embodiment of the present application provides a video decoding method, where the method includes:
  • the target image block being any of the distorted image blocks included in the distorted picture, the distorted picture being original before encoding corresponding to the video bit stream of the input decoding system Video images are distorted;
  • the subsequently received video bitstream is decoded according to the de-distorted image block corresponding to the target image block.
  • the method further includes:
  • the target image block is input to the filter for filtering processing to obtain a de-distorted image block corresponding to the target image block;
  • the target image block is determined as a de-distorted image block corresponding to the target image block.
  • the current entropy decoded data includes a filtered flag map, where the filtered flag map includes flag information corresponding to each of the distorted image blocks, and the flag information corresponding to the distorted image block is used to identify the distorted image.
  • the acquiring the data type of the target image block includes:
  • the acquiring the data type of the target image block includes:
  • the current entropy decoded data includes location and encoding information of each coding unit in the original video picture
  • an embodiment of the present application provides a video encoding apparatus, where the apparatus includes:
  • an acquiring module configured to obtain a distortion picture and an edge information component corresponding to the distortion picture, where the distortion picture is distorted with respect to a current original video picture of the input coding system, where the side information component represents the distortion picture relative to the current Distortion characteristics of the original video picture;
  • a filtering module configured to filter the distortion picture and the side information component into a convolutional neural network model to obtain a first de-distorted image block corresponding to the target image block, where the first de-distorted image block is The edge information component is obtained by guiding the distortion picture, and the target image block is any distortion image block included in the distortion picture;
  • a selection module configured to select one image block from the at least one image block corresponding to the target image block as a target de-distorted image block corresponding to the target image block, where the at least one image block includes a target image block corresponding to the target image block a first de-distorted image block and/or the target image block;
  • an encoding module configured to encode the original video picture after the current original video picture according to the target de-distorted image block corresponding to the target image block to obtain a video bit stream.
  • the filtering module includes:
  • a first filtering unit configured to divide the distorted picture, obtain a distorted image block included in the distorted picture, and input a target image block and an edge information component corresponding to the target image block into a convolutional neural network model for filtering Obtaining a first de-distorted image block corresponding to the target image block, where the target image block is any distorted image block included in the distorted picture; or
  • a second filtering unit configured to filter the distortion picture and the side information component into a convolutional neural network model to obtain a de-distortion picture, and divide the de-distorted picture to obtain a first corresponding to the target image block.
  • a distorted image block, the target image block being any distorted image block included in the distorted picture.
  • the filtering module is further configured to:
  • the selecting module includes:
  • a first selecting unit configured to select one image block from the at least one image block according to the original image block corresponding to the target image block in the current original video image
  • a second selecting unit configured to select one image block from the at least one image block according to encoding information of each coding unit included in the target image block.
  • the first selecting unit is configured to:
  • An image block having the smallest difference value between the original image blocks corresponding to the target image block is selected from the at least one image block.
  • the video bitstream further includes a filtered flag map corresponding to the distorted picture
  • the apparatus further includes:
  • the flag map fills in the flag information for identifying the data type of the target de-distorted image block.
  • an embodiment of the present application provides a video decoding apparatus, where the apparatus includes:
  • an obtaining module configured to acquire a data type of the target image block and the target image block, where the target image block is any distortion image block included in the distortion picture, and the code corresponding to the video bit stream of the input decoding system
  • the original original video picture has distortion
  • Generating a module configured to generate an edge information component corresponding to the target image block when the data type is used to represent data filtered by a convolutional neural network model, where the side information component indicates that the target image block is relative to Decoding characteristics of corresponding original image blocks in the original video picture;
  • a filtering module configured to perform convolution filtering processing on the target image block and the side information component into a convolutional neural network model, to obtain a de-distorted image block corresponding to the target image block, where the de-distorted image block is The side information component is obtained after the filtering of the target image block is guided;
  • the decoding module is further configured to decode a subsequently received video bitstream according to the de-distorted image block corresponding to the target image block.
  • the filtering module is further configured to:
  • the target image block is input to the filter for filtering processing to obtain a de-distorted image block corresponding to the target image block;
  • the target image block is determined as a de-distorted image block corresponding to the target image block.
  • the current entropy decoded data includes a filtered flag map, where the filtered flag map includes flag information corresponding to each of the distorted image blocks, and the flag information corresponding to the distorted image block is used to identify the distorted image.
  • the obtaining module includes:
  • a reading unit configured to read, according to a position of the target image block in the distorted picture, the flag information corresponding to the target image block from the filtered flag map;
  • a first determining unit configured to determine, according to the flag information, a data type corresponding to the target image block.
  • the obtaining module is configured to:
  • the current entropy decoded data includes location and encoding information of each coding unit in the original video picture
  • the obtaining module includes:
  • a second determining unit configured to determine, according to a location of the target image block in the distorted picture and a location of each coding unit in the original video picture, each coding unit included in the target image block;
  • a third determining unit configured to determine, according to the encoding information of each coding unit included in the target image block, a data type corresponding to the target image block.
  • an embodiment of the present application provides a video encoding method, where the method includes:
  • the method further includes:
  • the target image block is input to the filter for filtering processing to obtain a de-distorted image block corresponding to the target image block;
  • the target image block is determined as a de-distorted image block corresponding to the target image block.
  • the embodiment of the present application provides a video encoding apparatus, where the apparatus includes:
  • An obtaining module configured to obtain a distorted image block included in the distorted picture, where the distorted picture is distorted with respect to a current original video picture of the input encoding system
  • a determining module configured to determine, according to encoding information of a coding unit included in the target image block, a data type corresponding to the target image block, where the target image block is any one of the distortion image blocks;
  • Generating a module configured to generate an edge information component corresponding to the target image block when the data type is used to represent data filtered by a convolutional neural network model, where the side information component indicates that the target image block is relative to Decoding characteristics of corresponding original image blocks in the original video picture;
  • a filtering module configured to perform convolution filtering processing on the target image block and the side information component into a convolutional neural network model, to obtain a de-distorted image block corresponding to the target image block, where the de-distorted image block is The side information component is obtained after the filtering of the target image block is guided;
  • an encoding module configured to encode the original video picture after the current original video picture according to the de-distorted image block corresponding to the distorted image block in the distorted picture to obtain a video bit stream.
  • the filtering module is further configured to:
  • the target image block is input to the filter for filtering processing to obtain a de-distorted image block corresponding to the target image block;
  • the target image block is determined as a de-distorted image block corresponding to the target image block.
  • the embodiment of the present application provides a computer readable storage medium, where the computer readable storage medium stores a computer program, and when the computer program is executed by a processor, the first aspect or the first aspect is implemented.
  • an embodiment of the present application provides a computer readable storage medium, where the computer readable storage medium stores a computer program, where the computer program is executed by a processor to implement the second aspect or the second aspect.
  • the ninth aspect the embodiment of the present application provides a codec system, where the system includes the video encoding apparatus provided by the first aspect, and the video decoding apparatus provided by the second aspect; or
  • the system includes the video encoding device provided by the sixth aspect and the video decoding device as provided by the second aspect.
  • the image block not only improves the filtering performance, but also improves the distortion performance during the video encoding process.
  • FIG. 1 is a flowchart of a video encoding method according to an embodiment of the present application
  • FIG. 2 is a flowchart of another video encoding method provided by an embodiment of the present application.
  • FIG. 3 is a structural block diagram of a video encoding system according to an embodiment of the present application.
  • FIG. 4 is a structural block diagram of another video encoding system according to an embodiment of the present application.
  • FIG. 5 is a schematic diagram of side information components provided by an embodiment of the present application.
  • FIG. 6 is a second schematic diagram of side information components provided by an embodiment of the present application.
  • FIG. 7 is a system architecture diagram of a technical solution provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of data flow of a technical solution provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of obtaining a distortion image color component of a distorted image according to an embodiment of the present application.
  • FIG. 10 is a flowchart of a method for removing distortion of a distorted image provided by an embodiment of the present application
  • FIG. 11 is a flowchart of a method for training a convolutional neural network model provided by an embodiment of the present application.
  • FIG. 12 is a flowchart of a video decoding method according to an embodiment of the present application.
  • FIG. 13 is a flowchart of another video decoding method according to an embodiment of the present disclosure.
  • FIG. 14 is a structural block diagram of a video decoding system according to an embodiment of the present disclosure.
  • FIG. 15 is a structural block diagram of another video decoding system according to an embodiment of the present disclosure.
  • 16 is a schematic structural diagram of a video encoding apparatus according to an embodiment of the present application.
  • FIG. 17 is a schematic structural diagram of a video decoding apparatus according to an embodiment of the present disclosure.
  • FIG. 18 is a flowchart of another video encoding method according to an embodiment of the present application.
  • FIG. 19 is a flowchart of another video encoding method according to an embodiment of the present application.
  • FIG. 20 is a structural diagram of another video encoding apparatus according to an embodiment of the present disclosure.
  • FIG. 21 is a schematic structural diagram of a video codec system according to an embodiment of the present disclosure.
  • FIG. 22 is a schematic structural diagram of a device according to an embodiment of the present application.
  • an embodiment of the present application provides a video encoding method, where the method includes:
  • Step 101 Acquire a distorted picture generated when the current original video picture is encoded.
  • the coding herein includes performing prediction, transform, and quantization on the current original video picture to obtain prediction data and residual information, performing entropy coding according to the prediction data and residual information, etc., to obtain a video bit stream, and according to the prediction data and the residual.
  • the information is reconstructed to obtain a reconstructed picture.
  • the distorted picture is the reconstructed picture or the picture after filtering the reconstructed picture.
  • Step 102 Generate an edge information component corresponding to the distortion picture, where the side information component represents a distortion feature of the original video picture of the distorted picture phase.
  • Step 103 Perform a filtering process on the distortion picture and the side information component input convolutional neural network model to obtain a first de-distorted image block corresponding to the distortion image block included in the distortion picture.
  • the first de-distorted image block is obtained by filtering the distorted picture with the L-edge information component as a guide.
  • the convolutional neural network model is obtained by training based on a preset training set.
  • the preset training set includes an original sample picture, a plurality of distorted pictures corresponding to the original sample picture, and side information components corresponding to each distorted picture.
  • Step 104 Select one image block from at least one image block corresponding to the distorted image block as a target de-distorted image block corresponding to the distorted image block, where the at least one image block includes a first de-distorted image block corresponding to the distorted image block. / or the distortion image block.
  • the image block having the smallest difference from the original image block may be selected from the at least one image block as a target de-distorted image block with reference to the original image block corresponding to the distorted image block.
  • the coding information of the coding unit may reflect the original image information corresponding to the coding unit in the original video picture, so that an image block having a small difference from the original image block may be selected as the target de-distorted image block according to the coding information.
  • Step 105 Encode the original video picture after the current original video picture according to the target de-distorted image block corresponding to the distorted image block to obtain a video bit stream.
  • the target de-distorted image block corresponding to the distorted image block may be composed into a frame reference picture, and when the reference picture is selected for encoding the original video picture to be encoded after the current original video picture, The original video picture to be encoded may be encoded according to the reference picture to obtain a video bit stream.
  • the distortion image is filtered to obtain a first de-distorted image block corresponding to the distorted image block of the distorted image, and then selected from the distorted image block and the first de-distorted image block corresponding to the distorted image block.
  • the image block with small original image difference is used as the image block obtained by final filtering, which not only improves the filtering performance, but also improves the de-distortion performance in the video encoding process.
  • the detailed implementation process of the method may include:
  • Step 201 Acquire a distorted picture generated during video encoding.
  • a reconstructed picture may be generated during the video encoding process, and the distorted picture may be the reconstructed picture, or may be a picture obtained by filtering the reconstructed picture.
  • the video coding system includes a prediction module, an adder, a transform unit, a quantization unit, an entropy encoder, an inverse quantization unit, an inverse transform unit, a reconstruction unit, and a CNN (convolution neural network). Model) and the buffer and other parts.
  • the process of encoding the video coding system is: inputting the current original video picture into the prediction module and the adder, and the prediction module predicts the input current original video picture according to the reference picture in the buffer, and obtains the predicted data, and the predicted data is obtained.
  • the prediction module includes an intra prediction unit, a motion estimation and motion compensation unit, and a switch.
  • the intra prediction unit may perform intra prediction on the current original video picture to obtain intra prediction data
  • the motion estimation and motion compensation unit performs inter prediction on the current original video picture according to the reference picture buffered in the buffer to obtain inter prediction data, and switches
  • the intra prediction data is selected or the inter prediction data is output to the adder and the reconstruction unit.
  • the intra prediction data may include intra mode information
  • the inter prediction data may include inter mode information.
  • the adder generates prediction error information according to the prediction data and the current original video picture, and the transform unit transforms the prediction error information, and outputs the transformed prediction error information to the quantization unit; the quantization unit converts the prediction error information according to the quantization parameter. Performing quantization to obtain residual information, and outputting the residual information to an entropy encoder and an inverse quantization unit; the entropy encoder encodes information such as residual information and prediction data to form a video bitstream, where the video comparison stream may include original Encoding information for each coding unit in the video picture.
  • the inverse quantization unit and the inverse transform unit respectively perform inverse quantization and inverse transform processing on the residual information to obtain prediction error information, and input the prediction error information into the reconstruction unit; the reconstruction unit generates the prediction error information according to the prediction error information and the prediction data. Refactor the image.
  • the reconstructed picture generated by the reconstructing unit may be acquired, and the reconstructed picture is taken as a distorted picture.
  • a filter may be connected between the convolutional neural network model and the reconstruction unit, and the filter may also filter the reconstructed picture generated by the reconstruction unit, and output the filtered reconstructed picture.
  • the filtered reconstructed picture may be obtained, and the filtered reconstructed picture is taken as a distorted picture.
  • Step 202 Divide the distorted picture to obtain a plurality of distorted image blocks included in the distorted picture.
  • the distorted picture is divided into a plurality of distorted image blocks, and then each image block is separately filtered.
  • the resources required for filtering the picture may be reduced, thereby Enables the device to meet the resources needed to filter the picture.
  • a pre-trained convolutional neural network model and at least one filter are provided, and the convolutional neural network model is used to filter the distorted image block, and each of the at least one filter is also used. Filtering the distorted image block.
  • the at least one filter may be a convolutional neural network filter, an Adaptive Loop Filiter (ALF), or the like.
  • Step 202 is an optional step, that is, step 202 may not be performed, and the entire frame distortion picture may be directly filtered during filtering.
  • Step 203 Generate an edge information component corresponding to the target image block, where the edge information component represents a distortion feature of the target image block relative to the original image block corresponding to the original video image, and the target image block is a distortion image block in the distortion picture.
  • the side information component corresponding to the target image block is used as the to-be-converted neural network model. Input data.
  • the side information component corresponding to the target image block may be obtained according to the quantization parameter or the coding information of each coding unit included in the target image block.
  • an edge information component corresponding to the distorted picture is generated, and the side information component represents a distorted feature of the distorted picture relative to the original video picture.
  • the process of generating the side information component corresponding to the distorted picture is the same as the side information component corresponding to the generated target image block, and then only the side information component corresponding to the generated target image block is described.
  • the side information component of the distorted picture only the following is required.
  • the target image block of the content is replaced with a distorted picture.
  • the side information component it represents the distortion feature of the target image block relative to the original tile in the original picture, which is an expression of the distortion feature determined by the image processing process.
  • the above distortion feature may include at least one of the following distortion features:
  • the side information component can represent the degree of distortion of the distorted target image block relative to its corresponding original image block in the original picture.
  • the side information component may also represent the distorted position of the distorted target image block relative to the original picture, and the side information component may include the boundary coordinates of each coding unit in the target image block.
  • the side information component may include the boundary coordinates of each coding unit in the target image block.
  • an image is usually divided into a plurality of non-overlapping and non-fixed coding units, and the coding unit is separately subjected to predictive coding and different degrees of quantization processing.
  • the distortion between the coding units usually does not have Consistency, pixel mutations usually occur at the boundaries of the coding unit. Therefore, the boundary coordinates of the coding unit can be used as a priori edge information component to characterize the distortion position.
  • the side information component may also represent the distortion type of the distorted target image block relative to the original picture, and the side information component may include the prediction mode of each coding unit in the target image block.
  • the prediction mode of the coding unit may be used as a An edge information component that characterizes the type of distortion.
  • the side information component may be a combination of one or more of the foregoing distortion degree, distortion position, and distortion type, or one or more of the above distortion degree, distortion position, and distortion type may be used.
  • the parameter indicates, for example, that after image processing, the distortion degree of the target image block of the distortion may be represented by a parameter of physical meaning, or the distortion degree of the target image block of the distortion may be represented by two parameters of different physical meanings, correspondingly That is, one or more parameters representing the degree of distortion can be used as the side information component according to actual needs, that is, as input data input to the convolutional neural network model.
  • the side information component of the target image block may be an edge information guide map, which is a matrix structure of the same width as the target image block.
  • the side information component includes an edge information component of each pixel of the target image block in which the position of the side information component of the pixel is the same as the position of the pixel in the target image block.
  • the matrix structure of the side information component is the same as the matrix structure of the distorted target image block color component, wherein the coordinates [0, 0], [0, 1] represent the distortion position, and the matrix element value 1 represents the distortion.
  • the degree that is, the side information component, can simultaneously indicate the degree of distortion and the position of the distortion.
  • the coordinates [0, 0], [0, 1], [2, 0], [2, 4] represent the distortion position
  • the element values 1 and 2 of the matrix represent the distortion type, that is, the side information component. At the same time, it can indicate the type of distortion and the position of distortion.
  • two side information components respectively illustrated in FIG. 5 and FIG. 6 may be included.
  • the target image block is also a matrix, with each element in the matrix being the distorted image color component of the pixel in the target image block.
  • the distorted image color component of the pixel may include the color component of any one of the three channels Y, U, V or more.
  • the side information component may include side information components respectively corresponding to each of the distortion image color components. That is to say: the side information component of the pixel in the side information component of the target image block includes the side information component corresponding to each of the distortion image color components in the pixel.
  • This step can be implemented by the following two steps, respectively.
  • Step 2031 Determine, for the target image block to be processed, a distortion level value of each pixel in the target image block.
  • the physical parameter indicating the degree of distortion may also be different. Therefore, in this step, the corresponding image processing manner may be determined to accurately represent the pixel distortion degree.
  • the value of the distortion level can be as follows:
  • the quantization parameter of each coding unit in the target image block is known, that is, the quantization parameter of each coding unit in the target image block can be obtained, a quantization parameter of a coding unit where each pixel of the target image block is located is determined as a distortion degree value of each pixel of the target image block;
  • the quantization parameter of each coding unit in the target image block is included in the quantization unit in the video coding system, so the quantization parameter of each coding unit in the target image block can be acquired from the quantization unit.
  • the second mode for the target image block obtained by the codec, the coding information of each coding unit in the target image block is known, that is, the coding information of each coding unit in the target image block can be obtained, according to
  • the coding information of each coding unit in the target image block calculates a quantization parameter of each coding unit, and determines a quantization parameter of a coding unit where each pixel of the target image block is located as a distortion of each pixel of the target image block. Degree value.
  • the encoding information of each coding unit is included in the current original video picture, and the coding information of each coding unit in the target image block may be obtained from the current original video picture.
  • Step 2032 Generate, according to the position of each pixel point in the target image block, the edge information component corresponding to the target image block by using the obtained distortion degree value of each pixel point, where each component value and the target image included in the edge information component The pixel at the same position on the block corresponds to the position of the side information component of the side information component in the target image block and the position of the pixel point in the target image block.
  • the side information component Since each component value included in the side information component corresponds to a pixel at the same position on the target image block, the side information component has the same structure as the distortion image color component of the target image block, that is, a matrix representing the side information component and the representation target.
  • the matrix of the color components of the image block is of the same type.
  • the acquired distortion level value of each pixel point may be determined as the component value of the same position of the pixel information in the side information component corresponding to the target image block, that is, based on the position of each pixel point in the target image block, that is, The distortion degree value of each pixel is directly determined as the component value corresponding to the pixel.
  • the acquired distortion degree value of each pixel point may be normalized based on the pixel value range of the target image block, and processed.
  • the distortion degree value, the value range of the processed distortion degree value is the same as the pixel value range;
  • the processed distortion level value of each pixel point is determined as the component value of the same position of the pixel point in the side information component corresponding to the target image block.
  • the distortion degree value of the pixel point can be standardized by the following formula:
  • norm(x) is the processed distortion degree value obtained after normalization processing
  • x is the distortion degree value of the pixel point
  • the pixel value range of the target image block is [PIXEL MIN , PIXEL MAX ]
  • the distortion degree of the pixel point is The value range is [QP MIN , QP MAX ].
  • the side information guide map corresponding to the target image block is generated, and the side information guide map passes the side information component thereof. Indicates the degree of distortion of the target image block, and the side information guide map is equal in width to the target image block.
  • Step 204 Perform a filtering process on the target image block and the side information component into the convolutional neural network model to obtain a first de-distorted image block.
  • a convolutional neural network model includes: an edge information component generation module 11, a convolutional neural network 12, and a network training module 13;
  • the convolutional neural network 12 may include the following three-layer structure:
  • the input layer processing unit 121 is configured to receive an input of a convolutional neural network, where the distortion image color component of the target image block and the side information component of the target image block are included in the solution, and the first layer of the input data is convolved. Filter processing
  • the hidden layer processing unit 122 performs at least one layer of convolution filtering processing on the output data of the input layer processing unit 121;
  • the output layer processing unit 123 performs convolution filtering processing on the output data of the hidden layer processing unit 122, and outputs the result as a de-distorted image color component for generating a de-distorted image block.
  • FIG. 8 is a schematic diagram of a data flow for implementing the solution, in which a distorted image color component of a target image block and an edge information component of a target image block are input as input data to a pre-trained convolutional neural network model.
  • the convolutional neural network model can be represented by a convolutional neural network of a preset structure and a configured network parameter set, and the input data is subjected to convolution filtering processing of the input layer, the hidden layer and the output layer to obtain a de-distorted image block. .
  • the input data as a convolutional neural network model may include one or more side information components according to actual needs, and may also include one or more distorted image color components, for example, including at least a Y color component, a U color component, and One of the V color components.
  • the color component of the distorted image may be used as input data, such as two color components, in the de-distortion process.
  • the two color components of the distorted image are taken as input data, and correspondingly, the corresponding de-distorted image color components are output.
  • the stored data of each pixel of an image can obtain the required data from the stored data of each pixel when the distortion image color component of the distorted image is obtained.
  • the value of the Y color component of each pixel is extracted therefrom, thereby obtaining the Y color component of the distorted image.
  • [0, 0] and [0, 1] are positions, and Y, U, and V are three channel distortion image color components of pixel points.
  • the position [0, 0] is a stored data of one pixel, and the stored data includes three channel distortion image color components of Y, U, and V; in the right picture of FIG. 9, [0, 0] and [0, 1 ] is still the position, Y is the Y channel distortion image color component.
  • this step may specifically include the following processing steps:
  • the scheme is described by taking the structure of the convolutional neural network model including the input layer, the hidden layer, and the output layer as an example.
  • Step 61 The distortion image color component of the target image block and the generated side information component are used as input data of a pre-established convolutional neural network model, and the first layer is subjected to convolution filtering processing by the input layer to obtain a sparse form representation.
  • the image block outputs the image block represented in a sparse form.
  • the input data can be input to the network through respective channels.
  • the target image block color component Y and the c m channel side information can be c y channels.
  • the component M is combined in the dimension of the channel to form the input data I of c y +c m channels, and multidimensional convolution filtering and nonlinear mapping are performed on the input data I by using the following formula to generate n 1 in a sparse form.
  • F 1 (I) is the output of the input layer (for image blocks expressed in sparse form)
  • I is the input of the convolution layer in the input layer
  • * is the convolution operation
  • W 1 is the convolution layer of the input layer.
  • the weight coefficient of the group B 1 is the offset coefficient of the convolution layer filter bank of the input layer
  • g() is a nonlinear mapping function.
  • W 1 corresponds to n 1 convolution filters, that is, n 1 convolution filters are applied to the input of the convolution layer of the input layer, and n 1 image blocks are output; convolution of each convolution filter
  • the size of the kernel is c 1 ⁇ f 1 ⁇ f 1 , where c 1 is the number of input channels and f 1 is the spatial size of each convolution kernel.
  • a ReLU Rectified linear unit
  • Step 62 The hidden layer performs further high-dimensional mapping on the image block F 1 (I) expressed by the input layer in a sparse form to obtain a high-dimensional image block and outputs the high-dimensional image block.
  • the convolution layer number, the convolution layer connection mode, the convolution layer attribute, and the like included in the hidden layer are not limited, and various structures known at present may be adopted, but the hidden layer includes at least 1 convolution layer.
  • the hidden layer contains a N-1 (N ⁇ 2) layer convolutional layer, and the hidden layer processing is represented by:
  • F i (I) g(W i *F i-1 (I)+B i ), i ⁇ 2,3,...,N ⁇ ;
  • F i (I) represents the output of the i-th layer convolutional layer in the convolutional neural network
  • * is the convolution operation
  • W i is the weight coefficient of the i-th layer convolutional layer filter bank
  • B i is the i-th layer
  • the offset coefficient of the convolution layer filter bank, g() is a nonlinear mapping function.
  • W i corresponds to n i convolution filters, that is, n i convolution filters are applied to the input of the i-th convolution layer, and n i image blocks are output; convolution of each convolution filter
  • the size of the kernel is c i ⁇ f i ⁇ f i , where c i is the number of input channels and f i is the spatial size of each convolution kernel.
  • Step 63 The output layer aggregates the high-dimensional image block F N (I) output by the hidden layer, and outputs the de-distorted image color component of the target image block, for generating the first de-distorted image block.
  • the structure of the output layer is not limited in the embodiment of the present invention, and the output layer may be a Residual Learning structure, a Direct Learning structure, or other structures.
  • the processing using the Residual Learning structure is as follows:
  • F(I) is the de-distorted image color component of the output layer
  • F N (I) is the output of the hidden layer (which is a high-dimensional image block)
  • * is the convolution operation
  • W N+1 is the output layer.
  • B N+1 is the offset coefficient of the convolution layer filter bank of the output layer
  • Y is the distorted image color component that is not subjected to convolution filtering processing and is to be subjected to de-distortion processing.
  • W N+1 corresponds to n N+1 convolution filters, that is, n N+1 convolution filters are applied to the input of the N+1th convolution layer, and n N+1 image blocks are output.
  • n N+1 is the number of output de-distorted image color components, generally equal to the number of input distortion image color components. If only one de-distorted image color component is output, n N+1 is generally 1
  • the size of the convolution kernel of each convolution filter is c N+1 ⁇ f N+1 ⁇ f N+1 , where c N+1 is the number of input channels and f N+1 is the number of each convolution kernel The size of the space.
  • the de-distorted image color component is directly output, that is, the first de-distorted image block is obtained.
  • the output layer processing can be expressed by the following formula:
  • F(I) is the output of the output layer
  • F N (I) is the output of the hidden layer
  • * is the convolution operation
  • W N+1 is the weight coefficient of the convolutional layer filter bank of the output layer
  • B N+ 1 is the offset coefficient of the convolution layer filter bank of the output layer.
  • W N+1 corresponds to n N+1 convolution filters, that is, n N+1 convolution filters are applied to the input of the N+1th convolution layer, and n N+1 image blocks are output.
  • n N+1 is the number of output de-distorted image color components, generally equal to the number of input distortion image color components. If only one de-distorted image color component is output, n N+1 is generally 1
  • the size of the convolution kernel of each convolution filter is c N+1 ⁇ f N+1 ⁇ f N+1 , where c N+1 is the number of input channels and f N+1 is the number of each convolution kernel The size of the space.
  • the output layer adopts a Residual Learning structure
  • the output layer includes a convolution layer.
  • FIG. 11 a method for training a convolutional neural network model is also proposed, as shown in FIG. 11 , which specifically includes the following processing steps:
  • Step 71 Acquire a preset training set, where the preset training set includes an original sample image, and a distortion image color component of the plurality of distortion images corresponding to the original sample image, and an edge information component corresponding to each of the distortion images, where the distortion image corresponds to The side information component represents the distorted feature of the distorted image relative to the original sample image.
  • the distortion characteristics of the plurality of distorted images are different.
  • the original sample image (ie, the undistorted natural image) may be subjected to image processing with different degrees of distortion to obtain a distortion image corresponding to each original sample image, and according to the steps in the above-described de-distortion method, for each distortion And generating an edge information component corresponding to the distortion image, so that for each original sample image, the original sample image, the distortion image corresponding to the original sample image, and the side information component corresponding to the distortion image are combined into an image pair, and the image is composed of the image
  • the preset training set ⁇ is composed. Since the original sample image is subjected to image processing of different degrees of distortion, the original sample image may correspond to a plurality of distorted images.
  • the training set may include an original sample image, and performing image processing on the original sample image to obtain a plurality of distortion images having different distortion characteristics, and side information components corresponding to each distortion image; that is, the training set includes the An original sample image, the plurality of distorted images corresponding to the one original sample image and the side information components corresponding to each of the distorted images.
  • the training set may also include a plurality of original sample images, respectively performing image processing on each of the original sample images to obtain a plurality of distorted images having different distortion characteristics, and side information components corresponding to each distorted image; that is, the training set includes Each of the original sample images, the plurality of distorted images corresponding to the original sample image and the side information component corresponding to each of the distorted images of the original sample image.
  • Step 72 Initialize the parameters of the network parameter set of the convolutional neural network CNN for the convolutional neural network CNN of the preset structure.
  • the initialized parameter set may be represented by ⁇ 1 , and the initialized parameters may be set according to actual needs and experience.
  • the training-related high-level parameters such as the learning rate, the gradient descent algorithm, and the like, may be appropriately set.
  • various methods in the prior art may be used, and detailed descriptions are not provided herein.
  • Step 73 Perform forward calculation.
  • convolution filtering is performed by inputting a distortion image color component of each of the distortion images in the preset training set and a corresponding side information component into a convolutional neural network of a preset structure, to obtain a de-distorted image color corresponding to the distortion image.
  • Component
  • the forward calculation of the convolutional neural network CNN with the parameter set ⁇ i is performed on the preset training set ⁇ , and the output F(Y) of the convolutional neural network is obtained, that is, the corresponding image of each distortion image Distorted image color component.
  • the current parameter set is ⁇ 1 .
  • the current parameter set ⁇ i is obtained by adjusting the parameter set ⁇ i-1 used last time. description.
  • Step 74 Determine a loss value of the plurality of original sample images based on the original image color component of the plurality of original sample images and the obtained de-distorted image color component.
  • MSE mean square error
  • H represents the number of pairs of images selected from the preset training set in a single training
  • I h represents the input data of the combined edge component and the distorted image color component corresponding to the hth distorted image
  • ⁇ i ) represents the de-distorted image color component calculated by the convolutional neural network CNN forwardly under the parameter set ⁇ i for the h-th distorted image
  • X h represents the original image color component corresponding to the h-th distorted image
  • i is The number of times the forward calculation has been performed is currently counted.
  • Step 75 Determine whether the convolutional neural network of the preset structure adopting the current parameter set is converged based on the loss value. If not, proceed to step 76. If it converges, proceed to step 77.
  • the convergence may be determined when the loss value is less than the preset loss value threshold.
  • the loss value of each original sample image in the plurality of original sample images is less than a preset loss value threshold, determining convergence, or
  • the loss value of the original sample image of the plurality of original sample images is less than a preset loss value threshold, and the convergence is determined; or the difference between the loss value and the last calculated loss value may be calculated by the current calculation.
  • the convergence is determined.
  • the difference between the loss value of the original sample image obtained this time and the loss value of the original sample image obtained last time is calculated, that is, Calculating the difference of each original sample image, determining convergence when the difference of each original sample image is less than a preset change threshold, or determining convergence when the difference of any original sample image is less than a preset change threshold
  • the invention is not limited herein.
  • step 76 the parameters in the current parameter set are adjusted to obtain an adjusted parameter set, and then proceeds to step 73 for the next forward calculation.
  • the back propagation algorithm can be used to adjust the parameters in the current parameter set.
  • Step 77 The current parameter set is taken as the final parameter set ⁇ final of the output, and the convolutional neural network of the preset structure adopting the final parameter set ⁇ final is used as the trained convolutional neural network model.
  • the edge information component corresponding to the distortion picture and the distortion picture may be input into a convolutional neural network model for convolution filtering processing to obtain a de-distorted picture, and the de-distorted picture is obtained. Dividing is performed to obtain a first detoured true image block corresponding to each of the distorted image blocks in the distorted picture.
  • Step 205 Input the target image block into at least one filter to perform filtering processing to obtain a second de-distorted image block outputted by each filter.
  • the distorted picture can be input into at least one filter for filtering, and the de-distorted picture outputted by each filter is obtained, and then the de-distorted picture output by each filter is divided to obtain each
  • the filter-filtered distortion picture includes a second de-distorted image block corresponding to the distorted image block.
  • Step 206 Select one image block from the at least one image block as the target de-distorted image block corresponding to the target image block.
  • the at least one image block may comprise a first de-distorted image block and/or a target image block.
  • the at least one image block may further include each second de-distorted image block.
  • one image block is selected from at least one image block according to an original image block corresponding to the target image block.
  • a difference value between each image block in the at least one image block and the original image block corresponding to the target image block may be separately calculated; and between the original image blocks corresponding to the target image block is selected from the at least one image block.
  • the difference is the smallest image block.
  • the difference value between the original image block corresponding to the image block and the target image block may be a Sum of Squared Differences (SSD) value of the difference between the estimated value and the estimated object.
  • SSD Sum of Squared Differences
  • the flag information for identifying the data type of the target de-distorted image block may also be filled in the filtered flag according to the position of the target image block in the distorted picture.
  • the data type of the target de-distorted image block may be data output by the convolutional neural network model filtering, data output by a certain filter in at least one filter, or a target image block.
  • one image block is selected from at least one image block according to the encoding information of each coding unit included in the target image block.
  • the video bitstream output by the entropy encoder includes location and encoding information of each coding unit in the current original video picture.
  • each coding unit included in the target image block may be determined according to a position of the target image block in the distorted picture and a position of each coding unit in the current original video picture; according to each coding unit included in the target image block.
  • Encoding information selecting one image block from at least one image block.
  • the coding information of the coding unit included in the target image block may be a prediction mode and/or a motion vector or the like, and the selection result of the target image block is derived using one or more pieces of coding information. For example, if the coding unit exceeding the preset first ratio in the target image block adopts the intra coding mode, the first de-distorted image block filtered by the convolutional neural network model is selected, if the target image block exceeds the preset second ratio The coding unit performs coding using skip mode (SKIP), and the target image block is selected, the second ratio is smaller than the first ratio; otherwise, the second de-distorted image block output by a certain filter is selected.
  • SKIP skip mode
  • the target de-distorted image block corresponding to each of the distorted image blocks in the distorted picture is obtained.
  • the at least one image block since at least one image block corresponding to each of the distorted image blocks is obtained, the at least one image block includes the distorted image block and the de-true image block obtained by filtering using different filters, and then corresponding to the distorted image block according to the distorted image block. And encoding information of each coding unit included in the original image block or the distortion image block, and selecting an image block having the smallest difference from the original image block as the target de-distorted image block corresponding to the distortion image block from the at least one image block, thereby improving Filter performance and quality, as well as de-distortion performance.
  • Step 207 Encode the original video picture to be encoded according to the target de-distorted image block corresponding to each of the distortion image blocks included in the distortion picture to obtain a video bit stream.
  • the target de-distorted image block corresponding to each distorted image block is respectively filled in a blank reference picture, and the reference picture is cached in the buffer. Therefore, when the reference picture is selected, the original video picture to be encoded may be encoded by the reference picture to obtain a video bit stream, where the original video picture to be encoded refers to an original video picture that has not been encoded, and may be the current original video. The original video image after the image.
  • the target image block is filtered to obtain a first de-distorted image block corresponding to the distorted image block of the distorted picture, and the at least one filter is used to After the target image block is filtered, a second de-distorted image block corresponding to the target image block output by each filter is obtained, and then the target image is selected from the target image block, the first de-distorted image block, and the second de-distorted image block.
  • the image block corresponding to the original image block difference corresponding to the block is used as the image block obtained by the final filtering, which not only improves the filtering performance, but also improves the de-distortion performance in the video encoding process.
  • the side information component is also added to the convolutional neural network model, thereby improving the generalization ability of the convolutional neural network model.
  • an embodiment of the present application provides a video decoding method, where the method includes:
  • Step 301 Entropy decoding the received video bitstream to obtain current entropy decoded data.
  • Step 302 Acquire each of the distorted image blocks included in the distorted picture, and the distorted picture is generated when the current entropy decoded data is decoded.
  • the distorted picture has distortion relative to the original video picture before encoding corresponding to the video bitstream of the input decoding system.
  • Step 303 Determine a data type corresponding to the target image block according to the current entropy decoded data, where the target image block is a distorted image block in the distorted picture.
  • Step 304 When the data type is used to represent data filtered by the convolutional neural network model, generate side information components corresponding to the target image block.
  • the side information component represents a distortion feature of the target image block relative to the original image block corresponding to the original video picture, and the original video picture is a video picture corresponding to the current entropy decoded data.
  • Step 305 Perform a convolution filtering process on the target image block and the side information component into the convolutional neural network model to obtain a de-distorted image block corresponding to the target image block.
  • the de-distorted image block is obtained by filtering the target image block by using the side information component as a guide.
  • the convolutional neural network model is obtained by training based on a preset training set, and the preset training set includes an original sample picture, a plurality of distorted pictures corresponding to the original sample picture, and side information components corresponding to each distorted picture.
  • Step 306 Decode the subsequently received video bitstream according to the de-distorted image block corresponding to each of the distorted image blocks included in the distorted picture.
  • the data type corresponding to the target image block is determined according to the current entropy decoded data, and the filter is selected according to the data type selection filter, thereby improving the filtering performance and improving the distortion in the video decoding process. performance.
  • the detailed implementation process of the method may include:
  • Step 401 Entropy decoding the received video bitstream to obtain current entropy decoded data.
  • Step 402 Obtain each of the distorted image blocks included in the distorted picture, and the distorted image is generated when the current entropy decoded data is decoded.
  • the distorted picture has distortion relative to the original video picture before encoding corresponding to the video bitstream of the input decoding system.
  • a reconstructed picture may be generated during the video decoding process, and the distorted picture may be the reconstructed picture, or may be a picture obtained by filtering the reconstructed picture.
  • the video decoding system includes a prediction module, an entropy decoder, an inverse quantization unit, an inverse transform unit, a reconstruction unit, a CNN (convolution neural network model), and a buffer.
  • the decoding process of the video decoding system is: inputting the received video bit stream into an entropy decoder, and the entropy decoder decoding the bit stream to obtain entropy decoded data, where the entropy decoded data includes mode information, quantization parameters, residual information And encoding information and/or a filtering flag map of each coding unit included in the original video picture, inputting the mode information into the prediction module, inputting the quantization parameter into the convolutional neural network model, and the residual information Input to the inverse quantization unit.
  • the prediction module predicts the input mode information according to the reference picture in the buffer to obtain prediction data, and inputs the prediction data into the reconstruction unit.
  • the prediction module includes an intra prediction unit, a motion compensation unit, and a switch, and the mode information may include intra mode information and inter mode information.
  • the intra prediction unit may predict the intra prediction data by using the intra mode information
  • the motion compensation unit may obtain the inter prediction data by inter prediction of the inter mode information according to the reference picture buffered in the buffer, and the switch selects the intra prediction data or The inter prediction data is output to the reconstruction unit.
  • the inverse quantization unit and the inverse transform unit respectively perform inverse quantization and inverse transform processing on the residual information to obtain prediction error information, and input the prediction error information into the reconstruction unit; the reconstruction unit generates and reconstructs the prediction error information according to the prediction error information and the prediction data. image.
  • the reconstructed picture generated by the reconstructing unit may be acquired, and the reconstructed picture is taken as a distorted picture.
  • a filter may be connected between the convolutional neural network model and the reconstruction unit, and the filter may further filter the reconstructed picture generated by the reconstruction unit, and output the filtered reconstructed picture.
  • the filtered reconstructed picture may be obtained, and the filtered reconstructed picture is taken as a distorted picture.
  • Step 403 Determine a data type corresponding to the target image block according to the current entropy decoded data, where the target image block is a distorted image block in the distorted picture.
  • the current entropy decoded data includes a filtered flag map including flag information corresponding to each of the distorted image blocks in the distorted image, and the flag information corresponding to the distorted image block is used to identify a data type corresponding to the distorted image block.
  • the step may be: reading, according to the position of the target image block in the distorted picture, the flag information corresponding to the target image block from the filtered flag map; and determining the data type corresponding to the target image block according to the flag information. or,
  • the current entropy decoded data includes the location and encoding information of each coding unit in the original video picture.
  • the step may be: determining, according to a position of the target image block in the distorted picture and a position of each coding unit in the original video picture, each coding unit included in the target image block; and according to coding information of each coding unit included in the target image block, Determine the data type corresponding to the target image block.
  • the coding information of the coding unit included in the target image block may be a prediction mode and/or a motion vector or the like. For example, if a coding unit exceeding a preset first ratio in the target image block adopts an intra coding mode, determining that the data type is data filtered by the convolutional neural network model; if the coding unit of the target image block exceeds the preset second ratio The encoding is performed using skip mode (SKIP), and the data type is determined to be a filter filtered data, and the second ratio is smaller than the first ratio; otherwise, the data type is determined to be the target image block.
  • SKIP skip mode
  • Step 404 When the data type is used to represent data filtered by the convolutional neural network model, generate side information components corresponding to the target image block.
  • the side information component represents a distortion feature of the target image block relative to the original image block corresponding to the original video picture, and the original video picture is a video picture corresponding to the current entropy decoded data.
  • edge information component corresponding to the target image block For the detailed implementation process of generating the edge information component corresponding to the target image block, refer to the related content in step 203 in the embodiment shown in FIG. 2, which will not be described in detail herein.
  • Step 405 Perform a convolution filtering process on the target image block and the side information component into the convolutional neural network model to obtain a de-distorted image block corresponding to the target image block.
  • Step 406 When the data type is used to represent data output by a certain filter, the target image block is input into the filter for filtering processing to obtain a de-distorted image block.
  • Step 407 When the data type is used to represent the target image block, the target image block is determined as a de-distorted image block.
  • the de-distorted image block corresponding to each of the distorted image blocks in the distorted picture is obtained.
  • Step 408 Decode the subsequently received video bitstream according to the de-distorted image block corresponding to each of the distorted image blocks included in the distorted picture.
  • the de-distorted image block corresponding to each distorted image block is filled in a blank reference picture, and the reference picture is stored in the buffer, so that The subsequently received video bitstream is decoded using the reference picture in the buffer.
  • the data type corresponding to the target image block is determined according to the current entropy decoded data, and the filter is selected according to the data type selection filter, thereby improving the filtering performance and improving the distortion in the video decoding process. performance.
  • an embodiment of the present application provides a video encoding apparatus 500, where the apparatus 500 includes:
  • the obtaining module 501 is configured to obtain a distortion picture and an edge information component corresponding to the distortion picture, where the distortion picture is distorted with respect to a current original video picture of the input coding system, and the side information component represents the distortion picture relative to the Distortion characteristics of the current original video picture;
  • the filtering module 502 is configured to filter the distortion picture and the side information component into a convolutional neural network model to obtain a first de-distorted image block corresponding to the target image block, where the first de-distorted image block is The edge information component is obtained by guiding the distortion picture, and the target image block is any distortion image block included in the distortion picture;
  • a selection module 503 configured to select one image block from the at least one image block corresponding to the target image block as a target de-distorted image block corresponding to the target image block, where the at least one image block includes the target image block corresponding to a first de-distorted image block and/or the target image block;
  • the encoding module 504 is configured to encode the original video picture after the current original video picture according to the target de-distorted image block corresponding to the target image block to obtain a video bit stream.
  • the distortion picture is generated when the current original video picture is encoded; the convolutional neural network model is obtained by training based on a preset training set, where the preset training set includes an original sample picture, and the original sample picture Corresponding multiple distortion pictures, and side information components corresponding to the distortion pictures corresponding to each of the original sample pictures.
  • the filtering module 502 includes:
  • a first filtering unit configured to divide the distorted picture, obtain a distorted image block included in the distorted picture, and input a target image block and an edge information component corresponding to the target image block into a convolutional neural network model for filtering Obtaining a first de-distorted image block corresponding to the target image block, where the target image block is any distorted image block included in the distorted picture; or
  • a second filtering unit configured to filter the distortion picture and the side information component into a convolutional neural network model to obtain a de-distortion picture, and divide the de-distorted picture to obtain a first corresponding to the target image block.
  • a distorted image block, the target image block being any distorted image block included in the distorted picture.
  • the filtering module is further configured to:
  • the selecting module 503 includes:
  • a first selecting unit configured to select one image block from the at least one image block according to the original image block corresponding to the target image block in the current original video image
  • a second selecting unit configured to select one image block from the at least one image block according to encoding information of each coding unit included in the target image block.
  • the first selecting unit is configured to:
  • An image block having the smallest difference value between the original image blocks corresponding to the target image block is selected from the at least one image block.
  • the video bitstream further includes a filtered flag map corresponding to the distorted picture
  • the apparatus further includes:
  • the flag map fills in the flag information for identifying the data type of the target target image block.
  • the target image block is filtered to obtain a first de-distorted image block corresponding to the target image block, and then selected from the target image block and/or the first de-distorted image block corresponding to the target image block.
  • the image block with small original image difference is used as the image block obtained by final filtering, which not only improves the filtering performance, but also improves the de-distortion performance in the video encoding process.
  • an embodiment of the present application provides a video decoding apparatus 600, where the apparatus 600 includes:
  • An obtaining module 601 configured for a data type of a target image block and a target image block, wherein the target image block is any distortion image block included in the distortion picture, and the code corresponding to the video bit stream of the input decoding system
  • the original original video picture has distortion;
  • a generating module 602 configured to generate an edge information component corresponding to the target image block when the data type is used to represent data filtered by the convolutional neural network model, where the edge information component indicates that the target image block is opposite to the target image block a distortion feature of the corresponding original image block in the original video picture;
  • a filtering module 603 configured to perform convolution filtering processing on the target image block and the side information component into the convolutional neural network model, to obtain a de-distorted image block corresponding to the target image block, where the de-distorted image block is The side information component is obtained after the filtering of the target image block is guided;
  • the decoding module 604 is further configured to decode the subsequently received video bitstream according to the de-distorted image block corresponding to the target image block.
  • the decoding module 604 is configured to perform entropy decoding on the received video bitstream to obtain current entropy decoded data.
  • the side information component corresponding to the target image block represents a distortion feature of the target image block relative to the original image block corresponding to the original video picture
  • the original video picture is a video picture corresponding to the current entropy decoded data.
  • the filtering module 603 is further configured to:
  • the target image block is input to the filter for filtering processing to obtain a de-distorted image block corresponding to the target image block;
  • the target image block is determined as a de-distorted image block corresponding to the target image block.
  • the current entropy decoded data includes a filtered flag map, where the filtered flag map includes flag information corresponding to each of the distorted image blocks, and the flag information corresponding to the distorted image block is used to identify the distorted image.
  • the obtaining module 601 includes:
  • a reading unit configured to read, according to the position of the target image block in the distorted picture, the flag information corresponding to the target image block from the filtered flag map;
  • a first determining unit configured to determine, according to the flag information, a data type corresponding to the target image block.
  • the obtaining module 601 is configured to:
  • the current entropy decoded data includes location and encoding information of each coding unit in the original video picture
  • the obtaining module 601 includes:
  • a second determining unit configured to determine, according to a location of the target image block in the distorted picture and a location of each coding unit in the original video picture, each coding unit included in the target image block;
  • a third determining unit configured to determine, according to the encoding information of each coding unit included in the target image block, a data type corresponding to the target image block.
  • the data type corresponding to the target image block is determined according to the current entropy decoded data, and the filter is selected according to the data type selection filter, thereby improving the filtering performance and improving the distortion in the video decoding process. performance.
  • an embodiment of the present application provides a video encoding method, where the method includes:
  • Step 701 Acquire a distorted image block included in the distorted picture, the distorted picture being distorted with respect to the current original video picture of the input coding system.
  • the distorted picture is generated when the original video picture is encoded.
  • Step 702 Determine, according to the encoding information of the coding unit included in the target image block, a data type corresponding to the target image block, where the target image block is any of the distorted image blocks.
  • Step 703 When the data type is used to represent the data filtered by the convolutional neural network model, generate an edge information component corresponding to the target image block, where the edge information component represents the original image block corresponding to the original image block in the original video image. Distortion characteristics.
  • Step 704 Perform a convolution filtering process on the target image block and the side information component input convolutional neural network model to obtain a de-distorted image block corresponding to the target image block, and the convolutional neural network model is trained based on the preset training set.
  • the preset training set includes an original sample picture, a plurality of distorted pictures corresponding to the original sample picture, and side information components corresponding to the distorted picture corresponding to each original picture.
  • Step 705 Encode the original video picture after the current original video picture according to the de-distorted image block corresponding to the distorted image block in the distorted picture to obtain a video bit stream.
  • the data type corresponding to the target image block is determined according to the coding information of the coding unit included in the distorted image block, and the filter is selected according to the data type selection filter, thereby improving the filtering performance and improving the De-distortion performance during video encoding.
  • the detailed implementation process of the method may include:
  • Steps 801-802 are the same as steps 201-202 in the embodiment shown in FIG. 2, and will not be described in detail herein.
  • Step 803 Determine, according to the coding information of the coding unit included in the target image block, a data type corresponding to the target image block, where the target image block is any distortion image block in the distortion picture.
  • the video bitstream is obtained when the current original video picture is video-encoded, and the video bitstream includes the location and coding information of each coding unit in the current original video picture.
  • each coding unit included in the target image block may be determined according to a position of the target image block in the distorted picture and a position of each coding unit in the original video picture; and coding information of each coding unit included according to the target image block. , determining the data type corresponding to the target image block.
  • the coding information of the coding unit included in the target image block may be a prediction mode and/or a motion vector or the like. For example, if a coding unit exceeding a preset first ratio in the target image block adopts an intra coding mode, determining that the data type is data filtered by the convolutional neural network model; if the coding unit of the target image block exceeds the preset second ratio The encoding is performed using skip mode (SKIP), and the data type is determined to be a filter filtered data, and the second ratio is smaller than the first ratio; otherwise, the data type is determined to be the target image block.
  • SKIP skip mode
  • Step 804 When the data type is used to represent data filtered by the convolutional neural network model, generate side information components corresponding to the target image block.
  • the side information component represents a distortion feature of the target image block relative to the original image block corresponding to the original video picture, and the original video picture is a video picture corresponding to the current entropy decoded data.
  • step 203 For the detailed implementation process of generating the edge information component corresponding to the target image block, refer to related content in step 203 in the embodiment of 2-1, which will not be described in detail herein.
  • Step 805 Perform a convolution filtering process on the target image block and the side information component into the convolutional neural network model to obtain a de-distorted image block corresponding to the target image block.
  • step 204 For the detailed implementation process of the convolutional neural network model for the convolution filtering process, refer to the related content in step 204 in the embodiment of 2-1, which will not be described in detail herein.
  • Step 806 When the data type is used to represent data output by a certain filter, the target image block is input into the filter for filtering processing to obtain a de-distorted image block.
  • Step 807 When the data type is used to represent the target image block, the target image block is determined as a de-distorted image block.
  • the de-distorted image blocks corresponding to the respective distortion image blocks in the distorted picture are obtained.
  • Step 808 Encode the original video picture after the current original video picture to obtain a video bit stream according to the de-distorted image block corresponding to each of the distortion image blocks included in the distortion picture.
  • the target de-distorted image block corresponding to each distorted image block is respectively filled in a blank reference picture, and the reference picture is cached in the cache.
  • the original video picture to be encoded can be encoded by the reference picture to obtain a video bit stream.
  • the data type corresponding to the target image block is determined according to the coding information of each coding unit included in the target image block, and the filter is selected according to the data type selection filter, and the coding information can reflect the coding unit.
  • the original image information in the original video picture so that the coding method can determine the filtering method with small distortion when filtering, that is, the data type, and selecting the filter according to the data type, not only improves the filtering performance, but also improves the video encoding process. De-distortion performance.
  • an embodiment of the present application provides a video encoding apparatus 900, where the apparatus 900 includes:
  • the obtaining module 901 is configured to obtain a distorted image block included in the distorted picture, where the distorted picture is distorted with respect to a current original video picture of the input coding system, where the distorted picture is generated when the original video picture is encoded;
  • a determining module 902 configured to determine, according to encoding information of a coding unit included in the target image block, a data type corresponding to the target image block, where the target image block is any one of the distortion image blocks;
  • a generating module 903 configured to generate an edge information component corresponding to the target image block when the data type is used to represent data filtered by the convolutional neural network model, where the side information component indicates that the target image block is opposite to the target image block a distortion feature of the corresponding original image block in the original video picture;
  • a filtering module 904 configured to perform convolution filtering processing on the target image block and the side information component into a convolutional neural network model to obtain a de-distorted image block corresponding to the target image block, where the de-distorted image block is Obtaining after filtering the target image block by using the side information component as a guide;
  • the encoding module 905 is configured to encode the original video picture after the current original video picture according to the de-distorted image block corresponding to the distorted image block in the distorted picture to obtain a video bit stream.
  • the convolutional neural network model is obtained by training based on a preset training set, where the preset training set includes an original sample picture, a plurality of distorted pictures corresponding to the original sample picture, and side information corresponding to each distorted picture. Component.
  • the filtering module 904 is further configured to:
  • the target image block is input to the filter for filtering processing to obtain a de-distorted image block corresponding to the target image block;
  • the target image block is determined as a de-distorted image block corresponding to the target image block.
  • the data type corresponding to the target image block is determined according to the coding information of each coding unit included in the target image block, and the filter is selected according to the data type selection filter, thereby improving the filtering performance and improving the filtering performance. De-distortion performance during video encoding.
  • the embodiment of the present application provides a codec system 1000, which includes the video encoding apparatus 1001 provided in the embodiment shown in FIG. 16 and the video decoding apparatus 1002 provided in the embodiment shown in FIG. or,
  • the system 1000 includes a video encoding apparatus 1001 as provided in the embodiment shown in FIG. 20 and a video decoding apparatus 1002 provided in the embodiment shown in FIG.
  • FIG. 22 is a block diagram showing the structure of a terminal 1100 according to an exemplary embodiment of the present invention.
  • the terminal 1100 can be a portable mobile terminal, such as a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III), and a MP4 (Moving Picture Experts Group Audio Layer IV). Image experts compress standard audio layers 4) players, laptops or desktops.
  • Terminal 1100 may also be referred to as a user device, a portable terminal, a laptop terminal, a desktop terminal, and the like.
  • the terminal 1100 includes a processor 1101 and a memory 1102.
  • the processor 1101 can include one or more processing cores, such as a 4-core processor, an 8-core processor, and the like.
  • the processor 1101 may be configured by at least one of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). achieve.
  • the processor 1101 may also include a main processor and a coprocessor.
  • the main processor is a processor for processing data in an awake state, which is also called a CPU (Central Processing Unit); the coprocessor is A low-power processor for processing data in standby.
  • the processor 1101 may be integrated with a GPU (Graphics Processing Unit), which is responsible for rendering and rendering of content that needs to be displayed on the display screen.
  • the processor 1101 may further include an AI (Artificial Intelligence) processor for processing computational operations related to machine learning.
  • AI Artificial Intelligence
  • Memory 1102 can include one or more computer readable storage media, which can be non-transitory. Memory 1102 can also include high speed random access memory, as well as non-volatile memory, such as one or more disk storage devices, flash storage devices. In some embodiments, the non-transitory computer readable storage medium in the memory 1102 is configured to store at least one instruction for execution by the processor 1101 to implement the video encoding provided by the method embodiments of the present application. Method or video decoding method.
  • the terminal 1100 further optionally includes: a peripheral device interface 1103 and at least one peripheral device.
  • the processor 1101, the memory 1102, and the peripheral device interface 1103 may be connected by a bus or a signal line.
  • Each peripheral device can be connected to the peripheral device interface 1103 via a bus, signal line or circuit board.
  • the peripheral device includes at least one of a radio frequency circuit 1104, a touch display screen 1105, a camera 1106, an audio circuit 1107, a positioning component 1108, and a power source 1109.
  • the peripheral device interface 1103 can be used to connect at least one peripheral device associated with an I/O (Input/Output) to the processor 1101 and the memory 1102.
  • processor 1101, memory 1102, and peripheral interface 1103 are integrated on the same chip or circuit board; in some other embodiments, any of processor 1101, memory 1102, and peripheral interface 1103 or The two can be implemented on a separate chip or circuit board, which is not limited in this embodiment.
  • the RF circuit 1104 is configured to receive and transmit an RF (Radio Frequency) signal, also called an electromagnetic signal.
  • the RF circuit 1104 communicates with the communication network and other communication devices via electromagnetic signals.
  • the radio frequency circuit 1104 converts the electrical signal into an electromagnetic signal for transmission, or converts the received electromagnetic signal into an electrical signal.
  • the radio frequency circuit 1104 includes an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and the like.
  • the radio frequency circuit 1104 can communicate with other terminals via at least one wireless communication protocol.
  • the wireless communication protocols include, but are not limited to, the World Wide Web, a metropolitan area network, an intranet, generations of mobile communication networks (2G, 3G, 4G, and 5G), wireless local area networks, and/or WiFi (Wireless Fidelity) networks.
  • the radio frequency circuit 1104 may further include an NFC (Near Field Communication) related circuit, which is not limited in this application.
  • the display 1105 is used to display a UI (User Interface).
  • the UI can include graphics, text, icons, video, and any combination thereof.
  • the display 1105 also has the ability to capture touch signals over the surface or surface of the display 1105.
  • the touch signal can be input to the processor 1101 as a control signal for processing.
  • the display 1105 can also be used to provide virtual buttons and/or virtual keyboards, also referred to as soft buttons and/or soft keyboards.
  • the display screen 1105 may be one, and the front panel of the terminal 1100 is disposed; in other embodiments, the display screen 1105 may be at least two, respectively disposed on different surfaces of the terminal 1100 or in a folded design; In still other embodiments, the display screen 1105 can be a flexible display screen disposed on a curved surface or a folded surface of the terminal 1100. Even the display screen 1105 can be set to a non-rectangular irregular pattern, that is, a profiled screen.
  • the display 1105 can be made of a material such as an LCD (Liquid Crystal Display) or an OLED (Organic Light-Emitting Diode).
  • Camera component 1106 is used to capture images or video.
  • camera assembly 1106 includes a front camera and a rear camera.
  • the front camera is placed on the front panel of the terminal, and the rear camera is placed on the back of the terminal.
  • the rear camera is at least two, which are respectively a main camera, a depth camera, a wide-angle camera, and a telephoto camera, so as to realize the background blur function of the main camera and the depth camera, and the main camera Combine with a wide-angle camera for panoramic shooting and VR (Virtual Reality) shooting or other integrated shooting functions.
  • the camera assembly 1106 can also include a flash.
  • the flash can be a monochrome temperature flash or a two-color temperature flash.
  • the two-color temperature flash is a combination of a warm flash and a cool flash that can be used for light compensation at different color temperatures.
  • the audio circuit 1107 can include a microphone and a speaker.
  • the microphone is used to collect sound waves of the user and the environment, and convert the sound waves into electrical signals for input to the processor 1101 for processing, or to the RF circuit 1104 for voice communication.
  • the microphones may be multiple, and are respectively disposed at different parts of the terminal 1100.
  • the microphone can also be an array microphone or an omnidirectional acquisition microphone.
  • the speaker is then used to convert electrical signals from the processor 1101 or the RF circuit 1104 into sound waves.
  • the speaker can be a conventional film speaker or a piezoelectric ceramic speaker.
  • the audio circuit 1107 can also include a headphone jack.
  • the positioning component 1108 is configured to locate the current geographic location of the terminal 1100 to implement navigation or LBS (Location Based Service).
  • the positioning component 1108 can be a positioning component based on a US-based GPS (Global Positioning System), a Chinese Beidou system, or a Russian Galileo system.
  • a power supply 1109 is used to power various components in the terminal 1100.
  • the power source 1109 can be an alternating current, a direct current, a disposable battery, or a rechargeable battery.
  • the rechargeable battery may be a wired rechargeable battery or a wireless rechargeable battery.
  • a wired rechargeable battery is a battery that is charged by a wired line
  • a wireless rechargeable battery is a battery that is charged by a wireless coil.
  • the rechargeable battery can also be used to support fast charging technology.
  • terminal 1100 also includes one or more sensors 810.
  • the one or more sensors 810 include, but are not limited to, an acceleration sensor 811, a gyro sensor 812, a pressure sensor 813, a fingerprint sensor 814, an optical sensor 815, and a proximity sensor 816.
  • the acceleration sensor 811 can detect the magnitude of the acceleration on the three coordinate axes of the coordinate system established by the terminal 1100.
  • the acceleration sensor 811 can be used to detect components of gravity acceleration on three coordinate axes.
  • the processor 1101 can control the touch display screen 1105 to display the user interface in a landscape view or a portrait view according to the gravity acceleration signal collected by the acceleration sensor 811.
  • the acceleration sensor 811 can also be used for the acquisition of game or user motion data.
  • the gyro sensor 812 can detect the body direction and the rotation angle of the terminal 1100, and the gyro sensor 812 can cooperate with the acceleration sensor 811 to collect the 3D motion of the user to the terminal 1100. Based on the data collected by the gyro sensor 812, the processor 1101 can implement functions such as motion sensing (such as changing the UI according to the user's tilting operation), image stabilization at the time of shooting, game control, and inertial navigation.
  • functions such as motion sensing (such as changing the UI according to the user's tilting operation), image stabilization at the time of shooting, game control, and inertial navigation.
  • the pressure sensor 813 may be disposed at a side border of the terminal 1100 and/or a lower layer of the touch display screen 1105.
  • the pressure sensor 813 When the pressure sensor 813 is disposed on the side frame of the terminal 1100, the user's holding signal to the terminal 1100 can be detected, and the processor 1101 performs left and right hand recognition or shortcut operation according to the holding signal collected by the pressure sensor 813.
  • the operability control on the UI interface is controlled by the processor 1101 according to the user's pressure operation on the touch display screen 1105.
  • the operability control includes at least one of a button control, a scroll bar control, an icon control, and a menu control.
  • the fingerprint sensor 814 is used to collect the fingerprint of the user.
  • the processor 1101 identifies the identity of the user according to the fingerprint collected by the fingerprint sensor 814, or the fingerprint sensor 814 identifies the identity of the user according to the collected fingerprint. Upon identifying that the user's identity is a trusted identity, the processor 1101 authorizes the user to perform related sensitive operations including unlocking the screen, viewing encrypted information, downloading software, paying and changing settings, and the like.
  • the fingerprint sensor 814 can be provided with the front, back or side of the terminal 1100. When the physical button or vendor logo is set on the terminal 1100, the fingerprint sensor 814 can be integrated with the physical button or the manufacturer logo.
  • Optical sensor 815 is used to collect ambient light intensity.
  • the processor 1101 can control the display brightness of the touch display 1105 based on the ambient light intensity acquired by the optical sensor 815. Specifically, when the ambient light intensity is high, the display brightness of the touch display screen 1105 is raised; when the ambient light intensity is low, the display brightness of the touch display screen 1105 is lowered.
  • the processor 1101 can also dynamically adjust the shooting parameters of the camera assembly 1106 according to the ambient light intensity collected by the optical sensor 815.
  • Proximity sensor 816 also referred to as a distance sensor, is typically disposed on the front panel of terminal 1100. Proximity sensor 816 is used to collect the distance between the user and the front side of terminal 1100. In one embodiment, when the proximity sensor 816 detects that the distance between the user and the front side of the terminal 1100 is gradually decreasing, the processor 1101 controls the touch display screen 1105 to switch from the bright screen state to the interest screen state; when the proximity sensor 816 detects When the distance between the user and the front side of the terminal 1100 gradually becomes larger, the processor 1101 controls the touch display screen 1105 to switch from the state of the screen to the bright state.
  • FIG. 22 does not constitute a limitation of the terminal 1100, and may include more or less components than those illustrated, or may combine some components or adopt different component arrangements.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

La présente invention concerne un procédé de codage vidéo, un procédé de décodage vidéo, un dispositif et un système, se rapportant au domaine du codage et du décodage vidéo. Le procédé consiste à : acquérir une image déformée et un composant d'informations de bord correspondant à l'image déformée, et à entrer l'image déformée et le composant d'informations de bord dans un modèle de réseau de neurones convolutif pour effectuer un traitement de filtrage de façon à obtenir un premier bloc d'image non déformée correspondant à un bloc d'image déformée compris dans l'image déformée ; sélectionner un bloc d'image à partir d'au moins un bloc d'image correspondant au bloc d'image déformée pour être un bloc d'image non déformée cible du bloc d'image déformée, le ou les blocs d'image comprenant le premier bloc d'image non déformée correspondant au bloc d'image déformée et/ou au bloc d'image déformée ; et coder, en fonction d'un bloc d'image non déformée correspondant à chaque bloc d'image déformée compris dans l'image déformée, une image vidéo d'origine à la suite d'une image vidéo d'origine courante pour obtenir un flux de bits vidéo. La présente invention permet d'améliorer la performance de suppression de déformation d'image.
PCT/CN2019/072417 2018-01-18 2019-01-18 Procédé de codage vidéo, procédé de décodage vidéo, dispositif, et système WO2019141258A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810050810.6 2018-01-18
CN201810050810.6A CN110062226B (zh) 2018-01-18 2018-01-18 一种视频编码方法、视频解码方法、装置、系统及介质

Publications (1)

Publication Number Publication Date
WO2019141258A1 true WO2019141258A1 (fr) 2019-07-25

Family

ID=67301287

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/072417 WO2019141258A1 (fr) 2018-01-18 2019-01-18 Procédé de codage vidéo, procédé de décodage vidéo, dispositif, et système

Country Status (2)

Country Link
CN (1) CN110062226B (fr)
WO (1) WO2019141258A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113132755A (zh) * 2019-12-31 2021-07-16 北京大学 一种可扩展人机协同图像编码方法及编码系统

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05191796A (ja) * 1992-01-10 1993-07-30 Sharp Corp ブロック歪補正器
CN103621083A (zh) * 2011-06-30 2014-03-05 三菱电机株式会社 图像编码装置、图像解码装置、图像编码方法以及图像解码方法
WO2017036370A1 (fr) * 2015-09-03 2017-03-09 Mediatek Inc. Procédé et appareil de traitement basé sur un réseau neuronal dans un codage vidéo
CN107197260A (zh) * 2017-06-12 2017-09-22 清华大学深圳研究生院 基于卷积神经网络的视频编码后置滤波方法
CN108932697A (zh) * 2017-05-26 2018-12-04 杭州海康威视数字技术股份有限公司 一种失真图像的去失真方法、装置及电子设备
CN109120937A (zh) * 2017-06-26 2019-01-01 杭州海康威视数字技术股份有限公司 一种视频编码方法、解码方法、装置及电子设备
CN109151475A (zh) * 2017-06-27 2019-01-04 杭州海康威视数字技术股份有限公司 一种视频编码方法、解码方法、装置及电子设备

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05191796A (ja) * 1992-01-10 1993-07-30 Sharp Corp ブロック歪補正器
CN103621083A (zh) * 2011-06-30 2014-03-05 三菱电机株式会社 图像编码装置、图像解码装置、图像编码方法以及图像解码方法
WO2017036370A1 (fr) * 2015-09-03 2017-03-09 Mediatek Inc. Procédé et appareil de traitement basé sur un réseau neuronal dans un codage vidéo
CN108932697A (zh) * 2017-05-26 2018-12-04 杭州海康威视数字技术股份有限公司 一种失真图像的去失真方法、装置及电子设备
CN107197260A (zh) * 2017-06-12 2017-09-22 清华大学深圳研究生院 基于卷积神经网络的视频编码后置滤波方法
CN109120937A (zh) * 2017-06-26 2019-01-01 杭州海康威视数字技术股份有限公司 一种视频编码方法、解码方法、装置及电子设备
CN109151475A (zh) * 2017-06-27 2019-01-04 杭州海康威视数字技术股份有限公司 一种视频编码方法、解码方法、装置及电子设备

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113132755A (zh) * 2019-12-31 2021-07-16 北京大学 一种可扩展人机协同图像编码方法及编码系统
CN113132755B (zh) * 2019-12-31 2022-04-01 北京大学 可扩展人机协同图像编码方法及系统、解码器训练方法

Also Published As

Publication number Publication date
CN110062226B (zh) 2021-06-11
CN110062226A (zh) 2019-07-26

Similar Documents

Publication Publication Date Title
TWI788630B (zh) 三維人臉模型生成方法、裝置、電腦設備及儲存介質
CN108810538B (zh) 视频编码方法、装置、终端及存储介质
US20200327694A1 (en) Relocalization method and apparatus in camera pose tracking process and storage medium
CN107945163B (zh) 图像增强方法及装置
CN108305236B (zh) 图像增强处理方法及装置
WO2019140952A1 (fr) Procédé de codage vidéo, dispositif, appareil et support d'informations
WO2019141193A1 (fr) Procédé et appareil de traitement de données de trame vidéo
US20200382781A1 (en) Video encoding method and apparatus, storage medium, and device
CN112907725B (zh) 图像生成、图像处理模型的训练、图像处理方法和装置
CN110933334B (zh) 视频降噪方法、装置、终端及存储介质
CN111028144B (zh) 视频换脸方法及装置、存储介质
WO2020083385A1 (fr) Procédé, dispositif et système de traitement d'image
WO2024016611A1 (fr) Procédé et appareil de traitement d'images, dispositif électronique, et support d'informations lisible par ordinateur
CN110991457A (zh) 二维码处理方法、装置、电子设备及存储介质
WO2023087637A1 (fr) Procédé et appareil de codage vidéo, dispositif électronique et support de stockage lisible par ordinateur
CN110572710B (zh) 视频生成方法、装置、设备及存储介质
WO2019141258A1 (fr) Procédé de codage vidéo, procédé de décodage vidéo, dispositif, et système
CN111698512B (zh) 视频处理方法、装置、设备及存储介质
CN115330610A (zh) 图像处理方法、装置、电子设备以及存储介质
CN110460856B (zh) 视频编码方法、装置、编码设备及计算机可读存储介质
CN108881739B (zh) 图像生成方法、装置、终端及存储介质
CN113379624A (zh) 图像生成方法、图像生成模型的训练方法、装置及设备
CN109587603B (zh) 音量控制方法、装置及存储介质
WO2019141255A1 (fr) Procédé et dispositif de filtrage d'image
CN111797754A (zh) 图像检测的方法、装置、电子设备及介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19741850

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19741850

Country of ref document: EP

Kind code of ref document: A1