WO2023169303A1 - 编解码方法、装置、设备、存储介质及计算机程序产品 - Google Patents

编解码方法、装置、设备、存储介质及计算机程序产品 Download PDF

Info

Publication number
WO2023169303A1
WO2023169303A1 PCT/CN2023/079340 CN2023079340W WO2023169303A1 WO 2023169303 A1 WO2023169303 A1 WO 2023169303A1 CN 2023079340 W CN2023079340 W CN 2023079340W WO 2023169303 A1 WO2023169303 A1 WO 2023169303A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature
probability distribution
image
super
network
Prior art date
Application number
PCT/CN2023/079340
Other languages
English (en)
French (fr)
Inventor
毛珏
赵寅
于德权
张恋
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2023169303A1 publication Critical patent/WO2023169303A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding

Definitions

  • the present application relates to the field of coding and decoding technology, and in particular to a coding and decoding method, device, equipment, storage medium and computer program product.
  • Image compression technology can achieve effective transmission and storage of image information, and plays an important role in the current media era where the types and amounts of image information are increasing.
  • Image compression technology includes encoding and decoding images, and encoding and decoding performance reflects image quality and is an element that needs to be considered in image compression technology.
  • the image feature y of the image is extracted through the image feature extraction network, and the image feature y is quantized according to the quantization step size q to obtain the image feature ys.
  • the image features ys are input into the super-encoding network to determine the super-prior features zs, and the super-prior features zs are encoded into the code stream through entropy coding.
  • Entropy decoding is performed on the super prior feature zs in the code stream to obtain the super prior feature zs’. Based on the super prior feature zs’, the probability distribution parameters of the image feature ys are obtained through the probability distribution estimation network.
  • the image feature ys is encoded into the code stream through entropy coding.
  • the decoding process is symmetrical to the encoding process. Among them, a large part of image compression is achieved through quantization operations, which have a great impact on encoding and decoding performance.
  • the quantization operation during the encoding and decoding process needs to match the code rate.
  • different quantization steps are often needed during the encoding and decoding process.
  • Different quantization step sizes will cause significant differences in the numerical range of the quantized image feature ys.
  • the numerical range of the image feature ys varies greatly under different code rates, and the training of the probability distribution estimation network is more difficult.
  • the network training is unstable, and it is difficult to train a probability distribution estimation network with good performance, thus affecting the encoding and decoding performance.
  • Embodiments of the present application provide a coding and decoding method, device, equipment, storage medium and computer program product, which can also reduce the training difficulty of the probability distribution estimation network in a multi-code rate scenario, making network training stable and training performance Good network, thereby improving encoding and decoding performance.
  • this application provides an encoding method, including:
  • the first image feature is the image feature obtained by quantizing the second image feature according to the first quantization step; determine the first super prior of the second image feature Features; encode the first super-prior features into the code stream; determine the first probability distribution parameters through the probability distribution estimation network based on the first super-prior features; quantify the first probability distribution parameters according to the first quantization step, to obtain the second probability distribution parameters; based on the second concept rate distribution parameters to encode the first image features into the code stream.
  • determining the first super a priori feature of the second image feature includes: inputting the second image feature into a super coding network to obtain the first super a priori feature. That is, the encoding end inputs the unquantized image features into the super-encoding network to obtain the first super-prior features of the unquantized image features.
  • determining the first super a priori feature of the second image feature includes: inverse quantizing the first image feature according to the first quantization step size to obtain the third image feature of the image; inputting the third image feature Superencoding the network to obtain the first super-prior features. That is, the encoding end inputs the dequantized image features into the super-encoding network, and the obtained first super a priori features of the dequantized image features are considered to be the first super a priori features of the unquantized image features.
  • determining the first probability distribution parameter through a probability distribution estimation network includes: inputting the third image feature of the image into the context network to obtain the context feature of the third image feature, the third image
  • the feature is an image feature obtained by inverse quantizing the first image feature according to the first quantization step size; based on the first super prior feature, the first prior feature is determined; the first prior feature and the context feature are input into the probability distribution estimation network , to obtain the first probability distribution parameters. That is, the encoding end extracts context features from the dequantized image features, and then combines the context features and the first prior features to determine the first probability distribution parameters. This will help improve the accuracy of probability estimation.
  • determining the first probability distribution parameter through a probability distribution estimation network includes: inputting the first image feature into the context network to obtain the context feature of the first image feature; based on the first super prior feature a priori features to determine the first a priori features; quantify the first a priori features according to the second quantization step size to obtain the second a priori features; input the second a priori features and context features into the probability distribution estimation network to obtain Get the first probability distribution parameters. That is to say, the encoding end can also extract contextual features from the quantized image features, and then add a quantization operation to the first a priori feature to obtain the second a priori feature, thereby determining the second a priori feature and the contextual feature. The first probability distribution parameter. In this way, the accuracy of probability estimation can also be improved to a certain extent.
  • the first quantization step size is obtained through a gain network based on the code rate of the image, and the gain network is used to determine the quantization step sizes corresponding to multiple code rates. That is to say, the quantization step size is obtained through network learning, and the quantization step size can better match the code rate, which is beneficial to improving encoding and decoding performance.
  • the second aspect provides a decoding method, which includes:
  • Parse the first super-prior feature of the image to be decoded from the code stream based on the first super-prior feature, determine the first probability distribution parameter through the probability distribution estimation network, and the first probability distribution parameter represents the unquantized image of the image
  • the probability distribution of the feature according to the first quantization step, quantize the first probability distribution parameter to obtain the second probability distribution parameter; based on the second probability distribution parameter, parse the first image feature of the image from the code stream; according to The first quantization step size is used to inverse quantize the first image feature to reconstruct the image.
  • the first image feature is an image feature obtained by quantizing the second image feature of the image according to the first quantization step size.
  • performing inverse quantization on the first image feature according to the first quantization step size to reconstruct the image includes: performing inverse quantization on the first image feature according to the first quantization step size to obtain a third image of the image. Features; reconstruct the image based on the third image features.
  • the first probability distribution parameter is a probability distribution parameter of multiple feature points
  • the first super-prior feature is a super-prior feature of the multiple feature points.
  • the image features of the decoded feature point determine the context feature of the first feature point; based on the super-prior feature of the first feature point, determine the first prior feature of the first feature point; based on the first prior feature of the first feature point Features and contextual features of the first feature point, determine the probability distribution parameters of the first feature point through the probability distribution estimation network, that is, determine the probability distribution parameters of the first feature point in the first probability distribution parameters. That is to say, estimating the probability distribution based on contextual features is beneficial to improving the accuracy of probability estimation.
  • determining the context feature of the first feature point based on the image feature of the decoded feature point in the first image feature includes: determining peripheral feature points of the first feature point from the decoded feature points; according to the first A quantization step size, inversely quantize the image features of the surrounding feature points in the first image feature to obtain the surrounding features of the first feature point; input the surrounding features of the first feature point into the context network to obtain the surrounding features of the first feature point Contextual features; based on the first priori features of the first feature point and the contextual features of the first feature point, determining the probability distribution parameters of the first feature point through a probability distribution estimation network, including: converting the first priori feature of the first feature point The feature and the context feature of the first feature point are input into the probability distribution estimation network to obtain the probability distribution parameters of the first feature point. That is, the decoder extracts context features from the inverse quantized image features, and then combines the context features and the first prior features to determine the first probability distribution parameters.
  • determining the context feature of the first feature point based on the image feature of the decoded feature point in the first image feature includes: determining the surrounding feature points of the first feature point from the decoded feature points; The image features of the surrounding feature points in an image feature are input into the context network to obtain the context feature of the first feature point; based on the first prior feature of the first feature point and the context feature of the first feature point, it is determined through the probability distribution estimation network
  • the probability distribution parameters of the first feature point include: quantifying the first a priori feature of the first feature point according to the second quantization step size to obtain the second a priori feature of the first feature point; The second a priori feature and the contextual feature of the first feature point are input into the probability distribution estimation network to obtain the probability distribution parameters of the first feature point.
  • the decoder extracts contextual features from the quantized image features, and then adds a quantization operation to the first prior features to obtain the second prior features, thereby combining the second prior features and contextual features to determine the first prior features. Probability distribution parameters.
  • the third aspect provides an encoding method, which includes:
  • the first image feature is the image feature obtained by quantizing the second image feature according to the first quantization step; determine the first super prior of the second image feature Features; encode the first super a priori feature into the code stream; based on the first super a priori feature, determine the second probability distribution parameters through the second probability distribution estimation network, and the network parameters of the second probability distribution estimation network are based on the first probability Obtained from the network parameters of the distribution estimation network and the first quantization step size, the first probability distribution estimation network is used to determine the probability distribution of unquantized image features; based on the second probability Distribution parameters, encoding the first image features into the code stream.
  • the super-prior features of the unquantized image features are also determined, but the second probability distribution parameters are directly obtained through the second probability distribution estimation network.
  • the second probability distribution estimation network is based on the first
  • the quantization step size is obtained by processing the network parameters in the first probability distribution estimation network. It can be seen that in this scheme, the first probability distribution estimation network can be trained. Even in a multi-code rate scenario, since the numerical range of unquantized image features is relatively stable and is not affected by the quantization step size, the training of the first probability distribution estimation network is less difficult, and the network training is stable and can be trained A first probability distribution estimation network with good performance is developed, which is beneficial to improving encoding and decoding performance.
  • the first probability distribution estimation network is the probability distribution estimation network in the above-mentioned first aspect or second aspect.
  • the second probability distribution estimation network is obtained by multiplying the network parameters of the last layer in the first probability distribution estimation network by the first quantization step size.
  • the last layer of the first probability distribution estimation network is a convolutional layer
  • the network parameters of the convolutional layer include weights and biases.
  • determining the first super a priori feature of the second image feature includes: inputting the second image feature into a super coding network to obtain the first super a priori feature. That is, the encoding end inputs the unquantized image features into the super-encoding network to obtain the first super-prior features of the unquantized image features.
  • determining the first super a priori feature of the second image feature includes: inverse quantizing the first image feature according to the first quantization step size to obtain the third image feature of the image; inputting the third image feature Superencoding the network to obtain the first super-prior features. That is, the encoding end inputs the dequantized image features into the super-encoding network, and the obtained first super a priori features of the dequantized image features are considered to be the first super a priori features of the unquantized image features.
  • determining the second probability distribution parameter through the second probability distribution estimation network includes: inputting the third image feature of the image into the context network to obtain the context feature of the third image feature,
  • the three image features are image features obtained by inverse quantizing the first image feature according to the first quantization step size; determining the first a priori feature based on the first super a priori feature; inputting the first a priori feature and the context feature into the second Probability distribution estimation network to obtain second probability distribution parameters. That is, the encoding end extracts context features from the dequantized image features, and then combines the context features and the first prior features to determine the first probability distribution parameters. This will help improve the accuracy of probability estimation.
  • determining the second probability distribution parameter through the second probability distribution estimation network includes: inputting the first image feature into the context network to obtain the context feature of the first image feature; based on the first Super prior features, determine the first prior features; quantify the first prior features according to the second quantization step to obtain the second prior features; input the second prior features and contextual features into the second probability distribution Estimate the network to obtain the second probability distribution parameters. That is to say, the encoding end can also extract contextual features from the quantized image features, and then add a quantization operation to the first a priori feature to obtain the second a priori feature, thereby determining the second a priori feature and the contextual feature. The first probability distribution parameter. In this way, the accuracy of probability estimation can also be improved to a certain extent.
  • the fourth aspect provides a decoding method, which includes:
  • the first super-prior features of the image to be decoded are parsed from the code stream; based on the first super-prior features, the second probability distribution parameters are determined through the second probability distribution estimation network, and the network parameters of the second probability distribution estimation network are Based on the network parameters of the first probability distribution estimation network and the first quantization step size, the first probability distribution estimation network is used to determine the probability distribution of unquantized image features; based on the second probability distribution parameters, parsed from the code stream
  • the first image feature of the image; according to the first quantization step size, the first image feature is inversely quantized to reconstruct the image.
  • the second probability distribution parameters are directly obtained through the second probability distribution estimation network.
  • the second probability distribution estimation network processes the network parameters in the first probability distribution estimation network based on the first quantization step size. owned. In this scheme, just train the first probability distribution estimation network.
  • the training of the first probability distribution estimation network is less difficult, the network training is stable, and the first probability distribution estimation network with good performance can be trained, which is beneficial to improving the encoding and decoding performance.
  • the first probability distribution estimation network is the probability distribution estimation network in the above-mentioned first aspect or second aspect.
  • the second probability distribution estimation network is obtained by multiplying the network parameters of the last layer in the first probability distribution estimation network by the first quantization step size.
  • the last layer of the first probability distribution estimation network is a convolutional layer
  • the network parameters of the convolutional layer include weights and biases.
  • the first image feature is an image feature obtained by quantizing the second image feature of the image according to the first quantization step size.
  • performing inverse quantization on the first image feature according to the first quantization step size to reconstruct the image includes: performing inverse quantization on the first image feature according to the first quantization step size to obtain a third image of the image. Features; reconstruct the image based on the third image features.
  • the second probability distribution parameter is a probability distribution parameter of multiple feature points
  • the first super-prior feature is a super-prior feature of the multiple feature points.
  • the estimation network determines the second probability distribution parameter, including: performing the following operations on the first feature point to determine the probability distribution parameter of the first feature point, which is any one of the plurality of feature points: based on the first image feature Determine the contextual features of the first feature point based on the image features of the decoded feature points; determine the first prior features of the first feature point based on the super-prior features of the first feature point; The a priori features and the contextual features of the first feature point are used to determine the probability distribution parameters of the first feature point through the second probability distribution estimation network, that is, the probability distribution parameters of the first feature point in the second probability distribution parameters are determined. That is to say, estimating the probability distribution based on contextual features is beneficial to improving the accuracy of probability estimation.
  • determining the context feature of the first feature point based on the image feature of the decoded feature point in the first image feature includes: determining peripheral feature points of the first feature point from the decoded feature points; according to the first A quantization step size, inversely quantize the image features of the surrounding feature points in the first image feature to obtain the surrounding features of the first feature point; input the surrounding features of the first feature point into the context network to obtain the surrounding features of the first feature point Contextual features; based on the first a priori feature of the first feature point and the context feature of the first feature point, determining the probability distribution parameters of the first feature point through the second probability distribution estimation network, including: converting the first feature point of the first feature point
  • the prior features and the contextual features of the first feature point are input into the second probability distribution estimation network to obtain the probability distribution parameters of the first feature point. That is, the decoder extracts context features from the dequantized image features, and then combines the context features and the first prior features to determine the second probability distribution parameters.
  • determining the context feature of the first feature point based on the image feature of the decoded feature point in the first image feature includes: determining the surrounding feature points of the first feature point from the decoded feature points; The image features of the surrounding feature points in an image feature are input into the context network to obtain the context feature of the first feature point; based on the first prior feature of the first feature point and the context feature of the first feature point, it is estimated through the second probability distribution
  • the network determines the probability distribution parameters of the first feature point, including: quantifying the first a priori feature of the first feature point according to the second quantization step to obtain the second a priori feature of the first feature point;
  • the second a priori feature of the feature point and the contextual feature of the first feature point are input into the second probability distribution estimation network to obtain the probability distribution parameters of the first feature point. That is to say, the decoder uses quantized image features Extract contextual features, and then add a quantification operation to the first prior features to obtain the second prior features, thereby determining the second probability distribution parameters by
  • an encoding device has the function of realizing the behavior of the encoding method in the first aspect.
  • the encoding device includes one or more modules, and the one or more modules are used to implement the encoding method provided in the first aspect.
  • an encoding device which device includes:
  • the first determination module is used to determine the first image feature and the second image feature of the image to be encoded.
  • the first image feature is the image feature obtained by quantizing the second image feature according to the first quantization step;
  • a second determination module configured to determine the first super-a priori feature of the second image feature
  • the first encoding module is used to encode the first super-priori feature into the code stream
  • a probability estimation module configured to determine the first probability distribution parameters through a probability distribution estimation network based on the first super-prior features
  • a quantization module used to quantize the first probability distribution parameters according to the first quantization step size to obtain the second probability distribution parameters
  • the second encoding module is used to encode the first image feature into the code stream based on the second probability distribution parameter.
  • the second determination module includes:
  • the first super-encoding sub-module is used to input the second image feature into the super-encoding network to obtain the first super-prior feature.
  • the second determination module includes:
  • the inverse quantization submodule is used to inverse quantize the first image feature according to the first quantization step size to obtain the third image feature of the image;
  • the second super-encoding submodule is used to input the third image feature into the super-encoding network to obtain the first super-prior feature.
  • the probability estimation module includes:
  • the context submodule is used to input the third image feature of the image into the context network to obtain the context feature of the third image feature.
  • the third image feature is the image feature obtained by inverse quantizing the first image feature according to the first quantization step size. ;
  • the first determination sub-module is used to determine the first a priori feature based on the first super a priori feature
  • the first probability estimation submodule is used to input the first a priori feature and the context feature into the probability distribution estimation network to obtain the first probability distribution parameters.
  • the probability estimation module includes:
  • the context submodule is used to input the first image feature into the context network to obtain the context feature of the first image feature;
  • the second determination sub-module is used to determine the first a priori feature based on the first super a priori feature
  • the quantization submodule is used to quantize the first a priori feature according to the second quantization step size to obtain the second a priori feature;
  • the second probability estimation submodule is used to input the second a priori feature and the context feature into the probability distribution estimation network to obtain the first probability distribution parameters.
  • a sixth aspect provides a decoding device, the decoding device having the function of implementing the behavior of the decoding method in the above second aspect.
  • the decoding device includes one or more modules, and the one or more modules are used to implement the decoding method provided in the second aspect.
  • a decoding device which device includes:
  • the first parsing module is used to parse the first super-priori feature of the image to be decoded from the code stream;
  • a probability estimation module configured to determine a first probability distribution parameter through a probability distribution estimation network based on the first super-prior feature, where the first probability distribution parameter represents a probability distribution of unquantized image features of the image;
  • a quantization module used to quantize the first probability distribution parameters according to the first quantization step size to obtain the second probability distribution parameters
  • the second parsing module is used to parse the first image feature of the image from the code stream based on the second probability distribution parameter
  • the reconstruction module is used to dequantize the first image feature according to the first quantization step size to reconstruct the image.
  • the first image feature is an image feature obtained by quantizing the second image feature of the image according to the first quantization step size.
  • the reconstruction module includes:
  • the inverse quantization submodule is used to inverse quantize the first image feature according to the first quantization step size to obtain the third image feature of the image;
  • the reconstruction submodule is used to reconstruct the image based on the third image feature.
  • the first probability distribution parameter is a probability distribution parameter of multiple feature points
  • the first super-prior feature is a super-prior feature of the multiple feature points.
  • the probability estimation module includes: context sub-module, first determiner modules and probability estimation sub-modules;
  • the probability distribution parameters of the first feature point are determined through the context sub-module, the first determination sub-module and the probability estimation sub-module, and the first feature point is any one of the plurality of feature points; wherein,
  • the context submodule is used to determine the context feature of the first feature point based on the image feature of the decoded feature point in the first image feature;
  • the first determination sub-module is used to determine the first a priori feature of the first feature point based on the super a priori feature of the first feature point;
  • the probability estimation submodule is used to determine the probability distribution parameters of the first feature point through a probability distribution estimation network based on the first a priori feature of the first feature point and the context feature of the first feature point.
  • the context submodule is used to:
  • the first quantization step size perform inverse quantization on the image features of the peripheral feature points in the first image feature to obtain the peripheral features of the first feature point;
  • a probability distribution estimation network including:
  • the first a priori feature of the first feature point and the contextual feature of the first feature point are input into the probability distribution estimation network to obtain the probability distribution parameter of the first feature point.
  • the context submodule is used to:
  • a probability distribution estimation network including:
  • the first a priori feature of the first feature point is quantized to obtain the second feature of the first feature point.
  • the second a priori feature of the first feature point and the context feature of the first feature point are input into the probability distribution estimation network to obtain the probability distribution parameter of the first feature point.
  • an encoding device has the function of realizing the behavior of the encoding method in the above third aspect.
  • the encoding device includes one or more modules, and the one or more modules are used to implement the encoding method provided in the third aspect.
  • an encoding device which device includes:
  • the first determination module is used to determine the first image feature and the second image feature of the image to be encoded.
  • the first image feature is the image feature obtained by quantizing the second image feature according to the first quantization step;
  • a second determination module configured to determine the first super-a priori feature of the second image feature
  • the first encoding module is used to encode the first super-priori feature into the code stream
  • a probability estimation module configured to determine the second probability distribution parameters through the second probability distribution estimation network based on the first super-prior characteristics.
  • the network parameters of the second probability distribution estimation network are based on the network parameters of the first probability distribution estimation network and the third probability distribution estimation network. Obtained by a quantization step size, the first probability distribution estimation network is used to determine the probability distribution of unquantized image features;
  • the second encoding module is used to encode the first image feature into the code stream based on the second probability distribution parameter.
  • the second probability distribution estimation network is obtained by multiplying the network parameters of the last layer in the first probability distribution estimation network by the first quantization step size.
  • the last layer of the first probability distribution estimation network is a convolutional layer
  • the network parameters of the convolutional layer include weights and biases.
  • the second determination module includes:
  • the first super-encoding sub-module is used to input the second image feature into the super-encoding network to obtain the first super-prior feature.
  • the second determination module includes:
  • the inverse quantization submodule is used to inverse quantize the first image feature according to the first quantization step size to obtain the third image feature of the image;
  • the second super-encoding submodule is used to input the third image feature into the super-encoding network to obtain the first super-prior feature.
  • An eighth aspect provides a decoding device, the decoding device having the function of implementing the behavior of the decoding method in the fourth aspect.
  • the decoding device includes one or more modules, and the one or more modules are used to implement the decoding method provided in the fourth aspect.
  • a decoding device which device includes:
  • the first parsing module is used to parse the first super-priori feature of the image to be decoded from the code stream;
  • a probability estimation module configured to determine the second probability distribution parameters through the second probability distribution estimation network based on the first super-prior characteristics.
  • the network parameters of the second probability distribution estimation network are based on the network parameters of the first probability distribution estimation network and the third probability distribution estimation network. Obtained by a quantization step size, the first probability distribution estimation network is used to determine the probability distribution of unquantized image features;
  • the second parsing module is used to parse the first image feature of the image from the code stream based on the second probability distribution parameter
  • the reconstruction module is used to dequantize the first image feature according to the first quantization step size to reconstruct the image.
  • the second probability distribution estimation network is obtained by multiplying the network parameters of the last layer in the first probability distribution estimation network by the first quantization step size.
  • the last layer of the first probability distribution estimation network is a convolutional layer
  • the network parameters of the convolutional layer include weights and biases.
  • the first image feature is an image feature obtained by quantizing the second image feature of the image according to the first quantization step size.
  • the reconstruction module includes:
  • the inverse quantization submodule is used to inverse quantize the first image feature according to the first quantization step size to obtain the third image feature of the image;
  • the reconstruction submodule is used to reconstruct the image based on the third image feature.
  • a ninth aspect provides an encoding end device.
  • the encoding end device includes a processor and a memory.
  • the memory is used to store a program for executing the encoding method provided in the first aspect and/or the third aspect, and to store Data used to implement the encoding method provided by the first aspect and/or the third aspect.
  • the processor is configured to execute a program stored in the memory.
  • the encoding end device may also include a communication bus, which is used to establish a connection between the processor and the memory.
  • a decoding end device in a tenth aspect, includes a processor and a memory.
  • the memory is used to store a program for executing the decoding method provided in the second aspect and/or the fourth aspect, and to store Data used to implement the decoding method provided in the second aspect and/or the fourth aspect.
  • the processor is configured to execute a program stored in the memory.
  • the decoding end device may also include a communication bus, which is used to establish a connection between the processor and the memory.
  • a computer-readable storage medium is provided. Instructions are stored in the computer-readable storage medium. When run on a computer, the computer is caused to execute the code described in the first aspect or the third aspect. method, or perform the decoding method described in the second or fourth aspect above.
  • a twelfth aspect provides a computer program product containing instructions that, when run on a computer, causes the computer to execute the encoding method described in the first or third aspect, or to execute the second or fourth aspect.
  • the decoding method described in this aspect is not limited to:
  • the first probability distribution parameters are determined through the probability distribution estimation network, and the first probability distribution parameters are the representation The probability distribution of unquantized image features.
  • the first probability distribution parameters are then quantized according to the first quantization step size (ie, the quantization step size used to quantify the image features), thereby obtaining second probability distribution parameters used to characterize the probability distribution of the quantized image features.
  • the super-prior features of the unquantized image features are also determined, but the second probability distribution parameters are directly obtained through the second probability distribution estimation network.
  • the second probability distribution estimation network is based on the first quantization.
  • the step size is obtained by processing the network parameters in the first probability distribution estimation network, and the first probability distribution network
  • the parameters are the probability distribution estimation network in the first scheme.
  • the decoding process is symmetrical to the encoding process. It can be seen that in these two solutions, it is enough to train the first probability distribution estimation network (used to determine the probability distribution parameters of unquantized image features). Even in a multi-bitrate scenario, since the numerical range of unquantized image features is relatively stable and is not affected by the quantization step, that is, the input numerical range of the first probability distribution estimation network will not change with different bitrates. Therefore, the training difficulty of the first probability distribution estimation network is less, the network training is stable, and the first probability distribution estimation network with good performance can be trained, which is beneficial to improving the encoding and decoding performance.
  • Figure 1 is a schematic diagram of an implementation environment provided by an embodiment of the present application.
  • Figure 2 is a schematic diagram of another implementation environment provided by the embodiment of the present application.
  • Figure 3 is a schematic diagram of another implementation environment provided by the embodiment of the present application.
  • Figure 4 is a flow chart of an encoding method provided by an embodiment of the present application.
  • Figure 5 is a schematic structural diagram of an image feature extraction network provided by an embodiment of the present application.
  • Figure 6 is a flow chart of a coding and decoding method provided by an embodiment of the present application.
  • Figure 7 is a flow chart of another encoding and decoding method provided by an embodiment of the present application.
  • Figure 8 is a flow chart of yet another encoding and decoding method provided by an embodiment of the present application.
  • Figure 9 is a flow chart of yet another encoding and decoding method provided by an embodiment of the present application.
  • Figure 10 is a flow chart of yet another encoding and decoding method provided by an embodiment of the present application.
  • Figure 11 is a flow chart of another encoding method provided by an embodiment of the present application.
  • Figure 12 is a flow chart of a decoding method provided by an embodiment of the present application.
  • Figure 13 is a flow chart of another decoding method provided by an embodiment of the present application.
  • Figure 14 is a schematic structural diagram of an encoding device provided by an embodiment of the present application.
  • Figure 15 is a schematic structural diagram of a decoding device provided by an embodiment of the present application.
  • Figure 16 is a schematic structural diagram of another encoding device provided by an embodiment of the present application.
  • Figure 17 is a schematic structural diagram of another decoding device provided by an embodiment of the present application.
  • Figure 18 is a schematic block diagram of a coding and decoding device provided by an embodiment of the present application.
  • Code rate In image compression, it refers to the encoding length required for unit pixel encoding. The higher the code rate, the better the image reconstruction quality. good.
  • CNN Convolutional neural network
  • CNN includes convolutional layers and may also include activation layers (such as linear unit (rectified linear unit, ReLU), parameterized ReLU (Parametric ReLU, PReLU), etc.), pooling layer (pooling layer), batch normalization ( batch normalization (BN) layer, fully connected layer (fully connected layer), etc.
  • activation layers such as linear unit (rectified linear unit, ReLU), parameterized ReLU (Parametric ReLU, PReLU), etc.
  • pooling layer pooling layer
  • batch normalization batch normalization (BN) layer
  • Typical CNNs such as LeNet, AlexNet, VGGNet, ResNet, etc.
  • a basic CNN can include a backbone network and a head network.
  • Complex CNNs can include backbone networks, neck networks, and head networks.
  • Feature map The three-dimensional data output by the convolution layer, activation layer, pooling layer, batch normalization layer, etc. in the convolutional neural network.
  • the three dimensions are called width, height, Channel.
  • a feature map includes image features of multiple feature points.
  • Backbone network The first part of the convolutional neural network. Its function is to extract multi-scale feature maps from the input image. It usually consists of a convolution layer, a pooling layer, an activation layer, etc., and does not contain a fully connected layer. Usually, the feature map output by the layer in the backbone network closer to the input image has a larger resolution (width, height) but a smaller number of channels.
  • Typical backbone networks include VGG-16, ResNet-50, ResNeXt-101, etc.
  • Head network The last part of the convolutional neural network. Its function is to process the feature map to obtain the prediction result of the neural network output.
  • Common head networks include fully connected layers, softmax modules, etc.
  • Neck network The middle part of the convolutional neural network. Its function is to further integrate the feature maps generated by the head network to obtain new feature maps.
  • Common networks such as the feature pyramid network (FPN) in the fast region detection convolutional neural network (faster region-CNN, Faster-RCNN).
  • Figure 1 is a schematic diagram of an implementation environment provided by an embodiment of the present application.
  • the implementation environment includes an encoding end 101 and a decoding end 102.
  • the encoding end 101 is used to compress the image according to the encoding method provided by the embodiment of the present application
  • the decoding end 102 is used to decode the image according to the decoding method provided by the embodiment of the present application.
  • the encoding end 101 includes an encoder for compressing images
  • the decoding end 102 includes a decoder for decoding images.
  • the encoding end 101 and the decoding end 102 communicate through internal connections of the device or a network.
  • the encoding end 101 and the decoding end 102 communicate through external connections or wireless networks.
  • the encoding end 101 may also be called the source device, and the decoding end 102 may also be called the destination end transposition.
  • FIG. 2 is a schematic diagram of another implementation environment provided by the embodiment of the present application.
  • the implementation environment includes a source device 10 , a destination device 20 , a link 30 and a storage device 40 .
  • source device 10 may generate an encoded image. Therefore, the source device 10 may also be called an image encoding device or an encoding end.
  • Destination device 20 may decode the encoded image generated by source device 10 . Therefore, the destination device 20 may also be called an image decoding device or a decoding terminal.
  • Link 30 may receive the encoded image generated by source device 10 and may transmit the encoded image to destination device 20 .
  • the storage device 40 can receive the encoded image generated by the source device 10 and can store the encoded image.
  • the destination device 20 can directly obtain the encoded image from the storage device 40 .
  • storage device 40 may correspond to a file server or another intermediate storage device that may save the encoded images generated by source device 10 , in which case destination device 20 may store via streaming or download storage device 40 of Encoded image.
  • Source device 10 and destination device 20 may each include one or more processors and memory coupled to the one or more processors, which memory may include random access memory (RAM), read-only memory ( read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory, which can be used to store desired programs in the form of instructions or data structures that can be accessed by a computer Code for any other media etc.
  • RAM random access memory
  • ROM read-only memory
  • EEPROM electrically erasable programmable read-only memory
  • flash memory which can be used to store desired programs in the form of instructions or data structures that can be accessed by a computer Code for any other media etc.
  • both the source device 10 and the destination device 20 may include a mobile phone, a smart phone, a personal digital assistant (PDA), a wearable device, a pocket PC (PPC), a tablet computer, a smart car, a smartphone Televisions, smart speakers, desktop computers, mobile computing devices, notebooks (e.g., laptops), tablet computers, set-top boxes, telephone handsets such as so-called "smart" phones, televisions, cameras, display devices, Digital media player, video game console, vehicle computer or the like.
  • PDA personal digital assistant
  • PPC pocket PC
  • Link 30 may include one or more media or devices capable of transmitting encoded images from source device 10 to destination device 20 .
  • link 30 may include one or more communication media capable of enabling source device 10 to send encoded images directly to destination device 20 in real time.
  • the source device 10 may modulate the encoded image based on a communication standard, which may be a wireless communication protocol or the like, and may send the modulated image to the destination device 20 .
  • the one or more communication media may include wireless and/or wired communication media.
  • the one or more communication media may include a radio frequency (radio frequency, RF) spectrum or one or more physical transmission lines.
  • the one or more communication media may form part of a packet-based network, which may be a local area network, a wide area network, a global network (eg, the Internet), or the like.
  • the one or more communication media may include routers, switches, base stations, or other equipment that facilitates communication from the source device 10 to the destination device 20, etc., which are not specifically limited in the embodiments of the present application.
  • the storage device 40 can store the received encoded image sent by the source device 10 , and the destination device 20 can directly obtain the encoded image from the storage device 40 .
  • the storage device 40 may include any of a variety of distributed or locally accessed data storage media.
  • any of the multiple distributed or locally accessed data storage media may be Hard drive, Blu-ray Disc, digital versatile disc (DVD), compact disc read-only memory (CD-ROM), flash memory, volatile or non-volatile memory, or Any other suitable digital storage media that stores encoded images, etc.
  • storage device 40 may correspond to a file server or another intermediate storage device that may hold encoded images generated by source device 10 , and destination device 20 may stream or download storage device 40 via Stored image.
  • the file server may be any type of server capable of storing encoded images and sending the encoded images to destination device 20 .
  • the file server may include a network server, a file transfer protocol (FTP) server, a network attached storage (network attached storage, NAS) device or a local disk drive, etc.
  • Destination device 20 may obtain the encoded images over any standard data connection, including an Internet connection.
  • Any standard data connection may include a wireless channel (e.g., Wi-Fi connection), a wired connection (e.g., digital subscriber line (DSL), cable modem, etc.), or may be suitable for retrieving encoded data stored on a file server
  • the image is a combination of both.
  • the transmission of the encoded images from the storage device 40 may be a streaming transmission, a download transmission, or a combination of both.
  • the implementation environment shown in Figure 2 is only one possible implementation, and the technology of the embodiment of the present application can not only be applied to the source device 10 shown in Figure 2 that can encode images, but also can encode the encoded images.
  • Decoded destination The device 20 can also be applied to other devices that can encode images and decode encoded images, which are not specifically limited in this embodiment of the present application.
  • the source device 10 includes a data source 120 , an encoder 100 and an output interface 140 .
  • the output interface 140 may include a regulator/demodulator (modem) and/or a transmitter, where the transmitter may also be referred to as a transmitter.
  • Data source 120 may include an image capture device (e.g., video camera, etc.), an archive containing previously captured images, a feed interface for receiving images from an image content provider, and/or a computer graphics system for generating images, or A combination of these sources of images.
  • the data source 120 may send an image to the encoder 100, and the encoder 100 may encode the image received from the data source 120 to obtain an encoded image.
  • the encoder can send the encoded image to the output interface.
  • source device 10 sends the encoded images directly to destination device 20 via output interface 140 .
  • the encoded images may also be stored on storage device 40 for later retrieval by destination device 20 for decoding and/or display.
  • the destination device 20 includes an input interface 240 , a decoder 200 and a display device 220 .
  • input interface 240 includes a receiver and/or modem.
  • the input interface 240 may receive the encoded image via the link 30 and/or from the storage device 40 and then send it to the decoder 200.
  • the decoder 200 may decode the received encoded image to obtain a decoded image.
  • the decoder may send the decoded image to display device 220 .
  • Display device 220 may be integrated with destination device 20 or may be external to destination device 20 .
  • display device 220 displays the decoded image.
  • the display device 220 may be any of a variety of types of display devices.
  • the display device 220 may be a liquid crystal display (LCD), a plasma display, or an organic light-emitting diode (OLED). Monitor or other type of display device.
  • LCD liquid crystal display
  • OLED organic light-emitting diode
  • encoder 100 and decoder 200 may be integrated with the encoder and decoder, respectively, and may include appropriate multiplexer-demultiplexers.
  • MUX-DEMUX MUX-DEMUX unit or other hardware and software for encoding of both audio and video in a common data stream or in separate data streams.
  • the MUX-DEMUX unit may conform to the ITU H.223 multiplexer protocol, or other protocols such as user datagram protocol (UDP), if applicable.
  • the encoder 100 and the decoder 200 may each be any of the following circuits: one or more microprocessors, digital signal processing (DSP), application specific integrated circuit (ASIC) ), field-programmable gate array (FPGA), discrete logic, hardware, or any combination thereof. If the technology of embodiments of the present application is implemented partially in software, the device may store instructions for the software in a suitable non-volatile computer-readable storage medium, and may use one or more processors in hardware The instructions are executed to implement the technology of the embodiments of the present application. Any of the foregoing (including hardware, software, a combination of hardware and software, etc.) may be considered one or more processors. Each of the encoder 100 and the decoder 200 may be included in one or more encoders or decoders, either of which may be integrated into a combined encoding in the respective device. part of the encoder/decoder (codec).
  • Embodiments of the present application may generally refer to encoder 100 as “signaling” or “sending” certain information to another device, such as decoder 200.
  • the term “signaling” or “sending” may generally refer to the transmission of syntax elements and/or other data used to decode compressed images. This transfer can occur in real time or near real time. Alternatively, this communication may occur over a period of time, such as when encoding to store the syntax elements in the encoded bitstream to a computer-readable storage medium, and the decoding device may then occur after the syntax elements are stored to such media. Any time the syntax is retrieved element.
  • Figure 3 is a schematic diagram of yet another implementation environment provided by an embodiment of the present application.
  • the encoding and decoding method provided by the embodiment of the present application is applied to the virtual reality streaming scenario.
  • the implementation environment includes the encoding end and the decoding end.
  • the encoding end includes the video collection and pre-processing module (also called the pre-processing module), the video encoding module and the sending module.
  • the decoding end includes the receiving module, code stream decoding Modules and render display modules.
  • the acquisition module at the encoding end collects video, which includes multiple frames of images to be encoded, and then preprocesses each frame of image through the preprocessing module. Then, through the video encoding module, each frame of image is encoded using the encoding method provided by the embodiment of the present application to obtain a code stream.
  • the sending module sends the code stream to the decoding end through the transmission network.
  • the receiving module at the decoding end first receives the code stream, and then uses the decoding module to decode the code stream using the decoding method provided in the embodiment of the present application to obtain image information, and then renders and displays the image information through the rendering and display module.
  • the encoding end can also store the code stream after getting it.
  • the encoding and decoding methods provided by the embodiments of the present application can be applied to a variety of scenarios.
  • the encoded and decoded images in various scenarios can be images included in image files or images included in video files.
  • the encoded and decoded images can be images in RGB, YUV444, YUV420 and other formats. It should be noted that, in conjunction with the implementation environments shown in Figures 1, 2 and 3, any of the encoding methods below can be executed by the encoding end. Any of the decoding methods below can be performed by the decoding end.
  • Figure 4 is a flow chart of an encoding method provided by an embodiment of the present application. This method is applied on the encoding side. Please refer to Figure 4. The method includes the following steps.
  • Step 401 Determine the first image feature and the second image feature of the image to be encoded.
  • the first image feature is the image feature obtained by quantizing the second image feature according to the first quantization step size.
  • the encoding end inputs the image to be encoded into the image feature extraction network to obtain the second image feature of the image.
  • the second image feature is an unquantized image feature.
  • the encoding end quantizes the second image feature according to the first quantization step size to obtain the first image feature, which is the quantized image feature.
  • both the first image feature and the second image feature include image features of multiple feature points.
  • the image feature of each feature point in the first image feature can be called the first feature value of the corresponding feature point.
  • the second image feature The image features of each feature point in the feature may be called the second feature value of the corresponding feature point.
  • the image feature extraction network is a convolutional neural network
  • the first image feature is represented by a first feature map
  • the second image feature is represented by a second feature map
  • both the first feature map and the second feature map have multiple feature points.
  • the image feature extraction network in the embodiment of the present application is pre-trained, and the network structure and training method of the image feature extraction network are not limited in the embodiment of the present application.
  • the image feature extraction network can be a fully connected network or the above-mentioned convolutional neural network, and the convolution in the convolutional neural network can be 2D convolution or 3D convolution.
  • the embodiments of the present application do not limit the number of network layers included in the image feature extraction network and the number of nodes in each layer.
  • FIG. 5 is a schematic structural diagram of an image feature extraction network provided by an embodiment of the present application.
  • the image feature extraction network is a convolutional neural network, which includes four convolutional layers (Conv) and three cascaded grab detection network (GDN) layers.
  • the convolution kernel size of each convolutional layer is 5 ⁇ 5
  • the number of channels of the output feature map is M
  • each convolutional layer downsamples the width and height by 2 times. For example, for input 16W ⁇ 16H ⁇ 3
  • the size of the feature map output by the convolutional neural network is W ⁇ H ⁇ M.
  • the structure of the convolutional neural network shown in Figure 5 is not used to limit the embodiments of the present application, for example, the size of the convolution kernel, the number of channels of the feature map, the downsampling multiple, the number of downsampling, the number of convolution layers, etc. All can be adjusted.
  • the above-mentioned first quantization step size is obtained through a gain network based on the code rate of the image, and the gain network is used to determine the quantization step sizes corresponding to multiple code rates.
  • the encoding end determines the first quality factor based on the code rate of the image, and inputs the first quality factor into the gain network to obtain the first quantization step size.
  • different code rates correspond to different quality factors, and different quantization step sizes can be obtained through the gain network.
  • the mapping relationship between the code rate and the quantization step size is stored in advance, and the corresponding quantization step size is obtained from the mapping relationship based on the code rate of the image as the first quantization step size.
  • the first quantization step size is determined based on the code rate of the image to be encoded, and then the first quality factor corresponding to the first quality factor is obtained from the mapping relationship between the quality factor and the quantization step size.
  • the first quantization step size is determined based on the code rate of the image to be encoded, and then the first quality factor corresponding to the first quality factor is obtained from the mapping relationship between the quality factor and the quantization step size.
  • the quality factor can also be replaced by a quantization parameter.
  • quantization processing methods involved in the above implementation process, such as uniform quantization or scalar quantization.
  • the scalar quantization may also include an offset, that is, the data to be quantified (such as the second image feature) is offset by the offset, and then scalar quantization is performed according to the quantization step size.
  • the quantification processing performed on image features in this embodiment of the present application includes quantization and rounding operations.
  • the second image feature is represented by the feature map y
  • the value range of the second image feature is within the interval [0,100]
  • the first quantization step size is represented by q1
  • q1 is 0.5
  • the first image feature is represented by the feature map y y s is represented
  • the encoding end quantifies the feature values of each feature point in the feature map y to obtain the feature map y s
  • rounds the feature values of each feature point in the feature map y s to obtain the feature map ys', that is, the first image feature is obtained, and the value range of the first image feature is within the interval [0,50].
  • the first quantization step size for quantizing the image features of each feature point may be the same or different.
  • feature points in the same channel use the same first quantization step size, or the feature values of different channels at the same spatial location are used.
  • the size of the second image feature to be quantified is W ⁇ H ⁇ M.
  • the first quantization step size of the feature point with coordinates (k, j, l) in the second image feature is q i (k, j, l), q i (k, j, l) can be obtained through gain network learning or through stored mapping relationships. Among them, k ⁇ [1,W], j ⁇ [1,H], l ⁇ [1,M].
  • quantization parameters QP correspond to different quantization step sizes q
  • the quantization parameters QP correspond to the quantization step sizes q one-to-one.
  • other functions can also be designed to represent the mapping relationship between QP and q.
  • quantization processing method below is similar to that here, and the quantization processing method below can refer to the method here, and the embodiments of the present application will not be described again in detail below.
  • Step 402 Determine the first super a priori feature of the second image feature.
  • the encoding end determines the first super-prior features of the unquantized image features before step 404. (such as the first super prior feature of the second image feature).
  • the encoding end determines the first super-a priori feature of the second image feature.
  • a first implementation method for the encoding end to determine the first super-priori feature of the second image feature is to input the above-mentioned second image feature into the super-encoding network to obtain the first super-priori feature. That is, the encoding end inputs the unquantized image features into the super-encoding network to obtain the first super-prior features of the unquantized image features.
  • the second implementation method for the encoding end to determine the first super-a priori feature of the second image feature is: inversely quantizing the first image feature according to the first quantization step size to obtain the third image feature of the image, and converting the third image feature into The image features are input into the super-encoding network to obtain the first super-prior features.
  • the first super-transcendental feature It can also be considered as the first super a priori feature of the third image feature, and can also be considered as the first super a priori feature of the second image feature. Because the second image feature is the image feature before quantization, and the third image feature is the image feature after inverse quantization, therefore, although the first image feature and the third image feature have some numerical differences, the image information represented by the two are basically equivalent.
  • the above-mentioned super-encoding network outputs the first super-prior feature.
  • the above super-encoding network outputs the second super-prior feature, and the encoding end quantizes the second super-prior feature according to the third quantization step to obtain the first super-prior feature.
  • the first super-prior feature is the experience Quantified supra-priori features.
  • the third quantization step size is the same as or different from the first quantization step size. That is to say, quantization operations can also be performed on super-prior features to compress super-prior features.
  • super-prior features can also be called side information, and side information can be understood as further feature extraction of image features.
  • the above-mentioned first super a priori feature and the second super a priori feature are both super a priori features of multiple feature points.
  • the image features of each feature point in the first image feature are input into the super-encoding network to obtain the super-a priori features of each feature point in the first super-prior feature.
  • the super-coding network in the embodiment of the present application is pre-trained, and the embodiment of the present application does not limit the network structure and training method of the super-coding network.
  • the supercoding network can be a convolutional neural network or a fully connected network.
  • the super-encoding network in this article can also be called a super-prior network.
  • Step 403 Encode the first super-a priori feature into the code stream.
  • the encoding end encodes the first super a priori feature into the code stream, so that the subsequent decoding end can decode based on the first super a priori feature.
  • the encoding end encodes the first super-a priori feature into the code stream through entropy coding.
  • the encoding end encodes the first super-a priori feature into the code stream through entropy coding according to the specified probability distribution parameters.
  • the specified probability distribution parameters are probability distribution parameters determined in advance through a certain probability distribution estimation network. The embodiments of the present application do not limit the network structure and training method of the probability distribution estimation network.
  • Step 404 Based on the first super-prior features, determine the first probability distribution parameters through the probability distribution estimation network.
  • the probability distribution estimation network is used to determine the probability distribution parameters of unquantized image features. Based on this, the encoding end determines the first probability distribution parameters through the probability distribution estimation network based on the first super-prior features.
  • the first probability distribution parameter is the probability distribution that characterizes unquantized image features (such as second image features, third image features).
  • the probability distribution parameters in this article can be any parameter used to characterize the probability distribution of image features, such as the mean and variance (or standard deviation) of the Gaussian distribution, the position parameters and scale of the Laplace distribution. Parameters, mean and scale parameters of logistic distribution, and other model parameters.
  • the encoding end parses the first super a priori feature from the code stream, and determines the first probability distribution through the probability distribution estimation network based on the parsed first super a priori feature. parameter.
  • the first super a priori feature is a quantized super a priori feature, or it may be an unquantized super a priori feature.
  • the encoding end performs inverse quantization on the first super-prior feature according to the above-mentioned third quantization step to obtain the second super-prior feature.
  • a priori features and input the second super prior features into the probability distribution estimation network to obtain the first probability distribution parameters.
  • the encoding end inputs the first super-prior feature into the probability distribution estimation network to obtain the first probability distribution parameter.
  • the probability distribution estimation network can also be considered as a super-decoding network, which is used to determine probability distribution parameters based on super-prior features.
  • the encoding end may also determine the first probability distribution parameters based on contextual features to improve the accuracy of the first probability distribution parameters. This will be introduced next.
  • the encoding end inputs the third image feature of the image into the context network to obtain the context feature of the third image feature.
  • the third image feature is an image feature obtained by inverse quantizing the first image feature according to the first quantization step size.
  • the encoding end determines the first a priori feature based on the first super a priori feature, and inputs the first a priori feature and the context feature into the probability distribution estimation network to obtain the first probability distribution parameter. That is, the encoding end extracts context features from the dequantized image features, and then combines the context features and the first prior features to determine the first probability distribution parameters.
  • the encoding end parses the first super a priori feature from the code stream, and compares the parsed first super a priori feature according to the third quantization step size.
  • the a priori features are inversely quantized to obtain the second super a priori features, and the second super a priori features are input into the super decoding network to obtain the first a priori features.
  • the encoding end parses the first super-prior feature from the code stream, and inputs the parsed first super-prior feature into the super-decoding network, to obtain the first a priori features.
  • the encoding end can also input the second image feature into the context network to obtain the second image feature.
  • Contextual features, the contextual features of the second image feature are the contextual features of the third image feature.
  • the context feature of the third image feature includes the context feature of each feature point among the multiple feature points
  • the first probability distribution parameter is the probability distribution parameter of the multiple feature points. That is, the encoding end can perform parallel processing Determine the contextual features of each feature point among the plurality of feature points and the probability distribution parameters of each feature point.
  • the encoding end inputs the first image feature into the context network to obtain the context feature of the first image feature, determines the first a priori feature based on the first super a priori feature, and determines the first a priori feature according to the second quantization step size. , quantify the first prior feature to obtain the second prior feature, and input the second prior feature and context feature into the probability distribution estimation network to obtain the first probability distribution parameter. That is to say, the encoding end can also extract contextual features from the quantized image features, and then add a quantization operation to the first a priori feature to obtain the second a priori feature, thereby determining the second a priori feature and the contextual feature. The first probability distribution parameter.
  • the second quantization step size is the same as or different from the first quantization step size.
  • the implementation of the encoding end to determine the first a priori feature is consistent with the relevant process in the previous implementation, and will not be described again here.
  • the super decoding network is used to determine a priori features based on super a priori features
  • the probability distribution estimation network is used to determine probability distribution parameters based on a priori features and contextual features.
  • the super-decoding network and probability distribution estimation network in the embodiments of this application are both pre-trained.
  • the embodiments of this application do not limit the network structure and training methods of the super-decoding network and probability distribution estimation network.
  • both the super decoding network and the probability distribution estimation network can be a convolutional neural network, a recurrent neural network or a fully connected network, etc.
  • the probability distribution estimation network in the embodiment of the present application is modeled using a Gaussian model (such as a Gaussian single model (GSM) or a Gaussian mixture model (GMM)), that is, it is assumed that it is not quantized
  • a Gaussian model such as a Gaussian single model (GSM) or a Gaussian mixture model (GMM)
  • GSM Gaussian single model
  • GMM Gaussian mixture model
  • the first probability distribution parameters obtained by the probability estimation distribution network include the mean ⁇ and standard deviation ⁇ .
  • the probability distribution estimation network may also use a Laplace distribution model.
  • the first probability distribution parameters include the position parameter ⁇ and the scale parameter b.
  • the probability distribution estimation network can also use a logistic distribution model.
  • the first probability distribution parameters include the mean ⁇ and the scale parameter s.
  • the probability distribution function corresponding to the probability distribution of any feature point in the first probability distribution parameter is shown in formula (1), where x is the second feature value of the feature point.
  • Step 405 Quantize the first probability distribution parameters according to the first quantization step size to obtain the second probability distribution parameters.
  • the encoding end quantizes the first probability distribution parameter according to the first quantization step to obtain the second probability distribution parameter, and the second probability distribution parameter represents the quantized Probability distribution of image features (i.e. first image features).
  • the probability distribution parameters of each feature point in the first probability distribution parameter are quantified to obtain the probability distribution parameter of the corresponding feature point in the second probability distribution parameter.
  • the first quantization step size of the feature point with coordinates (k, j, l) is q i (k, j, l), and the probability of the feature point in the first probability distribution parameter is The distribution parameters are ⁇ (k,j,l) and ⁇ (k,j,l).
  • the quantization step size q i (k,j,l), ⁇ (k,j,l) and ⁇ (k,j, l) Quantify and obtain the probability distribution parameters of the feature point in the second probability distribution parameters as ⁇ s (k, j, l) and ⁇ s (k, j, l).
  • ⁇ s ⁇ /q
  • ⁇ s ⁇ /q
  • the probability distribution function corresponding to the probability distribution parameter of any feature point in the second probability distribution parameter is as shown in formula (2), Among them, x is the first eigenvalue of the feature point.
  • step 405 The principle of step 405 is explained here. Assume that the quantization operation is uniform quantization, taking the Gaussian model as an example, and assuming that the probability distribution function of variable x is as shown in the above formula (1), then the probability P 1 of variable x in the interval [a 2 *q, a 1 *q] is as follows As shown in the above formula (3). Among them, q is the quantization step size.
  • the second probability distribution parameters can be obtained from the first probability distribution parameters by scaling (that is, quantifying) the parameters of the corresponding model.
  • Step 406 Encode the first image feature into the code stream based on the second probability distribution parameter.
  • the encoding end encodes the first image feature into the code stream based on the second probability distribution parameter.
  • the encoding end encodes the image features of each feature point in the first image feature into the code stream based on the probability distribution parameters of each feature point in the second probability distribution parameter.
  • the encoding end encodes the first image feature into the code stream through entropy encoding.
  • Figure 6 is a flow chart of a coding and decoding method provided by an embodiment of the present application.
  • the The encoded image is input to the encoding network (encoder, Enc) to obtain the feature map y to be quantified (ie, the second image feature).
  • the encoding network is an image feature extraction network.
  • Each feature value in the feature map y is quantized (Q) according to the quantization step size q1 (ie, the first quantization step size) to obtain the feature map y s .
  • Round each eigenvalue (R) in the feature map y s to obtain the feature map (i.e. the first image feature).
  • the feature map y is input into the hyperencoder network (hyper encoder, HyEnc) to obtain the hyper-prior feature z.
  • the quantization step size q2 may be the same as or different from the quantization step size q1.
  • the super- prior features are transformed into Encoded bitstream.
  • the super-prior features are parsed from the code stream through entropy decoding (AD 2 )
  • super-prior features Inverse quantization (IQ) is performed to obtain the super-prior feature z.
  • the feature map is encoded through entropy coding (AE 1 ) Encoded code stream.
  • Figure 7 is a flow chart of another encoding and decoding method provided by an embodiment of the present application.
  • the image to be encoded is input into the encoding network (Enc) to obtain the feature map y to be quantized.
  • the encoding network is an image feature extraction network.
  • Each feature value in the feature map y is quantized (Q) according to the quantization step size q1 to obtain the feature map y s .
  • Round each eigenvalue (R) in the feature map y s to obtain the feature map
  • the feature map y is input into the super-encoding network (HyEnc) to obtain the super-prior feature z.
  • the super-prior feature z quantizes the super-prior feature z according to the quantization step size q2 to obtain the super-prior feature (i.e. the first super-transcendental feature).
  • the super- prior features are transformed into Encoded bitstream.
  • the super-prior features are parsed from the code stream through entropy decoding (AD 2 )
  • super-prior features Inverse quantization (IQ) is performed to obtain the super-prior feature z.
  • the feature map Perform inverse quantization to obtain feature maps (i.e. the third image feature).
  • the feature map Input the context (Ctx) network to get the feature map Context features of each feature point in .
  • the probability distribution estimation network to obtain the probability distribution parameters ⁇ and ⁇ (i.e., the first probability distribution parameters) of each feature value in the feature map y.
  • the probability distribution parameters ⁇ s and ⁇ s of each eigenvalue in i.e., the second probability distribution parameter.
  • the feature map is encoded through entropy coding (AE 1 ) Encoded code stream.
  • AE 1 entropy coding
  • Figure 8 is a flow chart of yet another encoding and decoding method provided by an embodiment of the present application.
  • the difference between Figure 8 and Figure 6 above is that during the encoding process, according to the quantization step size q1, the feature map Perform inverse quantization to obtain feature maps The feature map Enter the super-encoding network to obtain the super-prior feature z.
  • Figure 9 is a flow chart of yet another encoding and decoding method provided by an embodiment of the present application.
  • the difference between Figure 9 and Figure 7 above is that during the encoding process, according to the quantization step size q1, the feature map is Perform inverse quantization to obtain feature maps The feature map Enter the super-encoding network to obtain the super-prior feature z.
  • Figure 10 is a flow chart of yet another encoding and decoding method provided by an embodiment of the present application.
  • the difference between Figure 10 and Figure 7 and Figure 9 above is that during the encoding process, the feature map Input super-encoding network to get super-prior features
  • a quantization operation is added, that is, the prior features are processed according to the quantization step size q3 (i.e., the second quantization step size).
  • the quantization step size q3 is the same as or different from the quantization step size q1.
  • Combine the contextual features and prior features of each feature point Enter the probability distribution estimation network to obtain the first probability distribution parameters ⁇ and ⁇ .
  • the first probability distribution is determined through the probability distribution estimation network based on the super-prior features of the unquantized image features during the encoding process.
  • the first probability distribution parameter represents the probability distribution of unquantized image features.
  • the first probability distribution parameters are then quantized according to the first quantization step size (ie, the quantization step size used to quantify the image features), thereby obtaining second probability distribution parameters used to characterize the probability distribution of the quantized image features. It can be seen that in this scheme, it is enough to train a probability distribution estimation network used to determine the probability distribution parameters of unquantized image features.
  • the probability distribution estimation network is less difficult to train, the network training is stable, and a probability distribution estimation network with good performance can be trained, which is beneficial to improving the encoding and decoding performance.
  • Figure 11 is a flow chart of another encoding method provided by an embodiment of the present application. This method is applied on the encoding side.
  • the probability distribution estimation network in the above-mentioned embodiment of FIG. 4 is called the first probability distribution estimation network
  • the difference between the embodiment of FIG. 11 and the above-mentioned embodiment of FIG. 4 lies in that the encoding shown in FIG. 11
  • the second probability distribution parameters are directly obtained through the second probability distribution estimation network, that is, the second probability distribution estimation network directly outputs the probability distribution parameters of the quantized image features.
  • the network parameters of the second probability distribution estimation network are obtained based on the network parameters of the first probability distribution estimation network and the first quantization step size. In this way, the first probability distribution estimation network is also trained.
  • the method includes the following steps.
  • Step 1101 Determine the first image feature and the second image feature of the image to be encoded.
  • the first image feature is the image feature obtained by quantizing the second image feature according to the first quantization step size.
  • the encoding end inputs the image to be encoded into the image feature extraction network to obtain the second image feature of the image.
  • the encoding end quantizes the second image feature according to the first quantization step size to obtain the first image feature.
  • the specific implementation process is the same as the specific implementation process of step 401 in the above-mentioned embodiment of FIG. 4. Please refer to the relevant introduction in the above-mentioned step 401, which will not be described again here.
  • Step 1102 Determine the first super a priori feature of the second image feature.
  • the encoding end inputs the second image feature into the super-encoding network to obtain the first super-prior feature.
  • the encoding end performs inverse quantization on the first image feature according to the first quantization step size to obtain the third image feature of the image, and inputs the third image feature into the super coding network to obtain the first super coding network.
  • Step 1103 Encode the first super-a priori feature into the code stream.
  • the encoding end encodes the first super a priori feature into the code stream, so that the subsequent decoding end can decode based on the first super a priori feature.
  • the encoding end encodes the first super-a priori feature into the code stream through entropy coding.
  • the specific implementation process is the same as the specific implementation process of step 403 in the above-mentioned embodiment of FIG. 4. Please refer to the relevant introduction in the above-mentioned step 403, which will not be described again here.
  • Step 1104 Based on the first super-prior features, determine the second probability distribution parameters through the second probability distribution estimation network.
  • the network parameters of the second probability distribution estimation network are based on the network parameters of the first probability distribution estimation network and the first quantization step.
  • the resulting, first probability distribution estimation network is used to determine the probability distribution of unquantized image features.
  • the encoding end parses the first super a priori feature from the code stream, and determines the second probability through the second probability distribution estimation network based on the parsed first super a priori feature. distribution parameters.
  • the last layer of the first probability distribution estimation network is a convolutional layer
  • the network parameters of the convolutional layer include weights and biases.
  • the weights and biases of the last convolutional layer of the second probability distribution estimation network are obtained based on the weights and biases of the last convolutional layer of the first probability distribution estimation network and the first quantization step size.
  • the second probability distribution estimation network is obtained by multiplying the network parameters of the last layer in the first probability distribution estimation network by the first quantization step size.
  • the second probability distribution estimation network adjusts the network parameters of the last layer in the first probability distribution estimation network in a binary manner to shift left or right, so that the adjusted network parameters are equal to The network parameters before adjustment are multiplied by the first quantization step size.
  • the last layer of the first probability distribution estimation network is a convolution layer, and the weight w and bias b of the convolution layer are multiplied by the first quantization step q1 to obtain the second probability distribution estimation network.
  • the weight w*q1 and bias b*q1 of the last layer are the same as those of the first probability distribution estimation network.
  • the second probability distribution estimation network is the same as the first probability distribution estimation network.
  • the difference lies in the different network parameters of the last layer. In this way, it is enough to train the first probability distribution estimation network through unquantized image features.
  • multiply the network parameters of the last layer of the first probability distribution estimation network by the first quantization The step size can be used to obtain the second probability distribution estimation network.
  • the first super a priori feature is a quantized super a priori feature, or it may be an unquantized super a priori feature. feature.
  • the encoding end performs inverse quantization on the first super-prior feature according to the above-mentioned third quantization step to obtain the second super-prior feature. a priori feature, and input the second super prior feature into the second probability distribution estimation network to obtain the second probability distribution parameters.
  • the encoding end inputs the first super a priori feature into the second probability distribution estimation network to obtain the second probability distribution parameter.
  • the encoding end can also determine the second probability distribution parameters based on contextual features to improve the accuracy of the second probability distribution parameters. This will be introduced next.
  • the encoding end inputs the third image feature of the image into the context network to obtain the context feature of the third image feature.
  • the third image feature is an image feature obtained by inverse quantizing the first image feature according to the first quantization step size.
  • the encoding end determines the first a priori feature based on the first super a priori feature, and inputs the first a priori feature and the context feature into the second probability distribution estimation network to obtain the second probability distribution parameters. That is, the encoding end extracts context features from the dequantized image features, and then combines the context features and the first prior features to determine the second probability distribution parameters.
  • the encoding end inputs the first image feature into the context network to obtain the context feature of the first image feature.
  • the encoding end determines the first a priori feature based on the first super a priori feature, and quantizes the first a priori feature according to the second quantization step to obtain the second a priori feature.
  • the encoding end inputs the second a priori feature and the context feature into the second probability distribution estimation network to obtain the second probability distribution parameters. That is, the encoding end extracts contextual features from the quantized image features, and then adds a quantization operation to the first a priori features to obtain the second a priori features, thereby combining the second a priori features and contextual features to determine the second a priori features. Probability distribution parameters.
  • the second quantization step size is the same as or different from the first quantization step size.
  • step 1104 is similar to the specific implementation process of step 404 in the embodiment of FIG. 4. Please refer to the relevant introduction in step 404 above, which will not be described again here.
  • Step 1105 Encode the first image feature into the code stream based on the second probability distribution parameter.
  • the second probability distribution parameter is a probability distribution parameter of multiple feature points.
  • the encoding end converts the image of each feature point in the first image feature based on the probability distribution parameter of each feature point in the second probability distribution parameter.
  • Feature programming code stream Exemplarily, the encoding end encodes the second image feature into the code stream through entropy encoding.
  • the super-prior features of the unquantized image features are also determined, but the second probability distribution parameters are directly obtained through the second probability distribution estimation network later, and the second probability distribution estimation The network is obtained by processing the network parameters in the first probability distribution estimation network based on the first quantization step size. It can be seen that in this solution, it is enough to train the first probability distribution estimation network (used to determine the probability distribution parameters of unquantized image features). Even in a multi-bitrate scenario, since the numerical range of unquantized image features is relatively stable and is not affected by the quantization step, that is, the input numerical range of the first probability distribution estimation network will not change with different bitrates. Therefore, the training difficulty of the first probability distribution estimation network is less, the network training is stable, and the first probability distribution estimation network with good performance can be trained, which is beneficial to improving the encoding and decoding performance.
  • the decoding method shown in FIG. 12 below matches the encoding method shown in FIG. 4 above
  • the decoding method shown in FIG. 13 below matches the decoding method shown in FIG. 11 above.
  • Figure 12 is a flow chart of a decoding method provided by an embodiment of the present application. This method is applied to the decoder side. Please refer to Figure 12. The method includes the following steps.
  • Step 1201 Parse the first super-priori feature of the image to be decoded from the code stream.
  • the decoding end first parses the first super-a priori feature of the image to be decoded from the code stream.
  • the decoder parses the first super-a priori feature from the code stream through entropy decoding.
  • the decoding end parses the first super-prior feature from the code stream through entropy decoding according to the specified probability distribution parameters.
  • the specified probability distribution parameters are probability distribution parameters determined in advance through a certain probability distribution estimation network.
  • the embodiments of the present application do not limit the network structure and training method of the probability distribution estimation network.
  • the first super a priori feature is a super a priori feature of multiple feature points.
  • the first super a priori feature analyzed by the decoding end is consistent with the first super a priori feature determined by the encoding end. That is, the first super a priori feature obtained by the decoding end is exactly what is shown in the embodiment of Figure 4.
  • the second image feature is an unquantized image feature
  • the third image feature is an inverse quantized image feature.
  • Step 1202 Based on the first super-prior features, determine a first probability distribution parameter through a probability distribution estimation network, where the first probability distribution parameter represents a probability distribution of unquantized image features of the image.
  • the first super a priori feature is a super a priori feature of multiple feature points
  • the first probability distribution parameter is a probability distribution parameter of multiple feature points
  • the decoding end is based on each of the first super a priori features.
  • the probability distribution parameters of each feature point in the first probability distribution parameters are determined through the probability distribution estimation network.
  • the decoding end can decode the multiple feature points in parallel.
  • the decoding end cannot decode the multiple feature points at the same time.
  • the decoding end decodes the multiple feature points in sequence, or the decoding end decodes each feature point of a channel in sequence.
  • the decoding end decodes multiple sets of feature points in sequence, and the number of feature points in each set may be different, or the decoding end decodes the multiple feature points in other order.
  • the first super a priori feature is a quantized super a priori feature, or it may be an unquantized super a priori feature.
  • the decoder performs the above-mentioned third quantization step on the first super a priori feature. Inverse quantization is performed to obtain the second super prior feature, and the second super prior feature is input into the probability distribution estimation network to obtain the first probability distribution parameter.
  • the decoder inputs the first super prior feature into the probability distribution estimation network to obtain the first probability distribution parameter.
  • the decoder performs the following operations on the first feature point to determine the probability distribution of the first feature point Parameters: determine the context feature of the first feature point based on the image feature of the decoded feature point in the first image feature; determine the first a priori feature of the first feature point based on the super prior feature of the first feature point; based on The first a priori feature of the first feature point and the contextual feature of the first feature point are used to determine the probability distribution parameters of the first feature point through the probability distribution estimation network, that is, the probability distribution parameters of the first feature point in the first probability distribution parameters are determined.
  • the decoding end performs inverse quantization on the parsed first super a priori feature according to the third quantization step to obtain the second super a priori feature. a priori features, and input the second super a priori features into the super decoding network to obtain the first a priori features of each of the multiple feature points.
  • the decoder performs inverse quantization on the super-prior feature of the first feature point in the first super-prior feature to obtain the super-prior feature of the first feature point in the second super-prior feature,
  • the super a priori feature of the first feature point in the second super a priori feature is input into the super decoding network to obtain the first a priori feature of the first feature point.
  • the decoding end inputs the parsed first super-prior feature into the super-decoding network to obtain the first value of each feature point among the plurality of feature points. a priori characteristics.
  • the decoding end inputs the super prior feature of the first feature point in the first super prior feature into the super decoding network to obtain the first prior feature of the first feature point.
  • an implementation process in which the decoder determines the context feature of the first feature point based on the image feature of the decoded feature point in the first image feature is: the decoder determines the first feature from the decoded feature point. For the peripheral feature points of the point, the image features of the peripheral feature points in the first image feature are inversely quantized according to the first quantization step size to obtain the peripheral features of the first feature point. Then, the decoder inputs the surrounding features of the first feature point into the context network to obtain the context features of the first feature point.
  • the decoding end determines the probability distribution parameter of the first feature point in the first probability distribution parameter through the probability distribution estimation network based on the first a priori feature of the first feature point and the context feature of the first feature point.
  • the implementation process is as follows: The decoding end inputs the first a priori feature of the first feature point and the contextual feature of the first feature point into the probability distribution estimation network to obtain the probability distribution parameter of the first feature point in the first probability distribution parameter. That is, the decoder extracts context features from the inverse quantized image features, and then combines the context features and the first prior features to determine the first probability distribution parameters.
  • the surrounding feature points of the first feature point include one or more feature points in the neighborhood of the first feature point.
  • the decoder determines the first feature point from the decoded feature point.
  • the image features of the peripheral feature points in the first image feature are input into the context network to obtain the context features of the first feature point.
  • the decoding end determines the probability distribution parameter of the first feature point in the first probability distribution parameter through the probability distribution estimation network based on the first a priori feature of the first feature point and the context feature of the first feature point.
  • the implementation process is as follows: The decoding end quantifies the first a priori feature of the first feature point according to the second quantization step size to obtain the second a priori feature of the first feature point, and combines the second a priori feature of the first feature point with the first
  • the contextual features of the feature points are input into the probability distribution estimation network to obtain the probability distribution parameters of the first feature points in the first probability distribution parameters. That is to say, the decoder provides quantized image features Get the context features, and then add a quantization operation to the first prior features to obtain the second prior features, so as to determine the first probability distribution parameters by combining the second prior features and the contextual features.
  • the second quantization step size is the same as or different from the first quantization step size.
  • the implementation of the decoder determining the first a priori feature of the first feature point is similar to the relevant content in the previous embodiments, and will not be described again here.
  • the image features of the decoded feature points in the first image feature are decoded from the code stream according to step 1202 to step 1204 described below. That is to say, in the implementation of encoding and decoding using the context network, the decoding end sequentially parses the image features of each feature point in the first image feature from the code stream according to steps 1202 to 1204. The decoding end can decode at least A feature point.
  • the above-mentioned first image feature is an image feature obtained by quantizing the second image feature of the image according to the first quantization step size.
  • the quantization operation is performed during the encoding process
  • the second image feature is the image obtained during the encoding process. feature.
  • Step 1203 Quantize the first probability distribution parameters according to the first quantization step size to obtain the second probability distribution parameters.
  • the second probability distribution parameter is a probability distribution of multiple feature points.
  • the decoding end After decoding the probability distribution parameter of the first feature point in the first probability distribution parameter, the decoding end calculates the first probability distribution parameter according to the first quantization step size.
  • the probability distribution parameter of the first feature point in the second probability distribution parameter is quantified to obtain the probability distribution parameter of the first feature point in the second probability distribution parameter.
  • the decoding end can quantize the probability distribution parameters of multiple feature points in the first probability distribution parameters in parallel. In the implementation of encoding and decoding using the context network, each time the decoder obtains the probability distribution parameter of at least one feature point in the first probability distribution parameter, it quantifies the probability distribution parameter of the at least one feature point in the first probability distribution parameter.
  • Step 1204 Based on the second probability distribution parameter, parse the first image feature of the image from the code stream.
  • the decoding end parses the first image feature in the first image feature from the code stream based on the probability distribution parameter of the first feature point in the second probability distribution parameter.
  • Image feature of a feature point the decoder parses the image features of each feature point in the first image feature from the code stream through entropy decoding. It should be noted that in an implementation that does not utilize the context network for encoding and decoding, the decoder can parse multiple feature points in parallel to obtain the first image feature.
  • each time the decoder obtains the probability distribution parameter of at least one feature point in the second probability distribution parameter it parses the image feature of the at least one feature point in the first image feature from the code stream. , until the image features of all feature points in the first image feature are analyzed, and the first image feature of the image is obtained.
  • Step 1205 Perform inverse quantization on the first image feature according to the first quantization step size to reconstruct the image.
  • the decoder parses the first image feature from the code stream, it performs inverse quantization on the first image feature according to the first quantization step size to obtain the third image feature of the image. Based on the third image features to reconstruct the image.
  • the decoding end inputs the third image feature into the decoding network to reconstruct the image.
  • the decoding process performed within the decoding network is the inverse process of feature extraction performed by the above-mentioned image feature extraction network. It should be noted that the third image feature is consistent with the third image feature in the encoding process, and both are inverse quantized image features.
  • the decoder performs inverse quantization on the first image feature according to the fourth quantization step size to obtain the fourth image feature of the image, and reconstructs the image based on the fourth image feature.
  • the fourth quantization step size may deviate from the first quantization step size.
  • the super-prior features are first parsed from the code stream According to the quantification step Long q2 pairs of super-prior features Inverse quantization is performed to obtain the super-prior feature z. Then the super-prior feature z is input into the probability distribution estimation network to obtain the probability distribution parameters ⁇ and ⁇ of each feature value in the feature map y. Quantify the probability distribution parameters ⁇ and ⁇ according to the quantization step q1 to obtain the feature map The probability distribution parameters ⁇ s and ⁇ s of each eigenvalue in . Based on feature map The probability distribution parameters ⁇ s and ⁇ s of each feature value in , and the feature map is parsed from the code stream. According to the quantization step size q1, the feature map is Perform inverse quantization to obtain feature maps Finally, the feature map Input the decoder network (decoder, Dec) to reconstruct the image.
  • the super-prior features of each of the multiple feature points are first parsed from the code stream.
  • the super-prior features of each feature point according to the quantization step size q2 Inverse quantization is performed to obtain the super-prior feature z of each feature point.
  • the super prior features z of each feature point are input into the super decoding network to obtain the prior features of each feature point.
  • surrounding feature points of the first feature point are determined from the decoded feature points.
  • the quantization step size q1 the feature map
  • the eigenvalues of the surrounding feature points are inversely quantized to obtain the feature map.
  • the eigenvalue of the surrounding feature point is obtained, that is, the surrounding feature of the first feature point is obtained.
  • the feature map The feature values of the surrounding feature points are input into the context network to obtain the context features of the first feature point.
  • the contextual features of the first feature point and the prior features of the first feature point are input into the probability distribution estimation network to obtain the probability distribution estimation parameters ⁇ and ⁇ of the first feature point.
  • the probability distribution estimation parameters ⁇ and ⁇ of the first feature point are quantized to obtain the probability distribution estimation parameters ⁇ s and ⁇ s of the first feature point.
  • the feature map is parsed from the code stream The image feature of the first feature point in .
  • the decoding process of the encoding and decoding method shown in Figure 8 is the same as the decoding process of the encoding and decoding method shown in Figure 6.
  • the decoding process of the encoding and decoding method shown in Figure 9 is the same as the decoding process of the encoding and decoding method shown in Figure 7.
  • the difference between the decoding process of the encoding and decoding method shown in Figure 10 and Figures 7 and 9 is that the prior features are obtained through the super decoding network After that, a quantization operation is added, that is, the prior features are processed according to the quantization step size q3 Quantify to obtain a priori features Combine the contextual features and prior features of each feature point Enter the probability distribution estimation network to obtain the first probability distribution parameters ⁇ and ⁇ .
  • the first probability distribution parameter is determined through the probability distribution estimation network during the decoding process.
  • the first probability distribution parameter represents the unquantized image. Probability distribution of features.
  • the first probability distribution parameters are then quantized according to the first quantization step size (ie, the quantization step size used to quantify the image features), thereby obtaining second probability distribution parameters used to characterize the probability distribution of the quantized image features. It can be seen that in this scheme, it is enough to train a probability distribution estimation network used to determine the probability distribution parameters of unquantized image features.
  • the probability distribution estimation network is less difficult to train, the network training is stable, and a probability distribution estimation network with good performance can be trained, which is beneficial to improving the encoding and decoding performance.
  • Figure 13 is a flow chart of another decoding method provided by an embodiment of the present application. This method is applied to the decoder side. Please refer to Figure 13. The method includes the following steps.
  • Step 1301 Parse the first super-priori feature of the image to be decoded from the code stream.
  • step 1301 is the same as the specific implementation process of step 1201 in the above-mentioned embodiment of FIG. 12. Please refer to the relevant introduction in the above-mentioned step 1201, which will not be described again here.
  • Step 1302 Based on the first super-prior features, determine the second probability distribution parameters through the second probability distribution estimation network.
  • the network parameters of the second probability distribution estimation network are based on the network parameters of the first probability distribution estimation network and the first quantization step.
  • the resulting, first probability distribution estimation network is used to determine the probability distribution of unquantized image features.
  • the last layer of the first probability distribution estimation network is a convolutional layer
  • the network parameters of the convolutional layer include weights and biases.
  • the weights and biases of the last convolutional layer of the second probability distribution estimation network are obtained based on the weights and biases of the last convolutional layer of the first probability distribution estimation network and the first quantization step size.
  • the second probability distribution estimation network is obtained by multiplying the network parameters of the last layer in the first probability distribution estimation network by the first quantization step size.
  • the second probability distribution estimation network adjusts the network parameters of the last layer in the first probability distribution estimation network in a binary manner to shift left or right, so that the adjusted network parameters are equal to The network parameters before adjustment are multiplied by the first quantization step size.
  • the first super a priori feature is a super a priori feature of multiple feature points
  • the second probability distribution parameter is a probability distribution parameter of multiple feature points
  • the decoding end is based on each of the first super a priori features.
  • the probability distribution parameters of each feature point in the second probability distribution parameters are determined through the second probability distribution estimation network.
  • the decoding end can decode the multiple feature points in parallel. In the implementation of encoding and decoding using the context network, the decoding end cannot decode the multiple feature points at the same time.
  • the first super a priori feature is a quantized super a priori feature, or it may be an unquantized super a priori feature.
  • the decoder performs the above-mentioned third quantization step on the first super a priori feature. Inverse quantization is performed to obtain the second super prior feature, and the second super prior feature is input into the second probability distribution estimation network to obtain the second probability distribution parameter.
  • the decoder inputs the first super-prior feature into the second probability distribution estimation network to obtain the second Probability distribution parameters.
  • the decoder performs the following operations on the first feature point to determine the second probability distribution parameter.
  • Probability distribution parameter of a feature point based on the image feature of the decoded feature point in the first image feature, determine the contextual feature of the first feature point; based on the super-prior feature of the first feature point in the first super-prior feature, Determine the first a priori feature of the first feature point; based on the first a priori feature of the first feature point and the context feature of the first feature point, determine the first feature point in the second probability distribution parameter through the second probability distribution estimation network probability distribution parameters.
  • the decoding end performs inverse quantization on the parsed first super a priori feature according to the third quantization step to obtain the second super a priori feature. a priori features, and input the second super a priori features into the super decoding network to obtain the first a priori features of each of the multiple feature points.
  • the decoding end inputs the parsed first super-prior feature into the super-decoding network to obtain the first value of each feature point among the plurality of feature points. a priori characteristics.
  • an implementation process in which the decoder determines the context feature of the first feature point based on the image feature of the decoded feature point in the first image feature is: the decoder determines the first feature from the decoded feature point. For the peripheral feature points of the point, the image features of the peripheral feature points in the first image feature are inversely quantized according to the first quantization step size to obtain the peripheral features of the first feature point. Then, the decoder inputs the surrounding features of the first feature point into the context network to obtain the context features of the first feature point.
  • the decoding end determines the implementation process of the probability distribution parameter of the first feature point in the second probability distribution parameter through the second probability distribution estimation network based on the first a priori feature of the first feature point and the context feature of the first feature point. It is: the decoding end inputs the first a priori feature of the first feature point and the contextual feature of the first feature point into the second probability distribution estimation network to obtain the probability distribution parameter of the first feature point in the second probability distribution parameter. That is, the decoder extracts context features from the dequantized image features, and then combines the context features and the first prior features to determine the second probability distribution parameters.
  • the surrounding feature points of the first feature point include one or more feature points in the neighborhood of the first feature point.
  • the decoder determines the first feature point from the decoded feature point. For the peripheral feature points of the feature point, the image features of the peripheral feature points in the first image feature are input into the context network to obtain the context features of the first feature point.
  • the decoding end determines the implementation process of the probability distribution parameter of the first feature point in the second probability distribution parameter through the second probability distribution estimation network based on the first a priori feature of the first feature point and the context feature of the first feature point.
  • the decoder quantifies the first a priori feature of the first feature point according to the second quantization step to obtain the second a priori feature of the first feature point, and sums the second a priori feature of the first feature point and
  • the contextual features of the first feature point are input into the second probability distribution estimation network to obtain the probability distribution parameters of the first feature point in the second probability distribution parameters. That is to say, the decoder extracts contextual features from the quantized image features, and then adds a quantization operation to the first prior features to obtain the second prior features, thereby combining the second prior features and contextual features to determine the first prior features. Probability distribution parameters.
  • the second quantization step size is the same as or different from the first quantization step size.
  • the implementation of the decoder determining the first a priori feature of the first feature point is similar to the relevant content in the previous embodiments, and will not be described again here.
  • the image features of the decoded feature points in the first image feature are decoded from the code stream according to step 1302 to step 1304 described below. That is to say, in the implementation of encoding and decoding using the context network, the decoding end sequentially parses the image features of each feature point in the first image feature from the code stream according to steps 1302 to 1304. The decoding end can decode at least A feature point.
  • the above-mentioned first image feature is an image feature obtained by quantizing the second image feature of the image according to the first quantization step size.
  • the quantization operation is performed during the encoding process
  • the second image feature is the image obtained during the encoding process. feature.
  • Step 1303 Based on the second probability distribution parameter, parse the first image feature of the image from the code stream.
  • the decoding end parses the first image feature in the first image feature from the code stream based on the probability distribution parameter of the first feature point in the second probability distribution parameter.
  • Image feature of a feature point the decoder parses the image features of each feature point in the first image feature from the code stream through entropy decoding. It should be noted that in an implementation that does not utilize the context network for encoding and decoding, the decoder can parse multiple feature points in parallel to obtain the first image feature.
  • each time the decoder obtains the probability distribution parameter of at least one feature point in the second probability distribution parameter it parses the image feature of the at least one feature point in the first image feature from the code stream. , until the image features of all feature points in the first image feature are analyzed, and the first image feature of the image is obtained.
  • the first image feature obtained by the decoding end is consistent with the first image feature obtained by the encoding end.
  • the first image feature obtained by the encoding end is the second image feature of the image that is quantized according to the first quantization step size. obtained image features.
  • Step 1304 Perform inverse quantization on the first image feature according to the first quantization step size to reconstruct the image.
  • the decoder parses the first image feature from the code stream, it performs inverse quantization on the first image feature according to the first quantization step size to obtain the third image feature of the image. Based on the third image features to reconstruct the image.
  • the decoding end inputs the third image feature into the decoding network to reconstruct the image.
  • the decoding process performed within the decoding network is the inverse process of feature extraction performed by the above-mentioned image feature extraction network. It should be noted that the third image specifically The features are consistent with the third image features in the encoding process, and both are image features after inverse quantization.
  • the second probability distribution parameters are directly obtained through the second probability distribution estimation network.
  • the second probability distribution estimation network is based on the first quantization step size in the first probability distribution estimation network. Obtained after processing the network parameters, the first probability distribution estimation network is used to determine the probability distribution parameters of unquantized image features. It can be seen that in this solution, it is enough to train the first probability distribution estimation network (used to determine the probability distribution parameters of unquantized image features). Even in a multi-bitrate scenario, since the numerical range of unquantized image features is relatively stable and is not affected by the quantization step, that is, the input numerical range of the first probability distribution estimation network will not change with different bitrates. Therefore, the training difficulty of the first probability distribution estimation network is less, the network training is stable, and the first probability distribution estimation network with good performance can be trained, which is beneficial to improving the encoding and decoding performance.
  • this solution can obtain encoding and decoding performance in the three components of YUV format for images in YUV format. improvement.
  • the encoding and decoding method provided by this solution can deduce different code rates (corresponding to different first quantization steps) after performing a probability distribution estimation on the unquantized image features to obtain the first probability distribution parameters. long), there is no need to conduct a separate probability estimate for each code rate. It can be seen that this solution can reduce the computational complexity of probability estimation, is beneficial to rate distortion optimization (RDO) of feature maps, unifies probability estimation for different code rates, and is beneficial to the training of probability distribution estimation networks.
  • RDO rate distortion optimization
  • Figure 14 is a schematic structural diagram of an encoding device 1400 provided by an embodiment of the present application.
  • the encoding device 1400 can be implemented by software, hardware, or a combination of the two to become part or all of the encoding end device.
  • the encoding end device can be as shown in Figure 1 to any coding end shown in Figure 3.
  • the device 1400 includes: a first determination module 1401, a second determination module 1402, a first encoding module 1403, a probability estimation module 1404, a quantization module 1405 and a second encoding module 1406.
  • the first determination module 1401 is used to determine the first image feature and the second image feature of the image to be encoded.
  • the first image feature is the image feature obtained by quantizing the second image feature according to the first quantization step size;
  • the second determination module 1402 is used to determine the first super a priori feature of the second image feature
  • the first encoding module 1403 is used to encode the first super-priori feature into the code stream
  • Probability estimation module 1404 configured to determine the first probability distribution parameters through the probability distribution estimation network based on the first super-prior features
  • the quantization module 1405 is used to quantize the first probability distribution parameters according to the first quantization step size to obtain the second probability distribution parameters;
  • the second encoding module 1406 is used to encode the first image feature into the code stream based on the second probability distribution parameter.
  • the second determination module 1402 includes:
  • the first super-encoding sub-module is used to input the second image feature into the super-encoding network to obtain the first super-prior feature.
  • the second determination module 1402 includes:
  • the inverse quantization submodule is used to inverse quantize the first image feature according to the first quantization step size to obtain the third image feature of the image;
  • the second super-encoding submodule is used to input the third image feature into the super-encoding network to obtain the first super-prior feature.
  • the probability estimation module 1404 includes:
  • the context submodule is used to input the third image feature of the image into the context network to obtain the context feature of the third image feature.
  • the third image feature is the image feature obtained by inverse quantizing the first image feature according to the first quantization step size. ;
  • the first determination sub-module is used to determine the first a priori feature based on the first super a priori feature
  • the first probability estimation submodule is used to input the first a priori feature and the context feature into the probability distribution estimation network to obtain the first probability distribution parameters.
  • the probability estimation module 1404 includes:
  • the context submodule is used to input the first image feature into the context network to obtain the context feature of the first image feature;
  • the second determination sub-module is used to determine the first a priori feature based on the first super a priori feature
  • the quantization submodule is used to quantize the first a priori feature according to the second quantization step size to obtain the second a priori feature;
  • the second probability estimation submodule is used to input the second a priori feature and the context feature into the probability distribution estimation network to obtain the first probability distribution parameters.
  • the first probability distribution parameters are determined through the probability distribution estimation network, and the first probability The distribution parameters represent the probability distribution of unquantified image features.
  • the first probability distribution parameters are then quantized according to the first quantization step size (ie, the quantization step size used to quantify the image features), thereby obtaining second probability distribution parameters used to characterize the probability distribution of the quantized image features. It can be seen that in this scheme, it is enough to train a probability distribution estimation network used to determine the probability distribution parameters of unquantized image features.
  • the probability distribution estimation network is less difficult to train, the network training is stable, and a probability distribution estimation network with good performance can be trained, which is beneficial to improving the encoding and decoding performance.
  • the encoding device provided in the above embodiment performs encoding
  • only the division of the above functional modules is used as an example.
  • the above function allocation can be completed by different functional modules as needed, that is, the device The internal structure is divided into different functional modules to complete all or part of the functions described above.
  • the encoding device provided by the above embodiments and the encoding method embodiments belong to the same concept. Please refer to the method embodiments for the specific implementation process, which will not be described again here.
  • Figure 15 is a schematic structural diagram of a decoding device 1500 provided by an embodiment of the present application.
  • the encoding device 1500 can be implemented by software, hardware, or a combination of the two to become part or all of the decoding end device.
  • the decoding device can be as shown in Figure 1 to Any decoding end shown in Figure 3.
  • the device 1500 includes: a first parsing module 1501, a probability estimation module 1502, a quantization module 1503, a second parsing module 1504 and a reconstruction module 1505.
  • the first parsing module 1501 is used to parse the first super-a priori features of the image to be decoded from the code stream;
  • the probability estimation module 1502 is configured to determine the first probability distribution parameter through the probability distribution estimation network based on the first super-prior feature, and the first probability distribution parameter represents the probability distribution of the unquantized image feature of the image;
  • the quantization module 1503 is used to quantize the first probability distribution parameters according to the first quantization step size to obtain the second probability distribution parameters;
  • the second parsing module 1504 is used to parse the first image feature of the image from the code stream based on the second probability distribution parameter;
  • the reconstruction module 1505 is configured to inversely quantize the first image features according to the first quantization step size to reconstruct the image.
  • the first image feature is an image feature obtained by quantizing the second image feature of the image according to the first quantization step size.
  • the reconstruction module 1505 includes:
  • the inverse quantization submodule is used to inverse quantize the first image feature according to the first quantization step size to obtain the third image feature of the image;
  • the reconstruction submodule is used to reconstruct the image based on the third image feature.
  • the first probability distribution parameters are probability distribution parameters of multiple feature points
  • the first super-prior features are super-prior features of the multiple feature points.
  • the probability estimation module 1502 includes: context sub-module, first determination sub-module and probability estimation sub-module;
  • the probability distribution parameters of the first feature point are determined through the context sub-module, the first determination sub-module and the probability estimation sub-module, and the first feature point is any one of the plurality of feature points; wherein,
  • the context submodule is used to determine the context feature of the first feature point based on the image feature of the decoded feature point in the first image feature;
  • the first determination sub-module is used to determine the first a priori feature of the first feature point based on the super a priori feature of the first feature point;
  • the probability estimation submodule is used to determine the probability distribution parameters of the first feature point through a probability distribution estimation network based on the first a priori feature of the first feature point and the context feature of the first feature point.
  • the context submodule is used to:
  • the first quantization step size perform inverse quantization on the image features of the peripheral feature points in the first image feature to obtain the peripheral features of the first feature point;
  • a probability distribution estimation network including:
  • the first a priori feature of the first feature point and the contextual feature of the first feature point are input into the probability distribution estimation network to obtain the probability distribution parameter of the first feature point.
  • the context submodule is used to:
  • a probability distribution estimation network including:
  • the second a priori feature of the first feature point and the context feature of the first feature point are input into the probability distribution estimation network to obtain the probability distribution parameter of the first feature point.
  • the first probability distribution parameters are determined through the probability distribution estimation network during the decoding process.
  • the first probability distribution parameters represent the probability distribution of the unquantized image features.
  • the first probability distribution parameters are quantified according to the first quantization step size (that is, the quantization step size used to quantify the image features). ization, thereby obtaining second probability distribution parameters used to characterize the probability distribution of the quantized image features. It can be seen that in this scheme, it is enough to train a probability distribution estimation network used to determine the probability distribution parameters of unquantized image features.
  • the network training is stable and can train a probability distribution estimation network with good performance, which is beneficial to improving the encoding and decoding performance.
  • the decoding device provided in the above embodiment performs decoding
  • only the division of the above functional modules is used as an example.
  • the above function allocation can be completed by different functional modules as needed, that is, the device The internal structure is divided into different functional modules to complete all or part of the functions described above.
  • the decoding device provided by the above embodiments and the decoding method embodiments belong to the same concept. Please refer to the method embodiments for the specific implementation process, which will not be described again here.
  • Figure 16 is a schematic structural diagram of an encoding device 1600 provided by an embodiment of the present application.
  • the encoding device 1600 can be implemented by software, hardware, or a combination of the two to become part or all of the encoding end device.
  • the encoding end device can be as shown in Figure 1 to any coding end shown in Figure 3.
  • the device 1600 includes: a first determination module 1601, a second determination module 1602, a first encoding module 1603, a probability estimation module 1604 and a second encoding module 1605.
  • the first determination module 1601 is used to determine the first image feature and the second image feature of the image to be encoded.
  • the first image feature is the image feature obtained by quantizing the second image feature according to the first quantization step size;
  • the second determination module 1602 is used to determine the first super a priori feature of the second image feature
  • the first encoding module 1603 is used to encode the first super-priori feature into the code stream
  • the probability estimation module 1604 is configured to determine the second probability distribution parameters through the second probability distribution estimation network based on the first super-prior features.
  • the network parameters of the second probability distribution estimation network are based on the network parameters of the first probability distribution estimation network and Obtained by the first quantization step, the first probability distribution estimation network is used to determine the probability distribution of unquantized image features;
  • the second encoding module 1605 is used to encode the first image feature into the code stream based on the second probability distribution parameter.
  • the second probability distribution estimation network is obtained by multiplying the network parameters of the last layer in the first probability distribution estimation network by the first quantization step size.
  • the last layer of the first probability distribution estimation network is a convolutional layer
  • the network parameters of the convolutional layer include weights and biases.
  • the second determination module 1602 includes:
  • the first super-encoding sub-module is used to input the second image feature into the super-encoding network to obtain the first super-prior feature.
  • the second determination module 1602 includes:
  • the inverse quantization submodule is used to inverse quantize the first image feature according to the first quantization step size to obtain the third image feature of the image;
  • the second super-encoding submodule is used to input the third image feature into the super-encoding network to obtain the first super-prior feature.
  • the super-prior features of the unquantized image features are also determined, but the second probability distribution parameters are directly obtained through the second probability distribution estimation network.
  • the second probability distribution estimation network is based on the first
  • the quantization step size is obtained by processing the network parameters in the first probability distribution estimation network. It can be seen that in this solution, it is enough to train the first probability distribution estimation network (used to determine the probability distribution parameters of unquantized image features).
  • the training of the first probability distribution estimation network is less difficult, and the network training is stable and can be trained A well-performing first probability distribution estimate network, thus helping to improve encoding and decoding performance.
  • the encoding device provided in the above embodiment performs encoding
  • only the division of the above functional modules is used as an example.
  • the above function allocation can be completed by different functional modules as needed, that is, the device The internal structure is divided into different functional modules to complete all or part of the functions described above.
  • the encoding device provided by the above embodiments and the encoding method embodiments belong to the same concept. Please refer to the method embodiments for the specific implementation process, which will not be described again here.
  • Figure 17 is a schematic structural diagram of a decoding device 1700 provided by an embodiment of the present application.
  • the decoding device 1700 can be implemented by software, hardware, or a combination of the two to become part or all of the decoding end device.
  • the decoding end device can be as shown in Figure 1 to any decoding end shown in Figure 3.
  • the device 1700 includes: a first parsing module 1701, a probability estimation module 1702, a second parsing module 1703 and a reconstruction module 1704.
  • the first parsing module 1701 is used to parse the first super-priori features of the image to be decoded from the code stream;
  • the probability estimation module 1702 is configured to determine the second probability distribution parameters through the second probability distribution estimation network based on the first super-prior features.
  • the network parameters of the second probability distribution estimation network are based on the network parameters of the first probability distribution estimation network and Obtained by the first quantization step, the first probability distribution estimation network is used to determine the probability distribution of unquantized image features;
  • the second parsing module 1703 is used to parse the first image feature of the image from the code stream based on the second probability distribution parameter;
  • the reconstruction module 1704 is configured to inversely quantize the first image features according to the first quantization step size to reconstruct the image.
  • the second probability distribution estimation network is obtained by multiplying the network parameters of the last layer in the first probability distribution estimation network by the first quantization step size.
  • the last layer of the first probability distribution estimation network is a convolutional layer
  • the network parameters of the convolutional layer include weights and biases.
  • the first image feature is an image feature obtained by quantizing the second image feature of the image according to the first quantization step size.
  • the reconstruction module 1704 includes:
  • the inverse quantization submodule is used to inverse quantize the first image feature according to the first quantization step size to obtain the third image feature of the image;
  • the reconstruction submodule is used to reconstruct the image based on the third image feature.
  • the second probability distribution parameters are directly obtained through the second probability distribution estimation network.
  • the second probability distribution estimation network processes the network parameters in the first probability distribution estimation network based on the first quantization step size.
  • a first probability distribution estimation network is used to determine the probability distribution parameters of unquantized image features. It can be seen that in this solution, it is enough to train the first probability distribution estimation network (used to determine the probability distribution parameters of unquantized image features).
  • the training of the first probability distribution estimation network is less difficult, the network training is stable, and the first probability distribution estimation network with good performance can be trained, which is beneficial to improving the encoding and decoding performance.
  • the decoding device provided in the above embodiment performs decoding
  • only the division of the above functional modules is used as an example.
  • the above function allocation can be completed by different functional modules as needed, that is, the device The internal structure is divided into different functional modules to complete all or part of the functions described above.
  • the decoding device provided by the above embodiments and the decoding method embodiments belong to the same concept, and the specific implementation process can be found in the method embodiments. I won’t go into details here.
  • Figure 18 is a schematic block diagram of a coding and decoding device 1800 used in an embodiment of the present application.
  • the encoding and decoding device 1800 may include a processor 1801, a memory 1802, and a bus system 1803.
  • the processor 1801 and the memory 1802 are connected through a bus system 1803.
  • the memory 1802 is used to store instructions, and the processor 1801 is used to execute the instructions stored in the memory 1802 to perform various encoding or decoding described in the embodiments of this application. method. To avoid repetition, it will not be described in detail here.
  • the processor 1801 can be a central processing unit (CPU).
  • the processor 1801 can also be other general-purpose processors, DSP, ASIC, FPGA or other programmable logic devices, discrete gates. Or transistor logic devices, discrete hardware components, etc.
  • a general-purpose processor may be a microprocessor or the processor may be any conventional processor, etc.
  • the memory 1802 may include a ROM device or a RAM device. Any other suitable type of storage device may also be used as memory 1802.
  • Memory 1802 may include code and data 18021 accessed by processor 1801 using bus 1803 .
  • the memory 1802 may further include an operating system 18023 and an application program 18022, which includes at least one program that allows the processor 1801 to perform the encoding or decoding method described in the embodiment of the present application.
  • the application program 18022 may include applications 1 to N, which further include encoding or decoding applications (referred to as encoding and decoding applications) that perform the encoding or decoding methods described in the embodiments of this application.
  • bus system 1803 may also include a power bus, a control bus, a status signal bus, etc.
  • bus system 1803 may also include a power bus, a control bus, a status signal bus, etc.
  • various buses are labeled as bus system 1803 in the figure.
  • the codec apparatus 1800 may also include one or more output devices, such as a display 1804.
  • display 1804 may be a tactile display that incorporates a display with a tactile unit operable to sense touch input.
  • Display 1804 may be connected to processor 1801 via bus 1803.
  • the encoding and decoding device 1800 can perform the encoding method in the embodiment of the present application, and can also perform the decoding method in the embodiment of the present application.
  • Computer-readable media may include computer-readable storage media that correspond to tangible media, such as data storage media, or communication media including any medium that facilitates transfer of a computer program from one place to another (e.g., based on a communications protocol) .
  • computer-readable media generally may correspond to (1) non-transitory tangible computer-readable storage media, or (2) communication media, such as a signal or carrier wave.
  • Data storage media may be any available media that can be accessed by one or more computers or one or more processors to retrieve instructions, code and/or data structures for implementing the techniques described in this application.
  • a computer program product may include computer-readable media.
  • such computer-readable storage media may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage, flash memory or may be used to store instructions or data structures any other medium that may contain the required program code in a form that is accessible by a computer.
  • any connection is properly termed a computer-readable medium.
  • coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave are used to transmit instructions from a website, server, or other remote source
  • coaxial cable wire, fiber optic cable, twisted pair, DSL or wireless technologies such as infrared, radio and microwave included in the media Defining.
  • computer-readable storage media and data storage media do not include connections, carrier waves, signals, or other transitory media, but are instead directed to non-transitory tangible storage media.
  • disks and optical discs include compact discs (CDs), laser discs, optical discs, DVDs, and Blu-ray discs, where disks typically reproduce data magnetically, while discs reproduce data optically using lasers. Combinations of the above should also be included within the scope of computer-readable media.
  • DSPs digital signal processors
  • ASICs application specific integrated circuits
  • FPGAs field programmable logic arrays
  • processor to execute instructions may refer to any of the foregoing structures or any other structure suitable for implementing the techniques described herein.
  • the functionality described by the various illustrative logical blocks, modules, and steps described herein may be provided within or within dedicated hardware and/or software modules configured for encoding and decoding. into the combined codec.
  • the techniques may be entirely implemented in one or more circuits or logic elements.
  • various illustrative logical blocks, units, and modules in the encoder 100 and the decoder 200 can be understood as corresponding circuit devices or logical elements.
  • inventions of the present application may be implemented in a variety of devices or devices, including wireless handsets, integrated circuits (ICs), or a set of ICs (eg, chipsets).
  • ICs integrated circuits
  • a set of ICs eg, chipsets
  • Various components, modules or units are described in the embodiments of this application to emphasize the functional aspects of the apparatus for performing the disclosed technology, but do not necessarily need to be implemented by different hardware units. Indeed, as described above, the various units may be combined in a codec hardware unit in conjunction with suitable software and/or firmware, or by interoperating hardware units (including one or more processors as described above). supply.
  • the above embodiments it can be implemented in whole or in part by software, hardware, firmware, or any combination thereof.
  • software it may be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions. When the computer instructions are loaded and executed on the computer, the processes or functions described in the embodiments of the present application are generated in whole or in part.
  • the computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another, e.g., the computer instructions may be transferred from a website, computer, server, or data center Transmission to another website, computer, server or data center through wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) means.
  • the computer-readable storage medium can be any available medium that can be accessed by a computer, or a data storage device such as a server or data center integrated with one or more available media.
  • the available media may be magnetic media (such as floppy disks, hard disks, tapes), optical media (such as digital versatile discs (DVD)) or semiconductor media (such as solid state disks (SSD)) wait.
  • the computer-readable storage media mentioned in the embodiments of this application may be non-volatile storage media, in other words, may be non-transitory storage media.
  • the information (including but not limited to user equipment information, user personal information, etc.), data (including but not limited to data used for analysis, stored data, displayed data, etc.) involved in the embodiments of this application and Signals are all authorized by the user or fully authorized by all parties, and the collection, use and processing of relevant data need to comply with the relevant laws, regulations and standards of relevant countries and regions.
  • the images and videos involved in the embodiments of this application were all obtained with full authorization.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

本申请公开了一种编解码方法、装置、设备、存储介质及计算机程序产品,属于编解码技术领域。在编解码过程中进行概率估计时,基于未量化图像特征的超先验特征,通过第一概率分布估计网络估计未量化图像特征的概率分布,再通过量化得到经量化图像特征的概率分布。或者,基于未量化图像特征的超先验特征,通过第二概率分布估计网络(对第一概率分布估计网络的网络参数进行简单处理得到的网络)直接估计出经量化图像特征的概率分布。即使在多码率场景下,未量化图像特征的数值范围不受量化步长影响,是稳定的,因此,通过未量化图像特征来训练第一概率分布估计网络的训练难度较小且训练稳定,能够训练出性能良好的网络,有利于提升编解码性能。

Description

编解码方法、装置、设备、存储介质及计算机程序产品
本申请要求于2022年03月10日提交的申请号为202210234190.8、发明名称为“编解码方法、装置、设备、存储介质及计算机程序产品”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及编解码技术领域,特别涉及一种编解码方法、装置、设备、存储介质及计算机程序产品。
背景技术
图像压缩技术能够实现图像信息的有效传输和存储,对于当前图像信息的种类和数据量越来越大的媒体时代起着重要作用。图像压缩技术包括对图像的编码和解码,而编解码性能体现图像质量,是图像压缩技术中需要考虑的要素。
在相关技术的编码过程中,通过图像特征提取网络来提取图像的图像特征y,按照量化步长q对图像特征y进行量化,以得到图像特征ys。将图像特征ys输入超编码网络以确定超先验特征zs,通过熵编码将超先验特征zs编入码流。对码流中的超先验特征zs进行熵解码得到超先验特征zs’,基于超先验特征zs’,通过概率分布估计网络得到图像特征ys的概率分布参数。基于图像特征ys的概率分布参数,通过熵编码将图像特征ys编入码流。解码过程与编码过程是相对称的。其中,图像压缩很大部分通过量化操作实现,量化操作对编解码性能影响很大。
然而,编解码过程中的量化操作需要与码率相匹配。在多码率场景中,为匹配不同的码率,在编解码过程中往往需要使用不同的量化步长。而不同的量化步长会使得量化得到的图像特征ys的数值范围存在显著差异。为了训练得到用于估计不同码率下图像特征ys的概率分布参数的概率分布估计网络,需要通过不同数值范围的图像特征ys来训练概率分布估计网络。但不同码率下图像特征ys的数值范围变化较大,概率分布估计网络的训练难度较大,网络训练不稳定,较难训练得到性能良好的概率分布估计网络,从而影响编解码性能。
发明内容
本申请实施例提供了一种编解码方法、装置、设备、存储介质及计算机程序产品,在多码率场景下,也能够减小概率分布估计网络的训练难度,使得网络训练稳定,训练出性能良好的网络,从而提升编解码性能。
第一方面,本申请提供一种编码方法,包括:
确定待编码的图像的第一图像特征和第二图像特征,第一图像特征为按照第一量化步长对第二图像特征进行量化得到的图像特征;确定第二图像特征的第一超先验特征;将第一超先验特征编入码流;基于第一超先验特征,通过概率分布估计网络确定第一概率分布参数;按照第一量化步长,对第一概率分布参数进行量化,以得到第二概率分布参数;基于第二概 率分布参数,将第一图像特征编入码流。
可以看出,在编码过程中进行概率估计时,基于未量化图像特征的超先验特征,通过概率分布估计网络估计未量化图像特征的概率分布,再通过量化得到经量化图像特征的概率分布。即使在多码率场景下,未量化图像特征的数值范围不受量化步长影响,是稳定的,因此,通过未量化图像特征来训练概率分布估计网络的训练难度较小且训练稳定,能够训练出性能良好的网络,有利于提升编解码性能。
可选地,确定第二图像特征的第一超先验特征,包括:将第二图像特征输入超编码网络,以得到第一超先验特征。也即是,编码端将未量化的图像特征输入超编码网络,以得到未量化的图像特征的第一超先验特征。
可选地,确定第二图像特征的第一超先验特征,包括:按照第一量化步长,对第一图像特征进行反量化,以得到图像的第三图像特征;将第三图像特征输入超编码网络,以得到第一超先验特征。也即是,编码端将反量化后的图像特征输入超编码网络,所得到的反量化后的图像特征的第一超先验特征即认为是未量化的图像特征的第一超先验特征。
可选地,基于第一超先验特征,通过概率分布估计网络确定第一概率分布参数,包括:将图像的第三图像特征输入上下文网络,以得到第三图像特征的上下文特征,第三图像特征是按照第一量化步长对第一图像特征进行反量化得到的图像特征;基于第一超先验特征,确定第一先验特征;将第一先验特征和上下文特征输入概率分布估计网络,以得到第一概率分布参数。也即是,编码端对反量化后的图像特征提取上下文特征,进而结合上下文特征和第一先验特征来确定第一概率分布参数。这样,有利于提升概率估计的准确性。
可选地,基于第一超先验特征,通过概率分布估计网络确定第一概率分布参数,包括:将第一图像特征输入上下文网络,以得到第一图像特征的上下文特征;基于第一超先验特征,确定第一先验特征;按照第二量化步长,对第一先验特征进行量化,以得到第二先验特征;将第二先验特征和上下文特征输入概率分布估计网络,以得到第一概率分布参数。也即是,编码端也可以对经量化的图像特征提取上下文特征,再通过对第一先验特征增加一个量化操作来得到第二先验特征,从而结合第二先验特征和上下文特征来确定第一概率分布参数。这样,在一定程度上也能够提升概率估计的准确性。
可选地,第一量化步长是基于图像的码率通过增益网络得到的,增益网络用于确定多种码率分别对应的量化步长。也即是,量化步长通过网络学习得到,量化步长能够更好地匹配码率,有利于提升编解码性能。
第二方面,提供了一种解码方法,该方法包括:
从码流中解析出待解码的图像的第一超先验特征;基于第一超先验特征,通过概率分布估计网络确定第一概率分布参数,第一概率分布参数表征图像的未量化的图像特征的概率分布;按照第一量化步长,对第一概率分布参数进行量化,以得到第二概率分布参数;基于第二概率分布参数,从码流中解析出图像的第一图像特征;按照第一量化步长,对第一图像特征进行反量化,以重构图像。
可以看出,在解码过程中进行概率估计时,基于未量化图像特征的超先验特征,通过概率分布估计网络估计未量化图像特征的概率分布,再通过量化得到经量化图像特征的概率分布。即使在多码率场景下,未量化图像特征的数值范围不受量化步长影响,是稳定的,因此, 通过未量化图像特征来训练概率分布估计网络的训练难度较小且训练稳定,能够训练出性能良好的网络,有利于提升编解码性能。
可选地,第一图像特征为按照第一量化步长对图像的第二图像特征进行量化得到的图像特征。
可选地,按照第一量化步长,对第一图像特征进行反量化,以重构图像,包括:按照第一量化步长,对第一图像特征进行反量化,以得到图像的第三图像特征;基于第三图像特征,重构图像。
可选地,第一概率分布参数为多个特征点的概率分布参数,第一超先验特征为该多个特征点的超先验特征,基于第一超先验特征,通过概率分布估计网络确定第一概率分布参数,包括:对于第一特征点执行如下操作来确定第一特征点的概率分布参数,第一特征点为该多个特征点中的任意一个:基于第一图像特征中已解码的特征点的图像特征,确定第一特征点的上下文特征;基于第一特征点的超先验特征,确定第一特征点的第一先验特征;基于第一特征点的第一先验特征和第一特征点的上下文特征,通过概率分布估计网络确定第一特征点的概率分布参数,即,确定第一概率分布参数中第一特征点的概率分布参数。也即是,基于上下文特征来估计概率分布,有利于提升概率估计的准确性。
可选地,基于第一图像特征中已解码的特征点的图像特征,确定第一特征点的上下文特征,包括:从已解码的特征点中,确定第一特征点的周边特征点;按照第一量化步长,对第一图像特征中周边特征点的图像特征进行反量化,以得到第一特征点的周边特征;将第一特征点的周边特征输入上下文网络,以得到第一特征点的上下文特征;基于第一特征点的第一先验特征和第一特征点的上下文特征,通过概率分布估计网络确定第一特征点的概率分布参数,包括:将第一特征点的第一先验特征和第一特征点的上下文特征输入概率分布估计网络,以得到第一特征点的概率分布参数。也即是,解码端对反量化后的图像特征提取上下文特征,进而结合上下文特征和第一先验特征来确定第一概率分布参数。
可选地,基于第一图像特征中已解码的特征点的图像特征,确定第一特征点的上下文特征,包括:从已解码的特征点中,确定第一特征点的周边特征点;将第一图像特征中周边特征点的图像特征输入上下文网络,以得到第一特征点的上下文特征;基于第一特征点的第一先验特征和第一特征点的上下文特征,通过概率分布估计网络确定第一特征点的概率分布参数,包括:按照第二量化步长,对第一特征点的第一先验特征进行量化,以得到第一特征点的第二先验特征;将第一特征点的第二先验特征和第一特征点的上下文特征输入概率分布估计网络,以得到第一特征点的概率分布参数。也即是,解码端对经量化的图像特征提取上下文特征,再通过对第一先验特征增加一个量化操作来得到第二先验特征,从而结合第二先验特征和上下文特征来确定第一概率分布参数。
第三方面,提供了一种编码方法,该方法包括:
确定待编码的图像的第一图像特征和第二图像特征,第一图像特征为按照第一量化步长对第二图像特征进行量化得到的图像特征;确定第二图像特征的第一超先验特征;将第一超先验特征编入码流;基于第一超先验特征,通过第二概率分布估计网络确定第二概率分布参数,第二概率分布估计网络的网络参数是基于第一概率分布估计网络的网络参数和第一量化步长得到的,第一概率分布估计网络用于确定未量化的图像特征的概率分布;基于第二概率 分布参数,将第一图像特征编入码流。
可以看出,在该编码过程中也是确定未量化的图像特征的超先验特征,只不过后续通过第二概率分布估计网络直接得到第二概率分布参数,第二概率分布估计网络是基于第一量化步长对第一概率分布估计网络中的网络参数进行处理后得到的。可见在这种方案中训练第一概率分布估计网络即可。即使在多码率场景下,由于未量化的图像特征的数值范围是较稳定的,不受量化步长的影响,因此,第一概率分布估计网络的训练难度较小,网络训练稳定,能够训练出性能良好的第一概率分布估计网络,从而有利于提升编解码性能。
可选地,第一概率分布估计网络为上述第一方面或第二方面中的概率分布估计网络。
可选地,第二概率分布估计网络是将第一概率分布估计网络中最后一层的网络参数乘以第一量化步长后得到的。
可选地,第一概率分布估计网络的最后一层为卷积层,卷积层的网络参数包括权重和偏置。
可选地,确定第二图像特征的第一超先验特征,包括:将第二图像特征输入超编码网络,以得到第一超先验特征。也即是,编码端将未量化的图像特征输入超编码网络,以得到未量化的图像特征的第一超先验特征。
可选地,确定第二图像特征的第一超先验特征,包括:按照第一量化步长,对第一图像特征进行反量化,以得到图像的第三图像特征;将第三图像特征输入超编码网络,以得到第一超先验特征。也即是,编码端将反量化后的图像特征输入超编码网络,所得到的反量化后的图像特征的第一超先验特征即认为是未量化的图像特征的第一超先验特征。
可选地,基于第一超先验特征,通过第二概率分布估计网络确定第二概率分布参数,包括:将图像的第三图像特征输入上下文网络,以得到第三图像特征的上下文特征,第三图像特征是按照第一量化步长对第一图像特征进行反量化得到的图像特征;基于第一超先验特征,确定第一先验特征;将第一先验特征和上下文特征输入第二概率分布估计网络,以得到第二概率分布参数。也即是,编码端对反量化后的图像特征提取上下文特征,进而结合上下文特征和第一先验特征来确定第一概率分布参数。这样,有利于提升概率估计的准确性。
可选地,基于第一超先验特征,通过第二概率分布估计网络确定第二概率分布参数,包括:将第一图像特征输入上下文网络,以得到第一图像特征的上下文特征;基于第一超先验特征,确定第一先验特征;按照第二量化步长,对第一先验特征进行量化,以得到第二先验特征;将第二先验特征和上下文特征输入第二概率分布估计网络,以得到第二概率分布参数。也即是,编码端也可以对经量化的图像特征提取上下文特征,再通过对第一先验特征增加一个量化操作来得到第二先验特征,从而结合第二先验特征和上下文特征来确定第一概率分布参数。这样,在一定程度上也能够提升概率估计的准确性。
第四方面,提供了一种解码方法,该方法包括:
从码流中解析出待解码的图像的第一超先验特征;基于第一超先验特征,通过第二概率分布估计网络确定第二概率分布参数,第二概率分布估计网络的网络参数是基于第一概率分布估计网络的网络参数和第一量化步长得到的,第一概率分布估计网络用于确定未量化的图像特征的概率分布;基于第二概率分布参数,从码流中解析出图像的第一图像特征;按照第一量化步长,对第一图像特征进行反量化,以重构图像。
可见,在该解码过程中是通过第二概率分布估计网络直接得到第二概率分布参数,第二概率分布估计网络是基于第一量化步长对第一概率分布估计网络中的网络参数进行处理后得到的。在这种方案中训练第一概率分布估计网络即可。即使在多码率场景下,由于未量化的图像特征的数值范围是较稳定的,不受量化步长的影响,即,第一概率分布估计网络的输入数值范围不会随着码流的变化而发生变化,因此,第一概率分布估计网络的训练难度较小,网络训练稳定,能够训练出性能良好的第一概率分布估计网络,从而有利于提升编解码性能。
可选地,第一概率分布估计网络为上述第一方面或第二方面中的概率分布估计网络。
可选地,第二概率分布估计网络是将第一概率分布估计网络中最后一层的网络参数乘以第一量化步长后得到的。
可选地,第一概率分布估计网络的最后一层为卷积层,卷积层的网络参数包括权重和偏置。
可选地,第一图像特征为按照第一量化步长对图像的第二图像特征进行量化得到的图像特征。
可选地,按照第一量化步长,对第一图像特征进行反量化,以重构图像,包括:按照第一量化步长,对第一图像特征进行反量化,以得到图像的第三图像特征;基于第三图像特征,重构图像。
可选地,第二概率分布参数为多个特征点的概率分布参数,第一超先验特征为该多个特征点的超先验特征,基于第一超先验特征,通过第二概率分布估计网络确定第二概率分布参数,包括:对于第一特征点执行如下操作来确定第一特征点的概率分布参数,第一特征点为该多个特征点中的任意一个:基于第一图像特征中已解码的特征点的图像特征,确定第一特征点的上下文特征;基于第一特征点的超先验特征,确定第一特征点的第一先验特征;基于第一特征点的第一先验特征和第一特征点的上下文特征,通过第二概率分布估计网络确定第一特征点的概率分布参数,即,确定第二概率分布参数中第一特征点的概率分布参数。也即是,基于上下文特征来估计概率分布,有利于提升概率估计的准确性。
可选地,基于第一图像特征中已解码的特征点的图像特征,确定第一特征点的上下文特征,包括:从已解码的特征点中,确定第一特征点的周边特征点;按照第一量化步长,对第一图像特征中周边特征点的图像特征进行反量化,以得到第一特征点的周边特征;将第一特征点的周边特征输入上下文网络,以得到第一特征点的上下文特征;基于第一特征点的第一先验特征和第一特征点的上下文特征,通过第二概率分布估计网络确定第一特征点的概率分布参数,包括:将第一特征点的第一先验特征和第一特征点的上下文特征输入第二概率分布估计网络,以得到第一特征点的概率分布参数。也即是,解码端对反量化后的图像特征提取上下文特征,进而结合上下文特征和第一先验特征来确定第二概率分布参数。
可选地,基于第一图像特征中已解码的特征点的图像特征,确定第一特征点的上下文特征,包括:从已解码的特征点中,确定第一特征点的周边特征点;将第一图像特征中周边特征点的图像特征输入上下文网络,以得到第一特征点的上下文特征;基于第一特征点的第一先验特征和第一特征点的上下文特征,通过第二概率分布估计网络确定第一特征点的概率分布参数,包括:按照第二量化步长,对第一特征点的第一先验特征进行量化,以得到第一特征点的第二先验特征;将第一特征点的第二先验特征和第一特征点的上下文特征输入第二概率分布估计网络,以得到第一特征点的概率分布参数。也即是,解码端对经量化的图像特征 提取上下文特征,再通过对第一先验特征增加一个量化操作来得到第二先验特征,从而结合第二先验特征和上下文特征来确定第二概率分布参数。
第五方面,提供了一种编码装置,所述编码装置具有实现上述第一方面中编码方法行为的功能。所述编码装置包括一个或多个模块,该一个或多个模块用于实现上述第一方面所提供的编码方法。
也即是,提供了一种编码装置,该装置包括:
第一确定模块,用于确定待编码的图像的第一图像特征和第二图像特征,第一图像特征为按照第一量化步长对第二图像特征进行量化得到的图像特征;
第二确定模块,用于确定第二图像特征的第一超先验特征;
第一编码模块,用于将第一超先验特征编入码流;
概率估计模块,用于基于第一超先验特征,通过概率分布估计网络确定第一概率分布参数;
量化模块,用于按照第一量化步长,对第一概率分布参数进行量化,以得到第二概率分布参数;
第二编码模块,用于基于第二概率分布参数,将第一图像特征编入码流。
可选地,第二确定模块包括:
第一超编码子模块,用于将第二图像特征输入超编码网络,以得到第一超先验特征。
可选地,第二确定模块包括:
反量化子模块,用于按照第一量化步长,对第一图像特征进行反量化,以得到图像的第三图像特征;
第二超编码子模块,用于将第三图像特征输入超编码网络,以得到第一超先验特征。
可选地,概率估计模块包括:
上下文子模块,用于将图像的第三图像特征输入上下文网络,以得到第三图像特征的上下文特征,第三图像特征是按照第一量化步长对第一图像特征进行反量化得到的图像特征;
第一确定子模块,用于基于第一超先验特征,确定第一先验特征;
第一概率估计子模块,用于将第一先验特征和上下文特征输入概率分布估计网络,以得到第一概率分布参数。
可选地,概率估计模块包括:
上下文子模块,用于将第一图像特征输入上下文网络,以得到第一图像特征的上下文特征;
第二确定子模块,用于基于第一超先验特征,确定第一先验特征;
量化子模块,用于按照第二量化步长,对第一先验特征进行量化,以得到第二先验特征;
第二概率估计子模块,用于将第二先验特征和上下文特征输入概率分布估计网络,以得到第一概率分布参数。
第六方面,提供了一种解码装置,所述解码装置具有实现上述第二方面中解码方法行为的功能。所述解码装置包括一个或多个模块,该一个或多个模块用于实现上述第二方面所提供的解码方法。
也即是,提供了一种解码装置,该装置包括:
第一解析模块,用于从码流中解析出待解码的图像的第一超先验特征;
概率估计模块,用于基于第一超先验特征,通过概率分布估计网络确定第一概率分布参数,第一概率分布参数表征图像的未量化的图像特征的概率分布;
量化模块,用于按照第一量化步长,对第一概率分布参数进行量化,以得到第二概率分布参数;
第二解析模块,用于基于第二概率分布参数,从码流中解析出图像的第一图像特征;
重构模块,用于按照第一量化步长,对第一图像特征进行反量化,以重构图像。
可选地,第一图像特征为按照第一量化步长对图像的第二图像特征进行量化得到的图像特征。
可选地,重构模块包括:
反量化子模块,用于按照第一量化步长,对第一图像特征进行反量化,以得到图像的第三图像特征;
重构子模块,用于基于第三图像特征,重构图像。
可选地,第一概率分布参数为多个特征点的概率分布参数,第一超先验特征为该多个特征点的超先验特征,概率估计模块包括:上下文子模块、第一确定子模块和概率估计子模块;
对于第一特征点通过上下文子模块、第一确定子模块和概率估计子模块来确定第一特征点的概率分布参数,第一特征点为该多个特征点中的任意一个;其中,
上下文子模块,用于基于第一图像特征中已解码的特征点的图像特征,确定第一特征点的上下文特征;
第一确定子模块,用于基于第一特征点的超先验特征,确定第一特征点的第一先验特征;
概率估计子模块,用于基于第一特征点的第一先验特征和第一特征点的上下文特征,通过概率分布估计网络确定第一特征点的概率分布参数。
可选地,上下文子模块用于:
从已解码的特征点中,确定第一特征点的周边特征点;
按照第一量化步长,对第一图像特征中周边特征点的图像特征进行反量化,以得到第一特征点的周边特征;
将第一特征点的周边特征输入上下文网络,以得到第一特征点的上下文特征;
基于第一特征点的第一先验特征和第一特征点的上下文特征,通过概率分布估计网络确定第一特征点的概率分布参数,包括:
将第一特征点的第一先验特征和第一特征点的上下文特征输入概率分布估计网络,以得到第一特征点的概率分布参数。
可选地,上下文子模块用于:
从已解码的特征点中,确定第一特征点的周边特征点;
将第一图像特征中周边特征点的图像特征输入上下文网络,以得到第一特征点的上下文特征;
基于第一特征点的第一先验特征和第一特征点的上下文特征,通过概率分布估计网络确定第一特征点的概率分布参数,包括:
按照第二量化步长,对第一特征点的第一先验特征进行量化,以得到第一特征点的第二 先验特征;
将第一特征点的第二先验特征和第一特征点的上下文特征输入概率分布估计网络,以得到第一特征点的概率分布参数。
第七方面,提供了一种编码装置,所述编码装置具有实现上述第三方面中编码方法行为的功能。所述编码装置包括一个或多个模块,该一个或多个模块用于实现上述第三方面所提供的编码方法。
也即是,提供了一种编码装置,该装置包括:
第一确定模块,用于确定待编码的图像的第一图像特征和第二图像特征,第一图像特征为按照第一量化步长对第二图像特征进行量化得到的图像特征;
第二确定模块,用于确定第二图像特征的第一超先验特征;
第一编码模块,用于将第一超先验特征编入码流;
概率估计模块,用于基于第一超先验特征,通过第二概率分布估计网络确定第二概率分布参数,第二概率分布估计网络的网络参数是基于第一概率分布估计网络的网络参数和第一量化步长得到的,第一概率分布估计网络用于确定未量化的图像特征的概率分布;
第二编码模块,用于基于第二概率分布参数,将第一图像特征编入码流。
可选地,第二概率分布估计网络是将第一概率分布估计网络中最后一层的网络参数乘以第一量化步长后得到的。
可选地,第一概率分布估计网络的最后一层为卷积层,卷积层的网络参数包括权重和偏置。
可选地,第二确定模块包括:
第一超编码子模块,用于将第二图像特征输入超编码网络,以得到第一超先验特征。
可选地,第二确定模块包括:
反量化子模块,用于按照第一量化步长,对第一图像特征进行反量化,以得到图像的第三图像特征;
第二超编码子模块,用于将第三图像特征输入超编码网络,以得到第一超先验特征。
第八方面,提供了一种解码装置,所述解码装置具有实现上述第四方面中解码方法行为的功能。所述解码装置包括一个或多个模块,该一个或多个模块用于实现上述第四方面所提供的解码方法。
也即是,提供了一种解码装置,该装置包括:
第一解析模块,用于从码流中解析出待解码的图像的第一超先验特征;
概率估计模块,用于基于第一超先验特征,通过第二概率分布估计网络确定第二概率分布参数,第二概率分布估计网络的网络参数是基于第一概率分布估计网络的网络参数和第一量化步长得到的,第一概率分布估计网络用于确定未量化的图像特征的概率分布;
第二解析模块,用于基于第二概率分布参数,从码流中解析出图像的第一图像特征;
重构模块,用于按照第一量化步长,对第一图像特征进行反量化,以重构图像。
可选地,第二概率分布估计网络是将第一概率分布估计网络中最后一层的网络参数乘以第一量化步长后得到的。
可选地,第一概率分布估计网络的最后一层为卷积层,卷积层的网络参数包括权重和偏置。
可选地,第一图像特征为按照第一量化步长对图像的第二图像特征进行量化得到的图像特征。
可选地,重构模块包括:
反量化子模块,用于按照第一量化步长,对第一图像特征进行反量化,以得到图像的第三图像特征;
重构子模块,用于基于第三图像特征,重构图像。
第九方面,提供了一种编码端设备,所述编码端设备包括处理器和存储器,所述存储器用于存储执行上述第一方面和/或第三方面所提供的编码方法的程序,以及存储用于实现上述第一方面和/或第三方面所提供的编码方法所涉及的数据。所述处理器被配置为用于执行所述存储器中存储的程序。所述编码端设备还可以包括通信总线,该通信总线用于该处理器与存储器之间建立连接。
第十方面,提供了一种解码端设备,所述解码端设备包括处理器和存储器,所述存储器用于存储执行上述第二方面和/或第四方面所提供的解码方法的程序,以及存储用于实现上述第二方面和/或第四方面所提供的解码方法所涉及的数据。所述处理器被配置为用于执行所述存储器中存储的程序。所述解码端设备还可以包括通信总线,该通信总线用于该处理器与存储器之间建立连接。
第十一方面,提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行上述第一方面或第三方面所述的编码方法,或者执行上述第二方面或第四方面所述的解码方法。
第十二方面,提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述第一方面或第三方面所述的编码方法,或者执行上述第二方面或第四方面所述的解码方法。
上述第五方面至第十二方面所获得的技术效果与第一方面至第四方面中对应的技术手段获得的技术效果近似,在这里不再赘述。
本申请提供的技术方案至少能够带来以下有益效果:
为了得到经量化的图像特征的概率分布参数,在一方案的编码过程中基于未量化的图像特征的超先验特征,通过概率分布估计网络确定第一概率分布参数,第一概率分布参数即表征了未量化的图像特征的概率分布。然后按照第一量化步长(即量化图像特征所用的量化步长)对第一概率分布参数进行量化,从而得到用于表征经量化的图像特征的概率分布的第二概率分布参数。在另一方案的编码过程中也是确定未量化的图像特征的超先验特征,只不过后续通过第二概率分布估计网络直接得到第二概率分布参数,第二概率分布估计网络是基于第一量化步长对第一概率分布估计网络中的网络参数进行处理后得到的,第一概率分布网络 参数即第一种方案中的概率分布估计网络。解码过程与编码过程是相对称的。可见这两种方案中训练第一概率分布估计网络(用于确定未量化的图像特征的概率分布参数)即可。即使在多码率场景下,由于未量化的图像特征的数值范围是较稳定的,不受量化步长的影响,即,第一概率分布估计网络的输入数值范围不会随着码率的不同而变化,因此,第一概率分布估计网络的训练难度较小,网络训练稳定,能够训练出性能良好的第一概率分布估计网络,从而有利于提升编解码性能。
附图说明
图1是本申请实施例提供的一种实施环境的示意图;
图2是本申请实施例提供的另一种实施环境的示意图;
图3是本申请实施例提供的又一种实施环境的示意图;
图4是本申请实施例提供的一种编码方法的流程图;
图5是本申请实施例提供的一种图像特征提取网络的结构示意图;
图6是本申请实施例提供的一种编解码方法的流程图;
图7是本申请实施例提供的另一种编解码方法的流程图;
图8是本申请实施例提供的又一种编解码方法的流程图;
图9是本申请实施例提供的又一种编解码方法的流程图;
图10是本申请实施例提供的又一种编解码方法的流程图;
图11是本申请实施例提供的另一种编码方法的流程图;
图12是本申请实施例提供的一种解码方法的流程图;
图13是本申请实施例提供的另一种解码方法的流程图;
图14是本申请实施例提供的一种编码装置的结构示意图;
图15是本申请实施例提供的一种解码装置的结构示意图;
图16是本申请实施例提供的另一种编码装置的结构示意图;
图17是本申请实施例提供的另一种解码装置的结构示意图;
图18是本申请实施例提供的一种编解码装置的示意性框图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。
本申请实施例描述的系统架构以及业务场景是为了更加清楚的说明本申请实施例的技术方案,并不构成对于本申请实施例提供的技术方案的限定,本领域普通技术人员可知,随着系统架构的演变和新业务场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。
在对本申请实施例提供的编解码方法进行详细地解释说明之前,先对本申请实施例涉及的术语和实施环境进行介绍。
为了便于理解,首先对本申请实施例涉及的术语进行解释。
码率:在图像压缩中,指单位像素编码所需要的编码长度,码率越高,图像重建质量越 好。
卷积神经网络(convolution neural network,CNN):是一种包含卷积计算且具有深度结构的前馈神经网络,是深度学习的代表算法之一。CNN包括卷积层,还可能包含激活层(如线性修正单元(rectified linear unit,ReLU)、带参数的ReLU(Parametric ReLU,PReLU)等)、池化层(pooling layer)、批量归一化(batch normalization,BN)层、全连接层(fully connected layer)等。典型的CNN如LeNet、AlexNet、VGGNet、ResNet等。基本的CNN可包括主干网络和头部网络。复杂的CNN可包括主干网络、脖子网络和头部网络。
特征图(feature map):卷积神经网络中卷积层、激活层、池化层、批量归一化层等输出的三维数据,三个维度分别称为宽(width)、高(height)、通道(channel)。一个特征图包括多个特征点的图像特征。
主干网络(backbone network):卷积神经网络的第一部分,功能为对输入图像提取多个尺度的特征图,通常由卷积层、池化层、激活层等构成,不含有全连接层。通常,主干网络中较靠近输入图像的层所输出的特征图的分辨率(宽、高)较大但通道数较少。典型的主干网络如VGG-16、ResNet-50、ResNeXt-101等。
头部网络(head network):卷积神经网络的最后部分,其功能为处理特征图得到神经网络输出的预测结果,常见的头部网络包含全连接层、softmax模块等。
脖子网络(neck network):卷积神经网络的中间部分,其功能为对头部网络产生的特征图进一步整合处理,得到新的特征图。常见的网络如快速区域检测的卷积神经网络(faster region-CNN,Faster-RCNN)中的特征金字塔网络(feature pyramid network,FPN)。
接下来对本申请实施例涉及的实施环境进行介绍。
图1是本申请实施例提供的一种实施环境的示意图。参见图1,该实施环境包括编码端101和解码端102。其中,编码端101用于根据本申请实施例提供的编码方法来压缩图像,解码端102用于根据本申请实施例提供的解码方法来解码图像。可选地,编码端101包括编码器,编码器用于压缩图像,解码端102包括解码器,解码器用于解码图像。在编码端101和解码端102位于同一设备的情况下,编码端101与解码端102通过设备内部连线或网络进行通信。在编码端101和解码端102位于不同设备的情况下,编码端101与解码端102通过外部连线或无线网络进行通信。其中,编码端101也可称为源装置,解码端102也可称为目的端转置。
图2是本申请实施例提供的另一种实施环境的示意图。请参考图2,该实施环境包括源装置10、目的地装置20、链路30和存储装置40。其中,源装置10可以产生经编码的图像。因此,源装置10也可以被称为图像编码装置或编码端。目的地装置20可以对由源装置10所产生的经编码的图像进行解码。因此,目的地装置20也可以被称为图像解码装置或解码端。链路30可以接收源装置10所产生的经编码的图像,并可以将该经编码的图像传输给目的地装置20。存储装置40可以接收源装置10所产生的经编码的图像,并可以将该经编码的图像进行存储,这样的条件下,目的地装置20可以直接从存储装置40中获取经编码的图像。或者,存储装置40可以对应于文件服务器或可以保存由源装置10产生的经编码的图像的另一中间存储装置,这样的条件下,目的地装置20可以经由流式传输或下载存储装置40存储的 经编码的图像。
源装置10和目的地装置20均可以包括一个或多个处理器以及耦合到该一个或多个处理器的存储器,该存储器可以包括随机存取存储器(random access memory,RAM)、只读存储器(read-only memory,ROM)、带电可擦可编程只读存储器(electrically erasable programmable read-only memory,EEPROM)、快闪存储器、可用于以可由计算机存取的指令或数据结构的形式存储所要的程序代码的任何其它媒体等。例如,源装置10和目的地装置20均可以包括手机、智能手机、个人数字助手(personal digital assistant,PDA)、可穿戴设备、掌上电脑(pocket PC,PPC)、平板电脑、智能车机、智能电视、智能音箱、桌上型计算机、移动计算装置、笔记型(例如,膝上型)计算机、平板计算机、机顶盒、例如所谓的“智能”电话等电话手持机、电视机、相机、显示装置、数字媒体播放器、视频游戏控制台、车载计算机或其类似者。
链路30可以包括能够将经编码的图像从源装置10传输到目的地装置20的一个或多个媒体或装置。在一种可能的实现方式中,链路30可以包括能够使源装置10实时地将经编码的图像直接发送到目的地装置20的一个或多个通信媒体。在本申请实施例中,源装置10可以基于通信标准来调制经编码的图像,该通信标准可以为无线通信协议等,并且可以将经调制的图像发送给目的地装置20。该一个或多个通信媒体可以包括无线和/或有线通信媒体,例如该一个或多个通信媒体可以包括射频(radio frequency,RF)频谱或一个或多个物理传输线。该一个或多个通信媒体可以形成基于分组的网络的一部分,基于分组的网络可以为局域网、广域网或全球网络(例如,因特网)等。该一个或多个通信媒体可以包括路由器、交换器、基站或促进从源装置10到目的地装置20的通信的其它设备等,本申请实施例对此不做具体限定。
在一种可能的实现方式中,存储装置40可以将接收到的由源装置10发送的经编码的图像进行存储,目的地装置20可以直接从存储装置40中获取经编码的图像。这样的条件下,存储装置40可以包括多种分布式或本地存取的数据存储媒体中的任一者,例如,该多种分布式或本地存取的数据存储媒体中的任一者可以为硬盘驱动器、蓝光光盘、数字多功能光盘(digital versatile disc,DVD)、只读光盘(compact disc read-only memory,CD-ROM)、快闪存储器、易失性或非易失性存储器,或用于存储经编码图像的任何其它合适的数字存储媒体等。
在一种可能的实现方式中,存储装置40可以对应于文件服务器或可以保存由源装置10产生的经编码图像的另一中间存储装置,目的地装置20可经由流式传输或下载存储装置40存储的图像。文件服务器可以为能够存储经编码的图像并且将经编码的图像发送给目的地装置20的任意类型的服务器。在一种可能的实现方式中,文件服务器可以包括网络服务器、文件传输协议(file transfer protocol,FTP)服务器、网络附属存储(network attached storage,NAS)装置或本地磁盘驱动器等。目的地装置20可以通过任意标准数据连接(包括因特网连接)来获取经编码图像。任意标准数据连接可以包括无线信道(例如,Wi-Fi连接)、有线连接(例如,数字用户线路(digital subscriber line,DSL)、电缆调制解调器等),或适合于获取存储在文件服务器上的经编码的图像的两者的组合。经编码的图像从存储装置40的传输可为流式传输、下载传输或两者的组合。
图2所示的实施环境仅为一种可能的实现方式,并且本申请实施例的技术不仅可以适用于图2所示的可以对图像进行编码的源装置10,以及可以对经编码的图像进行解码的目的地 装置20,还可以适用于其他可以对图像进行编码和对经编码的图像进行解码的装置,本申请实施例对此不做具体限定。
在图2所示的实施环境中,源装置10包括数据源120、编码器100和输出接口140。在一些实施例中,输出接口140可以包括调节器/解调器(调制解调器)和/或发送器,其中发送器也可以称为发射器。数据源120可以包括图像捕获装置(例如,摄像机等)、含有先前捕获的图像的存档、用于从图像内容提供者接收图像的馈入接口,和/或用于产生图像的计算机图形系统,或图像的这些来源的组合。
数据源120可以向编码器100发送图像,编码器100可以对接收到由数据源120发送的图像进行编码,得到经编码的图像。编码器可以将经编码的图像发送给输出接口。在一些实施例中,源装置10经由输出接口140将经编码的图像直接发送到目的地装置20。在其它实施例中,经编码的图像还可存储到存储装置40上,供目的地装置20以后获取并用于解码和/或显示。
在图2所示的实施环境中,目的地装置20包括输入接口240、解码器200和显示装置220。在一些实施例中,输入接口240包括接收器和/或调制解调器。输入接口240可经由链路30和/或从存储装置40接收经编码的图像,然后再发送给解码器200,解码器200可以对接收到的经编码的图像进行解码,得到经解码的图像。解码器可以将经解码的图像发送给显示装置220。显示装置220可与目的地装置20集成或可在目的地装置20外部。一般来说,显示装置220显示经解码的图像。显示装置220可以为多种类型中的任一种类型的显示装置,例如,显示装置220可以为液晶显示器(liquid crystal display,LCD)、等离子显示器、有机发光二极管(organic light-emitting diode,OLED)显示器或其它类型的显示装置。
尽管图2中未示出,但在一些方面,编码器100和解码器200可各自与编码器和解码器集成,且可以包括适当的多路复用器-多路分用器(multiplexer-demultiplexer,MUX-DEMUX)单元或其它硬件和软件,用于共同数据流或单独数据流中的音频和视频两者的编码。在一些实施例中,如果适用的话,那么MUX-DEMUX单元可符合ITU H.223多路复用器协议,或例如用户数据报协议(user datagram protocol,UDP)等其它协议。
编码器100和解码器200各自可为以下各项电路中的任一者:一个或多个微处理器、数字信号处理器(digital signal processing,DSP)、专用集成电路(application specific integrated circuit,ASIC)、现场可编程门阵列(field-programmable gate array,FPGA)、离散逻辑、硬件或其任何组合。如果部分地以软件来实施本申请实施例的技术,那么装置可将用于软件的指令存储在合适的非易失性计算机可读存储媒体中,且可使用一个或多个处理器在硬件中执行所述指令从而实施本申请实施例的技术。前述内容(包括硬件、软件、硬件与软件的组合等)中的任一者可被视为一个或多个处理器。编码器100和解码器200中的每一者都可以包括在一个或多个编码器或解码器中,所述编码器或所述解码器中的任一者可以集成为相应装置中的组合编码器/解码器(编码解码器)的一部分。
本申请实施例可大体上将编码器100称为将某些信息“发信号通知”或“发送”到例如解码器200的另一装置。术语“发信号通知”或“发送”可大体上指代用于对经压缩的图像进行解码的语法元素和/或其它数据的传送。此传送可实时或几乎实时地发生。替代地,此通信可经过一段时间后发生,例如可在编码时在经编码位流中将语法元素存储到计算机可读存储媒体时发生,解码装置接着可在所述语法元素存储到此媒体之后的任何时间检索所述语法 元素。
图3是本申请实施例提供的又一种实施环境的示意图。在该实施环境中将本申请实施例提供的编解码方法应用于虚拟现实流场景。请参考图3,该实施环境包括编码端和解码端,编码端包括视频的采集及预处理模块(也称为前处理模块)、视频编码模块和发送模块,解码端包括接收模块、码流解码模块和渲染显示模块。
其中,编码端的采集模块采集视频,视频包括待编码的多帧图像,然后通过预处理模块对各帧图像进行预处理操作。之后通过视频编码模块,利用本申请实施例提供的编码方法对各帧图像进行编码处理,得到码流。发送模块将码流经传输网络发送给解码端。解码端的接收模块首先接收码流,之后通过解码模块,利用本申请实施例提供的解码方法对码流进行解码,以得到图像信息,然后通过渲染显示模块对图像信息进行渲染显示。除此之外,编码端得到码流后也可以进行存储。
需要说明的是,本申请实施例提供的编解码方法可以应用于多种场景,在各种场景中编解码的图像均可以是图像文件包括的图像,也可以是视频文件包括的图像。编解码的图像可以为RGB、YUV444、YUV420等格式的图像。需要说明的是,结合图1、图2和图3所示的实施环境,下文中的任一种编码方法可以是编码端执行的。下文中的任一种解码方法可以是解码端执行的。
接下来对本申请实施例提供的编码方法进行介绍。
图4是本申请实施例提供的一种编码方法的流程图。该方法应用于编码端。请参考图4,该方法包括如下步骤。
步骤401:确定待编码的图像的第一图像特征和第二图像特征,第一图像特征为按照第一量化步长对第二图像特征进行量化得到的图像特征。
在本申请实施例中,编码端将待编码的图像输入图像特征提取网络,以得到该图像的第二图像特征,第二图像特征即未量化的图像特征。编码端按照第一量化步长,对第二图像特征进行量化,以得到第一图像特征,第一图像特征即经量化的图像特征。
需要说明的是,第一图像特征和第二图像特征均包括多个特征点的图像特征,第一图像特征中各个特征点的图像特征可称为相应特征点的第一特征值,第二图像特征中各个特征点的图像特征可称为相应特征点的第二特征值。
可选地,图像特征提取网络为卷积神经网络,第一图像特征由第一特征图进行表示,第二图像特征由第二特征图进行表示,第一特征图和第二特征图均具有多个特征点。需要说明的是,本申请实施例中的图像特征提取网络是预先训练得到的,本申请实施例中不限定图像特征提取网络的网络结构和训练方式等。例如,图像特征提取网络可以是全连接网络或上述卷积神经网络,卷积神经网络中的卷积可以是2D卷积或3D卷积。另外,本申请实施例对图像特征提取网络的所包括的网络层数和每一层的节点数也不作限定。
图5是本申请实施例提供的一种图像特征提取网络的结构示意图。参见图5,该图像特征提取网络为卷积神经网络,该卷积神经网络包括四个卷积层(Conv)和穿插级联的三个抓取检测网络(grasp detection network,GDN)层。每个卷积层的卷积核大小均为5×5,输出的特征图的通道数为M,每个卷积层对宽和高进行2倍下采样。例如,对于输入16W×16H×3 的图像,该卷积神经网络输出的特征图大小为W×H×M。需要说明的是,图5所示卷积神经网络的结构并不用于限制本申请实施例,例如,卷积核大小、特征图的通道数、下采样倍数、下采样次数、卷积层数等均可调整。
可选地,在本申请实施例中,上述第一量化步长是基于该图像的码率通过增益网络得到的,该增益网络用于确定多种码率分别对应的量化步长。示例性地,编码端基于该图像的码率确定第一质量因子,将第一质量因子输入增益网络,以得到第一量化步长。需要说明的是,不同的码率对应不同的质量因子,通过增益网络即可得到不同的量化步长。或者,事先存储码率与量化步长的映射关系,基于该图像的码率从该映射关系中获取对应的量化步长作为第一量化步长。可选地,在另一些实施例中,第一量化步长是基于待编码的图像的码率确定第一质量因子后,从质量因子与量化步长的映射关系中得到第一质量因子对应的第一量化步长。
其中,质量因子也可以替换为量化参数。上述实现过程中所涉及的量化处理的方式可以有多种,例如均匀量化或标量量化。其中,标量量化还可以存在偏置量,即通过偏置量对待量化的数据(如第二图像特征)进行偏置处理后再按照量化步长进行标量量化。可选地,本申请实施例中对图像特征所做的量化处理包括量化及取整操作。示例性地,假设第二图像特征由特征图y进行表示,第二图像特征的数值范围在区间[0,100]内,第一量化步长由q1表示,q1为0.5,第一图像特征由特征图ys进行表示,那么,编码端对特征图y中各个特征点的特征值进行量化,以得到特征图ys,对特征图ys中各个特征点的特征值进行取整,以得到特征图ys’,即得到第一图像特征,第一图像特征的数值范围在区间[0,50]内。其中,以均匀量化为例,对任意的特征值x按照量化步长q进行量化得到的特征值为x'=x*q。
可选地,对各个特征点的图像特征进行量化的第一量化步长可以相同也可以不同,例如同一通道内的特征点使用相同的第一量化步长,或者同一空间位置不同通道的特征值使用相同的第一量化步长。假设待量化的第二图像特征的大小为W×H×M,在任意质量因子i下,第二图像特征中坐标为(k,j,l)的特征点的第一量化步长为qi(k,j,l),qi(k,j,l)可通过增益网络学习得到或通过存储的映射关系得到。其中,k∈[1,W],j∈[1,H],l∈[1,M]。应该理解的是,不同的量化参数QP对应不同的量化步长q,量化参数QP与量化步长q一一对应。例如,在一些标准方案中量化参数与量化步长之间的映射关系可表示为q=(21/6)QP-4。当然也可设计其他的函数来表示QP与q的映射关系。
需要说明的是,下文的量化处理的方式与此处的类似,下文的量化处理方式可以参考此处的方式,本申请实施例在后文不再赘述。
步骤402:确定第二图像特征的第一超先验特征。
在本申请实施例中,为了后续通过步骤404得到未量化的图像特征的概率分布参数(即第一概率分布参数),编码端在步骤404之前确定未量化的图像特征的第一超先验特征(如第二图像特征的第一超先验特征)。编码端确定第二图像特征的第一超先验特征的实现方式有多种。接下来介绍其中的两种实现方式。
编码端确定第二图像特征的第一超先验特征的第一种实现方式为:将上述第二图像特征输入超编码网络,以得到第一超先验特征。也即是,编码端将未量化的图像特征输入超编码网络,以得到未量化的图像特征的第一超先验特征。编码端确定第二图像特征的第一超先验特征的第二种实现方式为:按照第一量化步长,对第一图像特征进行反量化,以得到图像的第三图像特征,将第三图像特征输入超编码网络,以得到第一超先验特征。第一超先验特征 也可认为是第三图像特征的第一超先验特征,也可认为是第二图像特征的第一超先验特征。因为第二图像特征是量化前的图像特征,第三图像特征是反量化后的图像特征,因此,尽管第一图像特征和第三图像特征在数值上有些区别,但两者所表征的图像信息是基本等价的。
可选地,上述超编码网络输出第一超先验特征。或者,上述超编码网络输出第二超先验特征,编码端按照第三量化步长,对第二超先验特征进行量化,以得到第一超先验特征,第一超先验特征即经量化的超先验特征。其中,第三量化步长与第一量化步长相同或不同。也即是,对超先验特征也可以进行量化操作,以压缩超先验特征。可选地,超先验特征也可称为边信息,边信息可以理解为对图像特征进一步提取特征。
需要说明的是,上述第一超先验特征和第二超先验特征均为多个特征点的超先验特征。示例性地,在步骤404中,将第一图像特征中各个特征点的图像特征输入超编码网络,以得到第一超先验特征中各个特征点的超先验特征。另外,本申请实施例中的超编码网络是预先训练得到的,本申请实施例不限定超编码网络的网络结构和训练方式等。例如,超编码网络可以是一种卷积神经网络或全连接网络。可选地,本文中的超编码网络也可以称为超先验网络。
步骤403:将第一超先验特征编入码流。
在本申请实施例中,编码端将第一超先验特征编入码流,以便于后续解码端基于第一超先验特征进行解码。
可选地,编码端通过熵编码将第一超先验特征编入码流。示例性地,编码端根据指定的概率分布参数,通过熵编码将第一超先验特征编入码流。其中,指定的概率分布参数为预先通过某概率分布估计网络确定的概率分布参数,本申请实施例不限定该概率分布估计网络的网络结构和训练方法等。
步骤404:基于第一超先验特征,通过概率分布估计网络确定第一概率分布参数。
在本申请实施例中,该概率分布估计网络用于确定未量化的图像特征的概率分布参数,基于此,编码端基于第一超先验特征,通过该概率分布估计网络确定第一概率分布参数,第一概率分布参数即表征未量化的图像特征(如第二图像特征、第三图像特征)的概率分布。需要说明的是,本文中的概率分布参数可以为任意一种用于表征图像特征的概率分布的参数,例如高斯分布的均值和方差(或标准差)、拉普拉斯分布的位置参数和尺度参数、逻辑斯谛分布的均值和尺度参数,又如其他的模型参数等。
可选地,为了与解码端的解码过程相一致,编码端从码流中解析出第一超先验特征,基于解析出的第一超先验特征,通过该概率分布估计网络确定第一概率分布参数。
由前述可知,第一超先验特征是经量化的超先验特征,也可以是未量化的超先验特征。基于此,在第一超先验特征是经量化的超先验特征的实现方式中,编码端按照上述第三量化步长,对第一超先验特征进行反量化,以得到第二超先验特征,将第二超先验特征输入该概率分布估计网络,以得到第一概率分布参数。在第一超先验特征是未量化的超先验特征的实现方式中,编码端将第一超先验特征输入该概率分布估计网络,以得到第一概率分布参数。其中,该概率分布估计网络也可认为是一种超解码网络,该超解码网络用于基于超先验特征确定概率分布参数。
除了上述确定第一概率分布参数的实现方式之外,编码端也可以基于上下文特征来确定第一概率分布参数,以提高第一概率分布参数的准确性。接下来将对此进行介绍。
可选地,在一种实现方式中,编码端将图像的第三图像特征输入上下文网络,以得到第三图像特征的上下文特征。其中,第三图像特征是按照第一量化步长对第一图像特征进行反量化得到的图像特征。编码端基于第一超先验特征,确定第一先验特征,将第一先验特征和上下文特征输入该概率分布估计网络,以得到第一概率分布参数。也即是,编码端对反量化后的图像特征提取上下文特征,进而结合上下文特征和第一先验特征来确定第一概率分布参数。
其中,在第一超先验特征是经量化的超先验特征的实现方式中,编码端从码流中解析出第一超先验特征,按照第三量化步长对解析出的第一超先验特征进行反量化,以得到第二超先验特征,将第二超先验特征输入超解码网络,以得到第一先验特征。在第一超先验特征是经量化的超先验特征的实现方式中,编码端从码流中解析出第一超先验特征,将解析出的第一超先验特征输入超解码网络,以得到第一先验特征。
可选地,基于前述介绍由于第二图像特征和第三图像特征所表征的图像信息是基本等价的,因此,编码端也可以将第二图像特征输入上下文网络,以得到第二图像特征的上下文特征,第二图像特征的上下文特征即第三图像特征的上下文特征。
需要说明的是,第三图像特征的上下文特征包括多个特征点中各个特征点的上下文特征,第一概率分布参数为该多个特征点的概率分布参数,也即是,编码端可以并行地确定多个特征点中各个特征点的上下文特征以及各个特征点的概率分布参数。
在另一种实现方式中,编码端将第一图像特征输入上下文网络,以得到第一图像特征的上下文特征,基于第一超先验特征,确定第一先验特征,按照第二量化步长,对第一先验特征进行量化,以得到第二先验特征,将第二先验特征和上下文特征输入该概率分布估计网络,以得到第一概率分布参数。也即是,编码端也可以对经量化的图像特征提取上下文特征,再通过对第一先验特征增加一个量化操作来得到第二先验特征,从而结合第二先验特征和上下文特征来确定第一概率分布参数。其中,第二量化步长与第一量化步长相同或不同。
需要说明的是,在这种实现方式中,编码端确定第一先验特征的实现方式与上一实现方式中的相关过程一致,这里不再赘述。另外,在基于上下文特征确定概率分布的实现方式中,超解码网络用于基于超先验特征确定先验特征,概率分布估计网络用于基于先验特征和上下文特征确定概率分布参数。本申请实施例中的超解码网络和概率分布估计网络均是预先训练得到的,本申请实施例不限定超解码网络和概率分布估计网络的网络结构和训练方式等。例如,超解码网络和概率分布估计网络均可以是一种卷积神经网络、循环神经网络或全连接网络等。
可选地,本申请实施例中的概率分布估计网络使用高斯模型(如单高斯模型(Gaussian single model,GSM)或混合高斯模型(Gaussian mixture model,GMM))来建模,即,假设未量化的图像特征(如第二图像特征或第三图像特征)中各个特征点的特征值均符合单高斯模型或混合高斯模型,那么,该概率估计分布网络所得到的第一概率分布参数包括均值μ和标准差σ。可选地,该概率分布估计网络也可以使用拉普拉斯分布(Laplace distribution)模型,相应地,第一概率分布参数包括位置参数λ和尺度参数b。该概率分布估计网络也可以使用逻辑斯谛分布(logistic distribution)模型,相应地,第一概率分布参数包括均值μ和尺度参数s。以高斯模型为例,第一概率分布参数中任一特征点的概率分布所对应的概率分布函数如公式(1)所示,其中,x为该特征点的第二特征值。
步骤405:按照第一量化步长,对第一概率分布参数进行量化,以得到第二概率分布参数。
在得到未量化的图像特征的第一概率分布参数之后,编码端按照第一量化步长,对第一概率分布参数进行量化,以得到第二概率分布参数,第二概率分布参数表征经量化的图像特征(即第一图像特征)的概率分布。其中,按照第一量化步长,对第一概率分布参数中各个特征点的概率分布参数进行量化,以得到第二概率分布参数中相应特征点的概率分布参数。
示例性地,以高斯模型为例,坐标为(k,j,l)的特征点的第一量化步长为qi(k,j,l),第一概率分布参数中该特征点的概率分布参数为μ(k,j,l)和σ(k,j,l),按照量化步长qi(k,j,l)对μ(k,j,l)和σ(k,j,l)进行量化,得到第二概率分布参数中该特征点的概率分布参数为μs(k,j,l)和σs(k,j,l)。其中,如果是均匀量化,则μs=μ/q,σs=σ/q,第二概率分布参数中任一特征点的概率分布参数所对应的概率分布函数如公式(2)所示,其中,x为该特征点的第一特征值。
在这里对步骤405的原理进行解释。假设量化操作为均匀量化,以高斯模型为例,假定变量x的概率分布函数如上述公式(1)所示,则变量x在区间[a2*q,a1*q]的概率P1如下述公式(3)所示。其中,q为量化步长。
按照量化步长q对变量进行x进行量化,得到量化后的变量x'=x/q,则变量x在区间[a2*q,a1*q]的概率P1与变量x'在区间[a2,a1]的概率P2相等。基于此,假设变量x'的概率分布函数为g(x),则对比上述公式(3)得到g(x)正如上述公式(2)一样。由此可见,在已知量化前的图像特征的概率分布参数的情况下,按照第一量化步长对该概率分布参数进行量化,即可得到经量化的图像特征的概率分布参数。
对于拉普拉斯分布模型和逻辑斯谛模型也是类似地,通过缩放(即量化)相应模型的参数即可由第一概率分布参数得到第二概率分布参数。
步骤406:基于第二概率分布参数,将第一图像特征编入码流。
在本申请实施例中,在得到第二概率分布参数之后,编码端基于第二概率分布参数,将第一图像特征编入码流。其中,编码端基于第二概率分布参数中各个特征点的概率分布参数,将第一图像特征中各个特征点的图像特征编入码流。示例性地,编码端通过熵编码将第一图像特征编入码流。
以上介绍了本申请实施例提供的一种编码方法,接下来结合图6至图10再次对以上内容再次进行解释说明。
图6是本申请实施例提供的一种编解码方法的流程图。参见图6,在编码过程中,将待 编码的图像输入编码网络(encoder,Enc),以得到待量化的特征图y(即第二图像特征)。该编码网络为图像特征提取网络。按照量化步长q1(即第一量化步长)对特征图y中的各个特征值进行量化(Q),以得到特征图ys。对特征图ys中的各个特征值进行取整(R),以得到特征图(即第一图像特征)。另外,将特征图y输入超编码网络(hyper encoder,HyEnc),得到超先验特征z。可选地,按照量化步长q2对超先验特征z进行量化,以得到超先验特征(即第一超先验特征)。其中,量化步长q2可以与量化步长q1相同或不同。通过熵编码(AE2)将超先验特征编入码流(bitstream)。然后,通过熵解码(AD2)从码流中解析出超先验特征可选地,按照量化步长q2对超先验特征进行反量化(IQ),以得到超先验特征z。将超先验特征z输入概率分布估计网络,以得到特征图y中各个特征值的概率分布参数μ和σ(即第一概率分布参数)。按照量化步长q1对概率分布参数μ和σ进行量化,以得到特征图中各个特征值的概率分布参数μs和σs(即第二概率分布参数)。基于特征图中各个特征值的概率分布参数μs和σs,通过熵编码(AE1)将特征图编入码流。
图7是本申请实施例提供的另一种编解码方法的流程图。参见图7,在编码过程中,将待编码的图像输入编码网络(Enc),以得到待量化的特征图y。该编码网络为图像特征提取网络。按照量化步长q1对特征图y中的各个特征值进行量化(Q),以得到特征图ys。对特征图ys中的各个特征值进行取整(R),以得到特征图另外,将特征图y输入超编码网络(HyEnc),得到超先验特征z。可选地,按照量化步长q2对超先验特征z进行量化,以得到超先验特征(即第一超先验特征)。通过熵编码(AE2)将超先验特征编入码流(bitstream)。然后,通过熵解码(AD2)从码流中解析出超先验特征可选地,按照量化步长q2对超先验特征进行反量化(IQ),以得到超先验特征z。将超先验特征z输入超解码网络(hyper decoder,HyDec)(即概率分布估计网络),以得到先验特征(即第一先验特征)。另外,按照第一量化步长q1,对特征图进行反量化,以得到特征图(即第三图像特征)。将特征图输入上下文(context,Ctx)网络,以得到特征图中各个特征点的上下文特征。将各个特征点的上下文特征和先验特征输入概率分布估计网络,以得到特征图y中各个特征值的概率分布参数μ和σ(即第一概率分布参数)。按照量化步长q1对概率分布参数μ和σ进行量化,以得到特征图中各个特征值的概率分布参数μs和σs(即第二概率分布参数)。基于特征图中各个特征值的概率分布参数μs和σs,通过熵编码(AE1)将特征图编入码流。
图8是本申请实施例提供的又一种编解码方法的流程图。图8与上述图6的区别在于,在编码过程中,按照量化步长q1,对特征图进行反量化,以得到特征图将特征图输入超编码网络中,以得到超先验特征z。
图9是本申请实施例提供的又一种编解码方法的流程图。图9与上述图7的区别在于,在编码过程中,按照量化步长q1,对特征图进行反量化,以得到特征图将特征图输入超编码网络中,以得到超先验特征z。
图10是本申请实施例提供的又一种编解码方法的流程图。图10与上述图7和图9的区别在于,在编码过程中,将特征图输入超编码网络,以得到超先验特征另外,在通过超解码网络得到先验特征之后,增加一个量化操作,即按照量化步长q3(即第二量化步长)对先验特征进行量化,以得到先验特征其中,量化步长q3与量化步长q1相同或不同。将各个特征点的上下文特征和先验特征输入概率分布估计网络,以得到第一概率分布参数μ和σ。
综上所述,在本申请实施例中,为了得到经量化的图像特征的概率分布参数,在编码过程中基于未量化的图像特征的超先验特征,通过概率分布估计网络确定第一概率分布参数,第一概率分布参数即表征了未量化的图像特征的概率分布。然后按照第一量化步长(即量化图像特征所用的量化步长)对第一概率分布参数进行量化,从而得到用于表征经量化的图像特征的概率分布的第二概率分布参数。可见在这种方案中训练用于确定未量化的图像特征的概率分布参数的概率分布估计网络即可。即使在多码率场景下,由于未量化的图像特征的数值范围是较稳定的,不受量化步长的影响,即,概率分布估计网络的输入数值范围不会随着码率的不同而变化,因此,概率分布估计网络的训练难度较小,网络训练稳定,能够训练出性能良好的概率分布估计网络,从而有利于提升编解码性能。
图11是本申请实施例提供的另一种编码方法的流程图。该方法应用于编码端。需要说明的是,假设将上述图4实施例中的概率分布估计网络称为第一概率分布估计网络,那么,图11实施例与上述图4实施例的区别在于,在图11所示的编码方法中,通过第二概率分布估计网络直接得到第二概率分布参数,即第二概率分布估计网络直接输出经量化的图像特征的概率分布参数。其中,第二概率分布估计网络的网络参数是基于第一概率分布估计网络的网络参数和第一量化步长得到的,这样,也是训练第一概率分布估计网络即可。请参考图11,该方法包括如下步骤。
步骤1101:确定待编码的图像的第一图像特征和第二图像特征,第一图像特征为按照第一量化步长对第二图像特征进行量化得到的图像特征。
在本申请实施例中,编码端将待编码的图像输入图像特征提取网络,以得到该图像的第二图像特征。编码端按照第一量化步长,对第二图像特征进行量化,以得到第一图像特征。具体实现过程与上述图4实施例中步骤401的具体实现过程相同,请参照上述步骤401中的相关介绍,这里不再赘述。
步骤1102:确定第二图像特征的第一超先验特征。
可选地,在一种实现方式中,编码端将第二图像特征输入超编码网络,以得到第一超先验特征。在另一种实现方式中,编码端按照第一量化步长,对第一图像特征进行反量化,以得到图像的第三图像特征,将第三图像特征输入超编码网络,以得到第一超先验特征。具体实现过程与上述图4实施例中步骤402的具体实现过程相同,请参照上述步骤402中的相关介绍,这里不再赘述。
步骤1103:将第一超先验特征编入码流。
在本申请实施例中,编码端将第一超先验特征编入码流,以便于后续解码端基于第一超先验特征进行解码。可选地,编码端通过熵编码将第一超先验特征编入码流。具体实现过程与上述图4实施例中步骤403的具体实现过程相同,请参照上述步骤403中的相关介绍,这里不再赘述。
步骤1104:基于第一超先验特征,通过第二概率分布估计网络确定第二概率分布参数,第二概率分布估计网络的网络参数是基于第一概率分布估计网络的网络参数和第一量化步长得到的,第一概率分布估计网络用于确定未量化的图像特征的概率分布。
可选地,为了与解码端的解码过程相一致,编码端从码流中解析出第一超先验特征,基于解析出的第一超先验特征,通过第二概率分布估计网络确定第二概率分布参数。
可选地,第一概率分布估计网络的最后一层为卷积层,卷积层的网络参数包括权重和偏置。基于此,第二概率分布估计网络的最后一层卷积层的权重和偏置是基于第一概率分布估计网络的最后一层卷积层的权重和偏置,以及第一量化步长得到的。可选地,第二概率分布估计网络是将第一概率分布估计网络中最后一层的网络参数乘以第一量化步长后得到的。或者,在一些实施例中,第二概率分布估计网络是将第一概率分布估计网络中最后一层的网络参数按照二进制向左位移或者向右位移的方式进行调整,使调整后的网络参数等于调整前的网络参数乘以第一量化步长。
示例性地,第一概率分布估计网络的最后一层为卷积层,将该卷积层的权重w和偏置b均乘以第一量化步长q1,以得到第二概率分布估计网络的最后一层的权重w*q1和偏置b*q1。需要说明的是,第二概率分布估计网络中除最后一层之外的网络层与第一概率分布估计网络的网络层相同,换句话说,第二概率分布估计网络与第一概率分布估计网络的区别在于最后一层的网络参数不同。这样,通过未量化的图像特征来训练得到第一概率分布估计网络即可,对第一概率分布估计网络训练完成后,将第一概率分布估计网络的最后一层的网络参数乘以第一量化步长即可得到第二概率分布估计网络。
与图4实施例的步骤404中的第一超先验特征类似地,在本申请实施例中,第一超先验特征是经量化的超先验特征,也可以是未量化的超先验特征。基于此,在第一超先验特征是经量化的超先验特征的实现方式中,编码端按照上述第三量化步长,对第一超先验特征进行反量化,以得到第二超先验特征,将第二超先验特征输入第二概率分布估计网络,以得到第二概率分布参数。在第一超先验特征是未量化的超先验特征的情况下,编码端将第一超先验特征输入第二概率分布估计网络,以得到第二概率分布参数。
除了上述确定第二概率分布参数的实现方式之外,编码端也可以基于上下文特征来确定第二概率分布参数,以提高第二概率分布参数的准确性。接下来将对此进行介绍。
在一种实现方式中,编码端将图像的第三图像特征输入上下文网络,以得到第三图像特征的上下文特征。其中,第三图像特征是按照第一量化步长对第一图像特征进行反量化得到的图像特征。编码端基于第一超先验特征,确定第一先验特征,将第一先验特征和上下文特征输入第二概率分布估计网络,以得到第二概率分布参数。也即是,编码端对反量化后的图像特征提取上下文特征,进而结合上下文特征和第一先验特征来确定第二概率分布参数。
在另一种实现方式中,编码端将第一图像特征输入上下文网络,以得到第一图像特征的上下文特征。编码端基于第一超先验特征,确定第一先验特征,按照第二量化步长,对第一先验特征进行量化,以得到第二先验特征。编码端将第二先验特征和上下文特征输入第二概率分布估计网络,以得到第二概率分布参数。也即是,编码端对经量化的图像特征提取上下文特征,再通过对第一先验特征增加一个量化操作来得到第二先验特征,从而结合第二先验特征和上下文特征来确定第二概率分布参数。其中,第二量化步长与第一量化步长相同或不同。
需要说明的是,步骤1104的具体实现过程与图4实施例中步骤404的具体实现过程类似,请参照上述步骤404中的相关介绍,这里不再赘述。
步骤1105:基于第二概率分布参数,将第一图像特征编入码流。
在本申请实施例中,第二概率分布参数为多个特征点的概率分布参数,编码端基于第二概率分布参数中各个特征点的概率分布参数,将第一图像特征中各个特征点的图像特征编入 码流。示例性地,编码端通过熵编码将第二图像特征编入码流。
需要说明的是,假设将上述图6至图10所示的编解码流程中的概率分布估计网络称为第一概率分布估计网络,那么将图6至图10中的第一概率分布估计网络替换为第二概率分布估计网络,并去掉对概率分布参数的量化操作后,即得到与图11实施例一致的编解码方法的流程图。
综上所述,在本申请实施例的编码过程中也是确定未量化的图像特征的超先验特征,只不过后续通过第二概率分布估计网络直接得到第二概率分布参数,第二概率分布估计网络是基于第一量化步长对第一概率分布估计网络中的网络参数进行处理后得到的。可见在这种方案中训练第一概率分布估计网络(用于确定未量化的图像特征的概率分布参数)即可。即使在多码率场景下,由于未量化的图像特征的数值范围是较稳定的,不受量化步长的影响,即,第一概率分布估计网络的输入数值范围不会随着码率的不同而变化,因此,第一概率分布估计网络的训练难度较小,网络训练稳定,能够训练出性能良好的第一概率分布估计网络,从而有利于提升编解码性能。
接下来对本申请实施例提供的解码方法进行介绍。需要说明的是,下述图12所示的解码方法与上述图4所示的编码方法相匹配,下述图13所示的解码方法与上述图11所示的解码方法相匹配。
图12是本申请实施例提供的一种解码方法的流程图。该方法应用于解码端。请参考图12,该方法包括如下步骤。
步骤1201:从码流中解析出待解码的图像的第一超先验特征。
在本申请实施例中,解码端先从码流中解析出待解码的图像的第一超先验特征。可选地,解码端通过熵解码从码流中解析出第一超先验特征。示例性地,解码端根据指定的概率分布参数,通过熵解码从码流中解析出第一超先验特征。其中,指定的概率分布参数为预先通过某概率分布估计网络确定的概率分布参数,本申请实施例不限定该概率分布估计网络的网络结构和训练方法等。需要说明的是,第一超先验特征为多个特征点的超先验特征。解码端解析出的第一超先验特征,与编码端所确定的第一超先验特征是一致的,也即是,解码端得到的第一超先验特征正是图4实施例中所讲的第二图像特征的第一超先验特征,或者说是第三图像特征的第一超先验特征。其中,第二图像特征为未量化的图像特征,第三图像特征为反量化后的图像特征。
步骤1202:基于第一超先验特征,通过概率分布估计网络确定第一概率分布参数,第一概率分布参数表征图像的未量化的图像特征的概率分布。
在本申请实施例中,第一超先验特征为多个特征点的超先验特征,第一概率分布参数为多个特征点的概率分布参数,解码端基于第一超先验特征中各个特征点的超先验特征,通过概率分布估计网络确定第一概率分布参数中各个特征点的概率分布参数。
需要说明的是,在未利用上下文网络进行编解码的实现方式中,解码端可以并行解码该多个特征点。而在利用上下文网络进行编解码的实现方式中,解码端不能够同时解码该多个特征点,例如,解码端依次解码该多个特征点,或者,解码端依次解码一个通道的各个特征点,或者,解码端依次解码多组特征点,每组特征点的数量可以不同,或者,解码端按照其他顺序解码该多个特征点。
另外,由前述可知,第一超先验特征是经量化的超先验特征,也可以是未量化的超先验特征。基于此,在未利用上下文网络进行编解码,且第一超先验特征是经量化的超先验特征的实现方式中,解码端按照上述第三量化步长,对第一超先验特征进行反量化,以得到第二超先验特征,将第二超先验特征输入概率分布估计网络,以得到第一概率分布参数。在未利用上下文网络进行编解码,且第一超先验特征是未量化的超先验特征的实现方式中,解码端将第一超先验特征输入概率分布估计网络,以得到第一概率分布参数。
而在利用上下文网络进行编解码的一种实现方式中,假设第一特征点为该多个特征点中的任意一个,解码端对于第一特征点执行如下操作来确定第一特征点的概率分布参数:基于第一图像特征中已解码的特征点的图像特征,确定第一特征点的上下文特征;基于第一特征点的超先验特征,确定第一特征点的第一先验特征;基于第一特征点的第一先验特征和第一特征点的上下文特征,通过概率分布估计网络确定第一特征点的概率分布参数,即确定第一概率分布参数中第一特征点的概率分布参数。
其中,在第一超先验特征是经量化的超先验特征的实现方式中,解码端按照第三量化步长对解析出的第一超先验特征进行反量化,以得到第二超先验特征,将第二超先验特征输入超解码网络,以得到多个特征点中各个特征点的第一先验特征。对于第一特征点来说,解码端对第一超先验特征中第一特征点的超先验特征进行反量化,以得到第二超先验特征中第一特征点的超先验特征,将第二超先验特征中第一特征点的超先验特征输入超解码网络,以得到第一特征点的第一先验特征。在第一超先验特征是经量化的超先验特征的实现方式中,解码端将解析出的第一超先验特征输入超解码网络,以得到多个特征点中各个特征点的第一先验特征。对于第一特征点来说,解码端将第一超先验特征中第一特征点的超先验特征输入超解码网络,以得到第一特征点的第一先验特征。
可选地,解码端基于第一图像特征中已解码的特征点的图像特征,确定第一特征点的上下文特征的一种实现过程为:解码端从已解码的特征点中,确定第一特征点的周边特征点,按照第一量化步长,对第一图像特征中周边特征点的图像特征进行反量化,以得到第一特征点的周边特征。然后,解码端将第一特征点的周边特征输入上下文网络,以得到第一特征点的上下文特征。相应地,解码端基于第一特征点的第一先验特征和第一特征点的上下文特征,通过概率分布估计网络确定第一概率分布参数中第一特征点的概率分布参数的实现过程为:解码端将第一特征点的第一先验特征和第一特征点的上下文特征输入概率分布估计网络,以得到第一概率分布参数中第一特征点的概率分布参数。也即是,解码端对反量化后的图像特征提取上下文特征,进而结合上下文特征和第一先验特征来确定第一概率分布参数。其中,第一特征点的周边特征点包括第一特征点邻域的一个或多个特征点。
可选地,解码端基于第一图像特征中已解码的特征点的图像特征,确定第一特征点的上下文特征的另一种实现过程为:解码端从已解码的特征点中,确定第一特征点的周边特征点,将第一图像特征中该周边特征点的图像特征输入上下文网络,以得到第一特征点的上下文特征。相应地,解码端基于第一特征点的第一先验特征和第一特征点的上下文特征,通过概率分布估计网络确定第一概率分布参数中第一特征点的概率分布参数的实现过程为:解码端按照第二量化步长,对第一特征点的第一先验特征进行量化,以得到第一特征点的第二先验特征,将第一特征点的第二先验特征和第一特征点的上下文特征输入概率分布估计网络,以得到第一概率分布参数中第一特征点的概率分布参数。也即是,解码端对经量化的图像特征提 取上下文特征,再通过对第一先验特征增加一个量化操作来得到第二先验特征,从而结合第二先验特征和上下文特征来确定第一概率分布参数。其中,第二量化步长与第一量化步长相同或不同。
需要说明的是,在利用上下文网络进行编解码的实现方式中,解码端确定第一特征点的第一先验特征的实现方式与前述实施例中相关内容类似,这里不再赘述。另外,第一图像特征中已解码的特征点的图像特征是按照步骤1202至下述步骤1204从码流中解码得到的。也即是,在利用上下文网络进行编解码的实现方式中,解码端按照步骤1202至步骤1204依次从码流中解析出第一图像特征中各个特征点的图像特征,解码端每次可以解码至少一个特征点。另外,上述第一图像特征为按照第一量化步长对图像的第二图像特征进行量化得到的图像特征,该量化操作是在编码过程中执行的,第二图像特征是编码过程中得到的图像特征。
步骤1203:按照第一量化步长,对第一概率分布参数进行量化,以得到第二概率分布参数。
其中,第二概率分布参数为多个特征点的概率分布,解码端在解码出第一概率分布参数中第一特征点的概率分布参数之后,按照第一量化步长,对第一概率分布参数中第一特征点的概率分布参数进行量化,以得到第二概率分布参数中第一特征点的概率分布参数。需要说明的是,在未利用上下文网络进行编解码的实现方式中,解码端可以并行量化第一概率分布参数中多个特征点的概率分布参数。在利用上下文网络进行编解码的实现方式中,解码端每得到第一概率分布参数中至少一个特征点的概率分布参数,对第一概率分布参数中该至少一个特征点的概率分布参数进行量化。
步骤1204:基于第二概率分布参数,从码流中解析出图像的第一图像特征。
其中,解码端在得到第二概率分布参数中第一特征点的概率分布参数之后,基于第二概率分布参数中第一特征点的概率分布参数,从码流中解析出第一图像特征中第一特征点的图像特征。可选地,解码端通过熵解码从码流中解析第一图像特征中各个特征点的图像特征。需要说明的是,在未利用上下文网络进行编解码的实现方式中,解码端可以并行解析多个特征点,得到第一图像特征。在利用上下文网络进行编解码的实现方式中,解码端每得到第二概率分布参数中至少一个特征点的概率分布参数,从码流中解析出第一图像特征中该至少一个特征点的图像特征,直至解析出第一图像特征中所有特征点的图像特征时,得到图像的第一图像特征。
步骤1205:按照第一量化步长,对第一图像特征进行反量化,以重构图像。
在本申请实施例中,解码端从码流中解析出第一图像特征之后,按照第一量化步长,对第一图像特征进行反量化,以得到图像的第三图像特征,基于第三图像特征,重构该图像。可选地,解码端将第三图像特征输入解码网络,以重构该图像。其中,解码网络内所执行的解码过程是上述图像特征提取网络所执行的特征提取的逆过程。需要说明的是,第三图像特征与编码过程中的第三图像特征是一致的,均是反量化后的图像特征。
可选地,在另一些实施例中,解码端按照第四量化步长,对第一图像特征进行反量化,以得到图像的第四图像特征,基于第四图像特征,重构该图像。其中,第四量化步长可与第一量化步长有所偏差。
接下来结合图6至图10对以上解码过程再次进行解释说明。
在图6所示编解码方法的解码过程中,首先从码流中解析出超先验特征按照量化步 长q2对超先验特征进行反量化,以得到超先验特征z。然后将超先验特征z输入概率分布估计网络,以得到特征图y中各个特征值的概率分布参数μ和σ。按照量化步长q1对概率分布参数μ和σ进行量化,以得到特征图中各个特征值的概率分布参数μs和σs。基于特征图中各个特征值的概率分布参数μs和σs,从码流中解析出特征图按照量化步长q1对特征图进行反量化,以得到特征图最后将特征图输入解码网络(decoder,Dec),以重构图像。
在图7所示编解码方法的解码过程中,首先从码流中解析出多个特征点中各个特征点的超先验特征按照量化步长q2对各个特征点的超先验特征进行反量化,以得到各个特征点的超先验特征z。然后将各个特征点的超先验特征z输入超解码网络,以得到各个特征点的先验特征然后,对于待解码的第一特征点,从已解码的特征点中确定第一特征点的周边特征点。按照量化步长q1,对特征图中该周边特征点的特征值进行反量化,以得到特征图中该周边特征点的特征值,即得到第一特征点的周边特征。将特征图中该周边特征点的特征值输入上下文网络,以得到第一特征点的上下文特征。将第一特征点的上下文特征和第一特征点的先验特征输入概率分布估计网络,以得到第一特征点的概率分布估计参数μ和σ。按照第一量化步长,对第一特征点的概率分布估计参数μ和σ进行量化,以得到第一特征点的概率分布估计参数μs和σs。基于第一特征点的概率分布估计参数μs和σs,从码流中解析出特征图中第一特征点的图像特征。在从码流中解析出特征图中所有特征点的图像特征时,即得到特征图然后,按照量化步长q1,对特征图中各个特征值进行反量化,以得到特征图最后将特征图输入解码网络,以重构图像。
图8所示编解码方法的解码过程与图6所示编解码方法的解码过程是相同的。图9所示编解码方法的解码过程与图7所示编解码方法的解码过程是相同的。图10所示编解码方法的解码过程与图7和图9的区别在于,在通过超解码网络得到先验特征之后,增加一个量化操作,即按照量化步长q3对先验特征进行量化,以得到先验特征将各个特征点的上下文特征和先验特征输入概率分布估计网络,以得到第一概率分布参数μ和σ。
综上所述,在本申请实施例中,为了得到经量化的图像特征的概率分布参数,在解码过程中通过概率分布估计网络确定第一概率分布参数,第一概率分布参数表征未量化的图像特征的概率分布。然后按照第一量化步长(即量化图像特征所用的量化步长)对第一概率分布参数进行量化,从而得到用于表征经量化的图像特征的概率分布的第二概率分布参数。可见在这种方案中训练用于确定未量化的图像特征的概率分布参数的概率分布估计网络即可。即使在多码率场景下,由于未量化的图像特征的数值范围是较稳定的,不受量化步长的影响,即,概率分布估计网络的输入数值范围不会随着码率的不同而变化,因此,概率分布估计网络的训练难度较小,网络训练稳定,能够训练出性能良好的概率分布估计网络,从而有利于提升编解码性能。
图13是本申请实施例提供的另一种解码方法的流程图。该方法应用于解码端。请参考图13,该方法包括如下步骤。
步骤1301:从码流中解析出待解码的图像的第一超先验特征。
需要说明的是,步骤1301的具体实现过程与上述图12实施例中步骤1201的具体实现过程相同,请参照上述步骤1201中的相关介绍,这里不再赘述。
步骤1302:基于第一超先验特征,通过第二概率分布估计网络确定第二概率分布参数,第二概率分布估计网络的网络参数是基于第一概率分布估计网络的网络参数和第一量化步长得到的,第一概率分布估计网络用于确定未量化的图像特征的概率分布。
可选地,第一概率分布估计网络的最后一层为卷积层,卷积层的网络参数包括权重和偏置。基于此,第二概率分布估计网络的最后一层卷积层的权重和偏置是基于第一概率分布估计网络的最后一层卷积层的权重和偏置,以及第一量化步长得到的。可选地,第二概率分布估计网络是将第一概率分布估计网络中最后一层的网络参数乘以第一量化步长后得到的。或者,在一些实施例中,第二概率分布估计网络是将第一概率分布估计网络中最后一层的网络参数按照二进制向左位移或者向右位移的方式进行调整,使调整后的网络参数等于调整前的网络参数乘以第一量化步长。
在本申请实施例中,第一超先验特征为多个特征点的超先验特征,第二概率分布参数为多个特征点的概率分布参数,解码端基于第一超先验特征中各个特征点的超先验特征,通过第二概率分布估计网络确定第二概率分布参数中各个特征点的概率分布参数。
需要说明的是,在未利用上下文网络进行编解码的实现方式中,解码端可以并行解码该多个特征点。而在利用上下文网络进行编解码的实现方式中,解码端不能够同时解码该多个特征点。
另外,由前述可知,第一超先验特征是经量化的超先验特征,也可以是未量化的超先验特征。基于此,在未利用上下文网络进行编解码,且第一超先验特征是经量化的超先验特征的实现方式中,解码端按照上述第三量化步长,对第一超先验特征进行反量化,以得到第二超先验特征,将第二超先验特征输入第二概率分布估计网络,以得到第二概率分布参数。在未利用上下文网络进行编解码,且第一超先验特征是未量化的超先验特征的实现方式中,解码端将第一超先验特征输入第二概率分布估计网络,以得到第二概率分布参数。
而在利用上下文网络进行编解码的一种实现方式中,假设第一特征点为该多个特征点中的任意一个,解码端对于第一特征点执行如下操作来确定第二概率分布参数中第一特征点的概率分布参数:基于第一图像特征中已解码的特征点的图像特征,确定第一特征点的上下文特征;基于第一超先验特征中第一特征点的超先验特征,确定第一特征点的第一先验特征;基于第一特征点的第一先验特征和第一特征点的上下文特征,通过第二概率分布估计网络确定第二概率分布参数中第一特征点的概率分布参数。
其中,在第一超先验特征是经量化的超先验特征的实现方式中,解码端按照第三量化步长对解析出的第一超先验特征进行反量化,以得到第二超先验特征,将第二超先验特征输入超解码网络,以得到多个特征点中各个特征点的第一先验特征。在第一超先验特征是经量化的超先验特征的实现方式中,解码端将解析出的第一超先验特征输入超解码网络,以得到多个特征点中各个特征点的第一先验特征。
可选地,解码端基于第一图像特征中已解码的特征点的图像特征,确定第一特征点的上下文特征的一种实现过程为:解码端从已解码的特征点中,确定第一特征点的周边特征点,按照第一量化步长,对第一图像特征中周边特征点的图像特征进行反量化,以得到第一特征点的周边特征。然后,解码端将第一特征点的周边特征输入上下文网络,以得到第一特征点的上下文特征。相应地,解码端基于第一特征点的第一先验特征和第一特征点的上下文特征,通过第二概率分布估计网络确定第二概率分布参数中第一特征点的概率分布参数的实现过程 为:解码端将第一特征点的第一先验特征和第一特征点的上下文特征输入第二概率分布估计网络,以得到第二概率分布参数中第一特征点的概率分布参数。也即是,解码端对反量化后的图像特征提取上下文特征,进而结合上下文特征和第一先验特征来确定第二概率分布参数。其中,第一特征点的周边特征点包括第一特征点邻域的一个或多个特征点。
可选地,解码端基于第一图像特征中已解码的特征点的图像特征,确定第一特征点的上下文特征的另一种实现过程为:解码端从已解码的特征点中,确定第一特征点的周边特征点,将第一图像特征中该周边特征点的图像特征输入上下文网络,以得到第一特征点的上下文特征。相应地,解码端基于第一特征点的第一先验特征和第一特征点的上下文特征,通过第二概率分布估计网络确定第二概率分布参数中第一特征点的概率分布参数的实现过程为:解码端按照第二量化步长,对第一特征点的第一先验特征进行量化,以得到第一特征点的第二先验特征,将第一特征点的第二先验特征和第一特征点的上下文特征输入第二概率分布估计网络,以得到第二概率分布参数中第一特征点的概率分布参数。也即是,解码端对经量化的图像特征提取上下文特征,再通过对第一先验特征增加一个量化操作来得到第二先验特征,从而结合第二先验特征和上下文特征来确定第一概率分布参数。其中,第二量化步长与第一量化步长相同或不同。
需要说明的是,在利用上下文网络进行编解码的实现方式中,解码端确定第一特征点的第一先验特征的实现方式与前述实施例中相关内容类似,这里不再赘述。另外,第一图像特征中已解码的特征点的图像特征是按照步骤1302至下述步骤1304从码流中解码得到的。也即是,在利用上下文网络进行编解码的实现方式中,解码端按照步骤1302至步骤1304依次从码流中解析出第一图像特征中各个特征点的图像特征,解码端每次可以解码至少一个特征点。另外,上述第一图像特征为按照第一量化步长对图像的第二图像特征进行量化得到的图像特征,该量化操作是在编码过程中执行的,第二图像特征是编码过程中得到的图像特征。
步骤1303:基于第二概率分布参数,从码流中解析出图像的第一图像特征。
其中,解码端在得到第二概率分布参数中第一特征点的概率分布参数之后,基于第二概率分布参数中第一特征点的概率分布参数,从码流中解析出第一图像特征中第一特征点的图像特征。可选地,解码端通过熵解码从码流中解析第一图像特征中各个特征点的图像特征。需要说明的是,在未利用上下文网络进行编解码的实现方式中,解码端可以并行解析多个特征点,得到第一图像特征。在利用上下文网络进行编解码的实现方式中,解码端每得到第二概率分布参数中至少一个特征点的概率分布参数,从码流中解析出第一图像特征中该至少一个特征点的图像特征,直至解析出第一图像特征中所有特征点的图像特征时,得到图像的第一图像特征。
需要说明的是,解码端得到的第一图像特征与编码端得到的第一图像特征是一致的,编码端得到的第一图像特征为按照第一量化步长对图像的第二图像特征进行量化得到的图像特征。
步骤1304:按照第一量化步长,对第一图像特征进行反量化,以重构图像。
在本申请实施例中,解码端从码流中解析出第一图像特征之后,按照第一量化步长,对第一图像特征进行反量化,以得到图像的第三图像特征,基于第三图像特征,重构该图像。可选地,解码端将第三图像特征输入解码网络,以重构该图像。其中,解码网络内所执行的解码过程是上述图像特征提取网络所执行的特征提取的逆过程。需要说明的是,第三图像特 征与编码过程中的第三图像特征是一致的,均是反量化后的图像特征。
需要说明的是,假设将上述图6至图10所示的编解码流程中的概率分布估计网络称为第一概率分布估计网络,那么将图6至图10中的第一概率分布估计网络替换为第二概率分布估计网络,并去掉对概率分布参数的量化操作后,即得到与图13实施例一致的编解码方法的流程图。
综上所述,在本申请实施例的解码过程中通过第二概率分布估计网络直接得到第二概率分布参数,第二概率分布估计网络是基于第一量化步长对第一概率分布估计网络中的网络参数进行处理后得到的,第一概率分布估计网络用于确定未量化的图像特征的概率分布参数。可见在这种方案中训练第一概率分布估计网络(用于确定未量化的图像特征的概率分布参数)即可。即使在多码率场景下,由于未量化的图像特征的数值范围是较稳定的,不受量化步长的影响,即,第一概率分布估计网络的输入数值范围不会随着码率的不同而变化,因此,第一概率分布估计网络的训练难度较小,网络训练稳定,能够训练出性能良好的第一概率分布估计网络,从而有利于提升编解码性能。
另外,通过多次实验对本申请实施例提供的编解码方法进行验证后,得到的结论为本方案对于YUV格式的图像来说,可以在Y、U、V这三个分量上均获得编解码性能的提升。由上述图4实施例可知,本方案所提供的编解码方法,对于未量化的图像特征进行一次概率分布估计得到第一概率分布参数之后,可推导出不同码率(对应不同的第一量化步长)下的第二概率分布参数,无需针对每个码率分别进行一次概率估计。由此可见,本方案能够降低概率估计的运算复杂度,有利于特征图的率失真优化(rate distortion optimation,RDO),针对不同码率作概率估计统一化,有利于概率分布估计网络的训练。
图14是本申请实施例提供的一种编码装置1400的结构示意图,该编码装置1400可以由软件、硬件或者两者的结合实现成为编码端设备的部分或者全部,该编码端设备可以为图1至图3所示的任一编码端。参见图14,该装置1400包括:第一确定模块1401、第二确定模块1402、第一编码模块1403、概率估计模块1404、量化模块1405和第二编码模块1406。
第一确定模块1401,用于确定待编码的图像的第一图像特征和第二图像特征,第一图像特征为按照第一量化步长对第二图像特征进行量化得到的图像特征;
第二确定模块1402,用于确定第二图像特征的第一超先验特征;
第一编码模块1403,用于将第一超先验特征编入码流;
概率估计模块1404,用于基于第一超先验特征,通过概率分布估计网络确定第一概率分布参数;
量化模块1405,用于按照第一量化步长,对第一概率分布参数进行量化,以得到第二概率分布参数;
第二编码模块1406,用于基于第二概率分布参数,将第一图像特征编入码流。
可选地,第二确定模块1402包括:
第一超编码子模块,用于将第二图像特征输入超编码网络,以得到第一超先验特征。
可选地,第二确定模块1402包括:
反量化子模块,用于按照第一量化步长,对第一图像特征进行反量化,以得到图像的第三图像特征;
第二超编码子模块,用于将第三图像特征输入超编码网络,以得到第一超先验特征。
可选地,概率估计模块1404包括:
上下文子模块,用于将图像的第三图像特征输入上下文网络,以得到第三图像特征的上下文特征,第三图像特征是按照第一量化步长对第一图像特征进行反量化得到的图像特征;
第一确定子模块,用于基于第一超先验特征,确定第一先验特征;
第一概率估计子模块,用于将第一先验特征和上下文特征输入概率分布估计网络,以得到第一概率分布参数。
可选地,概率估计模块1404包括:
上下文子模块,用于将第一图像特征输入上下文网络,以得到第一图像特征的上下文特征;
第二确定子模块,用于基于第一超先验特征,确定第一先验特征;
量化子模块,用于按照第二量化步长,对第一先验特征进行量化,以得到第二先验特征;
第二概率估计子模块,用于将第二先验特征和上下文特征输入概率分布估计网络,以得到第一概率分布参数。
在本申请实施例中,为了得到经量化的图像特征的概率分布参数,在编码过程中基于未量化的图像特征的超先验特征,通过概率分布估计网络确定第一概率分布参数,第一概率分布参数即表征了未量化的图像特征的概率分布。然后按照第一量化步长(即量化图像特征所用的量化步长)对第一概率分布参数进行量化,从而得到用于表征经量化的图像特征的概率分布的第二概率分布参数。可见在这种方案中训练用于确定未量化的图像特征的概率分布参数的概率分布估计网络即可。即使在多码率场景下,由于未量化的图像特征的数值范围是较稳定的,不受量化步长的影响,即,概率分布估计网络的输入数值范围不会随着码率的不同而变化,因此,概率分布估计网络的训练难度较小,网络训练稳定,能够训练出性能良好的概率分布估计网络,从而有利于提升编解码性能。
需要说明的是:上述实施例提供的编码装置在进行编码时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的编码装置与编码方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
图15是本申请实施例提供的一种解码装置1500的结构示意图,该编码装置1500可以由软件、硬件或者两者的结合实现成为解码端设备的部分或者全部,该解码设备可以为图1至图3所示的任一解码端。参见图15,该装置1500包括:第一解析模块1501、概率估计模块1502、量化模块1503、第二解析模块1504和重构模块1505。
第一解析模块1501,用于从码流中解析出待解码的图像的第一超先验特征;
概率估计模块1502,用于基于第一超先验特征,通过概率分布估计网络确定第一概率分布参数,第一概率分布参数表征图像的未量化的图像特征的概率分布;
量化模块1503,用于按照第一量化步长,对第一概率分布参数进行量化,以得到第二概率分布参数;
第二解析模块1504,用于基于第二概率分布参数,从码流中解析出图像的第一图像特征;
重构模块1505,用于按照第一量化步长,对第一图像特征进行反量化,以重构图像。
可选地,第一图像特征为按照第一量化步长对图像的第二图像特征进行量化得到的图像特征。
可选地,重构模块1505包括:
反量化子模块,用于按照第一量化步长,对第一图像特征进行反量化,以得到图像的第三图像特征;
重构子模块,用于基于第三图像特征,重构图像。
可选地,第一概率分布参数为多个特征点的概率分布参数,第一超先验特征为该多个特征点的超先验特征,概率估计模块1502包括:上下文子模块、第一确定子模块和概率估计子模块;
对于第一特征点通过上下文子模块、第一确定子模块和概率估计子模块来确定第一特征点的概率分布参数,第一特征点为该多个特征点中的任意一个;其中,
上下文子模块,用于基于第一图像特征中已解码的特征点的图像特征,确定第一特征点的上下文特征;
第一确定子模块,用于基于第一特征点的超先验特征,确定第一特征点的第一先验特征;
概率估计子模块,用于基于第一特征点的第一先验特征和第一特征点的上下文特征,通过概率分布估计网络确定第一特征点的概率分布参数。
可选地,上下文子模块用于:
从已解码的特征点中,确定第一特征点的周边特征点;
按照第一量化步长,对第一图像特征中周边特征点的图像特征进行反量化,以得到第一特征点的周边特征;
将第一特征点的周边特征输入上下文网络,以得到第一特征点的上下文特征;
基于第一特征点的第一先验特征和第一特征点的上下文特征,通过概率分布估计网络确定第一特征点的概率分布参数,包括:
将第一特征点的第一先验特征和第一特征点的上下文特征输入概率分布估计网络,以得到第一特征点的概率分布参数。
可选地,上下文子模块用于:
从已解码的特征点中,确定第一特征点的周边特征点;
将第一图像特征中周边特征点的图像特征输入上下文网络,以得到第一特征点的上下文特征;
基于第一特征点的第一先验特征和第一特征点的上下文特征,通过概率分布估计网络确定第一特征点的概率分布参数,包括:
按照第二量化步长,对第一特征点的第一先验特征进行量化,以得到第一特征点的第二先验特征;
将第一特征点的第二先验特征和第一特征点的上下文特征输入概率分布估计网络,以得到第一特征点的概率分布参数。
在本申请实施例中,为了得到经量化的图像特征的概率分布参数,在解码过程中通过概率分布估计网络确定第一概率分布参数,第一概率分布参数表征未量化的图像特征的概率分布。然后按照第一量化步长(即量化图像特征所用的量化步长)对第一概率分布参数进行量 化,从而得到用于表征经量化的图像特征的概率分布的第二概率分布参数。可见在这种方案中训练用于确定未量化的图像特征的概率分布参数的概率分布估计网络即可。即使在多码率场景下,由于未量化的图像特征的数值范围是较稳定的,不受量化步长的影响,因此,通过未量化的图像特征来训练概率分布估计网络的训练难度较小,网络训练稳定,能够训练出性能良好的概率分布估计网络,从而有利于提升编解码性能。
需要说明的是:上述实施例提供的解码装置在进行解码时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的解码装置与解码方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
图16是本申请实施例提供的一种编码装置1600的结构示意图,该编码装置1600可以由软件、硬件或者两者的结合实现成为编码端设备的部分或者全部,该编码端设备可以为图1至图3所示的任一编码端。参见图16,该装置1600包括:第一确定模块1601、第二确定模块1602、第一编码模块1603、概率估计模块1604和第二编码模块1605。
第一确定模块1601,用于确定待编码的图像的第一图像特征和第二图像特征,第一图像特征为按照第一量化步长对第二图像特征进行量化得到的图像特征;
第二确定模块1602,用于确定第二图像特征的第一超先验特征;
第一编码模块1603,用于将第一超先验特征编入码流;
概率估计模块1604,用于基于第一超先验特征,通过第二概率分布估计网络确定第二概率分布参数,第二概率分布估计网络的网络参数是基于第一概率分布估计网络的网络参数和第一量化步长得到的,第一概率分布估计网络用于确定未量化的图像特征的概率分布;
第二编码模块1605,用于基于第二概率分布参数,将第一图像特征编入码流。
可选地,第二概率分布估计网络是将第一概率分布估计网络中最后一层的网络参数乘以第一量化步长后得到的。
可选地,第一概率分布估计网络的最后一层为卷积层,卷积层的网络参数包括权重和偏置。
可选地,第二确定模块1602包括:
第一超编码子模块,用于将第二图像特征输入超编码网络,以得到第一超先验特征。
可选地,第二确定模块1602包括:
反量化子模块,用于按照第一量化步长,对第一图像特征进行反量化,以得到图像的第三图像特征;
第二超编码子模块,用于将第三图像特征输入超编码网络,以得到第一超先验特征。
在本申请实施例的编码过程中也是确定未量化的图像特征的超先验特征,只不过后续通过第二概率分布估计网络直接得到第二概率分布参数,第二概率分布估计网络是基于第一量化步长对第一概率分布估计网络中的网络参数进行处理后得到的。可见在这种方案中训练第一概率分布估计网络(用于确定未量化的图像特征的概率分布参数)即可。即使在多码率场景下,由于未量化的图像特征的数值范围是较稳定的,不受量化步长的影响,因此,第一概率分布估计网络的训练难度较小,网络训练稳定,能够训练出性能良好的第一概率分布估计 网络,从而有利于提升编解码性能。
需要说明的是:上述实施例提供的编码装置在进行编码时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的编码装置与编码方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
图17是本申请实施例提供的一种解码装置1700的结构示意图,该解码装置1700可以由软件、硬件或者两者的结合实现成为解码端设备的部分或者全部,该解码端设备可以为图1至图3所示的任一解码端。参见图17,该装置1700包括:第一解析模块1701、概率估计模块1702、第二解析模块1703和重构模块1704。
第一解析模块1701,用于从码流中解析出待解码的图像的第一超先验特征;
概率估计模块1702,用于基于第一超先验特征,通过第二概率分布估计网络确定第二概率分布参数,第二概率分布估计网络的网络参数是基于第一概率分布估计网络的网络参数和第一量化步长得到的,第一概率分布估计网络用于确定未量化的图像特征的概率分布;
第二解析模块1703,用于基于第二概率分布参数,从码流中解析出图像的第一图像特征;
重构模块1704,用于按照第一量化步长,对第一图像特征进行反量化,以重构图像。
可选地,第二概率分布估计网络是将第一概率分布估计网络中最后一层的网络参数乘以第一量化步长后得到的。
可选地,第一概率分布估计网络的最后一层为卷积层,卷积层的网络参数包括权重和偏置。
可选地,第一图像特征为按照第一量化步长对图像的第二图像特征进行量化得到的图像特征。
可选地,重构模块1704包括:
反量化子模块,用于按照第一量化步长,对第一图像特征进行反量化,以得到图像的第三图像特征;
重构子模块,用于基于第三图像特征,重构图像。
在本申请实施例的解码过程中通过第二概率分布估计网络直接得到第二概率分布参数,第二概率分布估计网络是基于第一量化步长对第一概率分布估计网络中的网络参数进行处理后得到的,第一概率分布估计网络用于确定未量化的图像特征的概率分布参数。可见在这种方案中训练第一概率分布估计网络(用于确定未量化的图像特征的概率分布参数)即可。即使在多码率场景下,由于未量化的图像特征的数值范围是较稳定的,不受量化步长的影响,即,第一概率分布估计网络的输入数值范围不会随着码流的变化而发生变化,因此,第一概率分布估计网络的训练难度较小,网络训练稳定,能够训练出性能良好的第一概率分布估计网络,从而有利于提升编解码性能。
需要说明的是:上述实施例提供的解码装置在进行解码时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的解码装置与解码方法实施例属于同一构思,其具体实现过程详见方法实施例, 这里不再赘述。
图18为用于本申请实施例的一种编解码装置1800的示意性框图。其中,编解码装置1800可以包括处理器1801、存储器1802和总线系统1803。其中,处理器1801和存储器1802通过总线系统1803相连,该存储器1802用于存储指令,该处理器1801用于执行该存储器1802存储的指令,以执行本申请实施例描述的各种的编码或解码方法。为避免重复,这里不再详细描述。
在本申请实施例中,该处理器1801可以是中央处理单元(central processing unit,CPU),该处理器1801还可以是其他通用处理器、DSP、ASIC、FPGA或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
该存储器1802可以包括ROM设备或者RAM设备。任何其他适宜类型的存储设备也可以用作存储器1802。存储器1802可以包括由处理器1801使用总线1803访问的代码和数据18021。存储器1802可以进一步包括操作系统18023和应用程序18022,该应用程序18022包括允许处理器1801执行本申请实施例描述的编码或解码方法的至少一个程序。例如,应用程序18022可以包括应用1至N,其进一步包括执行在本申请实施例描述的编码或解码方法的编码或解码应用(简称编解码应用)。
该总线系统1803除包括数据总线之外,还可以包括电源总线、控制总线和状态信号总线等。但是为了清楚说明起见,在图中将各种总线都标为总线系统1803。
可选地,编解码装置1800还可以包括一个或多个输出设备,诸如显示器1804。在一个示例中,显示器1804可以是触感显示器,其将显示器与可操作地感测触摸输入的触感单元合并。显示器1804可以经由总线1803连接到处理器1801。
需要指出的是,编解码装置1800可以执行本申请实施例中的编码方法,也可执行本申请实施例中的解码方法。
本领域技术人员能够领会,结合本文公开描述的各种说明性逻辑框、模块和算法步骤所描述的功能可以硬件、软件、固件或其任何组合来实施。如果以软件来实施,那么各种说明性逻辑框、模块、和步骤描述的功能可作为一或多个指令或代码在计算机可读媒体上存储或传输,且由基于硬件的处理单元执行。计算机可读媒体可包含计算机可读存储媒体,其对应于有形媒体,例如数据存储媒体,或包括任何促进将计算机程序从一处传送到另一处的媒体(例如,基于通信协议)的通信媒体。以此方式,计算机可读媒体大体上可对应于(1)非暂时性的有形计算机可读存储媒体,或(2)通信媒体,例如信号或载波。数据存储媒体可为可由一或多个计算机或一或多个处理器存取以检索用于实施本申请中描述的技术的指令、代码和/或数据结构的任何可用媒体。计算机程序产品可包含计算机可读媒体。
作为实例而非限制,此类计算机可读存储媒体可包括RAM、ROM、EEPROM、CD-ROM或其它光盘存储装置、磁盘存储装置或其它磁性存储装置、快闪存储器或可用来存储指令或数据结构的形式的所要程序代码并且可由计算机存取的任何其它媒体。并且,任何连接被恰当地称作计算机可读媒体。举例来说,如果使用同轴缆线、光纤缆线、双绞线、数字订户线(DSL)或例如红外线、无线电和微波等无线技术从网站、服务器或其它远程源传输指令,那么同轴缆线、光纤缆线、双绞线、DSL或例如红外线、无线电和微波等无线技术包含在媒体的 定义中。但是,应理解,所述计算机可读存储媒体和数据存储媒体并不包括连接、载波、信号或其它暂时媒体,而是实际上针对于非暂时性有形存储媒体。如本文中所使用,磁盘和光盘包含压缩光盘(CD)、激光光盘、光学光盘、DVD和蓝光光盘,其中磁盘通常以磁性方式再现数据,而光盘利用激光以光学方式再现数据。以上各项的组合也应包含在计算机可读媒体的范围内。
可通过例如一或多个数字信号处理器(DSP)、通用微处理器、专用集成电路(ASIC)、现场可编程逻辑阵列(FPGA)或其它等效集成或离散逻辑电路等一或多个处理器来执行指令。因此,如本文中所使用的术语“处理器”可指前述结构或适合于实施本文中所描述的技术的任一其它结构中的任一者。另外,在一些方面中,本文中所描述的各种说明性逻辑框、模块、和步骤所描述的功能可以提供于经配置以用于编码和解码的专用硬件和/或软件模块内,或者并入在组合编解码器中。而且,所述技术可完全实施于一或多个电路或逻辑元件中。在一种示例下,编码器100及解码器200中的各种说明性逻辑框、单元、模块可以理解为对应的电路器件或逻辑元件。
本申请实施例的技术可在各种各样的装置或设备中实施,包含无线手持机、集成电路(IC)或一组IC(例如,芯片组)。本申请实施例中描述各种组件、模块或单元是为了强调用于执行所揭示的技术的装置的功能方面,但未必需要由不同硬件单元实现。实际上,如上文所描述,各种单元可结合合适的软件和/或固件组合在编码解码器硬件单元中,或者通过互操作硬件单元(包含如上文所描述的一或多个处理器)来提供。
也就是说,在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意结合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络或其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如:同轴电缆、光纤、数据用户线(digital subscriber line,DSL))或无线(例如:红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质,或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如:软盘、硬盘、磁带)、光介质(例如:数字通用光盘(digital versatile disc,DVD))或半导体介质(例如:固态硬盘(solid state disk,SSD))等。值得注意的是,本申请实施例提到的计算机可读存储介质可以为非易失性存储介质,换句话说,可以是非瞬时性存储介质。
应当理解的是,本文提及的“至少一个”是指一个或多个,“多个”是指两个或两个以上。在本申请实施例的描述中,除非另有说明,“/”表示或的意思,例如,A/B可以表示A或B;本文中的“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,为了便于清楚描述本申请实施例的技术方案,在本申请的实施例中,采用了“第一”、“第二”等字样对功能和作用基本相同的相同项或相似项进行区分。本领域技术人员可以理解“第一”、“第 二”等字样并不对数量和执行次序进行限定,并且“第一”、“第二”等字样也并不限定一定不同。
需要说明的是,本申请实施例所涉及的信息(包括但不限于用户设备信息、用户个人信息等)、数据(包括但不限于用于分析的数据、存储的数据、展示的数据等)以及信号,均为经用户授权或者经过各方充分授权的,且相关数据的收集、使用和处理需要遵守相关国家和地区的相关法律法规和标准。例如,本申请实施例中涉及到的图像、视频都是在充分授权的情况下获取的。
以上所述仅为本申请的示例性实施例,并不用以限制本申请,凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。

Claims (37)

  1. 一种编码方法,其特征在于,所述方法包括:
    确定待编码的图像的第一图像特征和第二图像特征,所述第一图像特征为按照第一量化步长对所述第二图像特征进行量化得到的图像特征;
    确定所述第二图像特征的第一超先验特征;
    将所述第一超先验特征编入码流;
    基于所述第一超先验特征,通过概率分布估计网络确定第一概率分布参数;
    按照所述第一量化步长,对所述第一概率分布参数进行量化,以得到第二概率分布参数;
    基于所述第二概率分布参数,将所述第一图像特征编入所述码流。
  2. 如权利要求1所述的方法,其特征在于,所述确定所述第二图像特征的第一超先验特征,包括:
    将所述第二图像特征输入超编码网络,以得到所述第一超先验特征。
  3. 如权利要求1所述的方法,其特征在于,所述确定所述第二图像特征的第一超先验特征,包括:
    按照所述第一量化步长,对所述第一图像特征进行反量化,以得到所述图像的第三图像特征;
    将所述第三图像特征输入超编码网络,以得到所述第一超先验特征。
  4. 如权利要求1-3任一所述的方法,其特征在于,所述基于所述第一超先验特征,通过概率分布估计网络确定第一概率分布参数,包括:
    将所述图像的第三图像特征输入上下文网络,以得到所述第三图像特征的上下文特征,所述第三图像特征是按照所述第一量化步长对所述第一图像特征进行反量化得到的图像特征;
    基于所述第一超先验特征,确定第一先验特征;
    将所述第一先验特征和所述上下文特征输入所述概率分布估计网络,以得到所述第一概率分布参数。
  5. 如权利要求1-3任一所述的方法,其特征在于,所述基于所述第一超先验特征,通过概率分布估计网络确定第一概率分布参数,包括:
    将所述第一图像特征输入上下文网络,以得到所述第一图像特征的上下文特征;
    基于所述第一超先验特征,确定第一先验特征;
    按照第二量化步长,对所述第一先验特征进行量化,以得到第二先验特征;
    将所述第二先验特征和所述上下文特征输入所述概率分布估计网络,以得到所述第一概率分布参数。
  6. 一种解码方法,其特征在于,所述方法包括:
    从码流中解析出待解码的图像的第一超先验特征;
    基于所述第一超先验特征,通过概率分布估计网络确定第一概率分布参数,所述第一概率分布参数表征所述图像的未量化的图像特征的概率分布;
    按照第一量化步长,对所述第一概率分布参数进行量化,以得到第二概率分布参数;
    基于所述第二概率分布参数,从所述码流中解析出所述图像的第一图像特征;
    按照所述第一量化步长,对所述第一图像特征进行反量化,以重构所述图像。
  7. 如权利要求6所述的方法,其特征在于,所述第一图像特征为按照所述第一量化步长对所述图像的第二图像特征进行量化得到的图像特征。
  8. 如权利要求6或7所述的方法,其特征在于,所述按照所述第一量化步长,对所述第一图像特征进行反量化,以重构所述图像,包括:
    按照所述第一量化步长,对所述第一图像特征进行反量化,以得到所述图像的第三图像特征;
    基于所述第三图像特征,重构所述图像。
  9. 如权利要求6-8任一所述的方法,其特征在于,所述第一概率分布参数为多个特征点的概率分布参数,所述第一超先验特征为所述多个特征点的超先验特征,所述基于所述第一超先验特征,通过概率分布估计网络确定第一概率分布参数,包括:
    对于第一特征点执行如下操作来确定所述第一特征点的概率分布参数,所述第一特征点为所述多个特征点中的任意一个:
    基于所述第一图像特征中已解码的特征点的图像特征,确定所述第一特征点的上下文特征;
    基于所述第一特征点的超先验特征,确定所述第一特征点的第一先验特征;
    基于所述第一特征点的第一先验特征和所述第一特征点的上下文特征,通过所述概率分布估计网络确定所述第一特征点的概率分布参数。
  10. 如权利要求9所述的方法,其特征在于,所述基于所述第一图像特征中已解码的特征点的图像特征,确定所述第一特征点的上下文特征,包括:
    从所述已解码的特征点中,确定所述第一特征点的周边特征点;
    按照所述第一量化步长,对所述第一图像特征中所述周边特征点的图像特征进行反量化,以得到所述第一特征点的周边特征;
    将所述第一特征点的周边特征输入上下文网络,以得到所述第一特征点的上下文特征;
    所述基于所述第一特征点的第一先验特征和所述第一特征点的上下文特征,通过所述概率分布估计网络确定所述第一特征点的概率分布参数,包括:
    将所述第一特征点的第一先验特征和所述第一特征点的上下文特征输入所述概率分布估计网络,以得到所述第一特征点的概率分布参数。
  11. 如权利要求9所述的方法,其特征在于,所述基于所述第一图像特征中已解码的特征点的图像特征,确定所述第一特征点的上下文特征,包括:
    从所述已解码的特征点中,确定所述第一特征点的周边特征点;
    将所述第一图像特征中所述周边特征点的图像特征输入上下文网络,以得到所述第一特征点的上下文特征;
    所述基于所述第一特征点的第一先验特征和所述第一特征点的上下文特征,通过所述概率分布估计网络确定所述第一特征点的概率分布参数,包括:
    按照第二量化步长,对所述第一特征点的第一先验特征进行量化,以得到所述第一特征点的第二先验特征;
    将所述第一特征点的第二先验特征和所述第一特征点的上下文特征输入所述概率分布估计网络,以得到所述第一特征点的概率分布参数。
  12. 一种编码方法,其特征在于,所述方法包括:
    确定待编码的图像的第一图像特征和第二图像特征,所述第一图像特征为按照第一量化步长对所述第二图像特征进行量化得到的图像特征;
    确定所述第二图像特征的第一超先验特征;
    将所述第一超先验特征编入码流;
    基于所述第一超先验特征,通过第二概率分布估计网络确定第二概率分布参数,所述第二概率分布估计网络的网络参数是基于第一概率分布估计网络的网络参数和所述第一量化步长得到的,所述第一概率分布估计网络用于确定未量化的图像特征的概率分布;
    基于所述第二概率分布参数,将所述第一图像特征编入所述码流。
  13. 如权利要求12所述的方法,其特征在于,所述第二概率分布估计网络是将所述第一概率分布估计网络中最后一层的网络参数乘以所述第一量化步长后得到的。
  14. 如权利要求12或13所述的方法,其特征在于,所述第一概率分布估计网络的最后一层为卷积层,所述卷积层的网络参数包括权重和偏置。
  15. 如权利要求12-14任一所述的方法,其特征在于,所述确定所述第二图像特征的第一超先验特征,包括:
    将所述第二图像特征输入超编码网络,以得到所述第一超先验特征。
  16. 如权利要求12-14任一所述的方法,其特征在于,所述确定所述第二图像特征的第一超先验特征,包括:
    按照所述第一量化步长,对所述第一图像特征进行反量化,以得到所述图像的第三图像特征;
    将所述第三图像特征输入超编码网络,以得到所述第一超先验特征。
  17. 一种解码方法,其特征在于,所述方法包括:
    从码流中解析出待解码的图像的第一超先验特征;
    基于所述第一超先验特征,通过第二概率分布估计网络确定第二概率分布参数,所述第 二概率分布估计网络的网络参数是基于第一概率分布估计网络的网络参数和第一量化步长得到的,所述第一概率分布估计网络用于确定未量化的图像特征的概率分布;
    基于所述第二概率分布参数,从所述码流中解析出所述图像的第一图像特征;
    按照所述第一量化步长,对所述第一图像特征进行反量化,以重构所述图像。
  18. 如权利要求17所述的方法,其特征在于,所述第二概率分布估计网络是将所述第一概率分布估计网络中最后一层的网络参数乘以所述第一量化步长后得到的。
  19. 如权利要求17或18所述的方法,其特征在于,所述第一概率分布估计网络的最后一层为卷积层,所述卷积层的网络参数包括权重和偏置。
  20. 如权利要求17-19任一所述的方法,其特征在于,所述第一图像特征为按照所述第一量化步长对所述图像的第二图像特征进行量化得到的图像特征。
  21. 如权利要求17-20任一所述的方法,其特征在于,所述按照所述第一量化步长,对所述第一图像特征进行反量化,以重构所述图像,包括:
    按照所述第一量化步长,对所述第一图像特征进行反量化,以得到所述图像的第三图像特征;
    基于所述第三图像特征,重构所述图像。
  22. 一种编码装置,其特征在于,所述装置包括:
    第一确定模块,用于确定待编码的图像的第一图像特征和第二图像特征,所述第一图像特征为按照第一量化步长对所述第二图像特征进行量化得到的图像特征;
    第二确定模块,用于确定所述第二图像特征的第一超先验特征;
    第一编码模块,用于将所述第一超先验特征编入码流;
    概率估计模块,用于基于所述第一超先验特征,通过概率分布估计网络确定第一概率分布参数;
    量化模块,用于按照所述第一量化步长,对所述第一概率分布参数进行量化,以得到第二概率分布参数;
    第二编码模块,用于基于所述第二概率分布参数,将所述第一图像特征编入所述码流。
  23. 如权利要求22所述的装置,其特征在于,所述第二确定模块包括:
    第一超编码子模块,用于将所述第二图像特征输入超编码网络,以得到所述第一超先验特征。
  24. 如权利要求22所述的装置,其特征在于,所述第二确定模块包括:
    反量化子模块,用于按照所述第一量化步长,对所述第一图像特征进行反量化,以得到所述图像的第三图像特征;
    第二超编码子模块,用于将所述第三图像特征输入超编码网络,以得到所述第一超先验 特征。
  25. 一种解码装置,其特征在于,所述装置包括:
    第一解析模块,用于从码流中解析出待解码的图像的第一超先验特征;
    概率估计模块,用于基于所述第一超先验特征,通过概率分布估计网络确定第一概率分布参数,所述第一概率分布参数表征所述图像的未量化的图像特征的概率分布;
    量化模块,用于按照第一量化步长,对所述第一概率分布参数进行量化,以得到第二概率分布参数;
    第二解析模块,用于基于所述第二概率分布参数,从所述码流中解析出所述图像的第一图像特征;
    重构模块,用于按照所述第一量化步长,对所述第一图像特征进行反量化,以重构所述图像。
  26. 如权利要求25所述的装置,其特征在于,所述第一图像特征为按照所述第一量化步长对所述图像的第二图像特征进行量化得到的图像特征。
  27. 如权利要求25或26所述的装置,其特征在于,所述重构模块包括:
    反量化子模块,用于按照所述第一量化步长,对所述第一图像特征进行反量化,以得到所述图像的第三图像特征;
    重构子模块,用于基于所述第三图像特征,重构所述图像。
  28. 一种编码装置,其特征在于,所述装置包括:
    第一确定模块,用于确定待编码的图像的第一图像特征和第二图像特征,所述第一图像特征为按照第一量化步长对所述第二图像特征进行量化得到的图像特征;
    第二确定模块,用于确定所述第二图像特征的第一超先验特征;
    第一编码模块,用于将所述第一超先验特征编入码流;
    概率估计模块,用于基于所述第一超先验特征,通过第二概率分布估计网络确定第二概率分布参数,所述第二概率分布估计网络的网络参数是基于第一概率分布估计网络的网络参数和所述第一量化步长得到的,所述第一概率分布估计网络用于确定未量化的图像特征的概率分布;
    第二编码模块,用于基于所述第二概率分布参数,将所述第一图像特征编入所述码流。
  29. 如权利要求28所述的装置,其特征在于,所述第二概率分布估计网络是将所述第一概率分布估计网络中最后一层的网络参数乘以所述第一量化步长后得到的。
  30. 如权利要求28或29所述的装置,其特征在于,所述第一概率分布估计网络的最后一层为卷积层,所述卷积层的网络参数包括权重和偏置。
  31. 一种解码装置,其特征在于,所述装置包括:
    第一解析模块,用于从码流中解析出待解码的图像的第一超先验特征;
    概率估计模块,用于基于所述第一超先验特征,通过第二概率分布估计网络确定第二概率分布参数,所述第二概率分布估计网络的网络参数是基于第一概率分布估计网络的网络参数和第一量化步长得到的,所述第一概率分布估计网络用于确定未量化的图像特征的概率分布;
    第二解析模块,用于基于所述第二概率分布参数,从所述码流中解析出所述图像的第一图像特征;
    重构模块,用于按照所述第一量化步长,对所述第一图像特征进行反量化,以重构所述图像。
  32. 如权利要求31所述的装置,其特征在于,所述第二概率分布估计网络是将所述第一概率分布估计网络中最后一层的网络参数乘以所述第一量化步长后得到的。
  33. 如权利要求31或32所述的装置,其特征在于,所述第一概率分布估计网络的最后一层为卷积层,所述卷积层的网络参数包括权重和偏置。
  34. 一种编码端设备,其特征在于,所述编码端设备包括存储器和处理器;
    所述存储器用于存储计算机程序,所述处理器用于执行所述存储器中存储的所述计算机程序,以实现权利要求1-5任一项所述的编码方法,或者实现权利要求12-16任一项所述的编码方法。
  35. 一种解码端设备,其特征在于,所述解码端设备包括存储器和处理器;
    所述存储器用于存储计算机程序,所述处理器用于执行所述存储器中存储的所述计算机程序,以实现权利要求6-11任一项所述的解码方法,或者实现权利要求17-21任一项所述的解码方法。
  36. 一种计算机可读存储介质,其特征在于,所述存储介质内存储有计算机程序,所述计算机程序被处理器执行时实现权利要求1-5任一所述的方法的步骤,或者实现权利要求12-16任一所述的方法的步骤,或者实现权利要求6-11任一所述的方法的步骤,或者实现权利要求17-21任一所述的方法的步骤。
  37. 一种计算机程序产品,其特征在于,所述计算机程序产品内存储有计算机指令,所述计算机指令被处理器执行时实现权利要求6-11任一所述的方法的步骤,或者实现权利要求17-21任一所述的方法的步骤,或者实现权利要求1-5任一所述的方法的步骤,或者实现权利要求12-16任一所述的方法的步骤。
PCT/CN2023/079340 2022-03-10 2023-03-02 编解码方法、装置、设备、存储介质及计算机程序产品 WO2023169303A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210234190.8 2022-03-10
CN202210234190.8A CN116778002A (zh) 2022-03-10 2022-03-10 编解码方法、装置、设备、存储介质及计算机程序产品

Publications (1)

Publication Number Publication Date
WO2023169303A1 true WO2023169303A1 (zh) 2023-09-14

Family

ID=87937219

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/079340 WO2023169303A1 (zh) 2022-03-10 2023-03-02 编解码方法、装置、设备、存储介质及计算机程序产品

Country Status (3)

Country Link
CN (1) CN116778002A (zh)
TW (1) TW202337209A (zh)
WO (1) WO2023169303A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013000575A1 (en) * 2011-06-30 2013-01-03 Canon Kabushiki Kaisha Methods and devices for scalable video coding
CN110602494A (zh) * 2019-08-01 2019-12-20 杭州皮克皮克科技有限公司 基于深度学习的图像编码、解码系统及编码、解码方法
CN111641832A (zh) * 2019-03-01 2020-09-08 杭州海康威视数字技术股份有限公司 编码方法、解码方法、装置、电子设备及存储介质
CN111641826A (zh) * 2019-03-01 2020-09-08 杭州海康威视数字技术股份有限公司 对数据进行编码、解码的方法、装置与系统
CN114071141A (zh) * 2020-08-06 2022-02-18 华为技术有限公司 一种图像处理方法及其设备

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013000575A1 (en) * 2011-06-30 2013-01-03 Canon Kabushiki Kaisha Methods and devices for scalable video coding
CN111641832A (zh) * 2019-03-01 2020-09-08 杭州海康威视数字技术股份有限公司 编码方法、解码方法、装置、电子设备及存储介质
CN111641826A (zh) * 2019-03-01 2020-09-08 杭州海康威视数字技术股份有限公司 对数据进行编码、解码的方法、装置与系统
CN110602494A (zh) * 2019-08-01 2019-12-20 杭州皮克皮克科技有限公司 基于深度学习的图像编码、解码系统及编码、解码方法
CN114071141A (zh) * 2020-08-06 2022-02-18 华为技术有限公司 一种图像处理方法及其设备

Also Published As

Publication number Publication date
CN116778002A (zh) 2023-09-19
TW202337209A (zh) 2023-09-16

Similar Documents

Publication Publication Date Title
US11335034B2 (en) Systems and methods for image compression at multiple, different bitrates
US11671576B2 (en) Method and apparatus for inter-channel prediction and transform for point-cloud attribute coding
CN111314709A (zh) 基于机器学习的视频压缩
WO2022028197A1 (zh) 一种图像处理方法及其设备
KR20210136082A (ko) 포인트 클라우드 속성 코딩을 위한 채널간 예측 및 변환을 위한 기술들 및 장치
US20230106778A1 (en) Quantization for Neural Networks
WO2022257971A1 (zh) 点云编码处理方法、点云解码处理方法及相关设备
JP2024520151A (ja) 特徴データ符号化および復号方法および装置
EP3151559A1 (en) Method for coding and decoding a plurality of picture blocks and corresponding devices
WO2023169303A1 (zh) 编解码方法、装置、设备、存储介质及计算机程序产品
WO2023225808A1 (en) Learned image compress ion and decompression using long and short attention module
EP4300958A1 (en) Video image encoding method, video image decoding method and related devices
US9202108B2 (en) Methods and apparatuses for facilitating face image analysis
US11792408B2 (en) Transcoder target bitrate prediction techniques
US11876969B2 (en) Neural-network media compression using quantized entropy coding distribution parameters
WO2023185305A1 (zh) 编码方法、装置、存储介质及计算机程序产品
US20230262267A1 (en) Entropy coding for neural-based media compression
US20240121392A1 (en) Neural-network media compression using quantized entropy coding distribution parameters
US11503311B2 (en) Hybrid palette-DPCM coding for image compression
WO2023050433A1 (zh) 视频编解码方法、编码器、解码器及存储介质
WO2022253088A1 (zh) 编解码方法、装置、设备、存储介质、计算机程序及产品
CN115474041B (zh) 点云属性的预测方法、装置及相关设备
WO2023071462A1 (zh) 点云的编解码方法、装置、设备、存储介质及程序产品
WO2022258055A1 (zh) 点云属性信息编码方法、解码方法、装置及相关设备
WO2022258010A1 (zh) 点云编码处理方法、解码处理方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23765881

Country of ref document: EP

Kind code of ref document: A1