WO2023279968A1 - Method and apparatus for encoding and decoding video image - Google Patents

Method and apparatus for encoding and decoding video image Download PDF

Info

Publication number
WO2023279968A1
WO2023279968A1 PCT/CN2022/100578 CN2022100578W WO2023279968A1 WO 2023279968 A1 WO2023279968 A1 WO 2023279968A1 CN 2022100578 W CN2022100578 W CN 2022100578W WO 2023279968 A1 WO2023279968 A1 WO 2023279968A1
Authority
WO
WIPO (PCT)
Prior art keywords
coefficient
probability estimation
estimation result
image
coefficients
Prior art date
Application number
PCT/CN2022/100578
Other languages
French (fr)
Chinese (zh)
Inventor
杨海涛
张恋
傅佳莉
毛珏
刘�东
马海川
李礼
吴枫
Original Assignee
华为技术有限公司
中国科学技术大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司, 中国科学技术大学 filed Critical 华为技术有限公司
Publication of WO2023279968A1 publication Critical patent/WO2023279968A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/13Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/46Embedding additional information in the video signal during the compression process
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/625Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using discrete cosine transform [DCT]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/63Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using sub-band based transform, e.g. wavelets
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/91Entropy coding, e.g. variable length coding [VLC] or arithmetic coding

Definitions

  • the present application relates to the field of video encoding and decoding, and in particular to a method and device for encoding and decoding video images.
  • Digital images are image information recorded in the form of digital signals.
  • a digital image (hereinafter referred to as an image) can be regarded as a two-dimensional array of M rows and N columns, including M ⁇ N samples, the position of each sample is called a sampling position, and the value of each sample is called a sample value.
  • Image coding includes two steps of encoding and decoding.
  • a typical coding process generally includes three steps of transformation, quantization and entropy coding.
  • the first step is to decorrelate the image through transformation to obtain the transformation coefficient with more concentrated energy distribution;
  • the second step is to quantize the transformation coefficient to obtain the quantization coefficient;
  • the third step is to entropy encode the quantization coefficient Get the compressed code stream.
  • a typical decoding process includes three steps of entropy decoding, inverse quantization and inverse transformation in sequence after the decoder receives the compressed code stream to obtain the reconstructed image.
  • entropy decoding, inverse quantization and inverse transformation are generally deterministic processes, that is, decoding a compressed code stream will obtain a unique reconstructed image. The quality is not high.
  • Embodiments of the present application provide a video image encoding and decoding method and related equipment, which can improve image quality.
  • the present application relates to a video image encoding method.
  • the method is performed by an encoding device, and the method includes:
  • the first image is an image to be encoded or a decoded image, performing probability estimation according to the first context information to obtain a first probability estimation result; the first context information is obtained from the first image; the first probability estimation The result is written to the compressed codestream.
  • the first context information may be pixels in the first image or coefficients in the first transformed image obtained by transforming the first image.
  • the probability estimation is performed at the encoding end to obtain a probability estimation result, and the probability estimation result is transmitted to the decoding end, so that the decoding end performs sampling based on the probability estimation result to obtain a high-quality image.
  • the method of this embodiment also includes:
  • the second image is an image to be encoded or a decoded image, and the second image is different from the first image; perform probability estimation according to the first context information to obtain a first probability estimation result, including:
  • the probability estimation is performed according to the first context information to obtain the first probability estimation result, including:
  • the first context information includes the context information of the first data and the context information of the second data.
  • the encoding end calculates the probability estimation results of each data in the first image one by one, and transmits the probability estimation results of each data to the decoding end, so that the decoding end can accurately sample based on the probability estimation results of the respective data, thereby obtaining the quality Higher reconstructed images.
  • the first probability estimation result includes the probability estimation result of a first preset area
  • the first preset area includes the first data and the second data
  • the first preset area is located in the first image, or located in In the image obtained by transforming the first image, performing probability estimation according to the first context information to obtain a first probability estimation result, including:
  • the first preset area is an image block in the first image, or a subband obtained by performing wavelet transform on the first image, or a subband obtained by performing discrete cosine transform (discrete cosine transform, DCT) on the first image
  • DCT discrete cosine transform
  • performing DCT transformation on the first image in units of one or more image blocks may obtain one or more transform blocks.
  • the encoder uses a probability estimation result as the probability estimation result of all the data in the preset area, so that only one probability estimation result needs to be transmitted during transmission, thereby reducing the number of transmission code streams and The resources required to transmit the code stream.
  • the first probability estimation result includes the probability estimation result of the second preset area
  • the second preset area is located in the first image or in an image obtained by transforming the first image
  • the first context The information includes context information of the second preset area
  • performing probability estimation according to the first context information to obtain the first probability estimation result includes: performing probability estimation according to the context information of the second preset area to obtain the probability estimation result of the second preset area
  • the first probability estimation result includes the probability estimation result of the second preset area.
  • the second preset area is an image block in the first image, or a subband obtained by performing wavelet transformation on the first image, or a frequency band obtained by performing DCT on the first image, or a A transformation block obtained by performing DCT on the image, or a channel in a three-dimensional feature map obtained by performing feature extraction on the first image.
  • performing DCT transformation on the first image in units of one or more image blocks may obtain one or more transform blocks.
  • the encoder uses a probability estimation result as the probability estimation result of all the data in the preset area, so that only one probability estimation result needs to be transmitted during transmission, thereby reducing the number of transmission code streams and The resources required to transmit the code stream.
  • this encoding method also includes:
  • the encoding end saves the probability estimation results of multiple preset areas in the probability estimation result set, and records the probability estimation results of each preset area in the probability estimation result set position (namely index), so that the decoder can accurately determine the probability estimation result of each preset area from the probability estimation result set obtained based on codestream decoding based on the index, thereby ensuring the accuracy of decoding.
  • the size information is introduced to indicate the number of times of sampling based on the probability estimation result of the first preset area when sampling to obtain the estimated coefficients in the first preset area, so as to obtain all the estimated coefficients in the first preset area.
  • this encoding method also includes:
  • the scaling factor of the preset area preprocesses the probability estimation result of the first preset area to obtain the processed probability estimation result, saves the processed probability estimation result into the probability estimation result set, and records the processed probability estimation result
  • the results are in the index of the probability estimation result set; writing the probability estimation result into the compressed code stream includes: writing the probability estimation result set, the index, the size information of the first preset area and the first identification into the compressed code stream.
  • the encoding end preprocesses the probability estimation result of the first preset area to obtain a processed probability estimation result; the decoding end performs sampling based on the processed probability estimation result to obtain a reconstructed image.
  • reconstructed images of different qualities can be obtained, such as images with high subjective quality or images with high objective quality.
  • this encoding method also includes:
  • Set the value of the first identifier of the first preset area as the first value to indicate that the probability estimation result of the first preset area is used when sampling the estimated coefficients in the first preset area; set the first Writing the probability estimation result into the compressed code stream includes: writing the probability estimation result of the first preset area, the size information of the first preset area and the first identification into the code stream.
  • the decoding end uses the first The probability estimation result of the preset area; the size information is introduced to indicate the number of sampling times that need to be sampled based on the probability estimation result of the first preset area when sampling to obtain the estimated coefficient in the first preset area, so as to obtain the first preset area All estimated coefficients in .
  • this encoding method also includes:
  • the probability estimation result of the first data is preprocessed to obtain the probability estimation result after processing.
  • the probability estimation result of the first data includes the mean and variance of the Gaussian distribution, and the probability estimation result of the first data is preprocessed to obtain the processed probability estimation result, including:
  • the variance of the Gaussian distribution is set to 0 as the processed variance, wherein the processed probability estimation result includes the mean value of the Gaussian distribution and the processed variance.
  • the probability estimation result of the first data includes the mean and variance of the Gaussian distribution
  • the probability estimation result of the first data is preprocessed to obtain the processed probability estimation result, including: scaling according to the first data
  • the factor preprocesses the variance of the Gaussian distribution to obtain the processed variance, where the processed probability estimation result includes the mean value of the Gaussian distribution and the processed variance; then
  • the scaling factor of the first data is the same as the scaling factor of the second data; or, the scaling factor of the first data is different from the scaling factor of the second data; or,
  • the content information of the preset area includes texture resolution level or texture complexity of the preset area.
  • the complexity of the texture can be calculated.
  • the resolution level is high, and for the smooth texture preset area, the resolution level is considered low.
  • the shrinkage factor of the first data is different from the shrinkage factor of the second data, for the first data and the second data in the preset area with low resolution level, the shrinkage factor of the first data and the second The shrinkage factor of the data is the same.
  • the shrinkage factor of the first data is different from that of the second data, and for the preset area with low texture complexity
  • the shrinkage factor of the first data is the same as the shrinkage factor of the second data.
  • the aforementioned preset area may be an image block, a subband, a frequency band, or a channel as mentioned below.
  • the scaling factor of the first data and the scaling factor of the second data are the same; or if the first data and the second data belong to different image blocks, then The scaling factor of the first data is different from the scaling factor of the second data; or the scaling factor of the first data is determined according to the texture complexity of the image block to which the first data belongs;
  • the scaling factor of the first data is the same as that of the second data; or if the first data and the second data The two data belong to different subbands, and the scaling factor of the first data is different from the scaling factor of the second data; or the scaling factor of the first data is determined according to the texture complexity of the subband to which the first data belongs;
  • the scaling factor of the first data is the same as the scaling factor of the second data; Or if the first data and the second data belong to different frequency bands or transform blocks, then the scaling factor of the first data and the scaling factor of the second data are different; if or the scaling factor of the first data is according to the frequency band or transform to which the first data belongs The texture complexity of the block is determined;
  • the scaling factor of the first data and the scaling factor of the second data are the same; or if the first data and the second data belong to different channels, the scaling factor of the first data is different from the scaling factor of the second data; or the scaling factor of the first data is determined according to the texture complexity of the channel to which the first data belongs.
  • the texture complexity of the image block to which the first data belongs can be determined according to the content of the corresponding image block in the image to be encoded or the decoded image; the texture complexity of the subband to which the first data belongs can be Determined according to the content of the corresponding part of the sub-band in the image to be encoded or in the decoded image; the texture complexity of the frequency band to which the first data belongs may be determined according to the content of the corresponding part of the frequency band in the image to be encoded or in the decoded image; for The texture complexity of the channel to which the first data belongs may be determined according to the content of the corresponding part of the channel in the image to be encoded or the decoded image. In one example, the larger the texture complexity of the first data is, the larger the scaling factor of the first data is.
  • this encoding method also includes:
  • the probability estimation result of the second preset area is preprocessed to obtain the probability estimation result after processing.
  • the probability estimation result of the first data includes the mean and variance of the Gaussian distribution, and the probability estimation result of the first data is preprocessed to obtain the processed probability estimation result, including:
  • the processed probability estimation result includes the mean value and the first variance of the Gaussian distribution, or, the variance of the Gaussian distribution is calculated according to the scaling factor of the second preset area processing to obtain the second variance, wherein the processed probability estimation result includes the mean value of the Gaussian distribution and the second variance, and the scaling factor of the first prefabricated area is the same or different from the scaling factor of the second prefabricated area.
  • the first context information includes some or all pixel values in the first image.
  • the probability estimation results By preprocessing the probability estimation results, reconstructed images with different properties can be obtained according to user's needs, which improves the quality of reconstructed images. For example, if the variance of the probability estimation result is set to 0 as the variance after processing, the reconstructed image with the best signal quality (best objective quality) can be obtained, that is, the peak signal to noise ratio (PSNR) of the image can be increased.
  • PSNR peak signal to noise ratio
  • MSE mean-square error
  • this encoding method also includes:
  • the probability estimation is performed according to the first context information to obtain the first probability estimation result, including:
  • the first context information is input into the first probability estimation network for processing to obtain the parameters of the first probability distribution model; the parameters of the probability estimation result first probability distribution model;
  • the first context information is input into the second probability estimation network for processing to obtain the target probability distribution, and the probability estimation result includes the parameters of the target probability distribution; wherein, the first probability estimation network and the second probability estimation network are realized by a neural network.
  • the present application relates to a video image encoding method.
  • the method is performed by an encoding device, and the method includes:
  • a plurality of coefficients are obtained according to the image to be encoded, and the plurality of coefficients include a first coefficient; a first probability estimation result is obtained according to the context information of the first coefficient; and the first coefficient and the first probability estimation result are written into a compressed code stream.
  • the first coefficient may be a pixel in the image to be coded or a coefficient in a transformed image obtained by transforming the image to be coded.
  • the probability estimation is performed at the encoding end to obtain a probability estimation result, and the probability estimation result is transmitted to the decoding end, so that the decoding end performs sampling based on the probability estimation result to obtain a high-quality image.
  • the multiple coefficients also include a second coefficient
  • the encoding method also includes:
  • the encoding end calculates the probability estimation results of each coefficient in the image to be encoded one by one, and transmits the probability estimation results of each coefficient to the decoding end, so that the decoding end can accurately sample based on the probability estimation results of the respective coefficients, thereby obtaining the quality Higher reconstructed images.
  • the plurality of coefficients further includes a second coefficient, the first coefficient and the second coefficient belong to the same preset area, and the preset area is located in the image to be coded, or in an image obtained by transforming the image to be coded,
  • the first probability estimation result is obtained according to the context information of the first coefficient, including:
  • Writing the first coefficient and the first probability estimation result into the compressed code stream includes: writing the first coefficient, the second coefficient and the first probability estimation result into the compressed code stream.
  • the preset area is an image block in the image to be encoded, or a subband obtained by performing wavelet transformation on the image to be encoded, or a frequency band obtained by performing DCT on the image to be encoded, or a frequency band obtained by performing DCT on the image to be encoded A transformation block, or a channel in a three-dimensional feature map obtained by performing feature extraction on the image to be encoded.
  • DCT transformation is performed on the image to be coded in units of one or more image blocks to obtain one or more transform blocks.
  • the encoding end uses a probability estimation result as the probability estimation result of all coefficients in the preset area, so that only one probability estimation result needs to be transmitted during transmission, thereby reducing the number of transmission code streams and The resources required to transmit the code stream.
  • the plurality of coefficients further includes a second coefficient, the first coefficient and the second coefficient belong to the same preset area, and the preset area is located in the image to be coded, or in an image obtained by transforming the image to be coded,
  • the first probability distribution is obtained according to the context information of the first coefficient, including:
  • Probability estimation is performed according to the context information of the preset area to obtain a first probability estimation result; the context information of the preset area includes context information of the first coefficient; writing the first coefficient and the first probability estimation result into the compressed code stream includes: The first coefficient, the second coefficient and the first probability estimation result are written into the compressed code stream.
  • the encoding end uses a probability estimation result as the probability estimation result of all coefficients in the preset area, so that only one probability estimation result needs to be transmitted during transmission, thereby reducing the number of transmission code streams and The resources required to transmit the code stream.
  • this encoding method also includes:
  • the encoding end saves the probability estimation results of multiple preset areas in the probability estimation result set, and records the probability estimation results of each preset area in the probability estimation result set position (namely index), so that the decoder can accurately determine the probability estimation result of each preset area from the probability estimation result set obtained based on codestream decoding based on the index, thereby ensuring the accuracy of decoding.
  • the size information is introduced to indicate the number of times of sampling based on the probability estimation result of the preset area when sampling to obtain the estimated coefficients in the preset area, so as to obtain all the estimated coefficients in the preset area.
  • this encoding method also includes:
  • Writing the estimation result into the compressed code stream includes: writing the first coefficient, the second coefficient, the first probability estimation result, the size information of the preset area and the first identification into the compressed code stream.
  • the decoding end By setting the value of the first identifier of the preset area as the first value, it indicates that the decoding end obtains the probability estimation result of the preset area after sampling to obtain the estimated coefficient in the preset area, and uses the probability estimation result of the preset area;
  • the size information is introduced to indicate the number of times of sampling based on the probability estimation result of the preset area when sampling to obtain the estimated coefficients in the preset area, so as to obtain all the estimated coefficients in the preset area.
  • the first coefficient and the second coefficient belong to the same preset area, and the encoding method further includes:
  • this encoding method also includes:
  • the probability estimation result of the first coefficient is preprocessed to obtain the probability estimation result after processing.
  • the probability estimation result of the first coefficient includes the mean and variance of the Gaussian distribution, and the probability estimation result of the first coefficient is preprocessed to obtain the processed probability estimation result, including:
  • the variance of the Gaussian distribution is set to 0 as the processed variance, wherein the processed probability estimation result includes the mean value of the Gaussian distribution and the processed variance.
  • the probability estimation result of the first coefficient includes the mean and variance of the Gaussian distribution, and the probability estimation result of the first coefficient is preprocessed to obtain the processed probability estimation result, including:
  • the scaling factor of the first coefficient and the scaling factor of the second coefficient are the same; or, the scaling factor of the first coefficient and the scaling factor of the second coefficient are different; or,
  • the content information of the preset area includes texture resolution level or texture complexity of the preset area.
  • the complexity of the texture can be calculated.
  • the resolution level is considered to be high, and for the texture smooth preset area, the resolution level is considered to be low.
  • the shrinkage factor of the first coefficient is different from the shrinkage factor of the second coefficient, for the first coefficient and the second coefficient in the preset area with low resolution level, the shrinkage factor of the first coefficient
  • the shrinkage factors of the coefficients are the same.
  • the shrinkage factor of the first coefficient and the shrinkage factor of the second coefficient are different, and for the preset area with low texture complexity
  • the shrinkage factor of the first coefficient and the shrinkage factor of the second coefficient are the same.
  • the aforementioned preset area may be an image block, a subband, a frequency band, a transform block or a channel as mentioned below.
  • the scaling factor of the first data and the scaling factor of the second coefficient are the same; or if the first coefficient and the second coefficient belong to different image blocks, then The scaling factor of the first coefficient is different from the scaling factor of the second coefficient; or the scaling factor of the first coefficient is determined according to the texture complexity of the subband to which the first coefficient belongs; or,
  • the scaling factor of the first coefficient is the same as the scaling factor of the second coefficient; or if the first coefficient and the second If the coefficients belong to different subbands, the scaling factor of the first coefficient is different from the scaling factor of the second coefficient; or the scaling factor of the first coefficient is determined according to the texture complexity of the subband to which the first coefficient belongs;
  • the scaling factor of the first coefficient and the scaling factor of the second coefficient are the same; or if the first coefficient and the second coefficient belong to different frequency band, the scaling factor of the first coefficient is different from the scaling factor of the second coefficient; if the scaling factor of the first coefficient is determined according to the texture complexity of the frequency band to which the first coefficient belongs;
  • the scaling factor of the first coefficient and the scaling factor of the second coefficient are the same; or if the first coefficient and the second coefficient belong to For different channels, the scaling factor of the first coefficient is different from the scaling factor of the second coefficient; if the scaling factor of the first coefficient is determined according to the texture complexity of the channel to which the first coefficient belongs.
  • the texture complexity of the image block to which the first coefficient belongs can be determined according to the content of the image block in the image to be encoded; the texture complexity of the subband to which the first coefficient belongs can be determined according to the content of the image to be encoded The content of the corresponding part of the sub-band is determined; the texture complexity of the frequency band to which the first coefficient belongs can be determined according to the content of the corresponding part of the frequency band in the image to be encoded; the texture complexity of the channel to which the first coefficient belongs can be determined according to the content of the frequency band to be encoded The content of the corresponding part of the channel in the encoded image is determined.
  • the larger the texture complexity of the first coefficient is, the larger the scaling factor of the first coefficient is.
  • this encoding method also includes:
  • the probability estimation result of the preset area is preprocessed to obtain the probability estimation result after processing.
  • the probability estimation result of the preset area includes the mean and variance of the Gaussian distribution, and the probability estimation result of the preset area is preprocessed to obtain the processed probability estimation result, including:
  • the processed probability estimation result includes the mean value and the first variance of the Gaussian distribution, or process the variance of the Gaussian distribution according to the scaling factor of the preset area, to obtain the second variance, wherein the processed probability estimation result includes the mean value and the second variance of the Gaussian distribution.
  • the reconstructed image with different properties can be obtained according to user's needs, which improves the quality of reconstructed images. For example, if the variance of the probability estimation result is set to 0 as the variance after processing, the reconstructed image with the best signal quality (best objective quality) can be obtained, that is, the PSNR or MSE of the image is increased; by setting the scaling factors of multiple coefficients For the same, the image with the best subjective quality can be obtained, that is, to reduce the PSNR of the image or to increase the MSE of the image; by setting the scaling factors of the coefficients belonging to the same part of the image to be the same, the coefficients of the coefficients belonging to different parts If the scaling factors are set differently, an image whose nature is between the best subjective quality and the best objective quality can be obtained.
  • the first context information includes some or all pixel values in the first image
  • the multiple coefficients are multiple wavelet coefficients
  • the first context information includes part or all of the multiple wavelet coefficients; or, if the image to be coded is subjected to wavelet transformation and quantization to obtain multiple Coefficients, the plurality of coefficients are a plurality of quantized wavelet coefficients, the first context information includes part or all of the plurality of quantized wavelet coefficients; or, if the image to be coded is subjected to DCT to obtain a plurality of coefficients, the plurality of coefficients are a plurality of DCT coefficients,
  • the first context information includes some or all of the multiple DCT coefficients; or, if the image to be coded is subjected to DCT and quantization to obtain multiple coefficients, the multiple coefficients are multiple quantized DCT coefficients, and the first context information includes multiple quantized DCT coefficients Part or all of them; or, if the feature extraction of the image to be coded obtains multiple coefficients, the
  • the first probability estimation result is obtained according to the context information of the first coefficient, including:
  • Obtain the second probability distribution model input the first context information into the third probability estimation network for processing, and obtain the parameters of the second probability distribution model; obtain the first probability according to the parameters of the second probability distribution model and the second probability distribution model estimated results;
  • the first context information is input into the fourth probability estimation model for processing to obtain a probability estimation result; wherein, the third probability estimation network and the fourth probability estimation network are realized by a neural network.
  • the present application relates to a method for decoding video images.
  • the method is performed by a decoding device, and the method includes:
  • the decoding method also includes:
  • the first probability estimation result is obtained from decoding the compressed code stream, including:
  • the preset area includes the first estimated coefficient, the preset area is an area in the first reconstructed image, and is determined from the probability estimation result set according to the index
  • the probability estimation result of the preset area, the first probability estimation result is the probability estimation result of the preset area; wherein, the value of the first identifier is the first value used to indicate that all estimation systems in the preset area are sampled using the above Probability estimation results for preset regions.
  • the decoding method also includes:
  • the first estimated coefficient and the second estimated coefficient belong to the same preset area, and the preset area is an area in the first reconstructed image, and the decoding method further includes:
  • the first probability estimation result includes the mean and variance of the Gaussian distribution, and sampling is performed according to the first probability estimation result to obtain the first estimated coefficient, including:
  • the decoding method also includes:
  • Determining the first estimated coefficient according to the first reference value and the mean value and variance of the first probability estimation result including:
  • the first estimation coefficient is determined according to the first reference value, the mean value of the first probability estimation result and the processed variance.
  • the variance of the first probability estimation result is preprocessed to obtain the processed variance, including:
  • the first estimated coefficient is a quantized wavelet coefficient, or, a wavelet coefficient, or a quantized DCT coefficient, or a DCT coefficient, or a feature coefficient, or a quantized feature coefficient
  • the variance of the first probability distribution is preprocessed, To get the processed variance, including:
  • the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient are the same; or, the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient are different; or,
  • the processing probability estimation result including: according to the content information of the preset area to which the first estimated coefficient belongs
  • the scaling factor of the first estimated coefficient is determined, and the variance of the Gaussian distribution is preprocessed according to the scaling factor to obtain the processed variance.
  • the content information of the preset area includes texture resolution level or texture complexity of the preset area.
  • the complexity of the texture can be calculated.
  • the resolution level is considered to be high, and the texture smooth preset area is considered to be low in resolution level.
  • the first estimate coefficient and the second estimated coefficient, the shrinkage factor of the first estimated coefficient and the shrinkage factor of the second estimated coefficient are different, for the first estimated coefficient and the second estimated coefficient belonging to the preset area with low resolution level, the first estimated coefficient
  • the shrinkage factor for the coefficients is the same as the shrinkage factor for the second estimated coefficients.
  • the shrinkage factor of the first estimated coefficient and the shrinkage factor of the second estimated coefficient are different, and for the same preset area with low texture complexity
  • the first estimated coefficient and the second estimated coefficient in the preset area of the shrinkage factor of the first estimated coefficient and the shrinkage factor of the second estimated coefficient are the same.
  • the aforementioned preset area may be an image block, a subband, a frequency band, a transform block or a channel as mentioned below.
  • the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient The factors are the same; or if the first estimated coefficient and the second estimated coefficient belong to different subbands, the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient are different; or the scaling factor of the first estimated coefficient is based on the first estimated
  • the texture complexity of the image block to which the coefficient belongs is determined;
  • the scaling factor of the first estimated coefficient and the second estimated coefficient are the same; or if the first estimated coefficients and the second estimated coefficients belong to different frequency bands or transform blocks, the scaling factors of the first estimated coefficients and the scaling factors of the second estimated coefficients are different; or the scaling factors of the first estimated coefficients is determined according to the frequency band to which the first estimated coefficient belongs or the texture complexity of the transform block;
  • the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient are the same ; or if the first estimated coefficient and the second estimated coefficient belong to different channels, the scaling factor of the first estimated coefficient is different from the scaling factor of the second estimated coefficient; or the scaling factor of the first estimated coefficient is based on the channel to which the first estimated coefficient belongs The channel's texture complexity is determined.
  • the first estimated coefficient and the second estimated coefficient are pixel values
  • the variance of the first probability estimation result is preprocessed to obtain the processed variance, including:
  • the scaling factor of the first estimated coefficient is the same as the scaling factor of the second estimated coefficient, or the scaling factor of the first estimated coefficient is different from the scaling factor of the second estimated coefficient; or the scaling factor of the first estimated coefficient is based on the first estimated coefficient
  • the texture complexity of the image block to which it belongs is determined.
  • the texture complexity of the image block to which the first estimated coefficient belongs can be determined according to the content of the image block in the first reconstructed image or the second reconstructed image; for the texture complexity of the subband to which the first estimated coefficient belongs The complexity can be determined according to the content of the corresponding part of the subband in the first reconstructed image or the second reconstructed image; the texture complexity of the frequency band to which the first estimated coefficient belongs can be determined according to the content of the subband in the first reconstructed image or the second reconstructed image The content of the corresponding part of the frequency band is determined; the texture complexity of the channel to which the first estimated coefficient belongs may be determined according to the content of the corresponding part of the channel in the first reconstructed image or the second reconstructed image. Wherein, the larger the texture complexity of the first estimated coefficient is, the larger the scaling factor of the first estimated coefficient is.
  • the first reconstructed image is obtained according to the first estimated coefficient and the second estimated coefficient, including:
  • first estimated coefficient and the second estimated coefficient are quantized wavelet coefficients
  • inverse quantization and wavelet inverse transform are performed on the first estimated coefficient and the second estimated coefficient to obtain the first reconstructed image, or, if the first estimated coefficient and the second estimated coefficient is the wavelet coefficient, perform wavelet inverse transform on the first estimated coefficient and the second estimated coefficient to obtain the first reconstructed image, or, if the first estimated coefficient and the second estimated coefficient are quantized DCT coefficients, the first estimated coefficient and the second estimated coefficient Perform inverse quantization and inverse DCT on the coefficients to obtain the first reconstructed image, or, the first estimated coefficient and the second estimated coefficient are DCT coefficients, and perform inverse DCT on the first estimated coefficient and the second estimated coefficient to obtain the first reconstructed image.
  • the reconstructed images with different properties can be obtained according to user's needs, which improves the quality of reconstructed images. For example, if the variance of the probability estimation result is set to 0 as the variance after processing, the reconstructed image with the best signal quality (best objective quality) can be obtained, that is, the PSNR or MSE of the image is increased; Set to the same, you can get the image with the best subjective quality, that is, reduce the PSNR of the image or increase the MSE of the image; by setting the scaling factor of the data belonging to the same part of the image to the same, the data belonging to different parts The scaling factors of different images can be obtained between the best subjective quality and the best objective quality.
  • the decoding method also includes:
  • a plurality of reconstruction coefficients are obtained by decoding the compressed code stream; and a second reconstruction image is obtained according to the plurality of reconstruction coefficients.
  • the second reconstructed image is derived from a plurality of coefficients, including:
  • the multiple reconstruction coefficients are quantized wavelet coefficients, perform inverse quantization and wavelet inverse transform on the multiple reconstruction coefficients to obtain the second reconstructed image, or, if the multiple reconstruction coefficients are wavelet coefficients, perform wavelet inverse transform on the multiple reconstruction coefficients to obtain the second reconstructed image
  • Two reconstructed images or, if the plurality of reconstruction coefficients are quantized DCT coefficients, perform inverse quantization and inverse DCT on the plurality of reconstruction coefficients to obtain a second reconstructed image, or, if the plurality of reconstruction coefficients are DCT coefficients, perform inverse quantization on the plurality of reconstruction coefficients
  • the inverse DCT obtains the second reconstructed image.
  • the sampling step can be repeated in the present application to obtain multiple reconstructed images.
  • the multiple reconstructed images may be the reconstructed images with the best subjective quality, or the reconstructed images with the best objective quality.
  • the reconstructed image can be used in the codec loop as a reference for intra-frame or inter-frame prediction; it can also be used outside the codec loop to optimize image quality as a post-processing method.
  • the reconstructed image with the best subjective quality is put into the decoded picture buffer (DPB) or the reference frame set, which is used to encode and decode the frame in the loop
  • DPB decoded picture buffer
  • the present application relates to a video image-based encoding device, and the beneficial effects may refer to the description of the first aspect or the second aspect, which will not be repeated here.
  • the coding device has the function of realizing the behavior in the method example of the first aspect or the second aspect above.
  • the functions described above may be implemented by hardware, or may be implemented by executing corresponding software on the hardware.
  • the hardware or software includes one or more modules corresponding to the above functions.
  • the present application relates to a video image-based decoding device, and the beneficial effects may refer to the description of the third aspect and will not be repeated here.
  • the encoding device has the function of realizing the behavior in the method example of the third aspect above.
  • the functions described above may be implemented by hardware, or may be implemented by executing corresponding software on the hardware.
  • the hardware or software includes one or more modules corresponding to the above functions.
  • the method described in the first aspect or the second aspect of the present application may be executed by the device described in the fourth aspect of the present application.
  • Other features and implementations of the method described in the first or second aspect of the present application directly depend on the functionality and implementation of the device described in the fourth aspect of the present application.
  • the method described in the third aspect of the present application can be executed by the device described in the fifth aspect of the present application.
  • Other features and implementations of the method described in the third aspect of the application depend directly on the functionality and implementations of the device described in the fifth aspect of the application.
  • the present application relates to an apparatus for encoding a video stream, including a processor and a memory.
  • the memory stores instructions, and the instructions cause the processor to execute the method described in the first aspect or the second aspect.
  • the present application relates to an apparatus for decoding a video stream, including a processor and a memory.
  • the memory stores instructions, and the instructions cause the processor to execute the method described in the third aspect.
  • a computer readable storage medium having stored thereon instructions which, when executed, cause one or more processors to encode video data.
  • the instructions cause the one or more processors to execute the method in the first, second, or third aspect, or any possible embodiment of the first, second, or third aspect.
  • the present application relates to a computer program product including program code, the program code executes the first or second or third aspect or any possible embodiment of the first or second or third aspect when running method in .
  • FIG. 1 is a block diagram of an example of a video decoding system for implementing an embodiment of the present application
  • FIG. 2 is a block diagram of another example of a video decoding system for implementing an embodiment of the present application
  • FIG. 3 is a schematic block diagram of a video decoding device for implementing an embodiment of the present application
  • FIG. 4 is a schematic block diagram of a video decoding device for implementing an embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of a video encoding and decoding device provided in an embodiment of the present application.
  • Figure 6a is a schematic diagram of the results after a wavelet transformation
  • Fig. 6b is a schematic diagram of the first context information and the second context information of the first data
  • Fig. 6c is a schematic diagram of the first context information and the second context information of the first preset area
  • Fig. 6d is a schematic structural diagram of a probability estimation network provided by an embodiment of the present application.
  • FIG. 6e is a schematic structural diagram of a residual network provided by an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of a video codec provided in an embodiment of the present application.
  • FIG. 8 is a schematic diagram of an encoding process provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of another encoding process provided by the embodiment of the present application.
  • FIG. 10 is a schematic diagram of a decoding process provided by an embodiment of the present application.
  • the embodiment of the present application provides an AI-based video image compression technology, especially a neural network-based video compression technology, and specifically provides a probability distribution and sampling-based decoding method to improve the traditional hybrid video codec system .
  • Video coding generally refers to the processing of sequences of images that form a video or video sequence.
  • the terms "picture”, “frame” or “image” may be used as synonyms.
  • Video coding (or commonly referred to as coding) includes two parts: video coding and video decoding.
  • Video encoding is performed on the source side and typically involves processing (eg, compressing) raw video images to reduce the amount of data needed to represent the video images (and thus more efficient storage and/or transmission).
  • Video decoding is performed at the destination and typically involves inverse processing relative to the encoder to reconstruct the video image.
  • the "encoding" of video images (or generally referred to as images) involved in the embodiments should be understood as “encoding” or “decoding” of video images or video sequences.
  • the encoding part and the decoding part are also collectively referred to as codec (encoding and decoding, CODEC).
  • the original video image can be reconstructed, ie the reconstructed video image has the same quality as the original video image (assuming no transmission loss or other data loss during storage or transmission).
  • further compression is performed by quantization, etc., to reduce the amount of data required to represent the video image, and the decoder side cannot completely reconstruct the video image, that is, the quality of the reconstructed video image is lower than that of the original video image. low or poor.
  • the neural network can be composed of neural units, and the neural unit can refer to an operation unit that takes xs and intercept 1 as input, and the output of the operation unit can be:
  • Ws is the weight of xs
  • b is the bias of the neural unit.
  • f is the activation function of the neural unit, which is used to introduce nonlinear characteristics into the neural network to convert the input signal in the neural unit into an output signal.
  • the output signal of the activation function can be used as the input of the next convolutional layer, and the activation function can be a sigmoid function.
  • a neural network is a network formed by connecting multiple above-mentioned single neural units, that is, the output of one neural unit can be the input of another neural unit.
  • the input of each neural unit can be connected with the local receptive field of the previous layer to extract the features of the local receptive field.
  • the local receptive field can be an area composed of several neural units.
  • Deep neural network also known as multi-layer neural network
  • DNN can be understood as a neural network with multiple hidden layers.
  • DNN is divided according to the position of different layers, and the neural network inside DNN can be divided into three categories: input layer, hidden layer, and output layer.
  • the first layer is the input layer
  • the last layer is the output layer
  • the layers in the middle are all hidden layers.
  • the layers are fully connected, that is, any neuron in the i-th layer must be connected to any neuron in the i+1-th layer.
  • DNN looks complicated, it is actually not complicated in terms of the work of each layer.
  • it is the following linear relationship expression: in, is the input vector, is the output vector, Is the offset vector, W is the weight matrix (also called coefficient), a() is the activation function.
  • Each layer is just an input vector After such a simple operation to get the output vector Due to the large number of DNN layers, the coefficient W and the offset vector The number is also higher.
  • DNN The definition of these parameters in DNN is as follows: Take the coefficient W as an example: Assume that in a three-layer DNN, the linear coefficient from the fourth neuron of the second layer to the second neuron of the third layer is defined as The superscript 3 represents the layer number of the coefficient W, and the subscript corresponds to the output third layer index 2 and the input second layer index 4.
  • the coefficient from the kth neuron of the L-1 layer to the jth neuron of the L layer is defined as
  • the input layer has no W parameter.
  • more hidden layers make the network more capable of describing complex situations in the real world. Theoretically speaking, a model with more parameters has a higher complexity and a greater "capacity", which means that it can complete more complex learning tasks.
  • Training the deep neural network is the process of learning the weight matrix, and its ultimate goal is to obtain the weight matrix of all layers of the trained deep neural network (the weight matrix formed by the vector W of many layers).
  • Convolutional neural network is a deep neural network with a convolutional structure.
  • the convolutional neural network contains a feature extractor composed of a convolutional layer and a subsampling layer, which can be regarded as a filter.
  • the convolutional layer refers to the neuron layer that performs convolution processing on the input signal in the convolutional neural network.
  • a neuron can only be connected to some adjacent neurons.
  • a convolutional layer usually contains several feature planes, and each feature plane can be composed of some rectangularly arranged neural units. Neural units of the same feature plane share weights, and the shared weights here are convolution kernels.
  • Shared weights can be understood as a way to extract image information that is independent of location.
  • the convolution kernel can be initialized in the form of a matrix of random size, and the convolution kernel can obtain reasonable weights through learning during the training process of the convolutional neural network.
  • the direct benefit of sharing weights is to reduce the connections between the layers of the convolutional neural network, while reducing the risk of overfitting.
  • Recurrent neural networks are used to process sequence data.
  • RNN Recurrent neural networks
  • the layers are fully connected, and each node in each layer is unconnected.
  • this ordinary neural network solves many problems, it is still powerless to many problems. For example, if you want to predict what the next word in a sentence is, you generally need to use the previous words, because the preceding and following words in a sentence are not independent. The reason why RNN is called a recurrent neural network is that the current output of a sequence is also related to the previous output.
  • RNN can process sequence data of any length.
  • the training of RNN is the same as that of traditional CNN or DNN.
  • RNN is designed to allow machines to have the ability to remember like humans. Therefore, the output of RNN needs to depend on the current input information and historical memory information.
  • the neural network can use the error back propagation (back propagation, BP) algorithm to correct the size of the parameters in the initial neural network model during the training process, so that the reconstruction error loss of the neural network model becomes smaller and smaller. Specifically, passing the input signal forward until the output will generate an error loss, and updating the parameters in the initial neural network model by backpropagating the error loss information, so that the error loss converges.
  • the backpropagation algorithm is a backpropagation movement dominated by error loss, aiming to obtain the optimal parameters of the neural network model, such as the weight matrix.
  • the encoder 20 and the decoder 30 are described with reference to FIGS. 1-3 .
  • FIG. 1 is a schematic block diagram of an exemplary decoding system 10 , such as a video decoding system 10 (or simply referred to as the decoding system 10 ), which may utilize the techniques of the present application.
  • Video encoder 20 (or simply encoder 20) and video decoder 30 (or simply decoder 30) in video coding system 10 represent devices, etc. that may be used to perform techniques according to various examples described in this application. .
  • the decoding system 10 includes a source device 12 for providing coded image data 21 such as coded images to a destination device 14 for decoding the coded image data 21 .
  • the source device 12 includes an encoder 20 , and optionally, an image source 16 , a preprocessor (or a preprocessing unit) 18 such as an image preprocessor, and a communication interface (or a communication unit) 22 .
  • Image source 16 may include or be any type of image capture device for capturing real world images, etc., and/or any type of image generation device, such as a computer graphics processor or any type of Devices for acquiring and/or providing real-world images, computer-generated images (e.g., screen content, virtual reality (VR) images, and/or any combination thereof (e.g., augmented reality (AR) images). So
  • the image source may be any type of memory or storage that stores any of the above images.
  • the image (or image data) 17 may also be referred to as an original image (or original image data) 17 .
  • the preprocessor 18 is used to receive (original) image data 17 and perform preprocessing on the image data 17 to obtain a preprocessed image (or preprocessed image data) 19 .
  • preprocessing performed by preprocessor 18 may include cropping, color format conversion (eg, from RGB to YCbCr), color grading, or denoising. It can be understood that the preprocessing unit 18 can be an optional component.
  • a video encoder (or encoder) 20 is used to receive preprocessed image data 19 and provide encoded image data 21 (to be further described below with reference to FIG. 2 etc.).
  • the communication interface 22 in the source device 12 may be used to receive the encoded image data 21 and send the encoded image data 21 (or any other processed version) via the communication channel 13 to another device such as the destination device 14 or any other device for storage Or rebuild directly.
  • the destination device 14 includes a decoder 30 , and may also optionally include a communication interface (or communication unit) 28 , a post-processor (or post-processing unit) 32 and a display device 34 .
  • the communication interface 28 in the destination device 14 is used to receive the coded image data 21 (or any other processed version) directly from the source device 12 or from any other source device such as a storage device, for example, the storage device is a coded image data storage device, And the coded image data 21 is supplied to the decoder 30 .
  • the communication interface 22 and the communication interface 28 can be used to pass through a direct communication link between the source device 12 and the destination device 14, such as a direct wired or wireless connection, etc., or through any type of network, such as a wired network, a wireless network, or any other Combination, any type of private network and public network or any combination thereof, send or receive coded image data (or coded data) 21 .
  • the communication interface 22 can be used to encapsulate the encoded image data 21 into a suitable format such as a message, and/or use any type of transmission encoding or processing to process the encoded image data, so that it can be transmitted over a communication link or communication network on the transmission.
  • the communication interface 28 corresponds to the communication interface 22, eg, can be used to receive the transmission data and process the transmission data using any type of corresponding transmission decoding or processing and/or decapsulation to obtain the encoded image data 21 .
  • Both the communication interface 22 and the communication interface 28 can be configured as a one-way communication interface as indicated by an arrow from the source device 12 to the corresponding communication channel 13 of the destination device 14 in FIG. 1, or a two-way communication interface, and can be used to send and receive messages etc., to establish the connection, confirm and exchange any other information related to the communication link and/or data transmission such as encoded image data transmission, etc.
  • the video decoder (or decoder) 30 is used to receive encoded image data 21 and provide decoded image data (or decoded image data) 31 (which will be further described below with reference to FIG. 3 , etc.).
  • the post-processor 32 is used to post-process the decoded image data 31 (also referred to as reconstructed image data) such as the decoded image to obtain post-processed image data 33 such as the post-processed image.
  • Post-processing performed by post-processing unit 32 may include, for example, color format conversion (e.g., from YCbCr to RGB), color grading, cropping, or resampling, or any other processing for producing decoded image data 31 for display by a display device 34 or the like. .
  • a display device 34 is used to receive the post-processed image data 33 to display the image to a user or viewer or the like.
  • Display device 34 may be or include any type of display for representing the reconstructed image, eg, an integrated or external display screen or display.
  • the display screen may include a liquid crystal display (LCD), an organic light emitting diode (OLED) display, a plasma display, a projector, a micro LED display, a liquid crystal on silicon (LCoS) display, or a liquid crystal on silicon (LCoS) display. ), a digital light processor (DLP), or any type of other display.
  • LCD liquid crystal display
  • OLED organic light emitting diode
  • plasma display e.g., a plasma display, a plasma display, a projector, a micro LED display, a liquid crystal on silicon (LCoS) display, or a liquid crystal on silicon (LCoS) display.
  • DLP digital light processor
  • the decoding system 10 also includes a training engine 25.
  • the specific training process implemented by the training engine 25 can be found in the subsequent description and will not be described here.
  • FIG. 1 shows the source device 12 and the destination device 14 as independent devices
  • the device embodiment may also include the source device 12 and the destination device 14 or the functions of the source device 12 and the destination device 14 at the same time, that is, include the source device 12 and the destination device 14 at the same time.
  • Device 12 or corresponding function and destination device 14 or corresponding function may be implemented using the same hardware and/or software or by separate hardware and/or software or any combination thereof.
  • Encoder 20 e.g., video encoder 20
  • decoder 30 e.g., video decoder 30
  • processing circuitry such as one or more microprocessors, digital signal processors (digital signal processor, DSP), application-specific integrated circuit (ASIC), field-programmable gate array (field-programmable gate array, FPGA), discrete logic, hardware, video encoding dedicated processor or any combination thereof .
  • Encoder 20 may be implemented by processing circuitry 46 to include the various modules discussed with reference to encoder 20 of FIG. 2 and/or any other encoder system or subsystem described herein.
  • Decoder 30 may be implemented by processing circuitry 46 to include the various modules discussed with reference to decoder 30 of FIG.
  • the processing circuitry 46 may be used to perform various operations discussed below. As shown in Figure 4, if part of the technology is implemented in software, the device can store the instructions of the software in a suitable non-transitory computer-readable storage medium, and use one or more processors to execute the instructions in hardware, thereby Perform the inventive technique.
  • One of the video encoder 20 and the video decoder 30 may be integrated in a single device as part of a combined codec (encoder/decoder, CODEC), as shown in FIG. 2 .
  • Source device 12 and destination device 14 may comprise any of a variety of devices, including any type of handheld or stationary device, such as a notebook or laptop computer, cell phone, smartphone, tablet or tablet computer, camera, desktop computers, set-top boxes, televisions, display devices, digital media players, video game consoles, video streaming devices (such as content service servers or content distribution servers), broadcast receiving devices, broadcast transmitting devices, etc., and may not Use or use any type of operating system.
  • source device 12 and destination device 14 may be equipped with components for wireless communication. Accordingly, source device 12 and destination device 14 may be wireless communication devices.
  • the video coding system 10 shown in FIG. 1 is merely exemplary, and the techniques provided herein are applicable to video coding settings (e.g., video coding or video decoding) that do not necessarily include coding devices and Decode any data communication between devices.
  • data is retrieved from local storage, sent over a network, and so on.
  • a video encoding device may encode and store data into memory, and/or a video decoding device may retrieve and decode data from memory.
  • encoding and decoding are performed by devices that do not communicate with each other but simply encode data to memory and/or retrieve and decode data from memory.
  • FIG. 2 is an illustrative diagram of an example of a video coding system 40 including video encoder 20 of FIG. 2 and/or video decoder 30 of FIG. 3, according to an example embodiment.
  • the video decoding system 40 may include an imaging device 41, a video encoder 20, a video decoder 30 (and/or a video encoder/decoder implemented by a processing circuit 46), an antenna 42, one or more processors 43, a or multiple memory stores 44 and/or a display device 45 .
  • imaging device 41 , antenna 42 , processing circuit 46 , video encoder 20 , video decoder 30 , processor 43 , memory storage 44 and/or display device 45 are capable of communicating with each other.
  • the video coding system 40 may include only the video encoder 20 or only the video decoder 30 .
  • antenna 42 may be used to transmit or receive an encoded bitstream of video data.
  • display device 45 may be used to present video data.
  • the processing circuit 46 may include application-specific integrated circuit (ASIC) logic, a graphics processor, a general-purpose processor, and the like.
  • the video decoding system 40 may also include an optional processor 43, and the optional processor 43 may similarly include application-specific integrated circuit (ASIC) logic, a graphics processor, a general-purpose processor, and the like.
  • the memory storage 44 can be any type of memory, such as volatile memory (for example, static random access memory (static random access memory, SRAM), dynamic random access memory (dynamic random access memory, DRAM), etc.) or non-volatile memory volatile memory (for example, flash memory, etc.) and the like.
  • volatile memory for example, static random access memory (static random access memory, SRAM), dynamic random access memory (dynamic random access memory, DRAM), etc.
  • non-volatile memory volatile memory for example, flash memory, etc.
  • memory storage 44 may be implemented by cache memory.
  • processing circuitry 46 may include memory (eg, cache, etc.) for implementing an image buffer or the like.
  • video encoder 20 implemented by logic circuitry may include an image buffer (eg, implemented by processing circuitry 46 or memory storage 44 ) and a graphics processing unit (eg, implemented by processing circuitry 46 ).
  • a graphics processing unit may be communicatively coupled to the image buffer.
  • Graphics processing unit may include video encoder 20 implemented by processing circuitry 46 to implement the various modules discussed with reference to FIG. 2 and/or any other encoder system or subsystem described herein.
  • Logic circuits may be used to perform the various operations discussed herein.
  • video decoder 30 may be implemented by processing circuitry 46 in a similar manner to implement the various aspects discussed with reference to video decoder 30 of FIG. 3 and/or any other decoder system or subsystem described herein. module.
  • logic circuit implemented video decoder 30 may include an image buffer (implemented by processing circuit 46 or memory storage 44 ) and a graphics processing unit (eg, implemented by processing circuit 46 ).
  • a graphics processing unit may be communicatively coupled to the image buffer.
  • Graphics processing unit may include video decoder 30 implemented by processing circuitry 46 to implement the various modules discussed with reference to FIG. 3 and/or any other decoder system or subsystem described herein.
  • antenna 42 may be used to receive an encoded bitstream of video data.
  • an encoded bitstream may contain data related to encoded video frames, indicators, index values, mode selection data, etc., as discussed herein, such as data related to encoding partitions (e.g., transform coefficients or quantized transform coefficients , (as discussed) an optional indicator, and/or data defining an encoding split).
  • Video coding system 40 may also include video decoder 30 coupled to antenna 42 and used to decode the encoded bitstream.
  • a display device 45 is used to present video frames.
  • the video decoder 30 may be used to perform a reverse process.
  • the video decoder 30 may be configured to receive and parse such syntax elements and decode the associated video data accordingly.
  • video encoder 20 may entropy encode the syntax elements into an encoded video bitstream.
  • video decoder 30 may parse such syntax elements and decode the related video data accordingly.
  • VVC Very video coding
  • VCEG Video Coding Experts Group
  • MPEG Motion Picture Experts Group
  • HEVC High-Efficiency Video Coding
  • JCT-VC Joint Collaboration Team on Video Coding
  • FIG. 3 is a schematic diagram of a video decoding device 300 provided by an embodiment of the present invention.
  • the video coding apparatus 300 is suitable for implementing the disclosed embodiments described herein.
  • the video decoding device 300 may be a decoder, such as the video decoder 30 in FIG. 1 , or an encoder, such as the video encoder 20 in FIG. 1 .
  • the video decoding device 300 includes: an input port 310 (or input port 310) for receiving data and a receiving unit (receiver unit, Rx) 320; a processor, a logic unit or a central processing unit (central processing unit) for processing data , CPU) 330;
  • the processor 330 here can be a neural network processor 330; a sending unit (transmitter unit, Tx) 340 and an output port 350 (or output port 350) for transmitting data; memory 360.
  • the video decoding device 300 may also include an optical-to-electrical (OE) component and an electrical-to-optical (EO) component coupled to the input port 310, the receiving unit 320, the transmitting unit 340 and the output port 350, For the exit or entrance of optical or electrical signals.
  • OE optical-to-electrical
  • EO electrical-to-optical
  • the processor 330 is realized by hardware and software.
  • Processor 330 may be implemented as one or more processor chips, cores (eg, multi-core processors), FPGAs, ASICs, and DSPs.
  • Processor 330 is in communication with ingress port 310 , receiving unit 320 , transmitting unit 340 , egress port 350 and memory 360 .
  • the processor 330 includes a decoding module 370 (eg, a neural network NN based decoding module 370 ).
  • the decoding module 370 implements the embodiments disclosed above. For example, the decode module 370 performs, processes, prepares, or provides for various encoding operations.
  • decoding module 370 is implemented as instructions stored in memory 360 and executed by processor 330 .
  • Memory 360 including one or more magnetic disks, tape drives, and solid-state drives, may be used as an overflow data storage device for storing programs when such programs are selected for execution, and for storing instructions and data that are read during program execution.
  • the memory 360 can be volatile and/or nonvolatile, and can be a read-only memory (ROM), a random access memory (RAM), a ternary content-addressable memory (ternary) content-addressable memory (TCAM) and/or static random-access memory (static random-access memory, SRAM).
  • ROM read-only memory
  • RAM random access memory
  • TCAM ternary content-addressable memory
  • SRAM static random-access memory
  • FIG. 4 is a simplified block diagram of an apparatus 400 provided by an exemplary embodiment.
  • the apparatus 400 may be used as either or both of the source device 12 and the destination device 14 in FIG. 1 .
  • Processor 402 in apparatus 400 may be a central processing unit.
  • processor 402 may be any other type of device or devices, existing or to be developed in the future, capable of manipulating or processing information. While the disclosed implementations can be implemented using a single processor, such as processor 402 as shown, it is faster and more efficient to use more than one processor.
  • memory 404 in apparatus 400 may be a read only memory (ROM) device or a random access memory (RAM) device. Any other suitable type of storage device may be used as memory 404 .
  • Memory 404 may include code and data 406 accessed by processor 402 via bus 412 .
  • Memory 404 may also include an operating system 408 and application programs 410, including at least one program that allows processor 402 to perform the methods described herein.
  • application programs 410 may include applications 1 through N, and also include a video coding application that performs the methods described herein.
  • Apparatus 400 may also include one or more output devices, such as display 418 .
  • display 418 may be a touch-sensitive display that combines the display with touch-sensitive elements that may be used to sense touch input.
  • Display 418 may be coupled to processor 402 via bus 412 .
  • bus 412 in device 400 is described herein as a single bus, bus 412 may include multiple buses. Additionally, secondary storage may be directly coupled to other components of device 400 or accessed over a network, and may include a single integrated unit such as a memory card or multiple units such as multiple memory cards. Accordingly, apparatus 400 may have a wide variety of configurations.
  • FIG. 5 is a schematic block diagram of an example of a video codec for implementing the technology of the present application.
  • the video encoder 20 includes an encoding unit 501 , a forward transform unit 502 and a probability estimation unit 503 ;
  • the video decoder 30 includes a decoding unit 504 , a sampling unit 505 and an inverse transform unit 506 .
  • the video codec shown in FIG. 5 may also be referred to as an end-to-end video codec or a video codec based on an end-to-end video codec.
  • the encoding unit 501 performs image encoding on the image to be encoded to obtain a compressed code stream.
  • the above-mentioned image encoding may be a joint photographic experts group (JPEG) encoding method, a JPEG2000 encoding method, an H.264 intra-frame encoding method, an H.265 intra-frame encoding method, or an H.266 intra-frame encoding method. method or other image encoding methods.
  • JPEG joint photographic experts group
  • the forward transformation unit 502 is used to transform the first image to obtain a first transformed image.
  • the first image is an image to be encoded or an image that has been decoded.
  • the forward transformation unit 502 is further configured to transform the second image to obtain a second transformed image.
  • the second image is an image to be encoded or a decoded image
  • the first image is different from the second image
  • N times of wavelet transformation are performed on the first image, 3N+1 subbands, each subband includes one or more wavelet coefficients, and N is an integer greater than 0.
  • the wavelet transform method may be a traditional wavelet transform or a deep network-based wavelet transform or other similar transform methods, which are not limited here.
  • the difference between the deep network-based wavelet transform method and the traditional wavelet transform lies in that the transformation and prediction are implemented using the deep network-based method, and the specific implementation method of the deep network is not limited here.
  • the image composed of subbands obtained by performing wavelet transform on the first image is the above-mentioned first transformed image.
  • the subband obtained by performing wavelet transform on the second image is the above-mentioned second converted image.
  • each wavelet coefficient is quantized to obtain multiple quantized wavelet coefficients.
  • each subband can be processed according to a preset order one, and then the wavelet coefficients in the current subband can be quantized according to a preset order two to obtain quantized wavelet coefficients, wherein the preset
  • the order one can be the existing zigzag scanning order, for example: LL1 ⁇ HL1 ⁇ LH1 ⁇ HH1.
  • the second preset order can be an existing zigzag scanning order, horizontal scanning order or vertical scanning order.
  • the wavelet coefficient before quantizing each wavelet coefficient, can be preprocessed to obtain the processed wavelet coefficient, and then the preprocessed wavelet coefficient can be quantized, for example: the obtained wavelet coefficient is subjected to a The neural network performs feature extraction, and then quantifies the feature extraction results. Processing the wavelet coefficients before quantization can enable the decoder to decode and obtain a high-quality first reconstructed image.
  • the image composed of quantized wavelet coefficients obtained by quantizing the wavelet coefficients obtained by performing wavelet transformation on the first image is the above-mentioned first transformed image.
  • the second image based on the first image
  • the image formed by the quantized wavelet coefficients obtained by quantizing the wavelet coefficients obtained by performing wavelet transformation on the second image is the second transformed image.
  • DCT is performed on the first image to obtain a DCT image
  • the DCT image includes a plurality of frequency bands, and each frequency band includes one or more DCT coefficients; wherein, after the first image is transformed, its low-frequency components are concentrated in In the upper left corner, the high-frequency components are distributed in the lower right corner, where the coefficient value in the first row and first column represents the direct current (DC) coefficient, that is, the average value of the first image, and the other coefficients are the alternating current (AC) coefficient, DC coefficient and AC coefficient Collectively referred to as DCT coefficients.
  • DC direct current
  • AC alternating current
  • block division is performed on the first image to obtain multiple image blocks, and then DCT is performed in units of image blocks to obtain transform blocks.
  • DCT is performed in units of image blocks to obtain transform blocks.
  • 2) dividing the first image to obtain one or more image blocks, and the size of the image blocks is not limited.
  • the first image may be divided using a quadtree, binary tree or ternary tree division method in an existing encoding standard (H266, H265, H264, AVS2 or AVS3) to obtain one or more image blocks.
  • a frequency band can be understood as a coefficient block (a coefficient block obtained by performing DCT transformation on an image block, because the DCT transformation is based on a block) or as coefficients at the same position in each coefficient block to form a frequency band.
  • the image formed based on the DCT coefficients obtained from the first image is the above-mentioned first transformed image.
  • the second image may also be processed in the above manner to obtain DCT coefficients of the second image, and the DCT coefficients of the second image may constitute the second transformed image.
  • the obtained DCT coefficients are quantized, such as uniformly quantized, to obtain quantized DCT coefficients.
  • the image formed based on the quantized DCT coefficients obtained for the first image is the above-mentioned first transformed image.
  • the image formed based on the quantized DCT coefficients obtained for the second image is The image is the above-mentioned second transformed image.
  • feature extraction is performed on the first image to obtain a three-dimensional feature map, and the three-dimensional feature map is the above-mentioned first transformed image.
  • the feature coefficients in the three-dimensional feature map are quantized to obtain quantized feature coefficients, and the three-dimensional feature map formed by the quantized feature coefficients is the first transformed image.
  • the above-mentioned processing can be performed on the second image, and the obtained three-dimensional feature map is the above-mentioned second transformed image; or the three-dimensional feature map composed of quantized feature coefficients obtained by quantizing the feature coefficients in the three-dimensional feature map is the above-mentioned first three-dimensional feature map. 2 Transform the image.
  • the forward transformation unit 502 is optional, so it is represented by a dotted box in FIG. 5 . That is to say, when the forward transformation unit 502 does not exist, the image in the pixel domain is input to the probability estimation unit 503 .
  • the probability estimation unit 503 performs probability estimation according to the first context information of the first data to obtain a probability estimation result of the first data.
  • the first data is a pixel of the first image
  • the first context information of the pixel includes all or part of the pixels in the first image.
  • the first context information of the pixel includes the pixels adjacent to the pixel in the first image, or includes part or all of the pixels in the image block adjacent to the pixel, or includes the pixels in the image block where the pixel is located. some or all of the pixels.
  • surrounding pixels refer to pixels whose distance from the first data is smaller than a preset threshold, and the unit of the preset threshold is "pixel”.
  • the first data is a coefficient in the first transformed image, and if the first data is a wavelet coefficient or a quantized wavelet coefficient, the first context information of the first data includes some or all coefficients in the first transformed image , the coefficient is a wavelet coefficient or a quantized wavelet coefficient.
  • the first context information of the first data includes wavelet coefficients or quantized wavelet coefficients around the first data in the first transformed image, or the first context information includes part or all of the subbands adjacent to the first data A coefficient, the coefficient is a wavelet coefficient or a quantized wavelet coefficient; or the first context information includes part or all of the coefficients in the sub-band where the first data is located; the coefficient is a wavelet coefficient or a quantized wavelet coefficient;
  • the first context information of the first data includes part or all of the coefficients in the first transformed image, and the coefficients are DCT coefficients or quantized DCT coefficients.
  • the first context information of the first data includes DCT coefficients or quantized DCT coefficients around the first data in the first transformed image, or the first context information includes some or all of the coefficients in the subband adjacent to the first data. It is a DCT coefficient or a quantized DCT coefficient; or the first context information includes some or all coefficients in the subband where the first data is located; the coefficient is a DCT coefficient or a quantized DCT coefficient;
  • the first context information of the first data includes part or all of the first transformed image Coefficients, the coefficients are characteristic coefficients or quantized characteristic coefficients; further, the first context information of the first data includes characteristic coefficients or quantized characteristic coefficients around the first data in the first transformed image, or the first context information includes the first data Some or all of the coefficients in the channel.
  • the coefficient is a characteristic coefficient or a quantized characteristic coefficient.
  • the aforementioned “surrounding wavelet coefficients or quantized wavelet coefficients” refer to wavelet coefficients or quantized wavelet coefficients whose distance from the first data is smaller than a preset threshold, and the unit of the preset threshold is "wavelet coefficients or quantized wavelet coefficients";
  • “Surrounding DCT coefficients or quantized DCT coefficients” refer to DCT coefficients or quantized DCT coefficients whose distance from the first data is less than a preset threshold, and the unit of the preset threshold is "DCT coefficients or quantized DCT coefficients”;
  • the above “ Surrounding characteristic coefficients or quantified characteristic coefficients” refer to characteristic coefficients or quantified characteristic coefficients whose distance from the first data is smaller than a preset threshold, and the unit of the preset threshold is "characteristic coefficient or quantified characteristic coefficient”.
  • the probability estimation unit 503 performs probability estimation according to the first context information of the first data to obtain a probability estimation result of the first data, including:
  • the probability estimation unit 503 performs probability estimation according to the first context information and the second context information of the first data to obtain a probability estimation result of the first data; wherein, the first context information and the second context information are respectively based on the first image and the second image get.
  • the first context information of the first data includes the pixels around the pixel at position P in the image to be encoded (the gray block in Figure 6b shown), or include some or all of the pixels in the image block adjacent to the pixel at position P, or include some or all of the pixels in the image block where the pixel at position P is located;
  • the second context information includes position P in the decoded image
  • the pixels around the pixel at position or include some or all of the pixels in the image block adjacent to the pixel at position P, or include some or all of the pixels in the image block where the pixel at position P is located.
  • the first context information of the first data includes coefficients around the coefficient at position P in the first transformed image, and the coefficient is the wavelet coefficient or quantized wavelet coefficient
  • the wavelet coefficients, or the first context information includes some or all of the coefficients in the subband adjacent to the coefficient at position P, and the coefficients are wavelet coefficients or quantized wavelet coefficients, or the first context information includes the subband where the coefficient at position P is located Part or all of the coefficients, the coefficients are wavelet coefficients or quantized wavelet coefficients
  • the second context information includes coefficients around the coefficients at position P in the second transformed image, and the coefficients are wavelet coefficients or quantized wavelet coefficients
  • the second context information includes Some or all of the coefficients in the subband adjacent to the coefficient at position P, the coefficients are wavelet coefficients or quantized wavelet coefficients, or the second context information includes some or all of the coefficients in the subband where the coefficient at position P is
  • the first context information of the first data includes coefficients around the coefficient at position P in the first transformed image, and the coefficient is a DCT coefficient or quantized DCT coefficients, or the first context information includes part or all of the coefficients in the frequency band adjacent to the coefficient at position P, and the coefficients are DCT coefficients or quantized DCT coefficients, or the first context information includes the part in the frequency band where the coefficient at position P is located Or all coefficients, the coefficients are DCT coefficients or quantized DCT coefficients; the second context information includes coefficients around the coefficient at position P in the second transformed image, and the coefficients are DCT coefficients or quantized DCT coefficients, or the second context information includes the same as the position Some or all of the coefficients in the frequency band adjacent to the coefficient at P, the coefficients are DCT coefficients or quantized DCT coefficients, or the second context information includes some or all of the coefficients in the frequency band where the coefficient at position P
  • the first context information of the first data includes characteristic coefficients or quantized characteristic coefficients around the coefficient at position P in the first transformed image, or the first context information includes some or all coefficients in channels adjacent to the characteristic coefficient at position P, The coefficient is a characteristic coefficient or a quantized characteristic coefficient, or the first context information includes part or all of the coefficients in the channel where the coefficient at the position P is located, and the coefficient is a characteristic coefficient or a quantized characteristic coefficient;
  • the second context information includes the position in the second transformed image Characteristic coefficients or quantized characteristic coefficients around the coefficient at position P, or the second context information includes some or all coefficients in the channel adjacent to the coefficient at position P, and the coefficients are characteristic coefficients or quantized characteristic coefficients, or the second context information includes Part or all of the coefficients in the channel from which the coefficient at position P
  • the probability estimation unit 503 further performs probability estimation according to the first context information of the second data to obtain a probability estimation result of the second data.
  • the second data and the first data belong to data in different positions of the same image (such as the first image or the first transformed image obtained by transforming the first image), and the probability is calculated according to the first context information of the second data.
  • the probability estimation result of the second data refer to the related description of obtaining the probability estimation result of the first data by performing probability estimation according to the first context information of the first data, and will not be described here again.
  • the first data and the second data belong to the same preset area
  • the preset area may be an image block in the first image, or a subband obtained by performing wavelet transform on the first image, Or for the frequency band obtained by performing DCT on the first image, or a channel of the three-dimensional feature map obtained by performing feature extraction on the first image, only one probability estimation result can be obtained during probability estimation, and the probability estimation result can be called preset Probability estimates for the region.
  • the probability estimation result that is, the probability estimation result of the preset area
  • the following describes how to obtain the probability estimation result of the first preset area.
  • Method 1 For each data in the first preset area, the probability estimation result of all data in the first preset area can be obtained by processing according to the above method of obtaining the probability estimation result of the first data, such as the first preset area If there are 5 data in it, 5 probability estimation results can be obtained; then the target probability estimation result is selected from the probability estimation results of all the data in the first preset area as the probability estimation result of the first preset area. For example, the probability estimation result of the data located in the middle of the first preset area, or the upper left corner or the upper right corner, the lower left corner or the lower right corner is the probability estimation result of the first preset area.
  • Method 2 Perform probability estimation according to the first context information of the first preset area to obtain the probability estimation result of the first preset area; or perform probability estimation according to the first context information and the second context information of the first preset area to obtain the second A probability estimation result of a preset area.
  • the first context information of the first preset area includes some or all pixels in the first image, further, the first preset area
  • the first context information includes some or all pixels in the image blocks around the first preset area in the first image
  • the first context information of the first preset area includes some or all coefficients in the first transformed image, further, The first context information of the first preset area includes some or all coefficients in subbands around the first preset area in the first image, the coefficients being wavelet coefficients or quantized wavelet coefficients;
  • the first context information of the first preset area includes some or all coefficients in the first transformed image, further, The first context information of the first preset area includes some or all coefficients in the frequency band around the first preset area in the first image, and the coefficients are DCT coefficients or quantized DCT coefficients;
  • the first context information of the first preset area includes some or all coefficients in the first transformed image , the coefficient is a characteristic coefficient or a quantized characteristic coefficient. Further, the first context information of the first preset area includes some or all coefficients in the channel to which the first preset area belongs in the first image, and the coefficient is a characteristic coefficient or a quantized characteristic coefficient.
  • the probability estimation unit 503 performs probability estimation according to the first context information and the second context information of the first preset area to obtain the probability estimation result of the first preset area, wherein the first context information and the second context information are respectively based on the first context information
  • An image and a second image wherein the second image is an image to be encoded or a decoded image, and the first image is different from the second image.
  • the first context information of the first preset area includes the surrounding area of area B in the first image (the left figure in Figure 6c Part or all of the pixels in the gray block shown in FIG. 6 c ); the second context information includes part or all of the pixels in the surrounding area of area B in the second image (the gray block shown in the right figure in FIG. 6 c ).
  • Region B in the first image is an image block in the first image.
  • the first context information of the first preset area includes all or part of the coefficients in the subbands around subband B in the first transformed image
  • the second context information includes All or part of the coefficients in subbands around subband B in the second transformed image are wavelet coefficients or quantized wavelet coefficients.
  • the first context information of the first preset area includes all or part of the coefficients in the frequency band around subband B in the first transformed image
  • the second context information includes the second Transform all or part of the coefficients in the frequency band around the sub-band B in the image
  • the coefficients are DCT coefficients or quantized DCT coefficients.
  • the first context information of the first preset area includes all or part of the coefficients in the subbands around channel B in the first transformed image
  • the second context information includes the second Transform all or part of the coefficients in channels around channel B in the image
  • the coefficients are wavelet coefficients or quantized wavelet coefficients.
  • the probability estimation unit 503 obtains the probability distribution model of the first data; and performs the first context information and/or the second context information of the first data through the first probability estimation network. Processing to obtain the parameters of the probability distribution model; obtain the probability distribution of the first data according to the probability distribution model of the first data and the parameters of the probability distribution model; the probability estimation result of the above-mentioned first data includes the probability distribution of the above-mentioned first data , or the parameters of the probability distribution model of the above-mentioned first data;
  • the probability estimation result of the first data includes the probability distribution of the first data, Or include parameters of a probability distribution model corresponding to the probability distribution, wherein the first probability estimation network and the second probability estimation network are implemented based on a neural network.
  • the probability estimation result of the first preset area can be obtained in the following manner:
  • the probability estimation unit 503 obtains the probability distribution model of the first preset area; processes the first context information and/or the second context information of the first preset area through a third probability estimation network to obtain parameters of the probability distribution model ; According to the probability distribution model of the first preset area and the parameters of the probability distribution model, the probability distribution of the first preset area is obtained; wherein, the probability estimation result of the first preset area includes the probability distribution of the first preset area , or the parameters of the probability distribution model of the above-mentioned first preset area;
  • the probability estimation result of the first preset area includes the second A probability distribution of a preset area, or parameters of a probability distribution model corresponding to the probability distribution; wherein, the third probability estimation network and the fourth probability estimation network are realized based on a neural network.
  • the above probability distribution model may be: a single Gaussian model (Gaussian single model, GSM), an asymmetric Gaussian model, a mixed Gaussian model (Gaussian mixture model, GMM) or a Laplace distribution model (Laplace distribution).
  • the probability estimation network can be implemented based on a deep learning network, such as a recurrent neural network (recurrent neural network, RNN) and a pixel convolutional neural network (Pixel convolutional neural network, PixelCNN), etc., which are not limited here.
  • the parameters of the probability distribution model are parameters of the Gaussian model, including mean ⁇ and variance ⁇ .
  • the parameters of the probability distribution model are parameters of the Laplace distribution model, including a location parameter ⁇ and a scale parameter b.
  • a typical probability estimation network based on PixelCNN (including the first probability estimation network, the second probability estimation network, the third probability estimation network and the fourth probability estimation network) is shown in Fig. 6d.
  • "h ⁇ w” indicates that the current convolutional layer uses a convolution kernel with a size of "h ⁇ w”
  • "ResB” indicates the residual module
  • the structure is shown in Figure 6e
  • "*/relu” indicates that relu is used after the current layer activation function.
  • the probability estimation unit 503 performs preprocessing on the approximate estimation result of the first data to obtain a processed probability estimation result. Specifically, if the probability estimation result of the first data includes the mean and variance of the Gaussian distribution, the variance of the Gaussian distribution is processed to obtain the processed variance, and the mean and the processed variance of the Gaussian distribution are used as the processed probability of the first data estimated results; or,
  • the mean value of the Gaussian distribution is processed to obtain the processed mean value, and the variance of the Gaussian distribution and the processed mean value are used as the processed probability estimation result of the first data.
  • the variance of the Gaussian distribution is processed to obtain the processed variance, including:
  • the variance of the Gaussian distribution is set to 0 as the variance after processing.
  • the variance of the Gaussian distribution is processed to obtain the processed variance, including:
  • the scaling factor of the first data is the same as the scaling factor of the second data; or,
  • the scaling factor of the first data and the scaling factor of the second data are different; or,
  • the scaling factor of the first data and the scaling factor of the second data are the same; or if the first data and the second data belong to different image blocks, then The scaling factor of the first data is different from the scaling factor of the second data; or the scaling factor of the first data is determined according to the texture complexity of the image block to which the first data belongs;
  • the scaling factor of the first data is the same as that of the second data; or if the first data and the second data The two data belong to different subbands, and the scaling factor of the first data is different from the scaling factor of the second data; or the scaling factor of the first data is determined according to the texture complexity of the subband to which the first data belongs;
  • the scaling factor of the first data and the scaling factor of the second data are the same; or if the first data and the second data belong to Different frequency bands, the scaling factor of the first data is different from the scaling factor of the second data; if or the scaling factor of the first data is determined according to the texture complexity of the frequency band to which the first data belongs;
  • the scaling factor of the first data and the scaling factor of the second data are the same; or if the first data and the second data belong to different channels, the scaling factor of the first data is different from the scaling factor of the second data; or the scaling factor of the first data is determined according to the texture complexity of the channel to which the first data belongs.
  • the scale parameter of the Laplace distribution is processed according to the scaling factor of the first data, and the processing of the first data
  • the final probability estimation results include the processed scale parameters and the location parameters of the Laplace distribution.
  • the location parameter of the Laplace distribution is processed according to the scaling factor of the first data, and the processing of the first data
  • the final probability estimation results include the processed location parameters and the scale parameters of the Laplace distribution.
  • the probability estimation unit 503 preprocesses the approximate estimation result of the first preset area to obtain the processed probability estimation result. Specifically, if the probability estimation result of the first preset area includes the mean value and variance of the Gaussian distribution, the variance of the Gaussian distribution is processed to obtain the processed variance, and the mean value and the processed variance of the Gaussian distribution are used as the value of the first preset area. Processed probability estimates; or,
  • processing the variance of the Gaussian distribution to obtain the processed variance includes: setting the variance of the Gaussian distribution to 0 as the processed variance.
  • the variance of the Gaussian distribution is processed to obtain the processed variance, including:
  • the scaling factor of the first preset area is the same as that of other preset areas; or,
  • the scaling factor of the first preset area is different from that of other preset areas.
  • the scale parameter of the Laplace distribution is processed according to the scaling factor of the first preset area, the first The processed probability estimation result of a preset area includes processed scale parameters and location parameters of Laplace distribution.
  • the location parameter of the Laplace distribution is processed according to the scaling factor of the first preset area, the first The processed probability estimation result of a preset area includes processed position parameters and scale parameters of Laplace distribution.
  • the encoding unit 501 directly writes the probability estimation results of the first data and the probability estimation results of the second data into the compressed code stream.
  • the probability estimation result of the first data and the probability estimation result of the second data can be stored in the sequence header (sequence header), image header (picture header), Slice (slice header) or attached Enhanced information (supplemental enhancement information, SEI) is transmitted to the decoder 30.
  • the first flag enable_flag of the first preset area is set to a first value (for example, 1 or true), to indicate that the decoding end obtains the first Use the same probability distribution when estimating coefficients in a preset area, that is, the probability estimation result of the first preset area, and save the probability estimation result of the first preset area in the probability estimation result set, and record the first preset
  • the probability estimation result of the region is indexed in the probability estimation result set and the size information of the first preset region, and the encoding unit 501 writes the probability estimation result set, enable_flag, index and size information of the first preset region into the compressed code stream.
  • the probability estimation result set may be transmitted to the decoder 30 through an adaptation parameter set (APS).
  • APS adaptation parameter set
  • the enable_flag of the first preset area is set to the first value (such as 1 or true) to indicate that the first preset is obtained by sampling at the decoding end.
  • the encoding unit 501 writes the probability estimation result of the first preset area, enable_flag and the size information of the first preset area into the compressed code flow.
  • the enable_flag of the first preset area is set to the second value (such as 0 or false), and the encoding unit 501 sets the The respective probability estimation results of all the data in a preset area and the enable_flag of the first preset area are written into the compressed code stream.
  • the coding unit 501 also writes the size information of the first preset area into the compressed code stream.
  • the encoding unit does not write the size information of the preset area into the code stream.
  • the encoding end and the decoding end can negotiate the size of the preset area, and save the size of the preset area in the codec in advance. terminal and decoding terminal.
  • the decoding unit 504 decodes the compressed code stream to obtain a first probability estimation result.
  • the decoding unit 504 further decodes the compressed code stream to obtain the second probability estimation result.
  • the first probability estimation result includes parameters of the first probability distribution or the first probability distribution model.
  • the second probability estimation result includes parameters of the second probability distribution or the second probability distribution model.
  • the decoding unit 504 also decodes the first identifier from the compressed code stream. If the first identifier is the first value, it means that the same probability estimation result ( That is, the probability estimation result of the first preset area), the first preset area is an area in the enhanced image; the decoding unit 504 also decodes the probability estimation result set and the index of the first preset area from the compressed code stream, The probability estimation result set includes probability estimation results of multiple preset areas, and the decoding unit 504 obtains the probability estimation results of the first preset area from the probability estimation result set according to the index of the first preset area. Probability estimate results;
  • the decoding unit 504 decodes the size information H1 of the first preset area from the code stream *W1, indicating that the decoding unit 504 decodes H1*W1 probability estimation results from the compressed code stream, and the sampling unit 505 can obtain all estimated coefficients in the first preset area by sampling the H1*W1 probability estimation results, H1 and W1 are all integers greater than 1.
  • the decoding unit 504 also decodes the first identifier from the compressed code stream, indicating that the same probability estimation result (that is, the probability estimation result of the first preset area) is used when sampling to obtain all estimated coefficients in the first preset area ), the first preset area is an area in the enhanced image, and the decoding unit 504 also decodes the probability estimation result and H1*W1 of the first preset area from the code stream, and the sampling unit 505 passes the The probability estimation result is sampled H1*W1 times to obtain H1*W1 estimated coefficients, that is, the first preset area includes H1*W1 estimated coefficients.
  • the sampling unit 505 performs sampling according to the first probability estimation result to obtain the first estimated coefficient, and performs sampling according to the first probability estimation result to obtain the second estimated coefficient. Since the two sampling processes are consistent, the following uses sampling according to the first probability estimation result to obtain the second estimated coefficient An estimated coefficient to specify.
  • the first probability estimation result includes the mean and variance of the Gaussian distribution
  • the sampling unit 505 performs sampling according to the first probability estimation result to obtain the first estimation coefficient, including:
  • erf() is the Gaussian error function, which is the cumulative distribution function of the standard normal distribution, defined as follows:
  • the specific processing process includes: setting the variance of the first probability estimation result to 0 as the processed variance; and then according to the processed variance and The mean value of the first probability estimation result is sampled according to the above sampling manner to obtain the first estimated coefficient.
  • the mean value of the first probability estimation result is processed according to the scaling factor of the first estimation coefficient, and then according to the processed mean value and the variance of the first probability estimation result, sampling is performed according to the above sampling method to obtain The first estimated coefficient.
  • sampling may be performed according to the second probability estimation result in the above manner to obtain the second estimated coefficient.
  • the scaling factor of the first estimated coefficient is the same as the scaling factor of the second estimated coefficient;
  • the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient are different; or,
  • the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient is quantized wavelet coefficients or wavelet coefficients
  • the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient The factors are the same; or if the first estimated coefficient and the second estimated coefficient belong to different subbands, the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient are different; or the scaling factor of the first estimated coefficient is based on the first estimated Determined by the texture complexity of the subband to which the coefficient belongs;
  • the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient are quantized DCT coefficients or DCT coefficients. If the first estimated coefficient and the second estimated coefficient belong to the same frequency band, the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient The factors are the same; or if the first estimated coefficient and the second estimated coefficient belong to different frequency bands, the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient are different; or the scaling factor of the first estimated coefficient is based on the first estimated coefficient
  • the texture complexity of the band to which it belongs is determined;
  • the scaling factor of the first estimated coefficient is the same as the scaling factor of the second estimated coefficient; Or if the first estimated coefficient and the second estimated coefficient belong to different channels, the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient are different; or the scaling factor of the first estimated coefficient is based on the channel to which the first estimated coefficient belongs
  • the texture complexity is determined.
  • the first estimated coefficient and the second estimated coefficient are pixel values
  • the variance of the first probability estimation result is preprocessed to obtain the processed variance, including:
  • the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient are the same, or the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient are not the same; or, the scaling factor of the first estimated coefficient is based on the first estimate Determined by the texture complexity of the channel the coefficient belongs to.
  • reconstructed images with different properties can be obtained according to user requirements. For example, if the variance of the first probability distribution is set to 0 as the variance after processing, the reconstructed image with the best signal quality (best objective quality) can be obtained, that is, the PSNR of the image can be increased or the MSE can be reduced; by scaling multiple coefficients If the factors are set to be the same, the image with the best subjective quality can be obtained, that is, to reduce the PSNR of the image or to increase the MSE of the image; The scaling factors of the coefficients are set to be different, and images whose properties are between the best subjective quality and the best objective quality can be obtained.
  • the probability estimation is performed according to the first probability estimation result to obtain the first estimated coefficient, including:
  • the scale parameter of the Laplace distribution is processed according to the scaling factor of the first coefficient, and then according to the processed scale parameter and the position parameter of the Laplacian distribution, the above sampling method is performed Sampling to obtain the first estimated coefficients.
  • the location parameter of the Laplace distribution is processed according to the scaling factor of the first coefficient, and then the sampling method is performed according to the processed location parameter and the scale parameter of the Laplace distribution Sampling to obtain the first estimated coefficients.
  • a plurality of estimated coefficients can be obtained, and the plurality of estimated coefficients include a first estimated coefficient and a second estimated coefficient.
  • the inverse transformation unit 506 obtains the enhanced image according to a plurality of estimated coefficients
  • the inverse transform unit 506 performs inverse quantization and wavelet inverse transform on the multiple estimated coefficients to obtain an enhanced image, or,
  • the inverse transform unit 506 performs wavelet inverse transform on the multiple estimated coefficients to obtain an enhanced image, or,
  • the inverse transform unit 506 performs inverse quantization and inverse DCT on the multiple estimated coefficients to obtain a reconstructed image, or,
  • the inverse transform unit 506 performs inverse DCT on the multiple estimated coefficients to obtain an enhanced image.
  • the enhanced image is obtained based on the multiple estimated coefficients.
  • the feature map may be passed through a neural network to output the enhanced image.
  • the neural network can adopt any structure, such as a fully connected network, a convolutional neural network, a recurrent neural network, and the like.
  • the neural network can adopt a deep neural network structure with a multi-layer structure to obtain the first reconstructed image or the second reconstructed image with better quality.
  • the feature map can be input into a machine vision task module to perform corresponding machine tasks.
  • complete machine vision tasks such as object classification, recognition, and segmentation.
  • the encoding end scheme of this embodiment is to obtain a compressed bit stream after encoding the image to be encoded, and then refer to the encoded information (such as the compressed bit stream or the coefficient information transformed during the encoding process) to perform probability estimation.
  • the solution at the decoding end of this embodiment is performed on the premise that the compressed code stream is decoded to obtain a decoded image. It can also be said that the solution at the decoding end of this embodiment is a post-processing process.
  • the probability estimation is performed at the encoding end, the probability estimation result is obtained, and the probability estimation result is transmitted to the decoding end.
  • the decoding end performs sampling based on the probability estimation result to obtain the estimated coefficient, and the estimated coefficient obtained by re-sampling obtains an enhanced image. Since the sampling process is random and is an uncertain process, multiple high-quality images of different properties can be obtained by decoding the same compressed code stream multiple times in the above-mentioned manner. For example, the image with the best subjective quality and the image with the best objective quality.
  • the encoding unit 501 first performs probability estimation on the first data to obtain the probability estimation result of the first data, which is called the probability estimation result A; Entropy encoding is performed on the first data; in the process of entropy decoding, the decoding unit 504 first performs probability estimation on the first data to obtain the probability estimation result of the first data, which may also be called probability estimation result A; The estimated result A is entropy decoded.
  • the probability estimation result mentioned in the above embodiment is called the probability estimation result B.
  • entropy encoding is performed on the first data at the encoding end according to the probability estimation result A, and the decoding end performs probability estimation on the first data according to the manner in which the encoding end performs probability estimation on the first data, and obtains the probability estimation result (also can be As the probability estimation result A), entropy decoding is performed according to the probability estimation result A, and sampling may also be performed according to the probability estimation result A, and the sampling method is consistent with the above-mentioned embodiment.
  • entropy encoding is performed on the first data at the encoding end according to the probability estimation result A, and the probability estimation result A is transmitted to the decoding end, and the decoding end performs entropy decoding according to the probability estimation result A, and may also perform Sampling, the sampling method is consistent with the above-mentioned embodiment.
  • entropy encoding is performed on the first data at the encoding end according to the probability estimation result B, the encoding end sends the probability estimation result B to the decoding end, and the decoding end performs entropy decoding according to the probability estimation result B, and can also perform entropy decoding according to the probability estimation result B Sampling, the sampling method is consistent with the above-mentioned embodiment.
  • entropy encoding is performed on the first data at the encoding end according to the probability estimation result B; the probability estimation is performed on the first data at the decoding end to obtain the probability estimation result B, and then entropy decoding is performed according to the probability estimation result B, and it is also possible to obtain the probability estimation result B according to The probability estimation result B is sampled, and the sampling method is consistent with the above-mentioned embodiment.
  • FIG. 7 is a schematic block diagram of an example of another video codec for implementing the technology of the present application.
  • the video encoder 20 includes a coefficient acquisition unit 701, a probability estimation unit 702, and an entropy encoding unit 703;
  • the video decoder 30 includes an entropy decoding unit 704, a sampling unit 705, a first reconstruction unit 706, and a second reconstruction unit.
  • the video codec shown in FIG. 5 may also be referred to as an end-to-end video codec or a video codec based on an end-to-end video codec.
  • the coefficient obtaining unit 701 obtains a plurality of coefficients from the image to be encoded, and the plurality of coefficients include a first coefficient.
  • the multiple coefficients can be multiple pixels.
  • the coefficient acquiring unit 701 divides the image to be coded into image blocks of a preset size, and the sizes of the image blocks of the preset size may be 4x4, 8x8, 16x16, 32x32, 64x64, 128x128, and 256x256. Or 2)
  • the coefficient acquisition unit 701 divides the image to be coded to obtain one or more image blocks, and the size of the image blocks is not limited.
  • the quadtree, binary tree or ternary tree division method in existing encoding standards H266, H265, H264, AVS2 or AVS3
  • Each image block includes one or more pixels.
  • the image to be coded is subjected to wavelet transformation N times, 3N+1 subbands, each subband includes one or more wavelet coefficients, and N is an integer greater than 0.
  • the wavelet transform method may be a traditional wavelet transform or a deep network-based wavelet transform or other similar transform methods, which are not limited here.
  • the difference between the deep network-based wavelet transform method and the traditional wavelet transform lies in that the transformation and prediction are implemented using the deep network-based method, and the specific implementation method of the deep network is not limited here.
  • the image composed of subbands obtained by performing wavelet transformation on the image to be coded is the first transformed image.
  • the subband composition obtained by performing wavelet transformation on the decoded image The image of is the above-mentioned second transformed image.
  • each wavelet coefficient is quantized to obtain multiple quantized wavelet coefficients.
  • each subband can be processed according to a preset order one, and then the wavelet coefficients in the current subband can be quantized according to a preset order two to obtain quantized wavelet coefficients, wherein the preset
  • the order one can be the existing zigzag scanning order, for example: LL1 ⁇ HL1 ⁇ LH1 ⁇ HH1.
  • the second preset order can be an existing zigzag scanning order, horizontal scanning order or vertical scanning order.
  • the wavelet coefficient before quantizing each wavelet coefficient, can be preprocessed to obtain the processed wavelet coefficient, and then the preprocessed wavelet coefficient can be quantized, for example: the obtained wavelet coefficient is subjected to a The neural network performs feature extraction, and then quantifies the feature extraction results. Processing the wavelet coefficients before quantization can enable the decoder to decode and obtain a high-quality first reconstructed image.
  • the above multiple coefficients may be multiple wavelet coefficients or quantized wavelet coefficients.
  • the coefficient acquisition unit 701 performs DCT on the image to be encoded to obtain a DCT image
  • the DCT image includes multiple frequency bands, and each frequency band includes one or more DCT coefficients; wherein, after the image to be encoded is transformed, its low frequency components They are all concentrated in the upper left corner, and the high-frequency components are distributed in the lower right corner.
  • the coefficient values in the first row and first column represent direct current (DC) coefficients, that is, the average value of the image to be encoded, and the other coefficients are alternating current (AC) coefficients, DC coefficients and AC coefficients are collectively referred to as DCT coefficients.
  • DC direct current
  • AC alternating current
  • the coefficient acquiring unit 701 divides the image to be coded into blocks to obtain multiple image blocks, and then performs DCT in units of image blocks to obtain transform blocks. For example 1) Divide the image to be coded into image blocks of a preset size, and the size of the image blocks of the preset size may be 4x4, 8x8, 16x16, 32x32, 64x64, 128x128, and 256x256. Or 2) dividing the image to be coded to obtain one or more image blocks, and the size of the image blocks is not limited.
  • the quadtree, binary tree or ternary tree division method in existing encoding standards H266, H265, H264, AVS2 or AVS3
  • H266, H265, H264, AVS2 or AVS3 can be used to divide the image to be encoded to obtain one or more image blocks.
  • the image formed based on the DCT coefficients obtained from the image to be encoded is the above-mentioned first transformed image.
  • the obtained DCT coefficients are quantized, such as uniformly quantized, to obtain quantized DCT coefficients.
  • the image formed based on the quantized DCT coefficients obtained from the image to be encoded is the first transformed image.
  • the above multiple coefficients may be multiple DCT coefficients or quantized DCT coefficients.
  • feature extraction is performed on the image to be coded to obtain a three-dimensional feature map, and the three-dimensional feature map is the above-mentioned first transformed image.
  • the probability estimation unit 702 obtains a first probability estimation result according to the context information of the first coefficient.
  • the first coefficient is a pixel of the image to be encoded
  • the first context information of the pixel includes all or part of the pixels in the image to be encoded.
  • the first context information of the pixel includes the pixels adjacent to the pixel in the image to be encoded, or includes part or all of the pixels in the image block adjacent to the pixel, or includes the pixels in the image block where the pixel is located. some or all of the pixels.
  • surrounding pixels refer to pixels whose distance from the first data is smaller than a preset threshold, and the unit of the preset threshold is "pixel”.
  • the first coefficient is a coefficient in the first transformed image
  • the first context information of the first coefficient includes part or all of the coefficients in the first transformed image
  • the coefficient is a wavelet coefficient or a quantized wavelet coefficient.
  • the first context information of the first coefficient includes wavelet coefficients or quantized wavelet coefficients around the first coefficient in the first transformed image, or the first context information includes part or all of the subbands adjacent to the first coefficient A coefficient, the coefficient is a wavelet coefficient or a quantized wavelet coefficient; or the first context information includes part or all of the coefficients in the subband where the first coefficient is located; the coefficient is a wavelet coefficient or a quantized wavelet coefficient;
  • the first context information of the first coefficient includes part or all of the coefficients in the first transformed image, and the coefficient is a DCT coefficient or a quantized DCT coefficient.
  • the first context information of the first coefficient includes DCT coefficients or quantized DCT coefficients around the first data in the first transformed image, or the first context information includes some or all of the coefficients in the subband adjacent to the first coefficient. It is a DCT coefficient or a quantized DCT coefficient; or the first context information includes some or all coefficients in the subband where the first coefficient is located; the coefficient is a DCT coefficient or a quantized DCT coefficient;
  • the first context information of the first coefficient includes part or all of the coefficients in the first transformed image, and the coefficient is a characteristic coefficient or a quantized characteristic coefficient; further, the first The first context information of the coefficient includes feature coefficients or quantized feature coefficients around the first coefficient in the first transformed image, or the first context information includes some or all coefficients in the channel where the first coefficient is located, and the coefficients are feature coefficients or quantized features coefficient.
  • the aforementioned “surrounding wavelet coefficients or quantized wavelet coefficients” refer to wavelet coefficients or quantized wavelet coefficients whose distance from the first data is smaller than a preset threshold, and the unit of the preset threshold is "wavelet coefficients or quantized wavelet coefficients";
  • “Surrounding DCT coefficients or quantized DCT coefficients” refer to DCT coefficients or quantized DCT coefficients whose distance from the first data is less than a preset threshold, and the unit of the preset threshold is "DCT coefficients or quantized DCT coefficients";
  • the above “ Surrounding characteristic coefficients or quantized characteristic coefficients” refer to characteristic coefficients or quantized characteristic coefficients whose distance from the first data is smaller than a preset threshold, and the unit of the preset threshold is "characteristic coefficient or quantized characteristic coefficient”.
  • the plurality of coefficients further include a second coefficient
  • the probability estimation unit 702 is further configured to perform probability estimation according to context information of the second coefficient to obtain a second probability estimation result.
  • the second coefficient and the first coefficient are located at different positions in the same image (such as the image to be encoded or the first transformed image obtained by transforming the image to be encoded), and the probability estimation is performed according to the context information of the second coefficient to obtain the second probability estimation result
  • the probability estimation is performed according to the context information of the second coefficient to obtain the second probability estimation result
  • the first coefficient and the second coefficient belong to the same preset area
  • the preset area can be an image block in the image to be coded, or a subband obtained by wavelet transform of the image to be coded, or a
  • the frequency band or image block obtained by performing DCT on the image, or a channel of the three-dimensional feature map obtained by performing feature extraction on the image to be coded can only obtain one probability estimation result during probability estimation, and this probability estimation result can be called the probability of the preset area Estimated results.
  • For data in a preset area only one probability estimation result is obtained, and only one probability estimation result (that is, the probability estimation result of the preset area) needs to be transmitted during transmission, which can save code streams.
  • the following describes how to obtain the probability estimation result of the first preset area.
  • Method 1 For each coefficient in the first preset area, the probability estimation results of all coefficients in the first preset area can be obtained by processing according to the method of obtaining the probability estimation result of the first coefficient, such as the first preset area If there are 5 coefficients in it, 5 probability estimation results can be obtained; then the target probability estimation result is selected from the probability estimation results of all the coefficients in the first preset area as the probability estimation result of the first preset area. For example, the probability estimation result of the coefficient located in the middle of the first preset area, or the upper left corner or the upper right corner, the lower left corner or the lower right corner is the probability estimation result of the first preset area.
  • Mode 2 Perform probability estimation according to the context information of the first preset area to obtain a probability estimation result of the first preset area.
  • the context information of the first preset area includes some or all pixels in the first image, further, the context information of the first preset area includes Part or all of the pixels in the image block around the first preset area in the first image;
  • the context information of the first preset area includes some or all coefficients in the first transformed image, further , the context information of the first preset area includes some or all of the coefficients in the subbands around the first preset area in the first image, and the coefficients are wavelet coefficients or quantized wavelet coefficients;
  • the context information of the first preset area includes some or all coefficients in the first transformed image, further, the second The context information of a preset area includes some or all coefficients in the frequency band around the first preset area in the first image, and the coefficients are DCT coefficients or quantized DCT coefficients;
  • the first preset area is a transform block of the first transform image (obtained by performing DCT on the first image)
  • performing DCT transform on the first image in units of one or more image blocks can obtain one or more transforms piece.
  • the context information of the first preset area includes some or all coefficients in the first transformed image, the The coefficients are feature coefficients or quantized feature coefficients. Further, the context information of the first preset area includes some or all coefficients in the channel to which the first preset area belongs in the first image, and the coefficients are feature coefficients or quantized feature coefficients.
  • the probability estimation unit 702 obtains the probability distribution model of the first coefficient; processes the context information of the first coefficient through the fifth probability estimation network to obtain the parameters of the probability distribution model;
  • the first probability distribution is obtained according to the probability distribution model of the first coefficient and the parameters of the probability distribution model;
  • the above-mentioned first probability estimation result includes the above-mentioned first probability distribution, or the parameters of the above-mentioned first probability distribution model;
  • the context information of the first coefficient is processed through the sixth probability estimation network to obtain the first probability distribution; the above-mentioned first probability estimation result includes the first probability distribution, or includes the parameters of the probability distribution model corresponding to the probability distribution, wherein,
  • the fifth probability estimation network and the sixth probability estimation network are implemented based on neural networks.
  • the probability estimation result of the first preset area can be obtained in the following manner:
  • the probability estimation unit 503 obtains the probability distribution model of the first preset area; processes the context information of the first preset area through the seventh probability estimation network to obtain the parameters of the probability distribution model; according to the probability of the first preset area The distribution model and the parameters of the probability distribution model obtain the probability distribution of the first preset area; wherein, the probability estimation result of the first preset area includes the probability distribution of the first preset area, or the probability distribution of the first preset area The parameters of the probability distribution model;
  • the probability estimation result of the first preset area includes the probability distribution of the first preset area, or It includes parameters of a probability distribution model corresponding to the probability distribution; wherein, the seventh probability estimation network and the eighth probability estimation network are implemented based on neural networks.
  • the above probability distribution model may be: GSM, an asymmetric Gaussian model, GMM or a Laplace distribution model (Laplace distribution).
  • the probability estimation network can be implemented based on a deep learning network, such as RNN and PixelCNN, etc., which is not limited here.
  • the parameters of the probability distribution model are parameters of the Gaussian model, including mean ⁇ and variance ⁇ .
  • the parameters of the probability distribution model are parameters of the Laplace distribution model, including a location parameter ⁇ and a scale parameter b.
  • a typical probability estimation network based on PixelCNN (including the fifth probability estimation network, the sixth probability estimation network, the seventh probability estimation network and the eighth probability estimation network) is shown in Fig. 6d.
  • "h ⁇ w” indicates that the current convolutional layer uses a convolution kernel with a size of "h ⁇ w”
  • "ResB” indicates the residual module
  • the structure is shown in Figure 6e
  • "*/relu” indicates that relu is used after the current layer activation function.
  • the probability estimation unit 702 performs preprocessing on the first approximate estimation result to obtain a processed probability estimation result. Specifically, if the first probability estimation result includes the mean value and variance of the Gaussian distribution, the variance of the Gaussian distribution is processed to obtain the processed variance, and the mean value and the processed variance of the Gaussian distribution are used as the processed probability estimation result; or,
  • the mean value of the Gaussian distribution is processed to obtain the processed mean value, and the variance of the Gaussian distribution and the processed mean value are used as the probability estimation result after processing.
  • the variance of the Gaussian distribution is processed to obtain the processed variance, including:
  • the variance of the Gaussian distribution is set to 0 as the variance after processing.
  • the variance of the Gaussian distribution is processed to obtain the processed variance, including:
  • the scaling factor of the first coefficient is the same as the scaling factor of the second coefficient; or,
  • the scaling factor of the first coefficient and the scaling factor of the second coefficient are different; or,
  • the scaling factor of the first data and the scaling factor of the second coefficient are the same; or if the first coefficient and the second coefficient belong to different image blocks, then The scaling factor of the first coefficient is different from the scaling factor of the second coefficient; or the scaling factor of the first coefficient is determined according to the texture complexity of the subband to which the first coefficient belongs; or,
  • the scaling factor of the first coefficient is the same as the scaling factor of the second coefficient; or if the first coefficient and the second If the coefficients belong to different subbands, the scaling factor of the first coefficient is different from the scaling factor of the second coefficient; or the scaling factor of the first coefficient is determined according to the texture complexity of the subband to which the first coefficient belongs;
  • the scaling factor of the first coefficient and the scaling factor of the second coefficient are the same; or if the first coefficient and the second If the coefficients belong to different frequency band transform blocks, the scaling factor of the first coefficient is different from the scaling factor of the second coefficient; if the scaling factor of the first coefficient is determined according to the texture complexity of the frequency band transform block to which the first coefficient belongs;
  • the scaling factor of the first coefficient and the scaling factor of the second coefficient are the same; or if the first coefficient and the second coefficient belong to For different channels, the scaling factor of the first coefficient is different from the scaling factor of the second coefficient; if the scaling factor of the first coefficient is determined according to the texture complexity of the channel to which the first coefficient belongs.
  • the probability estimation result is implemented based on a Gaussian distribution model, and the probability estimation result includes a Gaussian distribution or a mean and/or a variance of the Gaussian distribution.
  • the scale parameter of the Laplace distribution is processed according to the scaling factor of the first coefficient, and the processing of the first coefficient
  • the final probability estimation results include the processed scale parameters and the location parameters of the Laplace distribution.
  • the location parameter of the Laplace distribution is processed according to the scaling factor of the first coefficient, and the processing of the first coefficient
  • the final probability estimation results include the processed location parameters and the scale parameters of the Laplace distribution.
  • the probability estimation unit 702 preprocesses the approximate estimation result of the first preset area to obtain the processed probability estimation result. Specifically, if the probability estimation result of the first preset area includes the mean value and variance of the Gaussian distribution, the variance of the Gaussian distribution is processed to obtain the processed variance, and the mean value and the processed variance of the Gaussian distribution are used as the value of the first preset area. Processed probability estimates; or,
  • processing the variance of the Gaussian distribution to obtain the processed variance includes: setting the variance of the Gaussian distribution to 0 as the processed variance.
  • the variance of the Gaussian distribution is processed to obtain the processed variance, including:
  • the scaling factor of the first preset area is the same as that of other preset areas; or,
  • the scaling factor of the first preset area is different from that of other preset areas.
  • the scale parameter of the Laplace distribution is processed according to the scaling factor of the first preset area, the first The processed probability estimation result of a preset area includes processed scale parameters and location parameters of Laplace distribution.
  • the location parameter of the Laplace distribution is processed according to the scaling factor of the first preset area, the first The processed probability estimation result of a preset area includes processed position parameters and scale parameters of Laplace distribution.
  • the probability estimation result includes the Laplace distribution, or the scale parameter and/or the location parameter of the Laplace distribution.
  • the entropy encoding unit 703 writes the first coefficient, the second coefficient, the first probability estimation result and the second probability estimation result into the compressed code stream.
  • the first probability estimation result and the second probability estimation result may be stored in a sequence header, image header, Slice or SEI and transmitted to the decoder 30 .
  • the first flag enable_flag of the first preset area is set to a first value (for example, 1 or true), to indicate that the decoding end obtains the first Use the same probability distribution when estimating coefficients in a preset area, that is, the probability estimation result of the first preset area, and save the probability estimation result of the first preset area in the probability estimation result set, and record the first preset
  • the probability estimation result of the region is indexed in the probability estimation result set and the size information of the first preset region.
  • the entropy encoding unit 703 encodes all the coefficients in the first preset region, the probability estimation result set, the enable_flag of the first preset region, Index and size information is written into the compressed codestream.
  • the set of probability estimation results may be transmitted to decoder 30 via APS.
  • the enable_flag of the first preset area is set to the first value (such as 1 or true) to indicate that the first preset is obtained by sampling at the decoding end
  • the first value such as 1 or true
  • the entropy coding unit 703 converts all the coefficients in the first preset region, the probability estimation result of the first preset region, enable_flag and the The size information of a preset area is written into the compressed code stream.
  • the entropy coding unit 703 will All the coefficients in the first preset area, the respective probability estimation results of all the coefficients in the first preset area, and the enable_flag of the first preset area are written into the compressed code stream.
  • the entropy coding unit 703 also writes the size information of the first preset area into the compressed code stream.
  • the entropy coding unit 703 writes the above data into the compressed code stream, specifically refers to performing entropy coding on the above data to obtain the compressed code stream.
  • entropy coding methods such as Huffman coding, CABAC coding, and H.264/H265/H.266 can be used.
  • the decoding unit 504 decodes the compressed code stream to obtain a first probability estimation result.
  • the entropy decoding unit 704 also decodes the compressed code stream to obtain the second probability estimation result.
  • the first probability estimation result includes parameters of the first probability distribution or the first probability distribution model.
  • the second probability estimation result includes parameters of the second probability distribution or the second probability distribution model.
  • the entropy decoding unit 704 also decodes the first identifier from the compressed code stream. If the first identifier is the first value, it means that the same probability estimation result is used when all the estimated coefficients in the first preset area are obtained by sampling. (that is, the probability estimation result of the first preset area), the first preset area is an area in the enhanced image; the entropy decoding unit 704 also decodes the probability estimation result set and the first preset area from the compressed code stream index, the probability estimation result set includes probability estimation results of multiple preset areas, and the entropy decoding unit 704 obtains the first preset area from the probability estimation result set according to the index of the first preset area. The probability estimation result of the location area;
  • the entropy decoding unit 704 decodes the size information of the first preset area from the code stream H1*W1, indicating that the entropy decoding unit 704 decodes H1*W1 probability estimation results from the compressed code stream, and the sampling unit 705 can obtain all estimated coefficients in the first preset area by sampling the H1*W1 probability estimation results, Both H1 and W1 are integers greater than 1.
  • the entropy decoding unit 704 also decodes the first flag from the compressed code stream, indicating that the same probability estimation result is used when sampling all estimated coefficients in the first preset area (that is, the probability estimation of the first preset area result), the first preset area is an area in the enhanced image, and the entropy decoding unit 704 also decodes the probability estimation result and H1*W1 of the first preset area from the code stream, and the sampling unit 705 passes the first preset
  • the probability estimation result of the area is sampled H1*W1 times to obtain H1*W1 estimated coefficients, that is, the first preset area includes H1*W1 estimated coefficients.
  • the entropy decoding unit 704 is further configured to decode the compressed code stream to obtain multiple reconstruction coefficients.
  • the decoding method used by the entropy decoding unit 704 to decode the compressed code stream corresponds to the entropy coding method used by the entropy coding unit 703 .
  • the first reconstruction unit 706 obtains a first reconstructed image according to a plurality of estimated coefficients.
  • the first reconstructed image can be obtained based on the multiple pixel values.
  • the first reconstruction unit 706 performs inverse quantization and inverse wavelet transform on the multiple estimated coefficients to obtain the first reconstructed image, or,
  • the first reconstruction unit 706 performs wavelet inverse transform on the multiple estimated coefficients to obtain the first reconstructed image, or,
  • the first reconstruction unit 706 performs inverse quantization and inverse DCT on the multiple estimated coefficients to obtain a reconstructed image, or,
  • the first reconstruction unit 706 performs inverse DCT on the multiple estimated coefficients to obtain the first reconstructed image.
  • the first reconstruction unit 706 processes the feature map composed of multiple feature coefficients to obtain the first reconstructed image; or,
  • the first reconstruction unit 706 dequantizes the multiple estimated coefficients to obtain multiple feature coefficients; the feature map composed of multiple feature coefficients is processed to obtain the first reconstructed image.
  • a plurality of estimated coefficients may be input into the second reconstruction unit for processing to obtain a reconstructed image, and the reconstructed image may be used as a reference image for subsequent image prediction.
  • the second reconstruction unit 707 obtains a second reconstructed image according to the plurality of reconstruction coefficients.
  • the second reconstructed image can be obtained based on the multiple pixel values.
  • the second reconstruction unit 707 performs inverse quantization and wavelet inverse transform on the multiple quantized coefficients to obtain the first reconstructed image, or,
  • the second reconstruction unit 707 performs wavelet inverse transform on the multiple reconstruction coefficients to obtain the first reconstructed image, or,
  • the second reconstruction unit 707 performs inverse quantization and inverse DCT on the multiple reconstruction coefficients to obtain a reconstructed image, or,
  • the second reconstruction unit 707 performs inverse DCT on the multiple reconstruction coefficients to obtain the first reconstructed image.
  • the second reconstruction unit 707 processes the feature map composed of multiple feature coefficients to obtain the first reconstructed image; or,
  • the second reconstruction unit 707 dequantizes the multiple reconstruction coefficients to obtain multiple feature coefficients; the feature map composed of multiple feature coefficients is processed to obtain the first reconstructed image.
  • the implementation manner of the second reconstruction unit 707 may be the same as that of the first reconstruction unit 706, or may be different, which is not limited here.
  • the feature map may be passed through a neural network to output the above-mentioned first reconstructed image or the second reconstructed image.
  • the neural network can adopt any structure, such as a fully connected network, a convolutional neural network, a recurrent neural network, and the like.
  • the neural network can adopt a deep neural network structure with a multi-layer structure to obtain the first reconstructed image or the second reconstructed image with better quality.
  • the feature map can be input into a machine vision task module to perform corresponding machine tasks.
  • complete machine vision tasks such as object classification, recognition, and segmentation.
  • the plurality of estimated coefficients obtained by the sampling unit 705 can be input to the second reconstruction unit 707 together with the plurality of reconstruction coefficients; specifically, when the plurality of estimated coefficients and the plurality of reconstruction coefficients are characteristic coefficients , the second reconstruction unit 707 processes multiple estimated coefficients according to the method of the first reconstruction unit 706 to obtain the first feature map, the second reconstruction unit 707 obtains the second feature map based on the multiple reconstruction coefficients, and then based on the first feature and the second feature map are processed by the neural network to obtain the second reconstructed image.
  • the sampling step can be repeated in the present application to obtain multiple first reconstructed images.
  • the multiple first reconstructed images may be the reconstructed images with the best subjective quality, or the reconstructed images with the best objective quality.
  • the first reconstructed image can be used in the encoding and decoding loop as a reference for intra-frame or inter-frame prediction; it can also be used outside the encoding and decoding loop to optimize image quality in a post-processing manner.
  • the reconstructed image with the best subjective quality is put into the decoded picture buffer (DPB) or the reference frame set for use in the codec loop
  • DPB decoded picture buffer
  • the second reconstructed image is obtained according to the reconstruction coefficient, which can be used as a reference frame when predicting the next frame in video compression.
  • the entropy coding unit 703 first performs probability estimation on the first coefficient to obtain the probability estimation result of the first coefficient, which is called the probability estimation result C; and then according to the probability estimation result C Perform entropy encoding on the first coefficient; during the entropy decoding process, the entropy decoding unit 704 first performs probability estimation on the first coefficient to obtain a probability estimation result of the first coefficient, which may also be called a probability estimation result C; and then Entropy decoding is performed according to the probability estimation result C.
  • the probability estimation result mentioned in the above embodiments is called the probability estimation result D.
  • entropy encoding is performed on the first coefficient at the encoding end according to the probability estimation result C, and the decoding end performs probability estimation on the first coefficient according to the manner in which the encoding end performs probability estimation on the first data to obtain the probability estimation result (also can be As the probability estimation result C, entropy decoding is performed according to the probability estimation result C, and sampling may also be performed according to the probability estimation result C, and the sampling method is consistent with the above-mentioned embodiment.
  • entropy encoding is performed on the first coefficient at the encoding end according to the probability estimation result C, and the probability estimation result C is transmitted to the decoding end, and the decoding end performs entropy decoding according to the probability estimation result C, and can also perform entropy decoding according to the probability estimation result C Sampling, the sampling method is consistent with the above-mentioned embodiment.
  • entropy encoding is performed on the first coefficient at the encoding end according to the probability estimation result D, and the encoding end sends the probability estimation result D to the decoding end, and the decoding end performs entropy decoding according to the probability estimation result D, or performs entropy decoding according to the probability estimation result D Sampling, the sampling method is consistent with the above-mentioned embodiment.
  • entropy encoding is performed on the first coefficient at the encoding end according to the probability estimation result D; the probability estimation is performed on the first coefficient at the decoding end to obtain the probability estimation result D, and then entropy decoding is performed according to the probability estimation result D, and it is also possible to obtain the probability estimation result D according to The probability estimation result D is sampled, and the sampling method is consistent with the above-mentioned embodiment.
  • FIG. 8 is a flowchart showing a process 800 of an encoding method based on an embodiment of the present application.
  • Process 800 may be performed by video encoder 20 .
  • the process 800 is described as a series of steps or operations. It should be understood that the process 1000 may be performed in various orders and/or concurrently, and is not limited to the order of execution shown in FIG. 8 .
  • the encoding method includes:
  • the first context information may be pixels in the first image or coefficients in the first transformed image obtained by transforming the first image.
  • the method of this embodiment also includes:
  • the second image is an image to be encoded or a decoded image, and the second image is different from the first image; perform probability estimation according to the first context information to obtain a first probability estimation result, including:
  • Probability estimation is performed according to the first context information and the second context information to obtain the first probability estimation result; the second context information is obtained from the second image.
  • the probability estimation is performed according to the first context information to obtain the first probability estimation result, including:
  • the first context information includes the context information of the first data and the second Contextual information about the data.
  • the first probability estimation result includes the probability estimation result of a first preset area
  • the first preset area includes the first data and the second data
  • the first preset area is located in the first image, or located in In the image obtained by transforming the first image, performing probability estimation according to the first context information to obtain a first probability estimation result, including:
  • the first probability estimation result includes the probability estimation result of the second preset area
  • the second preset area is located in the first image or in an image obtained by transforming the first image
  • the first context The information includes context information of the second preset area
  • performing probability estimation according to the first context information to obtain the first probability estimation result includes: performing probability estimation according to the context information of the second preset area to obtain the probability estimation result of the second preset area
  • the first probability estimation result includes the probability estimation result of the second preset area.
  • the encoding method further includes: setting the value of the first flag of the first preset area as the first value, which is used to indicate that when the estimated coefficients in the first preset area are obtained by sampling The probability estimation result of the first preset area; the probability estimation result of the first preset area is saved in the probability estimation result set, and the index of the probability estimation result of the first preset area in the probability estimation result set is recorded; the probability estimation Writing the result into the compressed code stream includes: writing the probability estimation result set, the index, the size information of the first preset area and the first identification into the compressed code stream.
  • this encoding method also includes:
  • the scaling factor of the preset area preprocesses the probability estimation result of the first preset area to obtain the processed probability estimation result, saves the processed probability estimation result into the probability estimation result set, and records the processed probability estimation result
  • the results are in the index of the probability estimation result set; writing the probability estimation result into the compressed code stream includes: writing the probability estimation result set, the index, the size information of the first preset area and the first identification into the compressed code stream.
  • this encoding method also includes:
  • Set the value of the first identifier of the first preset area as the first value to indicate that the probability estimation result of the first preset area is used when sampling the estimated coefficients in the first preset area; set the first Writing the probability estimation result into the compressed code stream includes: writing the probability estimation result of the first preset area, the size information of the first preset area and the first identification into the code stream.
  • this encoding method also includes:
  • the probability estimation result of the first data is preprocessed to obtain the probability estimation result after processing.
  • the probability estimation result of the first data includes the mean and variance of the Gaussian distribution, and the probability estimation result of the first data is preprocessed to obtain the processed probability estimation result, including:
  • the variance of the Gaussian distribution is set to 0 as the processed variance, wherein the processed probability estimation result includes the mean value of the Gaussian distribution and the processed variance.
  • the probability estimation result of the first data includes the mean and variance of the Gaussian distribution, and the probability estimation result of the first data is preprocessed to obtain the processed probability estimation result, including:
  • the scaling factor of the first data is the same as the scaling factor of the second data; or, the scaling factor of the first data is different from the scaling factor of the second data; or,
  • the scaling factor of the first data and the scaling factor of the second data are the same; or if the first data and the second data belong to different image blocks, then The scaling factor of the first data is different from the scaling factor of the second data; or the scaling factor of the first data is determined according to the texture complexity of the image block to which the first data belongs;
  • the scaling factor of the first data is the same as that of the second data; or if the first data and the second data The two data belong to different subbands, and the scaling factor of the first data is different from the scaling factor of the second data; or the scaling factor of the first data is determined according to the texture complexity of the subband to which the first data belongs;
  • the scaling factor of the first data is the same as that of the second data; or if the first data and the second data The two data belong to different frequency bands or transform blocks, then the scaling factor of the first data and the scaling factor of the second data are different; if or the scaling factor of the first data is determined according to the frequency band to which the first data belongs or the texture complexity of the transform block ;
  • the scaling factor of the first data and the scaling factor of the second data are the same; or if the first data and the second data belong to different channels, the scaling factor of the first data is different from the scaling factor of the second data; or the scaling factor of the first data is determined according to the texture complexity of the channel to which the first data belongs.
  • this encoding method also includes:
  • the probability estimation result of the second preset area is preprocessed to obtain the probability estimation result after processing.
  • the probability estimation result of the first data includes the mean and variance of the Gaussian distribution, and the probability estimation result of the first data is preprocessed to obtain the processed probability estimation result, including:
  • the processed probability estimation result includes the mean value and the first variance of the Gaussian distribution, or, the variance of the Gaussian distribution is calculated according to the scaling factor of the second preset area processing to obtain the second variance, wherein the processed probability estimation result includes the mean value of the Gaussian distribution and the second variance, and the scaling factor of the first prefabricated area is the same or different from the scaling factor of the second prefabricated area.
  • the first context information includes some or all pixel values in the first image.
  • this encoding method also includes:
  • the probability estimation is performed according to the first context information to obtain the first probability estimation result, including:
  • the first context information is input into the first probability estimation network for processing to obtain the parameters of the first probability distribution model; the parameters of the probability estimation result first probability distribution model;
  • the first context information is input into the second probability estimation network for processing to obtain the target probability distribution, and the probability estimation result includes the parameters of the target probability distribution; wherein, the first probability estimation network and the second probability estimation network are realized by a neural network.
  • FIG. 9 is a flow chart showing a process 900 of an encoding method based on an embodiment of the present application.
  • Process 900 may be performed by video encoder 20 .
  • the process 900 is described as a series of steps or operations. It should be understood that the process 900 may be performed in various orders and/or concurrently, and is not limited to the order of execution shown in FIG. 9 .
  • the encoding method includes:
  • the multiple coefficients also include a second coefficient
  • the encoding method also includes:
  • the plurality of coefficients further includes a second coefficient, the first coefficient and the second coefficient belong to the same preset area, and the preset area is located in the image to be coded, or in an image obtained by transforming the image to be coded,
  • the first probability estimation result is obtained according to the context information of the first coefficient, including:
  • Writing the first coefficient and the first probability estimation result into the compressed code stream includes: writing the first coefficient, the second coefficient and the first probability estimation result into the compressed code stream.
  • the plurality of coefficients further includes a second coefficient, the first coefficient and the second coefficient belong to the same preset area, and the preset area is located in the image to be coded, or in an image obtained by transforming the image to be coded,
  • the first probability distribution is obtained according to the context information of the first coefficient, including:
  • Probability estimation is performed according to the context information of the preset area to obtain a first probability estimation result; the context information of the preset area includes context information of the first coefficient; writing the first coefficient and the first probability estimation result into the compressed code stream includes: The first coefficient, the second coefficient and the first probability estimation result are written into the compressed code stream.
  • this encoding method also includes:
  • this encoding method also includes:
  • Writing the estimation result into the compressed code stream includes: writing the first coefficient, the second coefficient, the first probability estimation result, the size information of the preset area and the first identification into the compressed code stream.
  • the first coefficient and the second coefficient belong to the same preset area, and the encoding method further includes:
  • this encoding method also includes:
  • the probability estimation result of the first coefficient is preprocessed to obtain the probability estimation result after processing.
  • the probability estimation result of the first coefficient includes the mean and variance of the Gaussian distribution, and the probability estimation result of the first coefficient is preprocessed to obtain the processed probability estimation result, including:
  • the variance of the Gaussian distribution is set to 0 as the processed variance, wherein the processed probability estimation result includes the mean value of the Gaussian distribution and the processed variance.
  • the probability estimation result of the first coefficient includes the mean and variance of the Gaussian distribution, the probability estimation result of the first coefficient is preprocessed, and the probability estimation result after processing is obtained, including:
  • the scaling factor of the first coefficient and the scaling factor of the second coefficient are the same; or, the scaling factor of the first coefficient and the scaling factor of the second coefficient are different; or,
  • the scaling factor of the first data and the scaling factor of the second coefficient are the same; or if the first coefficient and the second coefficient belong to different image blocks, then The scaling factor of the first coefficient is different from the scaling factor of the second coefficient; or the scaling factor of the first coefficient is determined according to the texture complexity of the subband to which the first coefficient belongs; or,
  • the scaling factor of the first coefficient is the same as the scaling factor of the second coefficient; or if the first coefficient and the second If the coefficients belong to different subbands, the scaling factor of the first coefficient is different from the scaling factor of the second coefficient; or the scaling factor of the first coefficient is determined according to the texture complexity of the subband to which the first coefficient belongs;
  • the scaling factor of the first coefficient and the scaling factor of the second coefficient are the same; or if the first coefficient and the second coefficient belong to different frequency band, the scaling factor of the first coefficient is different from the scaling factor of the second coefficient; if the scaling factor of the first coefficient is determined according to the texture complexity of the frequency band to which the first coefficient belongs;
  • the scaling factor of the first coefficient and the scaling factor of the second coefficient are the same; or if the first coefficient and the second coefficient belong to For different channels, the scaling factor of the first coefficient is different from the scaling factor of the second coefficient; if the scaling factor of the first coefficient is determined according to the texture complexity of the channel to which the first coefficient belongs.
  • this encoding method also includes:
  • the probability estimation result of the preset area is preprocessed to obtain the probability estimation result after processing.
  • the probability estimation result of the preset area includes the mean and variance of the Gaussian distribution, and the probability estimation result of the preset area is preprocessed to obtain the processed probability estimation result, including:
  • the processed probability estimation result includes the mean value and the first variance of the Gaussian distribution, or process the variance of the Gaussian distribution according to the scaling factor of the preset area, to obtain the second variance, wherein the processed probability estimation result includes the mean value and the second variance of the Gaussian distribution.
  • the first context information includes some or all pixel values in the image to be encoded;
  • the multiple coefficients are multiple wavelet coefficients
  • the first context information includes part or all of the multiple wavelet coefficients; or, if the image to be coded is subjected to wavelet transformation and quantization to obtain multiple Coefficients, the plurality of coefficients are a plurality of quantized wavelet coefficients, the first context information includes part or all of the plurality of quantized wavelet coefficients; or, if the image to be coded is subjected to DCT to obtain a plurality of coefficients, the plurality of coefficients are a plurality of DCT coefficients,
  • the first context information includes some or all of the multiple DCT coefficients; or, if the image to be coded is subjected to DCT and quantization to obtain multiple coefficients, the multiple coefficients are multiple quantized DCT coefficients, and the first context information includes multiple quantized DCT coefficients Part or all of them; or, if the feature extraction of the image to be coded obtains multiple coefficients, the
  • the first probability estimation result is obtained according to the context information of the first coefficient, including:
  • Obtain the second probability distribution model input the first context information into the third probability estimation network for processing, and obtain the parameters of the second probability distribution model; obtain the first probability according to the parameters of the second probability distribution model and the second probability distribution model estimated results;
  • the first context information is input into the fourth probability estimation model for processing to obtain a probability estimation result; wherein, the third probability estimation network and the fourth probability estimation network are realized by a neural network.
  • Fig. 10 is a flowchart showing a process 1000 of a decoding method based on an embodiment of the present application.
  • Process 1000 may be performed by video decoder 30 .
  • the process 1000 is described as a series of steps or operations. It should be understood that the process 1000 may be performed in various orders and/or concurrently, and is not limited to the order of execution shown in FIG. 10 .
  • the decoding method includes:
  • the decoding method also includes:
  • the first probability estimation result is obtained from decoding the compressed code stream, including:
  • the preset area includes the first estimated coefficient, the preset area is an area in the first reconstructed image, and is determined from the probability estimation result set according to the index
  • the probability estimation result of the preset area, the first probability estimation result is the probability estimation result of the preset area; wherein, the value of the first identifier is the first value used to indicate that all estimation systems in the preset area are sampled using the above Probability estimation results for preset regions.
  • the decoding method also includes:
  • the first estimated coefficient and the second estimated coefficient belong to the same preset area, and the preset area is an area in the first reconstructed image, and the decoding method further includes:
  • the first probability estimation result includes the mean and variance of the Gaussian distribution
  • the first estimation coefficient is obtained by sampling according to the first probability estimation result, including:
  • the decoding method also includes:
  • Determining the first estimated coefficient according to the first reference value and the mean value and variance of the first probability estimation result including:
  • the first estimation coefficient is determined according to the first reference value, the mean value of the first probability estimation result and the processed variance.
  • the variance of the first probability estimation result is preprocessed to obtain the processed variance, including:
  • the first estimated coefficient is a quantized wavelet coefficient, or, a wavelet coefficient, or a quantized DCT coefficient, or a DCT coefficient, or a feature coefficient, or a quantized feature coefficient
  • the variance of the first probability distribution is preprocessed, To get the processed variance, including:
  • the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient are the same; or, the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient are different; or
  • the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient The factors are the same; or if the first estimated coefficient and the second estimated coefficient belong to different subbands, the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient are different; or the scaling factor of the first estimated coefficient is based on the first estimated
  • the texture complexity of the image block to which the coefficient belongs is determined;
  • the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient are quantized DCT coefficients or DCT coefficients. If the first estimated coefficient and the second estimated coefficient belong to the same frequency band, the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient The factors are the same; or if the first estimated coefficient and the second estimated coefficient belong to different frequency bands, the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient are different; or the scaling factor of the first estimated coefficient is based on the first estimated coefficient
  • the texture complexity of the band to which it belongs is determined;
  • the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient are the same ; or if the first estimated coefficient and the second estimated coefficient belong to different channels, the scaling factor of the first estimated coefficient is different from the scaling factor of the second estimated coefficient; or the scaling factor of the first estimated coefficient is based on the channel to which the first estimated coefficient belongs The channel's texture complexity is determined.
  • the first estimated coefficient and the second estimated coefficient are pixel values
  • the variance of the first probability estimation result is preprocessed to obtain the processed variance, including:
  • the scaling factor of the first estimated coefficient is the same as the scaling factor of the second estimated coefficient, or the scaling factor of the first estimated coefficient is different from the scaling factor of the second estimated coefficient; or, the scaling factor of the first estimated coefficient is based on the first estimate
  • the texture complexity of the image block to which the coefficient belongs is determined.
  • the first reconstructed image is obtained according to the first estimated coefficient and the second estimated coefficient, including:
  • first estimated coefficient and the second estimated coefficient are quantized wavelet coefficients
  • inverse quantization and wavelet inverse transform are performed on the first estimated coefficient and the second estimated coefficient to obtain the first reconstructed image, or, if the first estimated coefficient and the second estimated coefficient is the wavelet coefficient, perform wavelet inverse transform on the first estimated coefficient and the second estimated coefficient to obtain the first reconstructed image, or, if the first estimated coefficient and the second estimated coefficient are quantized DCT coefficients, the first estimated coefficient and the second estimated coefficient Perform inverse quantization and inverse DCT on the coefficients to obtain the first reconstructed image, or, the first estimated coefficient and the second estimated coefficient are DCT coefficients, and perform inverse DCT on the first estimated coefficient and the second estimated coefficient to obtain the first reconstructed image.
  • the decoding method also includes:
  • a plurality of reconstruction coefficients are obtained by decoding the compressed code stream; and a second reconstruction image is obtained according to the plurality of reconstruction coefficients.
  • the second reconstructed image is derived from a plurality of coefficients, including:
  • the multiple reconstruction coefficients are quantized wavelet coefficients, perform inverse quantization and wavelet inverse transform on the multiple reconstruction coefficients to obtain the second reconstructed image, or, if the multiple reconstruction coefficients are wavelet coefficients, perform wavelet inverse transform on the multiple reconstruction coefficients to obtain the second reconstructed image
  • Two reconstructed images or, if the plurality of reconstruction coefficients are quantized DCT coefficients, perform inverse quantization and inverse DCT on the plurality of reconstruction coefficients to obtain a second reconstructed image, or, if the plurality of reconstruction coefficients are DCT coefficients, perform inverse quantization on the plurality of reconstruction coefficients
  • the inverse DCT obtains the second reconstructed image.
  • Computer-readable media may include computer-readable storage media, which correspond to tangible media, such as data storage media, or communication media including any medium that facilitates transfer of a computer program from one place to another (eg, based on a communication protocol) .
  • a computer-readable medium may generally correspond to (1) a non-transitory tangible computer-readable storage medium, or (2) a communication medium, such as a signal or carrier wave.
  • Data storage media may be any available media that can be accessed by one or more computers or one or more processors to retrieve instructions, code and/or data structures for implementation of the techniques described in this application.
  • a computer program product may include a computer readable medium.
  • such computer-readable storage media may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage, flash memory, or any other medium that can contain the desired program code in the form of a computer and can be accessed by a computer.
  • any connection is properly termed a computer-readable medium.
  • coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave
  • coaxial cable Wire, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of media.
  • Disk and disc includes compact disc (CD), laser disc, optical disc, digital versatile disc (DVD) and blu-ray disc where disks usually reproduce data magnetically, while discs reproduce optically with lasers data. Combinations of the above should also be included within the scope of computer-readable media.
  • DSPs digital signal processors
  • ASICs application specific integrated circuits
  • FPGAs field programmable logic arrays
  • DSPs digital signal processors
  • ASICs application specific integrated circuits
  • FPGAs field programmable logic arrays
  • the techniques of the present application may be implemented in a wide variety of devices or devices, including wireless handsets, an integrated circuit (IC), or a group of ICs (eg, a chipset).
  • IC integrated circuit
  • a group of ICs eg, a chipset
  • Various components, modules, or units are described in this application to emphasize functional aspects of means for performing the disclosed techniques, but do not necessarily require realization by different hardware units. Indeed, as described above, the various units may be combined in a codec hardware unit in conjunction with suitable software and/or firmware, or by interoperating hardware units (comprising one or more processors as described above) to supply.

Abstract

The present application relates to the technical field of video or image compression based on artificial intelligence (AI), and in particular to the technical field of video compression based on a neural network. Provided are a method and apparatus for encoding and decoding a video image. The encoding method comprises: acquiring a first image, the first image being an image to be encoded or a decoded image (S801); performing probability estimation according to first context information to obtain a first probability estimation result, wherein the first context information is obtained from the first image (S802); and writing the first probability estimation result into a compressed bitstream (S803). A decoding end performs sampling according to the probability estimation result, so as to obtain an estimation coefficient, and obtains a reconstructed image on the basis of the estimation coefficient obtained by means of sampling. By means of the present application, a high-quality image can be obtained.

Description

视频图像的编解码方法及装置Method and device for encoding and decoding video images
本申请要求于2021年7月9日提交中国国家知识产权局、申请号为202110781903.8、发明名称为“视频图像的编解码方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with the application number 202110781903.8 and the title of the invention "Method and device for encoding and decoding video images" filed with the State Intellectual Property Office of China on July 9, 2021, the entire contents of which are hereby incorporated by reference In this application.
技术领域technical field
本申请涉及视频编解码领域,尤其涉及一种视频图像的编解码方法及装置。The present application relates to the field of video encoding and decoding, and in particular to a method and device for encoding and decoding video images.
背景技术Background technique
数字图像是以数字信号方式记录的图像信息。数字图像(以下简称图像)可看作一个M行N列的二维阵列,包含M×N个采样,每个采样的位置称为采样位置,每个采样的数值称为采样值。Digital images are image information recorded in the form of digital signals. A digital image (hereinafter referred to as an image) can be regarded as a two-dimensional array of M rows and N columns, including M×N samples, the position of each sample is called a sampling position, and the value of each sample is called a sample value.
在图像存储、传输等应用中,通常需要对图像做编码操作,以减少存储容量和传输带宽。图像编码包括编码和解码两个步骤。典型的编码流程一般包括变换、量化和熵编码三个步骤。针对一幅待编码的图像,第一步通过变换对图像进行去相关,得到能量分布更加集中的变换系数;第二步对变换系数进行量化,得到量化系数;第三步对量化系数进行熵编码得到压缩码流。与编码操作相对应,一个典型的解码流程包括解码器在接收到压缩码流后,依次经过熵解码、反量化和反变换三个步骤,得到重建图像。In applications such as image storage and transmission, it is usually necessary to encode images to reduce storage capacity and transmission bandwidth. Image coding includes two steps of encoding and decoding. A typical coding process generally includes three steps of transformation, quantization and entropy coding. For an image to be encoded, the first step is to decorrelate the image through transformation to obtain the transformation coefficient with more concentrated energy distribution; the second step is to quantize the transformation coefficient to obtain the quantization coefficient; the third step is to entropy encode the quantization coefficient Get the compressed code stream. Corresponding to the encoding operation, a typical decoding process includes three steps of entropy decoding, inverse quantization and inverse transformation in sequence after the decoder receives the compressed code stream to obtain the reconstructed image.
在现有技术的解码方法中,熵解码、反量化和反变换等一般均是确定性过程,即对一个压缩码流进行解码操作会获得唯一的重建图像,该重建图像在某些评价指标下质量不高。In the decoding methods of the prior art, entropy decoding, inverse quantization and inverse transformation are generally deterministic processes, that is, decoding a compressed code stream will obtain a unique reconstructed image. The quality is not high.
发明内容Contents of the invention
本申请实施例提供一种视频图像的编解码方法及相关设备,能够提高图像的质量。Embodiments of the present application provide a video image encoding and decoding method and related equipment, which can improve image quality.
上述和其它目标通过独立权利要求的主题实现。其它实现方式在从属权利要求、具体实施方式和附图中显而易见。The above and other objects are achieved by the subject-matter of the independent claims. Other implementations are evident from the dependent claims, the detailed description and the figures.
具体实施例在所附独立权利要求中概述,其它实施例在从属权利要求中概述。Particular embodiments are outlined in the appended independent claims, other embodiments are outlined in the dependent claims.
基于第一方面,本申请涉及视频图像的编码方法。该方法由编码装置执行,该方法包括:Based on the first aspect, the present application relates to a video image encoding method. The method is performed by an encoding device, and the method includes:
获取第一图像,该第一图像为待编码图像或已解码图像,根据第一上下文信息进行概率估计得到第一概率估计结果;该第一上下文信息从第一图像得到的;将第一概率估计结果写入压缩码流。Acquiring a first image, the first image is an image to be encoded or a decoded image, performing probability estimation according to the first context information to obtain a first probability estimation result; the first context information is obtained from the first image; the first probability estimation The result is written to the compressed codestream.
其中,第一上下文信息可以为第一图像中的像素或者通过对第一图像进行变换得到的第一变换图像中的系数。Wherein, the first context information may be pixels in the first image or coefficients in the first transformed image obtained by transforming the first image.
在编码端进行概率估计得到概率估计结果,并将该概率估计结果传输至解码端,使得解码端基于该概率估计结果进行采样可得到高质量的图像。The probability estimation is performed at the encoding end to obtain a probability estimation result, and the probability estimation result is transmitted to the decoding end, so that the decoding end performs sampling based on the probability estimation result to obtain a high-quality image.
在一个可能的设计中,本实施例的方法还包括:In a possible design, the method of this embodiment also includes:
获取第二图像,第二图像为待编码图像或已解码图像,且第二图像与第一图像不相同;根据第一上下文信息进行概率估计得到第一概率估计结果,包括:Acquire a second image, the second image is an image to be encoded or a decoded image, and the second image is different from the first image; perform probability estimation according to the first context information to obtain a first probability estimation result, including:
根据第一上下文信息和第二上下文信息进行概率估计得到所述第一概率估计结果,其中第二上下文信息从第二图像得到的。Performing probability estimation according to the first context information and the second context information to obtain the first probability estimation result, wherein the second context information is obtained from the second image.
通过引入第二上下文信息,可得到精确性更高的概率估计结果,从而使得解码端基于该概率估计结果进行采样可得到质量更佳的图像。By introducing the second context information, a probability estimation result with higher accuracy can be obtained, so that the decoding end performs sampling based on the probability estimation result to obtain an image with better quality.
在一个可能的设计中,根据第一上下文信息进行概率估计得到第一概率估计结果,包括:In a possible design, the probability estimation is performed according to the first context information to obtain the first probability estimation result, including:
根据第一数据的上下文信息进行概率估计得到第一数据的概率估计结果;根据第二数据的上下文信息进行概率估计得到第二数据的概率估计结果;其中,第一数据和第二数据是根据第一图像得到的;第一上下文信息包括第一数据的上下文信息和第二数据的上下文信息。Perform probability estimation according to the context information of the first data to obtain the probability estimation result of the first data; perform probability estimation according to the context information of the second data to obtain the probability estimation result of the second data; wherein, the first data and the second data are based on the second data An image is obtained; the first context information includes the context information of the first data and the context information of the second data.
编码端通过逐个计算第一图像中每个数据的概率估计结果,并将每个数据的概率估计结果传输至解码端,使得解码端能够准确地基于各自数据的概率估计结果进行采样,从而得到质量更高的重建图像。The encoding end calculates the probability estimation results of each data in the first image one by one, and transmits the probability estimation results of each data to the decoding end, so that the decoding end can accurately sample based on the probability estimation results of the respective data, thereby obtaining the quality Higher reconstructed images.
在一个可能的设计中,第一概率估计结果包括第一预置区域的概率估计结果,第一预置区域包括第一数据和第二数据,第一预置区域位于第一图像中、或者位于对第一图像进行变换得到的图像中,根据第一上下文信息进行概率估计得到第一概率估计结果,包括:In a possible design, the first probability estimation result includes the probability estimation result of a first preset area, the first preset area includes the first data and the second data, and the first preset area is located in the first image, or located in In the image obtained by transforming the first image, performing probability estimation according to the first context information to obtain a first probability estimation result, including:
根据第一数据的上下文信息进行概率估计得到第一数据的概率估计结果;根据第二数据的上下文信息进行概率估计得到第二数据的概率估计结果,其中第一上下文信息包括第一数据的上下文信息和第二数据的上下文信息;根据第一数据的概率估计结果和第二数据的概率估计结果选择得到第一预置区域的概率估计结果,第一概率估计结果包括第一预置区域的概率估计结果。Perform probability estimation according to the context information of the first data to obtain the probability estimation result of the first data; perform probability estimation according to the context information of the second data to obtain the probability estimation result of the second data, wherein the first context information includes the context information of the first data and the context information of the second data; according to the probability estimation result of the first data and the probability estimation result of the second data, the probability estimation result of the first preset area is selected, and the first probability estimation result includes the probability estimation of the first preset area result.
其中,第一预置区域为第一图像中的一个图像块,或者为对第一图像进行小波变换得到的一个子带,或者为对第一图像进行离散余弦变换(discrete cosine transform,DCT)得到的一个频带、或者为对第一图像进行DCT得到的一个变换块,或者位于对所述第一图像进行特征提取得到的三维特征图中一个通道。Wherein, the first preset area is an image block in the first image, or a subband obtained by performing wavelet transform on the first image, or a subband obtained by performing discrete cosine transform (discrete cosine transform, DCT) on the first image A frequency band of , or a transform block obtained by performing DCT on the first image, or a channel in a three-dimensional feature map obtained by performing feature extraction on the first image.
其中,对第一图像以一个或者多个图像块为单位进行DCT变换可以得到一个或者多个变换块。Wherein, performing DCT transformation on the first image in units of one or more image blocks may obtain one or more transform blocks.
编码端对于一个预置区域的数据,将一个概率估计结果作为该预置区域内所有的数据的概率估计结果,从而在传输时只需传一个概率估计结果,从而降低了传输码流的数量和传输该码流所需的资源。For the data in a preset area, the encoder uses a probability estimation result as the probability estimation result of all the data in the preset area, so that only one probability estimation result needs to be transmitted during transmission, thereby reducing the number of transmission code streams and The resources required to transmit the code stream.
在一个可能的设计中,第一概率估计结果包括第二预置区域的概率估计结果,第二预置区域位于第一图像中、或者位于对第一图像进行变换得到的图像中,第一上下文信息包括第二预置区域的上下文信息,根据第一上下文信息进行概率估计得到第一概率估计结果,包括:根据第二预置区域的上下文信息进行概率估计得到第二预置区域的概率估计结果,第一概率估计结果包括第二预置区域的概率估计结果。In a possible design, the first probability estimation result includes the probability estimation result of the second preset area, and the second preset area is located in the first image or in an image obtained by transforming the first image, and the first context The information includes context information of the second preset area, and performing probability estimation according to the first context information to obtain the first probability estimation result includes: performing probability estimation according to the context information of the second preset area to obtain the probability estimation result of the second preset area , the first probability estimation result includes the probability estimation result of the second preset area.
其中,第二预置区域为第一图像中的一个图像块,或者为对第一图像进行小波变换得到的一个子带,或者为对第一图像进行DCT得到的一个频带、或者为对第一图像进行DCT得到的一个变换块,或者位于对所述第一图像进行特征提取得到的三维特征图中一个通道。Wherein, the second preset area is an image block in the first image, or a subband obtained by performing wavelet transformation on the first image, or a frequency band obtained by performing DCT on the first image, or a A transformation block obtained by performing DCT on the image, or a channel in a three-dimensional feature map obtained by performing feature extraction on the first image.
其中,对第一图像以一个或者多个图像块为单位进行DCT变换可以得到一个或者多个变换块。Wherein, performing DCT transformation on the first image in units of one or more image blocks may obtain one or more transform blocks.
编码端对于一个预置区域的数据,将一个概率估计结果作为该预置区域内所有的数据的概率估计结果,从而在传输时只需传一个概率估计结果,从而降低了传输码流的数量和传输该码流所需的资源。For the data in a preset area, the encoder uses a probability estimation result as the probability estimation result of all the data in the preset area, so that only one probability estimation result needs to be transmitted during transmission, thereby reducing the number of transmission code streams and The resources required to transmit the code stream.
在一个可能的设计中,本编码方法还包括:In a possible design, this encoding method also includes:
将第一预置区域的第一标识的值置为第一值,以用于指示在采样得到第一预置区域中的 估计系数时均使用第一预置区域的概率估计结果;将第一预置区域的概率估计结果保存至概率估计结果集合中,并记录第一预置区域的概率估计结果在概率估计结果集合的索引;将概率估计结果写入压缩码流,包括:将概率估计结果集合,索引、第一预置区域的尺寸信息及第一标识写入压缩码流。Set the value of the first identifier of the first preset area as the first value to indicate that the probability estimation result of the first preset area is used when sampling the estimated coefficients in the first preset area; set the first The probability estimation result of the preset area is stored in the probability estimation result set, and the index of the probability estimation result of the first preset area in the probability estimation result set is recorded; the probability estimation result is written into the compressed code stream, including: the probability estimation result The set, the index, the size information of the first preset area and the first identification are written into the compressed code stream.
对于多个预置区域的多个概率估计结果,编码端将多个预置区域的概率估计结果保存至概率估计结果集合中,并记录每个预置区域的概率估计结果在概率估计结果集合中的位置(即索引),使得解码端可以基于索引能准确从基于码流解码得到的概率估计结果集合中确定每个预置区域的概率估计结果,从而保证了解码的准确性。通过引入尺寸信息,以指示采样得到第一预置区域中的估计系数时需要基于第一预置区域的概率估计结果进行采样的次数,从而得到第一预置区域中所有的估计系数。For multiple probability estimation results of multiple preset areas, the encoding end saves the probability estimation results of multiple preset areas in the probability estimation result set, and records the probability estimation results of each preset area in the probability estimation result set position (namely index), so that the decoder can accurately determine the probability estimation result of each preset area from the probability estimation result set obtained based on codestream decoding based on the index, thereby ensuring the accuracy of decoding. The size information is introduced to indicate the number of times of sampling based on the probability estimation result of the first preset area when sampling to obtain the estimated coefficients in the first preset area, so as to obtain all the estimated coefficients in the first preset area.
在一个可能的设计中,本编码方法还包括:In a possible design, this encoding method also includes:
将第一预置区域的第一标识的值置为第一值,以用于指示在采样得到第一预置区域中的估计系数时均使用第一预置区域的概率估计结果;根据第一预置区域的缩放因子对第一预置区域的概率估计结果进行预处理,得到处理后的概率估计结果,将处理后的概率估计结果保存至概率估计结果集合中,并记录处理后的概率估计结果在概率估计结果集合的索引;将概率估计结果写入压缩码流,包括:将概率估计结果集合,索引、第一预置区域的尺寸信息及第一标识写入压缩码流。Set the value of the first identifier of the first preset area as the first value to indicate that the probability estimation result of the first preset area is used when sampling the estimated coefficients in the first preset area; according to the first The scaling factor of the preset area preprocesses the probability estimation result of the first preset area to obtain the processed probability estimation result, saves the processed probability estimation result into the probability estimation result set, and records the processed probability estimation result The results are in the index of the probability estimation result set; writing the probability estimation result into the compressed code stream includes: writing the probability estimation result set, the index, the size information of the first preset area and the first identification into the compressed code stream.
编码端对第一预置区域的概率估计结果进行预处理,得到处理后概率估计结果;解码端基于处理后的概率估计结果进行采样得到重建图像。通过设置不同的预处理方式,可以得到不同质量的重建图像,比如主观质量高的图像或者客观质量高的图像。The encoding end preprocesses the probability estimation result of the first preset area to obtain a processed probability estimation result; the decoding end performs sampling based on the processed probability estimation result to obtain a reconstructed image. By setting different preprocessing methods, reconstructed images of different qualities can be obtained, such as images with high subjective quality or images with high objective quality.
在一个可能的设计中,本编码方法还包括:In a possible design, this encoding method also includes:
将第一预置区域的第一标识的值置为第一值,以用于指示在采样得到第一预置区域中的估计系数时均使用第一预置区域的概率估计结果;将第一概率估计结果写入压缩码流,包括:将第一预置区域的概率估计结果、第一预置区域的尺寸信息和第一标识写入码流。Set the value of the first identifier of the first preset area as the first value to indicate that the probability estimation result of the first preset area is used when sampling the estimated coefficients in the first preset area; set the first Writing the probability estimation result into the compressed code stream includes: writing the probability estimation result of the first preset area, the size information of the first preset area and the first identification into the code stream.
通过将第一预置区域的第一标识的值置为第一值来指示解码端解码得到第一预置区域的概率估计结果后采样得到第一预置区域中的估计系数时均使用第一预置区域的概率估计结果;通过引入尺寸信息,以指示采样得到第一预置区域中的估计系数时需要基于第一预置区域的概率估计结果进行采样的次数,从而得到第一预置区域中所有的估计系数。By setting the value of the first flag of the first preset area as the first value, it indicates that the decoding end uses the first The probability estimation result of the preset area; the size information is introduced to indicate the number of sampling times that need to be sampled based on the probability estimation result of the first preset area when sampling to obtain the estimated coefficient in the first preset area, so as to obtain the first preset area All estimated coefficients in .
在一个可能的设计中,本编码方法还包括:In a possible design, this encoding method also includes:
对第一数据的概率估计结果进行预处理,得到处理后的概率估计结果。The probability estimation result of the first data is preprocessed to obtain the probability estimation result after processing.
在一个可能的设计中,第一数据的概率估计结果包括高斯分布的均值和方差,对第一数据的概率估计结果进行预处理,得到处理后的概率估计结果,包括:In a possible design, the probability estimation result of the first data includes the mean and variance of the Gaussian distribution, and the probability estimation result of the first data is preprocessed to obtain the processed probability estimation result, including:
将高斯分布的方差置为0作为处理后的方差,其中,处理后的概率估计结果包括高斯分布的均值和处理后的方差。The variance of the Gaussian distribution is set to 0 as the processed variance, wherein the processed probability estimation result includes the mean value of the Gaussian distribution and the processed variance.
在一个可能的设计中,第一数据的概率估计结果包括高斯分布的均值和方差,对第一数据的概率估计结果进行预处理,得到处理后的概率估计结果,包括:根据第一数据的缩放因子对高斯分布的方差进行预处理,以得到处理后的方差,其中,处理后的概率估计结果包括高斯分布的均值和处理后的方差;则In a possible design, the probability estimation result of the first data includes the mean and variance of the Gaussian distribution, and the probability estimation result of the first data is preprocessed to obtain the processed probability estimation result, including: scaling according to the first data The factor preprocesses the variance of the Gaussian distribution to obtain the processed variance, where the processed probability estimation result includes the mean value of the Gaussian distribution and the processed variance; then
第一数据的缩放因子和第二数据的缩放因子相同;或者,第一数据的缩放因子和第二数据的缩放因子不同;或者,The scaling factor of the first data is the same as the scaling factor of the second data; or, the scaling factor of the first data is different from the scaling factor of the second data; or,
根据第一数据所属的预置区域的内容信息对第一数据的概率估计结果进行预处理,得到 处理后的概率估计结果,包括:根据第一数据所属的预置区域的内容信息来确定第一数据的缩放因子,根据缩放因子对高斯分布的方差进行预处理,得到处理后的方差。其中,预置区域的内容信息包括预置区域的纹理分辨率级别或纹理复杂度。Perform preprocessing on the probability estimation result of the first data according to the content information of the preset area to which the first data belongs to obtain the processed probability estimation result, including: determining the first data according to the content information of the preset area to which the first data belongs The scaling factor of the data. According to the scaling factor, the variance of the Gaussian distribution is preprocessed to obtain the processed variance. Wherein, the content information of the preset area includes texture resolution level or texture complexity of the preset area.
作为一个示例,可以计算纹理的复杂程度,对于纹理复杂的预置区域认为分辨率级别高,纹理平滑预置区域认为分辨率级别低,对于同属于分辨率级别高的预置区域的第一数据和第二数据,第一数据的收缩因子和第二数据的收缩因子不相同,对于同属于分辨率级别低的预置区域内第一数据和第二数据,第一数据的收缩因子和第二数据的收缩因子相同。作为另外一个示例,对于同属于纹理复杂度高的预置区域第一数据和第二数据,第一数据的收缩因子和第二数据的收缩因子不相同,对于同属于纹理复杂度低的预置区域内第一数据和第二数据,第一数据的收缩因子和第二数据的收缩因子相同。As an example, the complexity of the texture can be calculated. For the preset area with complex texture, the resolution level is high, and for the smooth texture preset area, the resolution level is considered low. For the first data belonging to the preset area with high resolution level And the second data, the shrinkage factor of the first data is different from the shrinkage factor of the second data, for the first data and the second data in the preset area with low resolution level, the shrinkage factor of the first data and the second The shrinkage factor of the data is the same. As another example, for the first data and second data belonging to the preset area with high texture complexity, the shrinkage factor of the first data is different from that of the second data, and for the preset area with low texture complexity For the first data and the second data in the area, the shrinkage factor of the first data is the same as the shrinkage factor of the second data.
上述预置区域可为下面所说的图像块、子带、频带、或者通道。The aforementioned preset area may be an image block, a subband, a frequency band, or a channel as mentioned below.
若第一数据和第二数据在第一图像中属于同一个图像块,则第一数据的缩放因子和第二数据的缩放因子相同;或者若第一数据和第二数据属于不同图像块,则第一数据的缩放因子和第二数据的缩放因子不同;或者第一数据的缩放因子是根据第一数据所属的图像块的纹理复杂度确定的;If the first data and the second data belong to the same image block in the first image, the scaling factor of the first data and the scaling factor of the second data are the same; or if the first data and the second data belong to different image blocks, then The scaling factor of the first data is different from the scaling factor of the second data; or the scaling factor of the first data is determined according to the texture complexity of the image block to which the first data belongs;
或者,or,
若第一数据和第二数据属于对第一图像进行小波变换得到的多个子带中的一个子带,则第一数据的缩放因子和第二数据的缩放因子相同;或者若第一数据和第二数据属于不同子带,则第一数据的缩放因子和第二数据的缩放因子不同;或者第一数据的缩放因子是根据第一数据所属的子带的纹理复杂度确定的;If the first data and the second data belong to one subband among the plurality of subbands obtained by performing wavelet transformation on the first image, then the scaling factor of the first data is the same as that of the second data; or if the first data and the second data The two data belong to different subbands, and the scaling factor of the first data is different from the scaling factor of the second data; or the scaling factor of the first data is determined according to the texture complexity of the subband to which the first data belongs;
或者,or,
若第一数据和第二数据属于对第一图像进行DCT得到的多个频带中一个频带或者多个变换块中的一个变换块,则第一数据的缩放因子和第二数据的缩放因子相同;或者若第一数据和第二数据属于不同频带或变换块,则第一数据的缩放因子和第二数据的缩放因子不同;若或者第一数据的缩放因子是根据第一数据所属的频带或变换块的纹理复杂度确定的;If the first data and the second data belong to one of the multiple frequency bands or one of the multiple transform blocks obtained by performing DCT on the first image, the scaling factor of the first data is the same as the scaling factor of the second data; Or if the first data and the second data belong to different frequency bands or transform blocks, then the scaling factor of the first data and the scaling factor of the second data are different; if or the scaling factor of the first data is according to the frequency band or transform to which the first data belongs The texture complexity of the block is determined;
或者,or,
若第一数据和第二数据属于对第一图像进行特征提取得到的三维特征图的同一通道,则第一数据的缩放因子和第二数据的缩放因子相同;或者若第一数据和第二数据属于不同通道,则第一数据的缩放因子和第二数据的缩放因子不同;或者第一数据的缩放因子是根据第一数据所属的通道的纹理复杂度确定的。If the first data and the second data belong to the same channel of the three-dimensional feature map obtained by performing feature extraction on the first image, then the scaling factor of the first data and the scaling factor of the second data are the same; or if the first data and the second data belong to different channels, the scaling factor of the first data is different from the scaling factor of the second data; or the scaling factor of the first data is determined according to the texture complexity of the channel to which the first data belongs.
在此需要说明的是,对于第一数据所属图像块的纹理复杂度可以是根据待编码图像或者已解码图像中对应图像块的内容确定的;对于第一数据所属子带的纹理复杂度可以是根据待编码图像或者已解码图像中该子带对应部分的内容确定的;对于第一数据所属频带的纹理复杂度可以是根据待编码图像或者已解码图像中该频带对应部分的内容确定的;对于第一数据所属通道的纹理复杂度可以是根据待编码图像或者已解码图像中该通道对应部分的内容确定的。在一个示例中,第一数据的纹理复杂度越大,第一数据的缩放因子越大。It should be noted here that the texture complexity of the image block to which the first data belongs can be determined according to the content of the corresponding image block in the image to be encoded or the decoded image; the texture complexity of the subband to which the first data belongs can be Determined according to the content of the corresponding part of the sub-band in the image to be encoded or in the decoded image; the texture complexity of the frequency band to which the first data belongs may be determined according to the content of the corresponding part of the frequency band in the image to be encoded or in the decoded image; for The texture complexity of the channel to which the first data belongs may be determined according to the content of the corresponding part of the channel in the image to be encoded or the decoded image. In one example, the larger the texture complexity of the first data is, the larger the scaling factor of the first data is.
在一个可能的设计中,本编码方法还包括:In a possible design, this encoding method also includes:
对第二预置区域的概率估计结果进行预处理,得到处理后的概率估计结果。The probability estimation result of the second preset area is preprocessed to obtain the probability estimation result after processing.
在一个可能的设计中,第一数据的概率估计结果包括高斯分布的均值和方差,对第一数据的概率估计结果进行预处理,得到处理后的概率估计结果,包括:In a possible design, the probability estimation result of the first data includes the mean and variance of the Gaussian distribution, and the probability estimation result of the first data is preprocessed to obtain the processed probability estimation result, including:
将高斯分布的方差置为0作为第一方差,其中,处理后的概率估计结果包括高斯分布的 均值和第一方差,或者,根据第二预置区域的缩放因子对高斯分布的方差进行处理,以得到第二方差,其中,处理后的概率估计结果包括高斯分布的均值和第二方差,第一预置区域的缩放因子和第二预制区域的缩放因子相同或者不同。Set the variance of the Gaussian distribution to 0 as the first variance, wherein the processed probability estimation result includes the mean value and the first variance of the Gaussian distribution, or, the variance of the Gaussian distribution is calculated according to the scaling factor of the second preset area processing to obtain the second variance, wherein the processed probability estimation result includes the mean value of the Gaussian distribution and the second variance, and the scaling factor of the first prefabricated area is the same or different from the scaling factor of the second prefabricated area.
在一个可能的设计中,第一上下文信息包括在第一图像中部分或者全部像素值。In a possible design, the first context information includes some or all pixel values in the first image.
通过对概率估计结果进行预处理,可以按照用户的需求得到不同性质的重建图像,提高了重建图像的质量。比如将对概率估计结果的方差置0作为处理后的方差,可以得到信号质量最佳(客观质量最佳)的重建图像,也就是增大图像的峰值信噪比(peak signal to noise ratio,PSNR)或者降低均方误差(mean-square error,MSE);通过将多个数据的缩放因子设置为相同,可以得到主观质量最佳的图像,也即是降低图像的PSNR或者增大图像的MSE;通过将图像中属于同于部分的数据的缩放因子设置为相同,将属于不同部分的数据的缩放因子设置为不相同,可以得到性质在主观质量最佳和客观质量最佳之间的图像。By preprocessing the probability estimation results, reconstructed images with different properties can be obtained according to user's needs, which improves the quality of reconstructed images. For example, if the variance of the probability estimation result is set to 0 as the variance after processing, the reconstructed image with the best signal quality (best objective quality) can be obtained, that is, the peak signal to noise ratio (PSNR) of the image can be increased. ) or reduce the mean-square error (mean-square error, MSE); by setting the scaling factors of multiple data to be the same, an image with the best subjective quality can be obtained, that is, reducing the PSNR of the image or increasing the MSE of the image; By setting the scaling factors of the data belonging to the same part in the image to be the same, and setting the scaling factors of the data belonging to different parts to be different, an image whose property is between the best subjective quality and the best objective quality can be obtained.
在一个可能的设计中,本编码方法还包括:In a possible design, this encoding method also includes:
对第一图像进行变换,以得到第一变换图像;其中,若变换为小波变换,则第一上下文信息包括在第一变换图像中部分或者全部的系数,该系数为小波系数或者量化小波系数,或者;若变换为DCT,则第一上下文信息包括在第一变换图像中部分或者全部的系数,该系数为DCT系数或者量化DCT系数;或者,若变换为特征变换,则第一上下文信息包括在第一变换图像中部分或者全部的系数,该系数为特征系数或者量化特征系数。Transforming the first image to obtain a first transformed image; wherein, if transformed into a wavelet transform, the first context information includes some or all coefficients in the first transformed image, and the coefficients are wavelet coefficients or quantized wavelet coefficients, Or; if the transformation is DCT, the first context information includes some or all of the coefficients in the first transformed image, and the coefficients are DCT coefficients or quantized DCT coefficients; or, if the transformation is feature transformation, the first context information is included in Part or all of the coefficients in the first transformed image are characteristic coefficients or quantized characteristic coefficients.
在一个可能的设计中,根据第一上下文信息进行概率估计得到第一概率估计结果,包括:In a possible design, the probability estimation is performed according to the first context information to obtain the first probability estimation result, including:
将第一上下文信息输入到第一概率估计网络中进行处理,得到第一概率分布模型的参数;概率估计结果第一概率分布模型的参数;The first context information is input into the first probability estimation network for processing to obtain the parameters of the first probability distribution model; the parameters of the probability estimation result first probability distribution model;
或者,or,
将第一上下文信息输入到第二概率估计网络中进行处理,得到目标概率分布,概率估计结果包括目标概率分布的参数;其中,第一概率估计网络和第二概率估计网络是神经网络实现的。The first context information is input into the second probability estimation network for processing to obtain the target probability distribution, and the probability estimation result includes the parameters of the target probability distribution; wherein, the first probability estimation network and the second probability estimation network are realized by a neural network.
基于第二方面,本申请涉及视频图像的编码方法。该方法由编码装置执行,该方法包括:Based on the second aspect, the present application relates to a video image encoding method. The method is performed by an encoding device, and the method includes:
根据待编码图像获得多个系数,该多个系数包括第一系数;根据第一系数的上下文信息得到第一概率估计结果;将第一系数和第一概率估计结果写入压缩码流。A plurality of coefficients are obtained according to the image to be encoded, and the plurality of coefficients include a first coefficient; a first probability estimation result is obtained according to the context information of the first coefficient; and the first coefficient and the first probability estimation result are written into a compressed code stream.
其中,第一系数可以为待编码图像中的像素或者通过对待编码图像进行变换得到的变换图像中的系数。Wherein, the first coefficient may be a pixel in the image to be coded or a coefficient in a transformed image obtained by transforming the image to be coded.
在编码端进行概率估计得到概率估计结果,并将该概率估计结果传输至解码端,使得解码端基于该概率估计结果进行采样可得到高质量的图像。The probability estimation is performed at the encoding end to obtain a probability estimation result, and the probability estimation result is transmitted to the decoding end, so that the decoding end performs sampling based on the probability estimation result to obtain a high-quality image.
在一个可能的设计中,多个系数还包括第二系数,本编码方法还包括:In a possible design, the multiple coefficients also include a second coefficient, and the encoding method also includes:
根据第二系数的上下文信息得到第二概率估计结果;将第一系数和第一概率估计结果写入压缩码流,包括:将第一系数、第一概率估计结果、第二系数和第二概率估计结果写入压缩码流。Obtaining a second probability estimation result according to the context information of the second coefficient; writing the first coefficient and the first probability estimation result into the compressed code stream, including: writing the first coefficient, the first probability estimation result, the second coefficient and the second probability The estimated results are written to the compressed codestream.
编码端通过逐个计算待编码图像中每个系数的概率估计结果,并将每个系数的概率估计结果传输至解码端,使得解码端能够准确地基于各自系数的概率估计结果进行采样,从而得到质量更高的重建图像。The encoding end calculates the probability estimation results of each coefficient in the image to be encoded one by one, and transmits the probability estimation results of each coefficient to the decoding end, so that the decoding end can accurately sample based on the probability estimation results of the respective coefficients, thereby obtaining the quality Higher reconstructed images.
在一个可能的设计中,多个系数还包括第二系数,第一系数和第二系数属于同一预置区域,预置区域位于待编码图像中,或者位于对待编码图像进行变换得到的图像中,根据第一 系数的上下文信息得到第一概率估计结果,包括:In a possible design, the plurality of coefficients further includes a second coefficient, the first coefficient and the second coefficient belong to the same preset area, and the preset area is located in the image to be coded, or in an image obtained by transforming the image to be coded, The first probability estimation result is obtained according to the context information of the first coefficient, including:
根据第一系数的上下文信息进行概率估计得到第三概率估计结果;根据第二系数的上下文信息进行概率估计得到第二概率估计结果;从第三概率估计结果和第二概率估计结果中确定出第一概率估计结果;Perform probability estimation according to the context information of the first coefficient to obtain a third probability estimation result; perform probability estimation according to the context information of the second coefficient to obtain a second probability estimation result; determine the third probability estimation result from the third probability estimation result and the second probability estimation result a probability estimate result;
将第一系数和第一概率估计结果写入压缩码流,包括:将第一系数、第二系数和第一概率估计结果写入压缩码流。Writing the first coefficient and the first probability estimation result into the compressed code stream includes: writing the first coefficient, the second coefficient and the first probability estimation result into the compressed code stream.
其中,预置区域为待编码图像中的一个图像块,或者为对待编码图像进行小波变换得到的一个子带,或者为对待编码图像进行DCT得到的一个频带、或者为对待编码图像进行DCT得到的一个变换块,或者位于对所述待编码图像进行特征提取得到的三维特征图中一个通道。Wherein, the preset area is an image block in the image to be encoded, or a subband obtained by performing wavelet transformation on the image to be encoded, or a frequency band obtained by performing DCT on the image to be encoded, or a frequency band obtained by performing DCT on the image to be encoded A transformation block, or a channel in a three-dimensional feature map obtained by performing feature extraction on the image to be encoded.
其中,对待编码图像以一个或者多个图像块为单位进行DCT变换可以得到一个或者多个变换块。Wherein, DCT transformation is performed on the image to be coded in units of one or more image blocks to obtain one or more transform blocks.
编码端对于一个预置区域的数据,将一个概率估计结果作为该预置区域内所有的系数的概率估计结果,从而在传输时只需传一个概率估计结果,从而降低了传输码流的数量和传输该码流所需的资源。For the data in a preset area, the encoding end uses a probability estimation result as the probability estimation result of all coefficients in the preset area, so that only one probability estimation result needs to be transmitted during transmission, thereby reducing the number of transmission code streams and The resources required to transmit the code stream.
在一个可能的设计中,多个系数还包括第二系数,第一系数和第二系数属于同一预置区域,预置区域位于待编码图像中,或者位于对待编码图像进行变换得到的图像中,根据第一系数的上下文信息得到第一概率分布,包括:In a possible design, the plurality of coefficients further includes a second coefficient, the first coefficient and the second coefficient belong to the same preset area, and the preset area is located in the image to be coded, or in an image obtained by transforming the image to be coded, The first probability distribution is obtained according to the context information of the first coefficient, including:
根据预置区域的上下文信息进行概率估计得到第一概率估计结果;预置区域的上下文信息包括第一系数的上下文信息;将第一系数和第一概率估计结果写入压缩码流,包括:将第一系数、第二系数和第一概率估计结果写入压缩码流。Probability estimation is performed according to the context information of the preset area to obtain a first probability estimation result; the context information of the preset area includes context information of the first coefficient; writing the first coefficient and the first probability estimation result into the compressed code stream includes: The first coefficient, the second coefficient and the first probability estimation result are written into the compressed code stream.
编码端对于一个预置区域的数据,将一个概率估计结果作为该预置区域内所有的系数的概率估计结果,从而在传输时只需传一个概率估计结果,从而降低了传输码流的数量和传输该码流所需的资源。For the data in a preset area, the encoding end uses a probability estimation result as the probability estimation result of all coefficients in the preset area, so that only one probability estimation result needs to be transmitted during transmission, thereby reducing the number of transmission code streams and The resources required to transmit the code stream.
在一个可能的设计中,本编码方法还包括:In a possible design, this encoding method also includes:
将预置区域的第一标识的值置为第一值,以用于指示在采样得到预置区域中的估计系数时均使用第一概率估计结果;将第一概率估计结果保存至概率估计结果集合中,并记录第一概率估计结果在概率估计结果集合的索引;将第一系数、第二系数和第一概率估计结果写入压缩码流,包括:将第一系数、第二系数、概率估计结果集合,索引、预置区域的尺寸信息及第一标识写入压缩码流。Set the value of the first flag of the preset area to the first value to indicate that the first probability estimation result is used when sampling the estimated coefficients in the preset area; save the first probability estimation result to the probability estimation result set, and record the index of the first probability estimation result in the probability estimation result set; write the first coefficient, the second coefficient and the first probability estimation result into the compressed code stream, including: writing the first coefficient, the second coefficient, the probability The estimation result set, the index, the size information of the preset area and the first identification are written into the compressed code stream.
对于多个预置区域的多个概率估计结果,编码端将多个预置区域的概率估计结果保存至概率估计结果集合中,并记录每个预置区域的概率估计结果在概率估计结果集合中的位置(即索引),使得解码端可以基于索引能准确从基于码流解码得到的概率估计结果集合中确定每个预置区域的概率估计结果,从而保证了解码的准确性。通过引入尺寸信息,以指示采样得到预置区域中的估计系数时需要基于预置区域的概率估计结果进行采样的次数,从而得到预置区域中所有的估计系数。For multiple probability estimation results of multiple preset areas, the encoding end saves the probability estimation results of multiple preset areas in the probability estimation result set, and records the probability estimation results of each preset area in the probability estimation result set position (namely index), so that the decoder can accurately determine the probability estimation result of each preset area from the probability estimation result set obtained based on codestream decoding based on the index, thereby ensuring the accuracy of decoding. The size information is introduced to indicate the number of times of sampling based on the probability estimation result of the preset area when sampling to obtain the estimated coefficients in the preset area, so as to obtain all the estimated coefficients in the preset area.
在一个可能的设计中,本编码方法还包括:In a possible design, this encoding method also includes:
将预置区域的第一标识的值置为第一值,以用于指示在采样得到预置区域中的估计系数时均使用第一概率估计结果将第一系数、第二系数和第一概率估计结果写入压缩码流,包括:将第一系数、第二系数、第一概率估计结果、预置区域的尺寸信息及第一标识写入压缩码流。Set the value of the first flag of the preset area to the first value to indicate that when the estimated coefficients in the preset area are obtained by sampling, the first probability estimation result is used to combine the first coefficient, the second coefficient and the first probability Writing the estimation result into the compressed code stream includes: writing the first coefficient, the second coefficient, the first probability estimation result, the size information of the preset area and the first identification into the compressed code stream.
通过将预置区域的第一标识的值置为第一值来指示解码端解码得到预置区域的概率估计结果后采样得到预置区域中的估计系数时均使用预置区域的概率估计结果;通过引入尺寸信 息,以指示采样得到预置区域中的估计系数时需要基于预置区域的概率估计结果进行采样的次数,从而得到预置区域中所有的估计系数。By setting the value of the first identifier of the preset area as the first value, it indicates that the decoding end obtains the probability estimation result of the preset area after sampling to obtain the estimated coefficient in the preset area, and uses the probability estimation result of the preset area; The size information is introduced to indicate the number of times of sampling based on the probability estimation result of the preset area when sampling to obtain the estimated coefficients in the preset area, so as to obtain all the estimated coefficients in the preset area.
在一个可能的设计中,第一系数和第二系数属于同一预置区域,本编码方法还包括:In a possible design, the first coefficient and the second coefficient belong to the same preset area, and the encoding method further includes:
将预置区域的第一标识的值置为第二值,以用于指示在采样得到预置区域中的估计系数时均使用各自的概率估计结果;将第一系数、第一概率估计结果、第二系数和第二概率估计结果写入压缩码流,包括:将第一系数、第一概率估计结果、第二系数和第二概率估计结果和预置区域的第一标识写入压缩码流。Set the value of the first flag of the preset area to the second value to indicate that when sampling the estimated coefficients in the preset area, the respective probability estimation results are used; set the first coefficient, the first probability estimation result, Writing the second coefficient and the second probability estimation result into the compressed code stream includes: writing the first coefficient, the first probability estimation result, the second coefficient, the second probability estimation result and the first identification of the preset area into the compressed code stream .
在一个可能的设计中,本编码方法还包括:In a possible design, this encoding method also includes:
对第一系数的概率估计结果进行预处理,得到处理后的概率估计结果。The probability estimation result of the first coefficient is preprocessed to obtain the probability estimation result after processing.
在一个可能的设计中,第一系数的概率估计结果包括高斯分布的均值和方差,对第一系数的概率估计结果进行预处理,得到处理后的概率估计结果,包括:In a possible design, the probability estimation result of the first coefficient includes the mean and variance of the Gaussian distribution, and the probability estimation result of the first coefficient is preprocessed to obtain the processed probability estimation result, including:
将高斯分布的方差置为0作为处理后的方差,其中,处理后的概率估计结果包括高斯分布的均值和处理后的方差。The variance of the Gaussian distribution is set to 0 as the processed variance, wherein the processed probability estimation result includes the mean value of the Gaussian distribution and the processed variance.
在一个可能的设计中,第一系数的概率估计结果包括高斯分布的均值和方差,对第一系数的概率估计结果进行预处理,得到处理后的概率估计结果,包括:In a possible design, the probability estimation result of the first coefficient includes the mean and variance of the Gaussian distribution, and the probability estimation result of the first coefficient is preprocessed to obtain the processed probability estimation result, including:
根据第一系数的缩放因子对高斯分布的方差进行预处理,以得到处理后的方差,其中,处理后的概率估计结果包括高斯分布的均值和处理后的方差;Preprocessing the variance of the Gaussian distribution according to the scaling factor of the first coefficient to obtain the processed variance, wherein the processed probability estimation result includes the mean value of the Gaussian distribution and the processed variance;
第一系数的缩放因子和第二系数的缩放因子相同;或者,第一系数的缩放因子和第二系数的缩放因子不同;或者,The scaling factor of the first coefficient and the scaling factor of the second coefficient are the same; or, the scaling factor of the first coefficient and the scaling factor of the second coefficient are different; or,
根据第一系数所属的预置区域的内容信息对第一系数的概率估计结果进行预处理,得到处理后的概率估计结果,包括:根据第一系数所属的预置区域的内容信息来确定第一系数的缩放因子,根据缩放因子对高斯分布的方差进行预处理,得到处理后的方差。其中,预置区域的内容信息包括预置区域的纹理分辨率级别或纹理复杂度。Perform preprocessing on the probability estimation result of the first coefficient according to the content information of the preset area to which the first coefficient belongs to obtain the processed probability estimation result, including: determining the first coefficient according to the content information of the preset area to which the first coefficient belongs The scaling factor of the coefficient, according to which the variance of the Gaussian distribution is preprocessed to obtain the processed variance. Wherein, the content information of the preset area includes texture resolution level or texture complexity of the preset area.
作为一个示例,可以计算纹理的复杂程度,对于纹理复杂的预置区域认为分辨率级别高,纹理平滑预置区域认为分辨率级别低,对于同属于分辨率级别高的预置区域的第一系数和第二系数,第一系数的收缩因子和第二系数的收缩因子不相同,对于同属于分辨率级别低的预置区域内第一系数和第二系数,第一系数的收缩因子和第二系数的收缩因子相同。作为另外一个示例,对于同属于纹理复杂度高的预置区域第一系数和第二系数,第一系数的收缩因子和第二系数的收缩因子不相同,对于同属于纹理复杂度低的预置区域内第一系数和第二系数,第一系数的收缩因子和第二系数的收缩因子相同。As an example, the complexity of the texture can be calculated. For the preset area with complex texture, the resolution level is considered to be high, and for the texture smooth preset area, the resolution level is considered to be low. For the first coefficient belonging to the preset area with high resolution level And the second coefficient, the shrinkage factor of the first coefficient is different from the shrinkage factor of the second coefficient, for the first coefficient and the second coefficient in the preset area with low resolution level, the shrinkage factor of the first coefficient The shrinkage factors of the coefficients are the same. As another example, for the first coefficient and the second coefficient belonging to the preset area with high texture complexity, the shrinkage factor of the first coefficient and the shrinkage factor of the second coefficient are different, and for the preset area with low texture complexity The first coefficient and the second coefficient in the area, the shrinkage factor of the first coefficient and the shrinkage factor of the second coefficient are the same.
上述预置区域可为下面所说的图像块、子带、频带、变换块或者通道。The aforementioned preset area may be an image block, a subband, a frequency band, a transform block or a channel as mentioned below.
若第一系数和第二系数在待编码图像中属于同一个图像块,则第一数据的缩放因子和第二系数的缩放因子相同;或者若第一系数和第二系数属于不同图像块,则第一系数的缩放因子和第二系数的缩放因子不同;或者第一系数的缩放因子是根据第一系数所属的子带的纹理复杂度确定的;或者,If the first coefficient and the second coefficient belong to the same image block in the image to be encoded, then the scaling factor of the first data and the scaling factor of the second coefficient are the same; or if the first coefficient and the second coefficient belong to different image blocks, then The scaling factor of the first coefficient is different from the scaling factor of the second coefficient; or the scaling factor of the first coefficient is determined according to the texture complexity of the subband to which the first coefficient belongs; or,
若第一系数和第二系数属于对待编码图像进行小波变换得到的多个子带中的一个子带,则第一系数的缩放因子和第二系数的缩放因子相同;或者若第一系数和第二系数属于不同子带,则第一系数的缩放因子和第二系数的缩放因子不同;或者第一系数的缩放因子是根据第一系数所属的子带的纹理复杂度确定的;If the first coefficient and the second coefficient belong to one of the multiple subbands obtained by performing wavelet transformation on the image to be encoded, then the scaling factor of the first coefficient is the same as the scaling factor of the second coefficient; or if the first coefficient and the second If the coefficients belong to different subbands, the scaling factor of the first coefficient is different from the scaling factor of the second coefficient; or the scaling factor of the first coefficient is determined according to the texture complexity of the subband to which the first coefficient belongs;
或者,or,
若第一系数和第二系数属于对待编码图像进行DCT得到的多个频带中一个频带,则第一 系数的缩放因子和第二系数的缩放因子相同;或者若第一系数和第二系数属于不同频带,则第一系数的缩放因子和第二系数的缩放因子不同;若第一系数的缩放因子是根据第一系数所属的频带的纹理复杂度确定的;If the first coefficient and the second coefficient belong to one of the multiple frequency bands obtained by performing DCT on the image to be coded, the scaling factor of the first coefficient and the scaling factor of the second coefficient are the same; or if the first coefficient and the second coefficient belong to different frequency band, the scaling factor of the first coefficient is different from the scaling factor of the second coefficient; if the scaling factor of the first coefficient is determined according to the texture complexity of the frequency band to which the first coefficient belongs;
或者,or,
若第一系数和第二系数属于对待编码图像进行特征提取得到的三维特征图的同一通道,则第一系数的缩放因子和第二系数的缩放因子相同;或者若第一系数和第二系数属于不同通道,则第一系数的缩放因子和第二系数的缩放因子不同;若第一系数的缩放因子是根据第一系数所属的通道的纹理复杂度确定的。If the first coefficient and the second coefficient belong to the same channel of the three-dimensional feature map obtained by feature extraction of the image to be coded, the scaling factor of the first coefficient and the scaling factor of the second coefficient are the same; or if the first coefficient and the second coefficient belong to For different channels, the scaling factor of the first coefficient is different from the scaling factor of the second coefficient; if the scaling factor of the first coefficient is determined according to the texture complexity of the channel to which the first coefficient belongs.
在此需要说明的是,对于第一系数所属图像块的纹理复杂度可以是根据待编码图像中该图像块的内容确定的;对于第一系数所属子带的纹理复杂度可以是根据待编码图像该子带对应部分的内容确定的;对于第一系数所属频带的纹理复杂度可以是根据待编码图像中该频带对应部分的内容确定的;对于第一系数所属通道的纹理复杂度可以是根据待编码图像中该通道对应部分的内容确定的。其中,第一系数的纹理复杂度越大,第一系数的缩放因子越大。It should be noted here that the texture complexity of the image block to which the first coefficient belongs can be determined according to the content of the image block in the image to be encoded; the texture complexity of the subband to which the first coefficient belongs can be determined according to the content of the image to be encoded The content of the corresponding part of the sub-band is determined; the texture complexity of the frequency band to which the first coefficient belongs can be determined according to the content of the corresponding part of the frequency band in the image to be encoded; the texture complexity of the channel to which the first coefficient belongs can be determined according to the content of the frequency band to be encoded The content of the corresponding part of the channel in the encoded image is determined. Wherein, the larger the texture complexity of the first coefficient is, the larger the scaling factor of the first coefficient is.
在一个可能的设计中,本编码方法还包括:In a possible design, this encoding method also includes:
对预置区域的概率估计结果进行预处理,得到处理后的概率估计结果。The probability estimation result of the preset area is preprocessed to obtain the probability estimation result after processing.
在一个可能的设计中,预置区域的概率估计结果包括高斯分布的均值和方差,对预置区域的概率估计结果进行预处理,得到处理后的概率估计结果,包括:In a possible design, the probability estimation result of the preset area includes the mean and variance of the Gaussian distribution, and the probability estimation result of the preset area is preprocessed to obtain the processed probability estimation result, including:
将高斯分布的方差置为0作为第一方差,其中,处理后的概率估计结果包括高斯分布的均值和第一方差,或者,根据预置区域的缩放因子对高斯分布的方差进行处理,以得到第二方差,其中,处理后的概率估计结果包括高斯分布的均值和第二方差。Set the variance of the Gaussian distribution to 0 as the first variance, wherein the processed probability estimation result includes the mean value and the first variance of the Gaussian distribution, or process the variance of the Gaussian distribution according to the scaling factor of the preset area, to obtain the second variance, wherein the processed probability estimation result includes the mean value and the second variance of the Gaussian distribution.
通过对概率估计结果进行预处理,可以按照用户的需求得到不同性质的重建图像,提高了重建图像的质量。比如将概率估计结果的方差置0作为处理后的方差,可以得到信号质量最佳(客观质量最佳)的重建图像,也就是增大图像的PSNR或者MSE;通过将多个系数的缩放因子设置为相同,可以得到主观质量最佳的图像,也即是降低图像的PSNR或者增大图像的MSE;通过将图像中属于同于部分的系数的缩放因子设置为相同,将属于不同部分的系数的缩放因子设置为不相同,可以得到性质在主观质量最佳和客观质量最佳之间的图像。By preprocessing the probability estimation results, reconstructed images with different properties can be obtained according to user's needs, which improves the quality of reconstructed images. For example, if the variance of the probability estimation result is set to 0 as the variance after processing, the reconstructed image with the best signal quality (best objective quality) can be obtained, that is, the PSNR or MSE of the image is increased; by setting the scaling factors of multiple coefficients For the same, the image with the best subjective quality can be obtained, that is, to reduce the PSNR of the image or to increase the MSE of the image; by setting the scaling factors of the coefficients belonging to the same part of the image to be the same, the coefficients of the coefficients belonging to different parts If the scaling factors are set differently, an image whose nature is between the best subjective quality and the best objective quality can be obtained.
在一个可能的设计中,若多个系数为待编码图像中的多个像素值,第一上下文信息包括第一图像中部分或者全部像素值;或者,In a possible design, if the multiple coefficients are multiple pixel values in the image to be coded, the first context information includes some or all pixel values in the first image; or,
根据待编码图像获得多个系数,包括:Obtain multiple coefficients according to the image to be encoded, including:
若对待编码图像进行小波变换得到多个系数,多个系数为多个小波系数,第一上下文信息包括多个小波系数中的部分或者全部;或者,若对待编码图像进行小波变换和量化得到多个系数,多个系数为多个量化小波系数,第一上下文信息包括多个量化小波系数中的部分或者全部;或者,若对待编码图像进行DCT得到多个系数,多个系数为多个DCT系数,第一上下文信息包括多个DCT系数中的部分或者全部;或者,若对待编码图像进行DCT和量化得到多个系数,多个系数为多个量化DCT系数,第一上下文信息包括多个量化DCT系数中的部分或者全部;或者,若对待编码图像进行特征提取得到多个系数,多个系数为多个特征系数,第一上下文信息包括多个特征系数中的部分或者全部;或者,若对待编码图像进行特征提取和量化得到多个系数,多个系数为多个量化特征系数,第一上下文信息包括多个量化特征系数中的部分或者全部。If the image to be coded is subjected to wavelet transformation to obtain multiple coefficients, the multiple coefficients are multiple wavelet coefficients, and the first context information includes part or all of the multiple wavelet coefficients; or, if the image to be coded is subjected to wavelet transformation and quantization to obtain multiple Coefficients, the plurality of coefficients are a plurality of quantized wavelet coefficients, the first context information includes part or all of the plurality of quantized wavelet coefficients; or, if the image to be coded is subjected to DCT to obtain a plurality of coefficients, the plurality of coefficients are a plurality of DCT coefficients, The first context information includes some or all of the multiple DCT coefficients; or, if the image to be coded is subjected to DCT and quantization to obtain multiple coefficients, the multiple coefficients are multiple quantized DCT coefficients, and the first context information includes multiple quantized DCT coefficients Part or all of them; or, if the feature extraction of the image to be coded obtains multiple coefficients, the multiple coefficients are multiple feature coefficients, and the first context information includes some or all of the multiple feature coefficients; or, if the image to be coded Feature extraction and quantization are performed to obtain multiple coefficients, the multiple coefficients are multiple quantized feature coefficients, and the first context information includes part or all of the multiple quantized feature coefficients.
在一个可能的设计中,根据第一系数的上下文信息得到第一概率估计结果,包括:In a possible design, the first probability estimation result is obtained according to the context information of the first coefficient, including:
获取第二概率分布模型,将第一上下文信息输入到第三概率估计网络中进行处理,得到 第二概率分布模型的参数;根据第二概率分布模型和第二概率分布模型的参数得到第一概率估计结果;Obtain the second probability distribution model, input the first context information into the third probability estimation network for processing, and obtain the parameters of the second probability distribution model; obtain the first probability according to the parameters of the second probability distribution model and the second probability distribution model estimated results;
或者,or,
将第一上下文信息输入到第四概率估计模型中进行处理,得到概率估计结果;其中,第三概率估计网络和第四概率估计网络是神经网络实现的。The first context information is input into the fourth probability estimation model for processing to obtain a probability estimation result; wherein, the third probability estimation network and the fourth probability estimation network are realized by a neural network.
基于第三方面,本申请涉及视频图像的解码方法。该方法由解码装置执行,该方法包括:Based on a third aspect, the present application relates to a method for decoding video images. The method is performed by a decoding device, and the method includes:
从压缩码流解码获得第一概率估计结果;根据第一概率估计结果进行采样得到第一估计系数;根据第一估计系数得到第一重建图像。Decoding the compressed code stream to obtain a first probability estimation result; performing sampling according to the first probability estimation result to obtain a first estimated coefficient; obtaining a first reconstructed image according to the first estimated coefficient.
在一个可能的设计中,本解码方法还包括:In a possible design, the decoding method also includes:
从压缩码流解码获得第二概率估计结果;根据第二概率估计结果进行采样得到第二估计系数;根据第一估计系数得到第一重建图像,包括:根据第一估计系数和第二估计系数得到第一重建图像。Decoding the compressed code stream to obtain a second probability estimation result; performing sampling according to the second probability estimation result to obtain a second estimation coefficient; obtaining a first reconstructed image according to the first estimation coefficient, including: obtaining according to the first estimation coefficient and the second estimation coefficient First reconstruct the image.
在一个可能的设计中,从压缩码流解码获得第一概率估计结果,包括:In a possible design, the first probability estimation result is obtained from decoding the compressed code stream, including:
从压缩码流中解码出第一标识;若第一标识的值为第一值,从压缩码流解码获得第一概率估计结果,包括:Decoding the first identifier from the compressed code stream; if the value of the first identifier is the first value, decoding the compressed code stream to obtain a first probability estimation result, including:
从压缩码流中解码出概率估计结果集合和预置区域的索引;预置区域包括第一估计系数,预置区域为第一重建图像中的一个区域,根据索引从概率估计结果集合中确定出预置区域的概率估计结果,第一概率估计结果为预置区域的概率估计结果;其中,第一标识的值为第一值用于指示采样得到预置区域内的所有估计系时均使用述预置区域的概率估计结果。Decode the probability estimation result set and the index of the preset area from the compressed code stream; the preset area includes the first estimated coefficient, the preset area is an area in the first reconstructed image, and is determined from the probability estimation result set according to the index The probability estimation result of the preset area, the first probability estimation result is the probability estimation result of the preset area; wherein, the value of the first identifier is the first value used to indicate that all estimation systems in the preset area are sampled using the above Probability estimation results for preset regions.
在一个可能的设计中,本解码方法还包括:In a possible design, the decoding method also includes:
从压缩码流中解码出第一标识;若第一标识的值为第一值,从压缩码流解码获得第一概率估计结果,包括:从压缩码流中解码出预置区域的概率估计结果和预置区域的尺寸信息;预置区域包括第一估计系数,预置区域为第一重建图像中的一个区域;预置区域的概率估计结果为第一概率估计结果;其中,第一标识的值为第一值用于指示采样得到预置区域内的所有待估计系时均使用预置区域的概率估计结果。Decoding the first identifier from the compressed code stream; if the value of the first identifier is the first value, decoding the compressed code stream to obtain the first probability estimation result, including: decoding the probability estimation result of the preset area from the compressed code stream and the size information of the preset area; the preset area includes the first estimation coefficient, and the preset area is an area in the first reconstructed image; the probability estimation result of the preset area is the first probability estimation result; wherein, the first identified The value is the first value and is used to indicate that the probability estimation result of the preset area is used when all the systems to be estimated in the preset area are obtained by sampling.
在一个可能的设计中,第一估计系数和第二估计系数属于同一预置区域,预置区域为第一重建图像中的一个区域,本解码方法还包括:In a possible design, the first estimated coefficient and the second estimated coefficient belong to the same preset area, and the preset area is an area in the first reconstructed image, and the decoding method further includes:
从压缩码流中解码出第一标识;若第一标识的值为第二值,第一标识的值为第二值用于指示采样得到预置区域内的所有待估计系时使用各自的概率估计结果。Decode the first identifier from the compressed code stream; if the value of the first identifier is the second value, the value of the first identifier is the second value, which is used to indicate that when sampling all the systems to be estimated in the preset area, use their respective probabilities Estimated results.
在一个可能的设计中,第一概率估计结果包括高斯分布的均值和方差,根据第一概率估计结果进行采样得到第一估计系数,包括:In a possible design, the first probability estimation result includes the mean and variance of the Gaussian distribution, and sampling is performed according to the first probability estimation result to obtain the first estimated coefficient, including:
获取第一随机数;根据第一随机数确定第一参考值,该第一参考值服从高斯分布;根据第一参考值和第一概率估计结果的均值和方差确定第一估计系数。Acquiring a first random number; determining a first reference value according to the first random number, and the first reference value obeys a Gaussian distribution; determining a first estimation coefficient according to the first reference value and the mean value and variance of the first probability estimation result.
在一个可能的设计中,本解码方法还包括:In a possible design, the decoding method also includes:
对第一概率估计结果的方差进行预处理,以得到处理后的方差;Preprocessing the variance of the first probability estimation result to obtain the processed variance;
根据第一参考值和第一概率估计结果的均值和方差确定第一估计系数,包括:Determining the first estimated coefficient according to the first reference value and the mean value and variance of the first probability estimation result, including:
根据第一参考值、第一概率估计结果的均值及处理后的方差确定第一估计系数。The first estimation coefficient is determined according to the first reference value, the mean value of the first probability estimation result and the processed variance.
在一个可能的设计中,对第一概率估计结果的方差进行预处理,以得到处理后的方差,包括:In one possible design, the variance of the first probability estimation result is preprocessed to obtain the processed variance, including:
将第一概率分布的方差置0作为处理后的方差。Set the variance of the first probability distribution to 0 as the processed variance.
在一个可能的设计中,第一估计系数为量化小波系数,或者,小波系数,或者量化DCT系数,或者DCT系数,或者特征系数,或者量化特征系数,对第一概率分布的方差进行预处理,以得到处理后的方差,包括:In a possible design, the first estimated coefficient is a quantized wavelet coefficient, or, a wavelet coefficient, or a quantized DCT coefficient, or a DCT coefficient, or a feature coefficient, or a quantized feature coefficient, and the variance of the first probability distribution is preprocessed, To get the processed variance, including:
根据第一估计系数的缩放因子对第一概率分布的方差进行预处理,以得到处理后的方差,Preprocess the variance of the first probability distribution according to the scaling factor of the first estimated coefficient to obtain the processed variance,
第一估计系数的缩放因子和第二估计系数的缩放因子相同;或者,第一估计系数的缩放因子和第二估计系数的缩放因子不同;或者,The scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient are the same; or, the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient are different; or,
根据第一估计系数所属的预置区域的内容信息对第一估计系数的概率估计结果进行预处理,得到处理后的概率估计结果,包括:根据第一估计系数所属的预置区域的内容信息来确定第一估计系数的缩放因子,根据缩放因子对高斯分布的方差进行预处理,得到处理后的方差。其中,预置区域的内容信息包括预置区域的纹理分辨率级别或纹理复杂度。Perform preprocessing on the probability estimation result of the first estimated coefficient according to the content information of the preset area to which the first estimated coefficient belongs to obtain the processed probability estimation result, including: according to the content information of the preset area to which the first estimated coefficient belongs The scaling factor of the first estimated coefficient is determined, and the variance of the Gaussian distribution is preprocessed according to the scaling factor to obtain the processed variance. Wherein, the content information of the preset area includes texture resolution level or texture complexity of the preset area.
作为一个示例,可以计算纹理的复杂程度,对于纹理复杂的预置区域认为分辨率级别高,纹理平滑预置区域认为分辨率级别低,对于同属于分辨率级别高的预置区域的第一估计系数和第二估计系数,第一估计系数的收缩因子和第二估计系数的收缩因子不相同,对于同属于分辨率级别低的预置区域内第一估计系数和第二估计系数,第一估计系数的收缩因子和第二估计系数的收缩因子相同。作为另外一个示例,对于同属于纹理复杂度高的预置区域第一估计系数和第二系数,第一估计系数的收缩因子和第二估计系数的收缩因子不相同,对于同属于纹理复杂度低的预置区域内第一估计系数和第二估计系数,第一估计系数的收缩因子和第二估计系数的收缩因子相同。As an example, the complexity of the texture can be calculated. For the preset area with complex texture, the resolution level is considered to be high, and the texture smooth preset area is considered to be low in resolution level. For the preset area belonging to the same high resolution level, the first estimate coefficient and the second estimated coefficient, the shrinkage factor of the first estimated coefficient and the shrinkage factor of the second estimated coefficient are different, for the first estimated coefficient and the second estimated coefficient belonging to the preset area with low resolution level, the first estimated coefficient The shrinkage factor for the coefficients is the same as the shrinkage factor for the second estimated coefficients. As another example, for the first estimated coefficient and the second coefficient belonging to the preset area with high texture complexity, the shrinkage factor of the first estimated coefficient and the shrinkage factor of the second estimated coefficient are different, and for the same preset area with low texture complexity The first estimated coefficient and the second estimated coefficient in the preset area of , the shrinkage factor of the first estimated coefficient and the shrinkage factor of the second estimated coefficient are the same.
上述预置区域可为下面所说的图像块、子带、频带、变换块或者通道。The aforementioned preset area may be an image block, a subband, a frequency band, a transform block or a channel as mentioned below.
在第一估计系数和第二估计系数为量化小波系数或者为小波系数时,若第一估计系数和第二估计系数属于同一个子带,则第一估计系数的缩放因子和第二估计系数的缩放因子相同;或者若第一估计系数和第二估计系数属于不同子带,则第一估计系数的缩放因子和第二估计系数的缩放因子不同;或者第一估计系数的缩放因子是根据第一估计系数所属的图像块的纹理复杂度确定的;When the first estimated coefficient and the second estimated coefficient are quantized wavelet coefficients or wavelet coefficients, if the first estimated coefficient and the second estimated coefficient belong to the same subband, the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient The factors are the same; or if the first estimated coefficient and the second estimated coefficient belong to different subbands, the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient are different; or the scaling factor of the first estimated coefficient is based on the first estimated The texture complexity of the image block to which the coefficient belongs is determined;
或者,or,
在第一估计系数和第二估计系数为量化DCT系数或者为DCT系数时,若第一估计系数和第二估计系数属于同一个频带或变换块,则第一估计系数的缩放因子和第二估计系数的缩放因子相同;或者若第一估计系数和第二估计系数属于不同频带或变换块,则第一估计系数的缩放因子和第二估计系数的缩放因子不同;或者第一估计系数的缩放因子是根据第一估计系数所属的频带或变换块的纹理复杂度确定的;When the first estimated coefficient and the second estimated coefficient are quantized DCT coefficients or DCT coefficients, if the first estimated coefficient and the second estimated coefficient belong to the same frequency band or transform block, the scaling factor of the first estimated coefficient and the second estimated The scaling factors of the coefficients are the same; or if the first estimated coefficients and the second estimated coefficients belong to different frequency bands or transform blocks, the scaling factors of the first estimated coefficients and the scaling factors of the second estimated coefficients are different; or the scaling factors of the first estimated coefficients is determined according to the frequency band to which the first estimated coefficient belongs or the texture complexity of the transform block;
或者,or,
在第一估计系数和第二估计系数为特征系数或者量化特征系数时,若第一估计系数和第二估计系数属于同一通道,则第一估计系数的缩放因子和第二估计系数的缩放因子相同;或者若第一估计系数和第二估计系数属于不同通道,则第一估计系数的缩放因子和第二估计系数的缩放因子不同;或者第一估计系数的缩放因子是根据第一估计系数所属的通道的纹理复杂度确定的。When the first estimated coefficient and the second estimated coefficient are characteristic coefficients or quantized characteristic coefficients, if the first estimated coefficient and the second estimated coefficient belong to the same channel, the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient are the same ; or if the first estimated coefficient and the second estimated coefficient belong to different channels, the scaling factor of the first estimated coefficient is different from the scaling factor of the second estimated coefficient; or the scaling factor of the first estimated coefficient is based on the channel to which the first estimated coefficient belongs The channel's texture complexity is determined.
在一个可能的设计中,第一估计系数和第二估计系数为像素值,对第一概率估计结果的方差进行预处理,以得到处理后的方差,包括:In a possible design, the first estimated coefficient and the second estimated coefficient are pixel values, and the variance of the first probability estimation result is preprocessed to obtain the processed variance, including:
根据第一系数的缩放因子对第一概率估计结果的方差进行预处理,以得到处理后的方差,Preprocess the variance of the first probability estimate according to the scaling factor of the first coefficient to obtain the processed variance,
第一估计系数的缩放因子和第二估计系数的缩放因子相同,或者第一估计系数的缩放因子和第二估计系数的缩放因子不相同;或者第一估计系数的缩放因子是根据第一估计系数所 属的图像块的纹理复杂度确定的。The scaling factor of the first estimated coefficient is the same as the scaling factor of the second estimated coefficient, or the scaling factor of the first estimated coefficient is different from the scaling factor of the second estimated coefficient; or the scaling factor of the first estimated coefficient is based on the first estimated coefficient The texture complexity of the image block to which it belongs is determined.
在此需要说明的是,对于第一估计系数所属图像块的纹理复杂度可以是根据第一重建图像或者第二重建图像中该图像块的内容确定的;对于第一估计系数所属子带的纹理复杂度可以是根据第一重建图像或者第二重建图像中该子带对应部分的内容确定的;对于第一估计系数所属频带的纹理复杂度可以是根据第一重建图像或者第二重建图像中该频带对应部分的内容确定的;对于第一估计系数所属通道的纹理复杂度可以是根据第一重建图像或者第二重建图像中该通道对应部分的内容确定的。其中,第一估计系数的纹理复杂度越大,第一估计系数的缩放因子越大。It should be noted here that the texture complexity of the image block to which the first estimated coefficient belongs can be determined according to the content of the image block in the first reconstructed image or the second reconstructed image; for the texture complexity of the subband to which the first estimated coefficient belongs The complexity can be determined according to the content of the corresponding part of the subband in the first reconstructed image or the second reconstructed image; the texture complexity of the frequency band to which the first estimated coefficient belongs can be determined according to the content of the subband in the first reconstructed image or the second reconstructed image The content of the corresponding part of the frequency band is determined; the texture complexity of the channel to which the first estimated coefficient belongs may be determined according to the content of the corresponding part of the channel in the first reconstructed image or the second reconstructed image. Wherein, the larger the texture complexity of the first estimated coefficient is, the larger the scaling factor of the first estimated coefficient is.
在一个可能的设计中,根据第一估计系数和第二估计系数得到第一重建图像,包括:In a possible design, the first reconstructed image is obtained according to the first estimated coefficient and the second estimated coefficient, including:
若第一估计系数和第二估计系数为量化小波系数,对第一估计系数和第二估计系数进行反量化和小波反变换得到第一重建图像,或者,若第一估计系数和第二估计系数为小波系数,对第一估计系数和第二估计系数进行小波反变换得到第一重建图像,或者,若第一估计系数和第二估计系数为量化DCT系数,对第一估计系数和第二估计系数进行反量化和反DCT得到第一重建图像,或者,第一估计系数和第二估计系数为DCT系数,对第一估计系数和第二估计系数进行反DCT得到第一重建图像。If the first estimated coefficient and the second estimated coefficient are quantized wavelet coefficients, inverse quantization and wavelet inverse transform are performed on the first estimated coefficient and the second estimated coefficient to obtain the first reconstructed image, or, if the first estimated coefficient and the second estimated coefficient is the wavelet coefficient, perform wavelet inverse transform on the first estimated coefficient and the second estimated coefficient to obtain the first reconstructed image, or, if the first estimated coefficient and the second estimated coefficient are quantized DCT coefficients, the first estimated coefficient and the second estimated coefficient Perform inverse quantization and inverse DCT on the coefficients to obtain the first reconstructed image, or, the first estimated coefficient and the second estimated coefficient are DCT coefficients, and perform inverse DCT on the first estimated coefficient and the second estimated coefficient to obtain the first reconstructed image.
通过对概率估计结果进行预处理,可以按照用户的需求得到不同性质的重建图像,提高了重建图像的质量。比如将对概率估计结果的方差置0作为处理后的方差,可以得到信号质量最佳(客观质量最佳)的重建图像,也就是增大图像的PSNR或者MSE;通过将多个数据的缩放因子设置为相同,可以得到主观质量最佳的图像,也即是降低图像的PSNR或者增大图像的MSE;通过将图像中属于同于部分的数据的缩放因子设置为相同,将属于不同部分的数据的缩放因子设置为不相同,可以得到性质在主观质量最佳和客观质量最佳之间的图像。By preprocessing the probability estimation results, reconstructed images with different properties can be obtained according to user's needs, which improves the quality of reconstructed images. For example, if the variance of the probability estimation result is set to 0 as the variance after processing, the reconstructed image with the best signal quality (best objective quality) can be obtained, that is, the PSNR or MSE of the image is increased; Set to the same, you can get the image with the best subjective quality, that is, reduce the PSNR of the image or increase the MSE of the image; by setting the scaling factor of the data belonging to the same part of the image to the same, the data belonging to different parts The scaling factors of different images can be obtained between the best subjective quality and the best objective quality.
在一个可能的设计中,本解码方法还包括:In a possible design, the decoding method also includes:
从压缩码流中解码得到多个重建系数;根据多个重建系数得到第二重建图像。A plurality of reconstruction coefficients are obtained by decoding the compressed code stream; and a second reconstruction image is obtained according to the plurality of reconstruction coefficients.
在一个可能的设计中,根据多个系数得到第二重建图像,包括:In one possible design, the second reconstructed image is derived from a plurality of coefficients, including:
若多个重建系数为量化小波系数,对多个重建系数进行反量化和小波反变换得到第二重建图像,或者,若多个重建系数为小波系数,对多个重建系数进行小波反变换得到第二重建图像,或者,若多个重建系数为量化DCT系数,对多个重建系数进行反量化和反DCT得到第二重建图像,或者,若多个重建系数为DCT系数,对多个重建系数进行反DCT得到第二重建图像。If the multiple reconstruction coefficients are quantized wavelet coefficients, perform inverse quantization and wavelet inverse transform on the multiple reconstruction coefficients to obtain the second reconstructed image, or, if the multiple reconstruction coefficients are wavelet coefficients, perform wavelet inverse transform on the multiple reconstruction coefficients to obtain the second reconstructed image Two reconstructed images, or, if the plurality of reconstruction coefficients are quantized DCT coefficients, perform inverse quantization and inverse DCT on the plurality of reconstruction coefficients to obtain a second reconstructed image, or, if the plurality of reconstruction coefficients are DCT coefficients, perform inverse quantization on the plurality of reconstruction coefficients The inverse DCT obtains the second reconstructed image.
由于采样过程具有随机性,本申请的中可重复进行采样步骤,以得到多张重建图像。多张重建图像可以是主观质量最优的重建图像,也可以是客观质量最优的重建图像。重建图像可用于编解码环路内作为帧内或帧间预测的参考;也可以用于编解码环路外,作为后处理的方式优化图像质量。例如:通过采样步骤和反变换步骤得到多张重建图像后,主观质量最优的重建图像放入图像缓存区(decoded picture buffer,DPB)中或参考帧集合中,用于编解码环路内帧内或帧间预测的参考图像;客观质量最优的重建图像用于后处理,对编解码后的重建图像进行主观质量的调整,提升压缩重建后的图像/视频质量。Due to the randomness of the sampling process, the sampling step can be repeated in the present application to obtain multiple reconstructed images. The multiple reconstructed images may be the reconstructed images with the best subjective quality, or the reconstructed images with the best objective quality. The reconstructed image can be used in the codec loop as a reference for intra-frame or inter-frame prediction; it can also be used outside the codec loop to optimize image quality as a post-processing method. For example: After multiple reconstructed images are obtained through the sampling step and the inverse transformation step, the reconstructed image with the best subjective quality is put into the decoded picture buffer (DPB) or the reference frame set, which is used to encode and decode the frame in the loop The reference image for intra or inter-frame prediction; the reconstructed image with the best objective quality is used for post-processing, and the subjective quality adjustment is performed on the coded reconstructed image to improve the image/video quality after compression and reconstruction.
在此需要指出的是,解码端的有益效果可参见编码端的有益效果,在此不再叙述。It should be pointed out here that the beneficial effects of the decoding end can refer to the beneficial effects of the encoding end, which will not be described here again.
基于第四方面,本申请涉及基于视频图像的编码装置,有益效果可以参见第一方面或第二方面的描述此处不再赘述。所述编码装置具有实现上述第一方面或第二方面的方法实例中行为的功能。所述功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。所述硬件或软件包括一个或多个与上述功能相对应的模块。Based on the fourth aspect, the present application relates to a video image-based encoding device, and the beneficial effects may refer to the description of the first aspect or the second aspect, which will not be repeated here. The coding device has the function of realizing the behavior in the method example of the first aspect or the second aspect above. The functions described above may be implemented by hardware, or may be implemented by executing corresponding software on the hardware. The hardware or software includes one or more modules corresponding to the above functions.
基于第五方面,本申请涉及基于视频图像的解码装置,有益效果可以参见第三方面的描述此处不再赘述。所述编码装置具有实现上述第三方面的方法实例中行为的功能。所述功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。所述硬件或软件包括一个或多个与上述功能相对应的模块。Based on the fifth aspect, the present application relates to a video image-based decoding device, and the beneficial effects may refer to the description of the third aspect and will not be repeated here. The encoding device has the function of realizing the behavior in the method example of the third aspect above. The functions described above may be implemented by hardware, or may be implemented by executing corresponding software on the hardware. The hardware or software includes one or more modules corresponding to the above functions.
本申请第一方面或第二方面所述的方法可由本申请第四方面所述的装置执行。本申请第一方面或第二方面所述的方法的其它特征和实现方式直接取决于本申请第四方面所述的装置的功能性和实现方式。The method described in the first aspect or the second aspect of the present application may be executed by the device described in the fourth aspect of the present application. Other features and implementations of the method described in the first or second aspect of the present application directly depend on the functionality and implementation of the device described in the fourth aspect of the present application.
本申请第三方面所述的方法可由本申请第五方面所述的装置执行。本申请第三方面所述的方法的其它特征和实现方式直接取决于本申请第五方面所述的装置的功能性和实现方式。The method described in the third aspect of the present application can be executed by the device described in the fifth aspect of the present application. Other features and implementations of the method described in the third aspect of the application depend directly on the functionality and implementations of the device described in the fifth aspect of the application.
基于第六方面,本申请涉及编码视频流的装置,包含处理器和存储器。所述存储器存储指令,所述指令使得所述处理器执行第一方面或第二方面所述的方法。Based on a sixth aspect, the present application relates to an apparatus for encoding a video stream, including a processor and a memory. The memory stores instructions, and the instructions cause the processor to execute the method described in the first aspect or the second aspect.
基于第七方面,本申请涉及解码视频流的装置,包含处理器和存储器。所述存储器存储指令,所述指令使得所述处理器执行第三方面所述的方法。Based on a seventh aspect, the present application relates to an apparatus for decoding a video stream, including a processor and a memory. The memory stores instructions, and the instructions cause the processor to execute the method described in the third aspect.
基于第八方面,提供一种计算机可读存储介质,其上储存有指令,当所述指令执行时,使得一个或多个处理器编码视频数据。所述指令使得所述一个或多个处理器执行第一或第二或第三方面或第一或第二或第三方面任意一种可能的实施例中的方法。According to an eighth aspect, there is provided a computer readable storage medium having stored thereon instructions which, when executed, cause one or more processors to encode video data. The instructions cause the one or more processors to execute the method in the first, second, or third aspect, or any possible embodiment of the first, second, or third aspect.
基于第九方面,本申请涉及包括程序代码的计算机程序产品,所述程序代码在运行时执行第一或第二或第三方面或第一或第二或第三方面任意一种可能的实施例中的方法。Based on the ninth aspect, the present application relates to a computer program product including program code, the program code executes the first or second or third aspect or any possible embodiment of the first or second or third aspect when running method in .
附图及以下说明中将详细描述一个或多个实施例。其它特征、目的和优点在说明、附图以及权利要求中是显而易见的。The details of one or more embodiments are set forth in the accompanying drawings and the description below. Other features, objects, and advantages will be apparent from the description, drawings, and claims.
附图说明Description of drawings
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以基于这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present application or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only These are some embodiments of the present application, and those skilled in the art can also obtain other drawings based on these drawings without creative work.
图1为用于实现本申请实施例的视频译码系统示例的框图;FIG. 1 is a block diagram of an example of a video decoding system for implementing an embodiment of the present application;
图2为用于实现本申请实施例的视频译码系统另一示例的框图;FIG. 2 is a block diagram of another example of a video decoding system for implementing an embodiment of the present application;
图3为用于实现本申请实施例的视频译码装置的示意性框图;FIG. 3 is a schematic block diagram of a video decoding device for implementing an embodiment of the present application;
图4为用于实现本申请实施例的视频译码装置的示意性框图;FIG. 4 is a schematic block diagram of a video decoding device for implementing an embodiment of the present application;
图5为本申请实施例提供的一种视频编解码装置的结构示意图;FIG. 5 is a schematic structural diagram of a video encoding and decoding device provided in an embodiment of the present application;
图6a为一次小波变换后的结果示意图;Figure 6a is a schematic diagram of the results after a wavelet transformation;
图6b为第一数据的第一上下文信息和第二上下文信息示意图;Fig. 6b is a schematic diagram of the first context information and the second context information of the first data;
图6c为第一预置区域的第一上下文信息和第二上下文信息示意图;Fig. 6c is a schematic diagram of the first context information and the second context information of the first preset area;
图6d为本申请实施例提供的一种概率估计网络的结构示意图;Fig. 6d is a schematic structural diagram of a probability estimation network provided by an embodiment of the present application;
图6e为本申请实施例提供的一种残差网络的结构示意图;FIG. 6e is a schematic structural diagram of a residual network provided by an embodiment of the present application;
图7为本申请实施例提供的一种视频编解码器的结构示意图;FIG. 7 is a schematic structural diagram of a video codec provided in an embodiment of the present application;
图8为本申请实施例提供的一种编码流程示意图;FIG. 8 is a schematic diagram of an encoding process provided by an embodiment of the present application;
图9为本申请实施例提供的另一种编码流程示意图;FIG. 9 is a schematic diagram of another encoding process provided by the embodiment of the present application;
图10为本申请实施例提供的一种解码流程示意图。FIG. 10 is a schematic diagram of a decoding process provided by an embodiment of the present application.
具体实施方式detailed description
本申请实施例提供一种基于AI的视频图像压缩技术,尤其是提供一种基于神经网络的视频压缩技术,具体提供一种基于概率分布和采样的解码方法,以改进传统的混合视频编解码系统。The embodiment of the present application provides an AI-based video image compression technology, especially a neural network-based video compression technology, and specifically provides a probability distribution and sampling-based decoding method to improve the traditional hybrid video codec system .
视频编码通常是指处理形成视频或视频序列的图像序列。在视频编码领域,术语“图像(picture)”、“帧(frame)”或“图片(image)”可以用作同义词。视频编码(或通常称为编码)包括视频编码和视频解码两部分。视频编码在源侧执行,通常包括处理(例如,压缩)原始视频图像以减少表示该视频图像所需的数据量(从而更高效存储和/或传输)。视频解码在目的地侧执行,通常包括相对于编码器作逆处理,以重建视频图像。实施例涉及的视频图像(或通常称为图像)的“编码”应理解为视频图像或视频序列的“编码”或“解码”。编码部分和解码部分也合称为编解码(编码和解码,CODEC)。Video coding generally refers to the processing of sequences of images that form a video or video sequence. In the field of video coding, the terms "picture", "frame" or "image" may be used as synonyms. Video coding (or commonly referred to as coding) includes two parts: video coding and video decoding. Video encoding is performed on the source side and typically involves processing (eg, compressing) raw video images to reduce the amount of data needed to represent the video images (and thus more efficient storage and/or transmission). Video decoding is performed at the destination and typically involves inverse processing relative to the encoder to reconstruct the video image. The "encoding" of video images (or generally referred to as images) involved in the embodiments should be understood as "encoding" or "decoding" of video images or video sequences. The encoding part and the decoding part are also collectively referred to as codec (encoding and decoding, CODEC).
在无损视频编码情况下,可以重建原始视频图像,即重建的视频图像与原始视频图像具有相同的质量(假设存储或传输期间没有传输损耗或其它数据丢失)。在有损视频编码情况下,通过量化等执行进一步压缩,来减少表示视频图像所需的数据量,而解码器侧无法完全重建视频图像,即重建的视频图像的质量比原始视频图像的质量较低或较差。In the case of lossless video coding, the original video image can be reconstructed, ie the reconstructed video image has the same quality as the original video image (assuming no transmission loss or other data loss during storage or transmission). In the case of lossy video coding, further compression is performed by quantization, etc., to reduce the amount of data required to represent the video image, and the decoder side cannot completely reconstruct the video image, that is, the quality of the reconstructed video image is lower than that of the original video image. low or poor.
由于本申请实施例涉及神经网络的应用,为了便于理解,下面先对本申请实施例所使用到的一些名词或术语进行解释说明,该名词或术语也作为发明内容的一部分。Since the embodiment of the present application involves the application of a neural network, for ease of understanding, some nouns or terms used in the embodiment of the present application are firstly explained below, and the nouns or terms are also part of the summary of the invention.
(1)神经网络(1) neural network
神经网络可以是由神经单元组成的,神经单元可以是指以xs和截距1为输入的运算单元,该运算单元的输出可以为:The neural network can be composed of neural units, and the neural unit can refer to an operation unit that takes xs and intercept 1 as input, and the output of the operation unit can be:
Figure PCTCN2022100578-appb-000001
Figure PCTCN2022100578-appb-000001
其中,s=1、2、……n,n为大于1的自然数,Ws为xs的权重,b为神经单元的偏置。f为神经单元的激活函数(activation functions),用于将非线性特性引入神经网络中,来将神经单元中的输入信号转换为输出信号。该激活函数的输出信号可以作为下一层卷积层的输入,激活函数可以是sigmoid函数。神经网络是将多个上述单一的神经单元联结在一起形成的网络,即一个神经单元的输出可以是另一个神经单元的输入。每个神经单元的输入可以与前一层的局部接受域相连,来提取局部接受域的特征,局部接受域可以是由若干个神经单元组成的区域。Wherein, s=1, 2, ... n, n is a natural number greater than 1, Ws is the weight of xs, and b is the bias of the neural unit. f is the activation function of the neural unit, which is used to introduce nonlinear characteristics into the neural network to convert the input signal in the neural unit into an output signal. The output signal of the activation function can be used as the input of the next convolutional layer, and the activation function can be a sigmoid function. A neural network is a network formed by connecting multiple above-mentioned single neural units, that is, the output of one neural unit can be the input of another neural unit. The input of each neural unit can be connected with the local receptive field of the previous layer to extract the features of the local receptive field. The local receptive field can be an area composed of several neural units.
(2)深度神经网络(2) Deep Neural Network
深度神经网络(deep neural network,DNN),也称多层神经网络,可以理解为具有多层隐含层的神经网络。按照不同层的位置对DNN进行划分,DNN内部的神经网络可以分为三类:输入层,隐含层,输出层。一般来说第一层是输入层,最后一层是输出层,中间的层数都是隐含层。层与层之间是全连接的,也就是说,第i层的任意一个神经元一定与第i+1层的任意一个神经元相连。Deep neural network (DNN), also known as multi-layer neural network, can be understood as a neural network with multiple hidden layers. DNN is divided according to the position of different layers, and the neural network inside DNN can be divided into three categories: input layer, hidden layer, and output layer. Generally speaking, the first layer is the input layer, the last layer is the output layer, and the layers in the middle are all hidden layers. The layers are fully connected, that is, any neuron in the i-th layer must be connected to any neuron in the i+1-th layer.
虽然DNN看起来很复杂,但是就每一层的工作来说,其实并不复杂,简单来说就是如下线性关系表达式:
Figure PCTCN2022100578-appb-000002
其中,
Figure PCTCN2022100578-appb-000003
是输入向量,
Figure PCTCN2022100578-appb-000004
是输出向量,
Figure PCTCN2022100578-appb-000005
是偏移向量,W是权重矩阵(也称系数),a()是激活函数。每一层仅仅是对输入向量
Figure PCTCN2022100578-appb-000006
经过如此简单的操作得到输出向量
Figure PCTCN2022100578-appb-000007
由于DNN层数多,系数W和偏移向量
Figure PCTCN2022100578-appb-000008
的数量也比较多。这些参数在DNN中的定义如下所述:以系数W为例:假设在一个三层的DNN中,第二层的第4个神经元到第三层的第2个神经元的线性系数定义为
Figure PCTCN2022100578-appb-000009
上标3代表系数W所在的层数,而下标对应的是输出的第三层索引2和输入的第二层索引4。
Although DNN looks complicated, it is actually not complicated in terms of the work of each layer. In simple terms, it is the following linear relationship expression:
Figure PCTCN2022100578-appb-000002
in,
Figure PCTCN2022100578-appb-000003
is the input vector,
Figure PCTCN2022100578-appb-000004
is the output vector,
Figure PCTCN2022100578-appb-000005
Is the offset vector, W is the weight matrix (also called coefficient), a() is the activation function. Each layer is just an input vector
Figure PCTCN2022100578-appb-000006
After such a simple operation to get the output vector
Figure PCTCN2022100578-appb-000007
Due to the large number of DNN layers, the coefficient W and the offset vector
Figure PCTCN2022100578-appb-000008
The number is also higher. The definition of these parameters in DNN is as follows: Take the coefficient W as an example: Assume that in a three-layer DNN, the linear coefficient from the fourth neuron of the second layer to the second neuron of the third layer is defined as
Figure PCTCN2022100578-appb-000009
The superscript 3 represents the layer number of the coefficient W, and the subscript corresponds to the output third layer index 2 and the input second layer index 4.
综上,第L-1层的第k个神经元到第L层的第j个神经元的系数定义为
Figure PCTCN2022100578-appb-000010
In summary, the coefficient from the kth neuron of the L-1 layer to the jth neuron of the L layer is defined as
Figure PCTCN2022100578-appb-000010
需要注意的是,输入层是没有W参数的。在深度神经网络中,更多的隐含层让网络更能够刻画现实世界中的复杂情形。理论上而言,参数越多的模型复杂度越高,“容量”也就越大,也就意味着它能完成更复杂的学习任务。训练深度神经网络的也就是学习权重矩阵的过程,其最终目的是得到训练好的深度神经网络的所有层的权重矩阵(由很多层的向量W形成的权重矩阵)。It should be noted that the input layer has no W parameter. In deep neural networks, more hidden layers make the network more capable of describing complex situations in the real world. Theoretically speaking, a model with more parameters has a higher complexity and a greater "capacity", which means that it can complete more complex learning tasks. Training the deep neural network is the process of learning the weight matrix, and its ultimate goal is to obtain the weight matrix of all layers of the trained deep neural network (the weight matrix formed by the vector W of many layers).
(3)卷积神经网络(3) Convolutional neural network
卷积神经网络(convolutional neuron network,CNN)是一种带有卷积结构的深度神经网络。卷积神经网络包含了一个由卷积层和子采样层构成的特征抽取器,该特征抽取器可以看作是滤波器。卷积层是指卷积神经网络中对输入信号进行卷积处理的神经元层。在卷积神经网络的卷积层中,一个神经元可以只与部分邻层神经元连接。一个卷积层中,通常包含若干个特征平面,每个特征平面可以由一些矩形排列的神经单元组成。同一特征平面的神经单元共享权重,这里共享的权重就是卷积核。共享权重可以理解为提取图像信息的方式与位置无关。卷积核可以以随机大小的矩阵的形式初始化,在卷积神经网络的训练过程中卷积核可以通过学习得到合理的权重。另外,共享权重带来的直接好处是减少卷积神经网络各层之间的连接,同时又降低了过拟合的风险。Convolutional neural network (CNN) is a deep neural network with a convolutional structure. The convolutional neural network contains a feature extractor composed of a convolutional layer and a subsampling layer, which can be regarded as a filter. The convolutional layer refers to the neuron layer that performs convolution processing on the input signal in the convolutional neural network. In the convolutional layer of a convolutional neural network, a neuron can only be connected to some adjacent neurons. A convolutional layer usually contains several feature planes, and each feature plane can be composed of some rectangularly arranged neural units. Neural units of the same feature plane share weights, and the shared weights here are convolution kernels. Shared weights can be understood as a way to extract image information that is independent of location. The convolution kernel can be initialized in the form of a matrix of random size, and the convolution kernel can obtain reasonable weights through learning during the training process of the convolutional neural network. In addition, the direct benefit of sharing weights is to reduce the connections between the layers of the convolutional neural network, while reducing the risk of overfitting.
(4)循环神经网络(recurrent neural networks,RNN)是用来处理序列数据的。在传统的神经网络模型中,是从输入层到隐含层再到输出层,层与层之间是全连接的,而对于每一层层内之间的各个节点是无连接的。这种普通的神经网络虽然解决了很多难题,但是却仍然对很多问题无能无力。例如,你要预测句子的下一个单词是什么,一般需要用到前面的单词,因为一个句子中前后单词并不是独立的。RNN之所以称为循环神经网路,即一个序列当前的输出与前面的输出也有关。具体的表现形式为网络会对前面的信息进行记忆并应用于当前输出的计算中,即隐含层本层之间的节点不再无连接而是有连接的,并且隐含层的输入不仅包括输入层的输出还包括上一时刻隐含层的输出。理论上,RNN能够对任何长度的序列数据进行处理。对于RNN的训练和对传统的CNN或DNN的训练一样。RNN旨在让机器像人一样拥有记忆的能力。因此,RNN的输出就需要依赖当前的输入信息和历史的记忆信息。(4) Recurrent neural networks (RNN) are used to process sequence data. In the traditional neural network model, from the input layer to the hidden layer to the output layer, the layers are fully connected, and each node in each layer is unconnected. Although this ordinary neural network solves many problems, it is still powerless to many problems. For example, if you want to predict what the next word in a sentence is, you generally need to use the previous words, because the preceding and following words in a sentence are not independent. The reason why RNN is called a recurrent neural network is that the current output of a sequence is also related to the previous output. The specific manifestation is that the network will remember the previous information and apply it to the calculation of the current output, that is, the nodes between the hidden layer and the current layer are no longer connected but connected, and the input of the hidden layer not only includes The output of the input layer also includes the output of the hidden layer at the previous moment. In theory, RNN can process sequence data of any length. The training of RNN is the same as that of traditional CNN or DNN. RNN is designed to allow machines to have the ability to remember like humans. Therefore, the output of RNN needs to depend on the current input information and historical memory information.
(5)损失函数(5) Loss function
在训练深度神经网络的过程中,因为希望深度神经网络的输出尽可能的接近真正想要预测的值,所以可以通过比较当前网络的预测值和真正想要的目标值,再基于两者之间的差异情况来更新每一层神经网络的权重向量(当然,在第一次更新之前通常会有初始化的过程,即为深度神经网络中的各层预先配置参数),比如,如果网络的预测值高了,就调整权重向量让它预测低一些,不断地调整,直到深度神经网络能够预测出真正想要的目标值或与真正想要的目标值非常接近的值。因此,就需要预先定义“如何比较预测值和目标值之间的差异”,这便是损失函数(loss function)或目标函数(objective function),它们是用于衡量预测值和目标值的差异的重要方程。其中,以损失函数举例,损失函数的输出值(loss)越高表示差异越大,那么深度神经网络的训练就变成了尽可能缩小这个loss的过程。In the process of training the deep neural network, because it is hoped that the output of the deep neural network is as close as possible to the value you really want to predict, you can compare the predicted value of the current network with the target value you really want, and then based on the difference between the two to update the weight vector of each layer of the neural network (of course, there is usually an initialization process before the first update, that is, to pre-configure parameters for each layer in the deep neural network), for example, if the predicted value of the network If it is high, adjust the weight vector to make it predict lower, and keep adjusting until the deep neural network can predict the real desired target value or a value very close to the real desired target value. Therefore, it is necessary to pre-define "how to compare the difference between the predicted value and the target value", which is the loss function (loss function) or objective function (objective function), which is used to measure the difference between the predicted value and the target value important equation. Among them, taking the loss function as an example, the higher the output value (loss) of the loss function, the greater the difference. Then the training of the deep neural network becomes a process of reducing the loss as much as possible.
(6)反向传播算法(6) Back propagation algorithm
神经网络可以采用误差反向传播(back propagation,BP)算法在训练过程中修正初始的神经网络模型中参数的大小,使得神经网络模型的重建误差损失越来越小。具体地,前向传递输入信号直至输出会产生误差损失,通过反向传播误差损失信息来更新初始的神经网络模型中参数,从而使误差损失收敛。反向传播算法是以误差损失为主导的反向传播运动,旨在 得到最优的神经网络模型的参数,例如权重矩阵。The neural network can use the error back propagation (back propagation, BP) algorithm to correct the size of the parameters in the initial neural network model during the training process, so that the reconstruction error loss of the neural network model becomes smaller and smaller. Specifically, passing the input signal forward until the output will generate an error loss, and updating the parameters in the initial neural network model by backpropagating the error loss information, so that the error loss converges. The backpropagation algorithm is a backpropagation movement dominated by error loss, aiming to obtain the optimal parameters of the neural network model, such as the weight matrix.
在以下译码系统10的实施例中,编码器20和解码器30根据图1至图3进行描述。In the following embodiment of the decoding system 10 , the encoder 20 and the decoder 30 are described with reference to FIGS. 1-3 .
图1为示例性译码系统10的示意性框图,例如可以利用本申请技术的视频译码系统10(或简称为译码系统10)。视频译码系统10中的视频编码器20(或简称为编码器20)和视频解码器30(或简称为解码器30)代表可用于根据本申请中描述的各种示例执行各技术的设备等。FIG. 1 is a schematic block diagram of an exemplary decoding system 10 , such as a video decoding system 10 (or simply referred to as the decoding system 10 ), which may utilize the techniques of the present application. Video encoder 20 (or simply encoder 20) and video decoder 30 (or simply decoder 30) in video coding system 10 represent devices, etc. that may be used to perform techniques according to various examples described in this application. .
如图1所示,译码系统10包括源设备12,源设备12用于将编码图像等编码图像数据21提供给用于对编码图像数据21进行解码的目的设备14。As shown in FIG. 1 , the decoding system 10 includes a source device 12 for providing coded image data 21 such as coded images to a destination device 14 for decoding the coded image data 21 .
源设备12包括编码器20,另外即可选地,可包括图像源16、图像预处理器等预处理器(或预处理单元)18、通信接口(或通信单元)22。The source device 12 includes an encoder 20 , and optionally, an image source 16 , a preprocessor (or a preprocessing unit) 18 such as an image preprocessor, and a communication interface (or a communication unit) 22 .
图像源16可包括或可以为任意类型的用于捕获现实世界图像等的图像捕获设备,和/或任意类型的图像生成设备,例如用于生成计算机动画图像的计算机图形处理器或任意类型的用于获取和/或提供现实世界图像、计算机生成图像(例如,屏幕内容、虚拟现实(virtual reality,VR)图像和/或其任意组合(例如增强现实(augmented reality,AR)图像)的设备。所述图像源可以为存储上述图像中的任意图像的任意类型的内存或存储器。Image source 16 may include or be any type of image capture device for capturing real world images, etc., and/or any type of image generation device, such as a computer graphics processor or any type of Devices for acquiring and/or providing real-world images, computer-generated images (e.g., screen content, virtual reality (VR) images, and/or any combination thereof (e.g., augmented reality (AR) images). So The image source may be any type of memory or storage that stores any of the above images.
为了区分预处理器(或预处理单元)18执行的处理,图像(或图像数据)17也可称为原始图像(或原始图像数据)17。To distinguish the processing performed by the preprocessor (or preprocessing unit) 18 , the image (or image data) 17 may also be referred to as an original image (or original image data) 17 .
预处理器18用于接收(原始)图像数据17,并对图像数据17进行预处理,得到预处理图像(或预处理图像数据)19。例如,预处理器18执行的预处理可包括修剪、颜色格式转换(例如从RGB转换为YCbCr)、调色或去噪。可以理解的是,预处理单元18可以为可选组件。The preprocessor 18 is used to receive (original) image data 17 and perform preprocessing on the image data 17 to obtain a preprocessed image (or preprocessed image data) 19 . For example, preprocessing performed by preprocessor 18 may include cropping, color format conversion (eg, from RGB to YCbCr), color grading, or denoising. It can be understood that the preprocessing unit 18 can be an optional component.
视频编码器(或编码器)20用于接收预处理图像数据19并提供编码图像数据21(下面将根据图2等进一步描述)。A video encoder (or encoder) 20 is used to receive preprocessed image data 19 and provide encoded image data 21 (to be further described below with reference to FIG. 2 etc.).
源设备12中的通信接口22可用于:接收编码图像数据21并通过通信信道13向目的设备14等另一设备或任何其它设备发送编码图像数据21(或其它任意处理后的版本),以便存储或直接重建。The communication interface 22 in the source device 12 may be used to receive the encoded image data 21 and send the encoded image data 21 (or any other processed version) via the communication channel 13 to another device such as the destination device 14 or any other device for storage Or rebuild directly.
目的设备14包括解码器30,另外即可选地,可包括通信接口(或通信单元)28、后处理器(或后处理单元)32和显示设备34。The destination device 14 includes a decoder 30 , and may also optionally include a communication interface (or communication unit) 28 , a post-processor (or post-processing unit) 32 and a display device 34 .
目的设备14中的通信接口28用于直接从源设备12或从存储设备等任意其它源设备接收编码图像数据21(或其它任意处理后的版本),例如,存储设备为编码图像数据存储设备,并将编码图像数据21提供给解码器30。The communication interface 28 in the destination device 14 is used to receive the coded image data 21 (or any other processed version) directly from the source device 12 or from any other source device such as a storage device, for example, the storage device is a coded image data storage device, And the coded image data 21 is supplied to the decoder 30 .
通信接口22和通信接口28可用于通过源设备12与目的设备14之间的直连通信链路,例如直接有线或无线连接等,或者通过任意类型的网络,例如有线网络、无线网络或其任意组合、任意类型的私网和公网或其任意类型的组合,发送或接收编码图像数据(或编码数据)21。The communication interface 22 and the communication interface 28 can be used to pass through a direct communication link between the source device 12 and the destination device 14, such as a direct wired or wireless connection, etc., or through any type of network, such as a wired network, a wireless network, or any other Combination, any type of private network and public network or any combination thereof, send or receive coded image data (or coded data) 21 .
例如,通信接口22可用于将编码图像数据21封装为报文等合适的格式,和/或使用任意类型的传输编码或处理来处理所述编码后的图像数据,以便在通信链路或通信网络上进行传输。For example, the communication interface 22 can be used to encapsulate the encoded image data 21 into a suitable format such as a message, and/or use any type of transmission encoding or processing to process the encoded image data, so that it can be transmitted over a communication link or communication network on the transmission.
通信接口28与通信接口22对应,例如,可用于接收传输数据,并使用任意类型的对应传输解码或处理和/或解封装对传输数据进行处理,得到编码图像数据21。The communication interface 28 corresponds to the communication interface 22, eg, can be used to receive the transmission data and process the transmission data using any type of corresponding transmission decoding or processing and/or decapsulation to obtain the encoded image data 21 .
通信接口22和通信接口28均可配置为如图1中从源设备12指向目的设备14的对应通信信道13的箭头所指示的单向通信接口,或双向通信接口,并且可用于发送和接收消息等,以建立连接,确认并交换与通信链路和/或例如编码后的图像数据传输等数据传输相关的任何其它信息,等等。Both the communication interface 22 and the communication interface 28 can be configured as a one-way communication interface as indicated by an arrow from the source device 12 to the corresponding communication channel 13 of the destination device 14 in FIG. 1, or a two-way communication interface, and can be used to send and receive messages etc., to establish the connection, confirm and exchange any other information related to the communication link and/or data transmission such as encoded image data transmission, etc.
视频解码器(或解码器)30用于接收编码图像数据21并提供解码图像数据(或解码图像数据)31(下面将根据图3等进一步描述)。The video decoder (or decoder) 30 is used to receive encoded image data 21 and provide decoded image data (or decoded image data) 31 (which will be further described below with reference to FIG. 3 , etc.).
目后处理器32用于对解码后的图像等解码图像数据31(也称为重建后的图像数据)进行后处理,得到后处理后的图像等后处理图像数据33。后处理单元32执行的后处理可以包括例如颜色格式转换(例如从YCbCr转换为RGB)、调色、修剪或重采样,或者用于产生供显示设备34等显示的解码图像数据31等任何其它处理。The post-processor 32 is used to post-process the decoded image data 31 (also referred to as reconstructed image data) such as the decoded image to obtain post-processed image data 33 such as the post-processed image. Post-processing performed by post-processing unit 32 may include, for example, color format conversion (e.g., from YCbCr to RGB), color grading, cropping, or resampling, or any other processing for producing decoded image data 31 for display by a display device 34 or the like. .
目显示设备34用于接收后处理图像数据33,以向用户或观看者等显示图像。显示设备34可以为或包括任意类型的用于表示重建后图像的显示器,例如,集成或外部显示屏或显示器。例如,显示屏可包括液晶显示器(liquid crystal display,LCD)、有机发光二极管(organic light emitting diode,OLED)显示器、等离子显示器、投影仪、微型LED显示器、硅基液晶显示器(liquid crystal on silicon,LCoS)、数字光处理器(digital light processor,DLP)或任意类型的其它显示屏。A display device 34 is used to receive the post-processed image data 33 to display the image to a user or viewer or the like. Display device 34 may be or include any type of display for representing the reconstructed image, eg, an integrated or external display screen or display. For example, the display screen may include a liquid crystal display (LCD), an organic light emitting diode (OLED) display, a plasma display, a projector, a micro LED display, a liquid crystal on silicon (LCoS) display, or a liquid crystal on silicon (LCoS) display. ), a digital light processor (DLP), or any type of other display.
译码系统10还包括训练引擎25,训练引擎25所实现的具体训练过程详见后续描述,在此不再叙述。The decoding system 10 also includes a training engine 25. The specific training process implemented by the training engine 25 can be found in the subsequent description and will not be described here.
尽管图1示出了源设备12和目的设备14作为独立的设备,但设备实施例也可以同时包括源设备12和目的设备14或同时包括源设备12和目的设备14的功能,即同时包括源设备12或对应功能和目的设备14或对应功能。在这些实施例中,源设备12或对应功能和目的设备14或对应功能可以使用相同硬件和/或软件或通过单独的硬件和/或软件或其任意组合来实现。Although FIG. 1 shows the source device 12 and the destination device 14 as independent devices, the device embodiment may also include the source device 12 and the destination device 14 or the functions of the source device 12 and the destination device 14 at the same time, that is, include the source device 12 and the destination device 14 at the same time. Device 12 or corresponding function and destination device 14 or corresponding function. In these embodiments, source device 12 or corresponding functionality and destination device 14 or corresponding functionality may be implemented using the same hardware and/or software or by separate hardware and/or software or any combination thereof.
根据描述,图1所示的源设备12和/或目的设备14中的不同单元或功能的存在和(准确)划分可能根据实际设备和应用而有所不同,这对技术人员来说是显而易见的。It will be apparent to a skilled person from the description that the presence and (exact) division of different units or functions in the source device 12 and/or destination device 14 shown in FIG. 1 may vary depending on the actual device and application. .
编码器20(例如视频编码器20)或解码器30(例如视频解码器30)或两者都可通过如图2所示的处理电路实现,例如一个或多个微处理器、数字信号处理器(digital signal processor,DSP)、专用集成电路(application-specific integrated circuit,ASIC)、现场可编程门阵列(field-programmable gate array,FPGA)、离散逻辑、硬件、视频编码专用处理器或其任意组合。编码器20可以通过处理电路46实现,以包含参照图2编码器20论述的各种模块和/或本文描述的任何其它编码器系统或子系统。解码器30可以通过处理电路46实现,以包含参照图3解码器30论述的各种模块和/或本文描述的任何其它解码器系统或子系统。所述处理电路46可用于执行下文论述的各种操作。如图4所示,如果部分技术在软件中实施,则设备可以将软件的指令存储在合适的非瞬时性计算机可读存储介质中,并且使用一个或多个处理器在硬件中执行指令,从而执行本发明技术。视频编码器20和视频解码器30中的其中一个可作为组合编解码器(encoder/decoder,CODEC)的一部分集成在单个设备中,如图2所示。Encoder 20 (e.g., video encoder 20) or decoder 30 (e.g., video decoder 30) or both may be implemented by processing circuitry as shown in FIG. 2, such as one or more microprocessors, digital signal processors (digital signal processor, DSP), application-specific integrated circuit (ASIC), field-programmable gate array (field-programmable gate array, FPGA), discrete logic, hardware, video encoding dedicated processor or any combination thereof . Encoder 20 may be implemented by processing circuitry 46 to include the various modules discussed with reference to encoder 20 of FIG. 2 and/or any other encoder system or subsystem described herein. Decoder 30 may be implemented by processing circuitry 46 to include the various modules discussed with reference to decoder 30 of FIG. 3 and/or any other decoder system or subsystem described herein. The processing circuitry 46 may be used to perform various operations discussed below. As shown in Figure 4, if part of the technology is implemented in software, the device can store the instructions of the software in a suitable non-transitory computer-readable storage medium, and use one or more processors to execute the instructions in hardware, thereby Perform the inventive technique. One of the video encoder 20 and the video decoder 30 may be integrated in a single device as part of a combined codec (encoder/decoder, CODEC), as shown in FIG. 2 .
源设备12和目的设备14可包括各种设备中的任一种,包括任意类型的手持设备或固定设备,例如,笔记本电脑或膝上型电脑、手机、智能手机、平板或平板电脑、相机、台式计算机、机顶盒、电视机、显示设备、数字媒体播放器、视频游戏控制台、视频流设备(例如,内容业务服务器或内容分发服务器)、广播接收设备、广播发射设备,等等,并可以不使用或 使用任意类型的操作系统。在一些情况下,源设备12和目的设备14可配备用于无线通信的组件。因此,源设备12和目的设备14可以是无线通信设备。Source device 12 and destination device 14 may comprise any of a variety of devices, including any type of handheld or stationary device, such as a notebook or laptop computer, cell phone, smartphone, tablet or tablet computer, camera, desktop computers, set-top boxes, televisions, display devices, digital media players, video game consoles, video streaming devices (such as content service servers or content distribution servers), broadcast receiving devices, broadcast transmitting devices, etc., and may not Use or use any type of operating system. In some cases, source device 12 and destination device 14 may be equipped with components for wireless communication. Accordingly, source device 12 and destination device 14 may be wireless communication devices.
在一些情况下,图1所示的视频译码系统10仅仅是示例性的,本申请提供的技术可适用于视频编码设置(例如,视频编码或视频解码),这些设置不一定包括编码设备与解码设备之间的任何数据通信。在其它示例中,数据从本地存储器中检索,通过网络发送,等等。视频编码设备可以对数据进行编码并将数据存储到存储器中,和/或视频解码设备可以从存储器中检索数据并对数据进行解码。在一些示例中,编码和解码由相互不通信而只是编码数据到存储器和/或从存储器中检索并解码数据的设备来执行。In some cases, the video coding system 10 shown in FIG. 1 is merely exemplary, and the techniques provided herein are applicable to video coding settings (e.g., video coding or video decoding) that do not necessarily include coding devices and Decode any data communication between devices. In other examples, data is retrieved from local storage, sent over a network, and so on. A video encoding device may encode and store data into memory, and/or a video decoding device may retrieve and decode data from memory. In some examples, encoding and decoding are performed by devices that do not communicate with each other but simply encode data to memory and/or retrieve and decode data from memory.
图2是根据一示例性实施例的包含图2的视频编码器20和/或图3的视频解码器30的视频译码系统40的实例的说明图。视频译码系统40可以包含成像设备41、视频编码器20、视频解码器30(和/或藉由处理电路46实施的视频编/解码器)、天线42、一个或多个处理器43、一个或多个内存存储器44和/或显示设备45。2 is an illustrative diagram of an example of a video coding system 40 including video encoder 20 of FIG. 2 and/or video decoder 30 of FIG. 3, according to an example embodiment. The video decoding system 40 may include an imaging device 41, a video encoder 20, a video decoder 30 (and/or a video encoder/decoder implemented by a processing circuit 46), an antenna 42, one or more processors 43, a or multiple memory stores 44 and/or a display device 45 .
如图2所示,成像设备41、天线42、处理电路46、视频编码器20、视频解码器30、处理器43、内存存储器44和/或显示设备45能够互相通信。在不同实例中,视频译码系统40可以只包含视频编码器20或只包含视频解码器30。As shown in FIG. 2 , imaging device 41 , antenna 42 , processing circuit 46 , video encoder 20 , video decoder 30 , processor 43 , memory storage 44 and/or display device 45 are capable of communicating with each other. In different examples, the video coding system 40 may include only the video encoder 20 or only the video decoder 30 .
在一些实例中,天线42可以用于传输或接收视频数据的经编码比特流。另外,在一些实例中,显示设备45可以用于呈现视频数据。处理电路46可以包含专用集成电路(application-specific integrated circuit,ASIC)逻辑、图形处理器、通用处理器等。视频译码系统40也可以包含可选的处理器43,该可选处理器43类似地可以包含专用集成电路(application-specific integrated circuit,ASIC)逻辑、图形处理器、通用处理器等。另外,内存存储器44可以是任何类型的存储器,例如易失性存储器(例如,静态随机存取存储器(static random access memory,SRAM)、动态随机存储器(dynamic random access memory,DRAM)等)或非易失性存储器(例如,闪存等)等。在非限制性实例中,内存存储器44可以由超速缓存内存实施。在其它实例中,处理电路46可以包含存储器(例如,缓存等)用于实施图像缓冲器等。In some examples, antenna 42 may be used to transmit or receive an encoded bitstream of video data. Additionally, in some instances, display device 45 may be used to present video data. The processing circuit 46 may include application-specific integrated circuit (ASIC) logic, a graphics processor, a general-purpose processor, and the like. The video decoding system 40 may also include an optional processor 43, and the optional processor 43 may similarly include application-specific integrated circuit (ASIC) logic, a graphics processor, a general-purpose processor, and the like. In addition, the memory storage 44 can be any type of memory, such as volatile memory (for example, static random access memory (static random access memory, SRAM), dynamic random access memory (dynamic random access memory, DRAM), etc.) or non-volatile memory volatile memory (for example, flash memory, etc.) and the like. In a non-limiting example, memory storage 44 may be implemented by cache memory. In other examples, processing circuitry 46 may include memory (eg, cache, etc.) for implementing an image buffer or the like.
在一些实例中,通过逻辑电路实施的视频编码器20可以包含(例如,通过处理电路46或内存存储器44实施的)图像缓冲器和(例如,通过处理电路46实施的)图形处理单元。图形处理单元可以通信耦合至图像缓冲器。图形处理单元可以包含通过处理电路46实施的视频编码器20,以实施参照图2和/或本文中所描述的任何其它编码器系统或子系统所论述的各种模块。逻辑电路可以用于执行本文所论述的各种操作。In some examples, video encoder 20 implemented by logic circuitry may include an image buffer (eg, implemented by processing circuitry 46 or memory storage 44 ) and a graphics processing unit (eg, implemented by processing circuitry 46 ). A graphics processing unit may be communicatively coupled to the image buffer. Graphics processing unit may include video encoder 20 implemented by processing circuitry 46 to implement the various modules discussed with reference to FIG. 2 and/or any other encoder system or subsystem described herein. Logic circuits may be used to perform the various operations discussed herein.
在一些实例中,视频解码器30可以以类似方式通过处理电路46实施,以实施参照图3的视频解码器30和/或本文中所描述的任何其它解码器系统或子系统所论述的各种模块。在一些实例中,逻辑电路实施的视频解码器30可以包含(通过处理电路46或内存存储器44实施的)图像缓冲器和(例如,通过处理电路46实施的)图形处理单元。图形处理单元可以通信耦合至图像缓冲器。图形处理单元可以包含通过处理电路46实施的视频解码器30,以实施参照图3和/或本文中所描述的任何其它解码器系统或子系统所论述的各种模块。In some examples, video decoder 30 may be implemented by processing circuitry 46 in a similar manner to implement the various aspects discussed with reference to video decoder 30 of FIG. 3 and/or any other decoder system or subsystem described herein. module. In some examples, logic circuit implemented video decoder 30 may include an image buffer (implemented by processing circuit 46 or memory storage 44 ) and a graphics processing unit (eg, implemented by processing circuit 46 ). A graphics processing unit may be communicatively coupled to the image buffer. Graphics processing unit may include video decoder 30 implemented by processing circuitry 46 to implement the various modules discussed with reference to FIG. 3 and/or any other decoder system or subsystem described herein.
在一些实例中,天线42可以用于接收视频数据的经编码比特流。如所论述,经编码比特流可以包含本文所论述的与编码视频帧相关的数据、指示符、索引值、模式选择数据等,例如与编码分割相关的数据(例如,变换系数或经量化变换系数,(如所论述的)可选指示符,和/或定义编码分割的数据)。视频译码系统40还可包含耦合至天线42并用于解码经编码比 特流的视频解码器30。显示设备45用于呈现视频帧。In some examples, antenna 42 may be used to receive an encoded bitstream of video data. As discussed, an encoded bitstream may contain data related to encoded video frames, indicators, index values, mode selection data, etc., as discussed herein, such as data related to encoding partitions (e.g., transform coefficients or quantized transform coefficients , (as discussed) an optional indicator, and/or data defining an encoding split). Video coding system 40 may also include video decoder 30 coupled to antenna 42 and used to decode the encoded bitstream. A display device 45 is used to present video frames.
应理解,本申请实施例中对于参考视频编码器20所描述的实例,视频解码器30可以用于执行相反过程。关于信令语法元素,视频解码器30可以用于接收并解析这种语法元素,相应地解码相关视频数据。在一些例子中,视频编码器20可以将语法元素熵编码成经编码视频比特流。在此类实例中,视频解码器30可以解析这种语法元素,并相应地解码相关视频数据。It should be understood that, for the example described with reference to the video encoder 20 in the embodiment of the present application, the video decoder 30 may be used to perform a reverse process. With regard to signaling syntax elements, the video decoder 30 may be configured to receive and parse such syntax elements and decode the associated video data accordingly. In some examples, video encoder 20 may entropy encode the syntax elements into an encoded video bitstream. In such instances, video decoder 30 may parse such syntax elements and decode the related video data accordingly.
为便于描述,参考通用视频编码(Versatile video coding,VVC)参考软件或由ITU-T视频编码专家组(Video Coding Experts Group,VCEG)和ISO/IEC运动图像专家组(Motion Picture Experts Group,MPEG)的视频编码联合工作组(Joint Collaboration Team on Video Coding,JCT-VC)开发的高性能视频编码(High-Efficiency Video Coding,HEVC)描述本发明实施例。本领域普通技术人员理解本发明实施例不限于HEVC或VVC。For ease of description, refer to the general video coding (Versatile video coding, VVC) reference software or by the ITU-T Video Coding Experts Group (Video Coding Experts Group, VCEG) and ISO/IEC Motion Picture Experts Group (Motion Picture Experts Group, MPEG) Embodiments of the present invention are described in High-Efficiency Video Coding (HEVC) developed by the Joint Collaboration Team on Video Coding (JCT-VC). Those of ordinary skill in the art understand that embodiments of the present invention are not limited to HEVC or VVC.
图3为本发明实施例提供的视频译码设备300的示意图。视频译码设备300适用于实现本文描述的公开实施例。在一个实施例中,视频译码设备300可以是解码器,例如图1中的视频解码器30,也可以是编码器,例如图1中的视频编码器20。FIG. 3 is a schematic diagram of a video decoding device 300 provided by an embodiment of the present invention. The video coding apparatus 300 is suitable for implementing the disclosed embodiments described herein. In one embodiment, the video decoding device 300 may be a decoder, such as the video decoder 30 in FIG. 1 , or an encoder, such as the video encoder 20 in FIG. 1 .
视频译码设备300包括:用于接收数据的入端口310(或输入端口310)和接收单元(receiver unit,Rx)320;用于处理数据的处理器、逻辑单元或中央处理器(central processing unit,CPU)330;例如,这里的处理器330可以是神经网络处理器330;用于传输数据的发送单元(transmitter unit,Tx)340和出端口350(或输出端口350);用于存储数据的存储器360。视频译码设备300还可包括耦合到入端口310、接收单元320、发送单元340和出端口350的光电(optical-to-electrical,OE)组件和电光(electrical-to-optical,EO)组件,用于光信号或电信号的出口或入口。The video decoding device 300 includes: an input port 310 (or input port 310) for receiving data and a receiving unit (receiver unit, Rx) 320; a processor, a logic unit or a central processing unit (central processing unit) for processing data , CPU) 330; For example, the processor 330 here can be a neural network processor 330; a sending unit (transmitter unit, Tx) 340 and an output port 350 (or output port 350) for transmitting data; memory 360. The video decoding device 300 may also include an optical-to-electrical (OE) component and an electrical-to-optical (EO) component coupled to the input port 310, the receiving unit 320, the transmitting unit 340 and the output port 350, For the exit or entrance of optical or electrical signals.
处理器330通过硬件和软件实现。处理器330可实现为一个或多个处理器芯片、核(例如,多核处理器)、FPGA、ASIC和DSP。处理器330与入端口310、接收单元320、发送单元340、出端口350和存储器360通信。处理器330包括译码模块370(例如,基于神经网络NN的译码模块370)。译码模块370实施上文所公开的实施例。例如,译码模块370执行、处理、准备或提供各种编码操作。因此,通过译码模块370为视频译码设备300的功能提供了实质性的改进,并且影响了视频译码设备300到不同状态的切换。或者,以存储在存储器360中并由处理器330执行的指令来实现译码模块370。The processor 330 is realized by hardware and software. Processor 330 may be implemented as one or more processor chips, cores (eg, multi-core processors), FPGAs, ASICs, and DSPs. Processor 330 is in communication with ingress port 310 , receiving unit 320 , transmitting unit 340 , egress port 350 and memory 360 . The processor 330 includes a decoding module 370 (eg, a neural network NN based decoding module 370 ). The decoding module 370 implements the embodiments disclosed above. For example, the decode module 370 performs, processes, prepares, or provides for various encoding operations. Thus, a substantial improvement is provided to the functionality of the video coding device 300 by the decoding module 370 and the switching of the video coding device 300 to different states is effected. Alternatively, decode module 370 is implemented as instructions stored in memory 360 and executed by processor 330 .
存储器360包括一个或多个磁盘、磁带机和固态硬盘,可以用作溢出数据存储设备,用于在选择执行程序时存储此类程序,并且存储在程序执行过程中读取的指令和数据。存储器360可以是易失性和/或非易失性的,可以是只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、三态内容寻址存储器(ternary content-addressable memory,TCAM)和/或静态随机存取存储器(static random-access memory,SRAM)。 Memory 360, including one or more magnetic disks, tape drives, and solid-state drives, may be used as an overflow data storage device for storing programs when such programs are selected for execution, and for storing instructions and data that are read during program execution. The memory 360 can be volatile and/or nonvolatile, and can be a read-only memory (ROM), a random access memory (RAM), a ternary content-addressable memory (ternary) content-addressable memory (TCAM) and/or static random-access memory (static random-access memory, SRAM).
图4为示例性实施例提供的装置400的简化框图,装置400可用作图1中的源设备12和目的设备14中的任一个或两个。FIG. 4 is a simplified block diagram of an apparatus 400 provided by an exemplary embodiment. The apparatus 400 may be used as either or both of the source device 12 and the destination device 14 in FIG. 1 .
装置400中的处理器402可以是中央处理器。或者,处理器402可以是现有的或今后将研发出的能够操控或处理信息的任何其它类型设备或多个设备。虽然可以使用如图所示的处理器402等单个处理器来实施已公开的实现方式,但使用一个以上的处理器速度更快和效率更高。 Processor 402 in apparatus 400 may be a central processing unit. Alternatively, processor 402 may be any other type of device or devices, existing or to be developed in the future, capable of manipulating or processing information. While the disclosed implementations can be implemented using a single processor, such as processor 402 as shown, it is faster and more efficient to use more than one processor.
在一种实现方式中,装置400中的存储器404可以是只读存储器(ROM)设备或随机存 取存储器(RAM)设备。任何其它合适类型的存储设备都可以用作存储器404。存储器404可以包括处理器402通过总线412访问的代码和数据406。存储器404还可包括操作系统408和应用程序410,应用程序410包括允许处理器402执行本文所述方法的至少一个程序。例如,应用程序410可以包括应用1至N,还包括执行本文所述方法的视频译码应用。In one implementation, memory 404 in apparatus 400 may be a read only memory (ROM) device or a random access memory (RAM) device. Any other suitable type of storage device may be used as memory 404 . Memory 404 may include code and data 406 accessed by processor 402 via bus 412 . Memory 404 may also include an operating system 408 and application programs 410, including at least one program that allows processor 402 to perform the methods described herein. For example, application programs 410 may include applications 1 through N, and also include a video coding application that performs the methods described herein.
装置400还可以包括一个或多个输出设备,例如显示器418。在一个示例中,显示器418可以是将显示器与可用于感测触摸输入的触敏元件组合的触敏显示器。显示器418可以通过总线412耦合到处理器402。 Apparatus 400 may also include one or more output devices, such as display 418 . In one example, display 418 may be a touch-sensitive display that combines the display with touch-sensitive elements that may be used to sense touch input. Display 418 may be coupled to processor 402 via bus 412 .
虽然装置400中的总线412在本文中描述为单个总线,但是总线412可以包括多个总线。此外,辅助储存器可以直接耦合到装置400的其它组件或通过网络访问,并且可以包括存储卡等单个集成单元或多个存储卡等多个单元。因此,装置400可以具有各种各样的配置。Although bus 412 in device 400 is described herein as a single bus, bus 412 may include multiple buses. Additionally, secondary storage may be directly coupled to other components of device 400 or accessed over a network, and may include a single integrated unit such as a memory card or multiple units such as multiple memory cards. Accordingly, apparatus 400 may have a wide variety of configurations.
编解码器和编解码方法Codecs and Codec Methods
图5为用于实现本申请技术的一种视频编解码器的示例的示意性框图。在图5的示例中,视频编码器20包括编码单元501、前向变换单元502和概率估计单元503;视频解码器30包括解码单元504、采样单元505和反向变换单元506。图5所示的视频编解码器也可称为端到端的视频编解码器或者基于端到端视频编解码器的视频编解码器。FIG. 5 is a schematic block diagram of an example of a video codec for implementing the technology of the present application. In the example of FIG. 5 , the video encoder 20 includes an encoding unit 501 , a forward transform unit 502 and a probability estimation unit 503 ; the video decoder 30 includes a decoding unit 504 , a sampling unit 505 and an inverse transform unit 506 . The video codec shown in FIG. 5 may also be referred to as an end-to-end video codec or a video codec based on an end-to-end video codec.
编码单元501 coding unit 501
编码单元501对待编码图像进行图像编码得到压缩码流。The encoding unit 501 performs image encoding on the image to be encoded to obtain a compressed code stream.
可选地,上述图像编码可以为联合图像专家组织(joint photographic experts group,JPEG)编码方法、JPEG2000编码方法、H.264帧内编码方法、H.265帧内编码方法、H.266帧内编码方法或者其他图像编码方法。Optionally, the above-mentioned image encoding may be a joint photographic experts group (JPEG) encoding method, a JPEG2000 encoding method, an H.264 intra-frame encoding method, an H.265 intra-frame encoding method, or an H.266 intra-frame encoding method. method or other image encoding methods.
前向变换单元502forward transformation unit 502
前向变换单元502用于对第一图像进行变换,以得到第一变换图像。The forward transformation unit 502 is used to transform the first image to obtain a first transformed image.
其中,第一图像为待编码图像或者已解码图像。Wherein, the first image is an image to be encoded or an image that has been decoded.
可选地,前向变换单元502还用于对第二图像进行变换,以得到第二变换图像。Optionally, the forward transformation unit 502 is further configured to transform the second image to obtain a second transformed image.
其中,第二图像为待编码图像或者已解码图像,且第一图像与第二图像不相同。Wherein, the second image is an image to be encoded or a decoded image, and the first image is different from the second image.
在一个示例中,对第一图像进行N次小波变换,3N+1个子带,每个子带包括一个或多个小波系数,N为大于0的整数。In an example, N times of wavelet transformation are performed on the first image, 3N+1 subbands, each subband includes one or more wavelet coefficients, and N is an integer greater than 0.
其中,小波变换方式可以为传统小波变换或者基于深度网络的小波变换或者其他类似的变换方法,在此不做限定。基于深度网络的小波变换方法中,与传统的小波变换的不同之处在于,变换和预测使用基于深度网络的方法来实现,具体的深度网络的实现方法在此不做限定。本申请以一次小波变换为例,即N=1,如图6a所示,第一图像经一次小波变换后得到四个子带LL1,HL1,LH1和HH1。Wherein, the wavelet transform method may be a traditional wavelet transform or a deep network-based wavelet transform or other similar transform methods, which are not limited here. The difference between the deep network-based wavelet transform method and the traditional wavelet transform lies in that the transformation and prediction are implemented using the deep network-based method, and the specific implementation method of the deep network is not limited here. The present application takes a wavelet transform as an example, that is, N=1, as shown in FIG. 6a, four subbands LL1, HL1, LH1 and HH1 are obtained after the first image undergoes a wavelet transform.
对于第一图像来说,对第一图像进行小波变换得到的子带构成的图像即为上述第一变换图像,同理,对于第二图像来说,对第二图像进行小波变换得到的子带构成的图像即为上述第二变换图像。For the first image, the image composed of subbands obtained by performing wavelet transform on the first image is the above-mentioned first transformed image. Similarly, for the second image, the subband obtained by performing wavelet transform on the second image The formed image is the above-mentioned second converted image.
可选地,在小波变换得到多个小波系数后,对每个小波系数进行量化,得到多个量化小波系数。具体地,在对每个小波系数进行量化时,可以按照预置次序一处理每个子带,然后再按照预置次序二对当前子带内的小波系数进行量化得到量化小波系数,其中,预置次序一可以现有的Z字扫描顺序,例如:LL1→HL1→LH1→HH1。预置次序二可以为现有的Z字扫描顺序、水平扫描顺序或者竖直扫描顺序。Optionally, after the wavelet transform obtains multiple wavelet coefficients, each wavelet coefficient is quantized to obtain multiple quantized wavelet coefficients. Specifically, when quantizing each wavelet coefficient, each subband can be processed according to a preset order one, and then the wavelet coefficients in the current subband can be quantized according to a preset order two to obtain quantized wavelet coefficients, wherein the preset The order one can be the existing zigzag scanning order, for example: LL1→HL1→LH1→HH1. The second preset order can be an existing zigzag scanning order, horizontal scanning order or vertical scanning order.
应理解,上述预置次序一和预置次序二只是一个示例,不是对申请的限定,当然还可以是其他顺序。It should be understood that the preset order 1 and the preset order 2 above are just examples, and are not limitations on the application, and of course other orders may also be used.
可选地,在对每个小波系数进行量化之前,可以对小波系数进行预处理,得到处理后的小波系数,再对预处理后的小波系数进行量化操作,例如:对得到的小波系数经过一个神经网络进行特征提取,再对特征提取结果进行量化。在量化前对小波系数进行处理,可以使得解码器能够解码得到高质量的第一重建图像。Optionally, before quantizing each wavelet coefficient, the wavelet coefficient can be preprocessed to obtain the processed wavelet coefficient, and then the preprocessed wavelet coefficient can be quantized, for example: the obtained wavelet coefficient is subjected to a The neural network performs feature extraction, and then quantifies the feature extraction results. Processing the wavelet coefficients before quantization can enable the decoder to decode and obtain a high-quality first reconstructed image.
对于第一图像来说,基于对第一图像进行小波变换得到的小波系数进行量化得到的量化小波系数构成的图像即为上述第一变换图像,同理,对于第二图像来说,基于对第二图像进行小波变换得到的小波系数进行量化得到的量化小波系数构成的图像即为上述第二变换图像。For the first image, the image composed of quantized wavelet coefficients obtained by quantizing the wavelet coefficients obtained by performing wavelet transformation on the first image is the above-mentioned first transformed image. Similarly, for the second image, based on the first image The image formed by the quantized wavelet coefficients obtained by quantizing the wavelet coefficients obtained by performing wavelet transformation on the second image is the second transformed image.
在另一个实例中,对第一图像进行DCT,得到DCT图像,DCT图像包括多个频带,每个频带包括一个或多个DCT系数;其中,第一图像经过变换后,其低频分量都集中在左上角,高频分量分布在右下角,其中第一行第一列的系数值代表直流(DC)系数,即第一图像的平均值,其它系数是交流(AC)系数,DC系数和AC系数统称为DCT系数。In another example, DCT is performed on the first image to obtain a DCT image, the DCT image includes a plurality of frequency bands, and each frequency band includes one or more DCT coefficients; wherein, after the first image is transformed, its low-frequency components are concentrated in In the upper left corner, the high-frequency components are distributed in the lower right corner, where the coefficient value in the first row and first column represents the direct current (DC) coefficient, that is, the average value of the first image, and the other coefficients are the alternating current (AC) coefficient, DC coefficient and AC coefficient Collectively referred to as DCT coefficients.
可选地,对第一图像进行块划分,得到多个图像块,然后以图像块为单位进行DCT得到变换块。例如1)将第一图像划分为预置大小的图像块,预置大小的图像块的尺寸可以是4x4、8x8、16x16、32x32、64x64、128x128和256x256等。或者2)对第一图像进行划分得到一个或者多个图像块,图像块的大小不做限定。可以使用现有编码标准(H266,H265,H264,AVS2或者AVS3)中的四叉树、二叉树或者三叉树的划分方法对第一图像进行划分,以得到一个或者多个图像块。Optionally, block division is performed on the first image to obtain multiple image blocks, and then DCT is performed in units of image blocks to obtain transform blocks. For example, 1) divide the first image into image blocks of a preset size, and the sizes of the image blocks of the preset size may be 4x4, 8x8, 16x16, 32x32, 64x64, 128x128, and 256x256. Or 2) dividing the first image to obtain one or more image blocks, and the size of the image blocks is not limited. The first image may be divided using a quadtree, binary tree or ternary tree division method in an existing encoding standard (H266, H265, H264, AVS2 or AVS3) to obtain one or more image blocks.
需要指出的是,频带可以理解成一个系数块(图像块进行DCT变换得到的一个系数块,因为DCT变换是以块为单位)或者理解成各个系数块中相同位置的系数,组成一个频带。It should be pointed out that a frequency band can be understood as a coefficient block (a coefficient block obtained by performing DCT transformation on an image block, because the DCT transformation is based on a block) or as coefficients at the same position in each coefficient block to form a frequency band.
应理解,对于基于第一图像得到的DCT系数构成的图像为上述第一变换图像。对第二图像也可采用上述方式进行处理的,得到第二图像的DCT系数,该第二图像的DCT系数可以构成上述第二变换图像。It should be understood that the image formed based on the DCT coefficients obtained from the first image is the above-mentioned first transformed image. The second image may also be processed in the above manner to obtain DCT coefficients of the second image, and the DCT coefficients of the second image may constitute the second transformed image.
可选地,得到的DCT系数进行量化,比如均匀量化,得到量化DCT系数。对于第一图像来说,基于对第一图像得到的量化DCT系数构成的图像即为上述第一变换图像,同理,对于第二图像来说,基于对第二图像得到的量化DCT系数构成的图像即为上述第二变换图像。Optionally, the obtained DCT coefficients are quantized, such as uniformly quantized, to obtain quantized DCT coefficients. For the first image, the image formed based on the quantized DCT coefficients obtained for the first image is the above-mentioned first transformed image. Similarly, for the second image, the image formed based on the quantized DCT coefficients obtained for the second image is The image is the above-mentioned second transformed image.
在另一个示例中,对第一图像进行特征提取,得到三维特征图,该三维特征图即为上述第一变换图像。可选地,对三维特征图中的特征系数进行量化,得到量化特征系数,该量化特征系数构成的三维特征图即为上述第一变换图像。In another example, feature extraction is performed on the first image to obtain a three-dimensional feature map, and the three-dimensional feature map is the above-mentioned first transformed image. Optionally, the feature coefficients in the three-dimensional feature map are quantized to obtain quantized feature coefficients, and the three-dimensional feature map formed by the quantized feature coefficients is the first transformed image.
应理解,对第二图像可以进行上述处理,得到的三维特征图即为上述第二变换图像;或者对该三维特征图中的特征系数进行量化得到量化特征系数构成的三维特征图即为上述第二变换图像。It should be understood that the above-mentioned processing can be performed on the second image, and the obtained three-dimensional feature map is the above-mentioned second transformed image; or the three-dimensional feature map composed of quantized feature coefficients obtained by quantizing the feature coefficients in the three-dimensional feature map is the above-mentioned first three-dimensional feature map. 2 Transform the image.
在此需要指出的是,前向变换单元502是可选的,因此在图5中是以虚线框表示的。也就是说,前向变换单元502不存在时,输入到概率估计单元503中的像素域的图像。It should be pointed out here that the forward transformation unit 502 is optional, so it is represented by a dotted box in FIG. 5 . That is to say, when the forward transformation unit 502 does not exist, the image in the pixel domain is input to the probability estimation unit 503 .
概率估计单元503Probability estimation unit 503
概率估计单元503根据第一数据的第一上下文信息进行概率估计得到第一数据的概率估计结果。The probability estimation unit 503 performs probability estimation according to the first context information of the first data to obtain a probability estimation result of the first data.
在一个示例,第一数据为第一图像的一个像素,该像素的第一上下文信息包括第一图像中的全部或者部分像素。进一步地,该像素的第一上下文信息包括在第一图像中与该像素相 邻的像素,或者包括与该像素相邻的图像块中的部分或者全部像素,或者包括该像素所在图像块内的部分或者全部像素。In an example, the first data is a pixel of the first image, and the first context information of the pixel includes all or part of the pixels in the first image. Further, the first context information of the pixel includes the pixels adjacent to the pixel in the first image, or includes part or all of the pixels in the image block adjacent to the pixel, or includes the pixels in the image block where the pixel is located. some or all of the pixels.
在此需要指出的是,上述“周围的像素”是指与该第一数据之间距离小于预设阈值的像素,该预设阈值的单位为“像素”。It should be noted here that the above-mentioned "surrounding pixels" refer to pixels whose distance from the first data is smaller than a preset threshold, and the unit of the preset threshold is "pixel".
在一个示例中,第一数据为第一变换图像中的一个系数,若第一数据为小波系数或者量化小波系数,则第一数据的第一上下文信息包括第一变换图像中的部分或者全部系数,该系数为小波系数或者量化小波系数。进一步地,第一数据的第一上下文信息包括第一变换图像中第一数据周围的小波系数或者量化小波系数,或者第一上下文信息包括与该第一数据相邻的子带内的部分或者全部系数,该系数为小波系数或者量化小波系数;或者第一上下文信息包括第一数据所在子带内的部分或者全部系数;该系数为小波系数或者量化小波系数;In an example, the first data is a coefficient in the first transformed image, and if the first data is a wavelet coefficient or a quantized wavelet coefficient, the first context information of the first data includes some or all coefficients in the first transformed image , the coefficient is a wavelet coefficient or a quantized wavelet coefficient. Further, the first context information of the first data includes wavelet coefficients or quantized wavelet coefficients around the first data in the first transformed image, or the first context information includes part or all of the subbands adjacent to the first data A coefficient, the coefficient is a wavelet coefficient or a quantized wavelet coefficient; or the first context information includes part or all of the coefficients in the sub-band where the first data is located; the coefficient is a wavelet coefficient or a quantized wavelet coefficient;
或者,or,
若第一数据为DCT系数或者量化DCT系数,则第一数据的第一上下文信息包括第一变换图像中的部分或者全部系数,该系数为DCT系数或者量化DCT系数。第一数据的第一上下文信息包括第一变换图像中第一数据周围的DCT系数或者量化DCT系数,或者第一上下文信息包括与该第一数据相邻的子带内的部分或者全部系数该系数为DCT系数或者量化DCT系数;或者第一上下文信息包括第一数据所在子带内的部分或者全部系数;该系数为DCT系数或者量化DCT系数;If the first data are DCT coefficients or quantized DCT coefficients, the first context information of the first data includes part or all of the coefficients in the first transformed image, and the coefficients are DCT coefficients or quantized DCT coefficients. The first context information of the first data includes DCT coefficients or quantized DCT coefficients around the first data in the first transformed image, or the first context information includes some or all of the coefficients in the subband adjacent to the first data. It is a DCT coefficient or a quantized DCT coefficient; or the first context information includes some or all coefficients in the subband where the first data is located; the coefficient is a DCT coefficient or a quantized DCT coefficient;
或者,若第一数据为特征系数或量化特征系数,第一变换图像为第一图像进行特征提取得到的三维特征图,则第一数据的第一上下文信息包括第一变换图像中的部分或者全部系数,该系数为特征系数或量化特征系数;进一步地,第一数据的第一上下文信息包括第一变换图像中第一数据周围的特征系数或量化特征系数,或者第一上下文信息包括第一数据所在通道内的部分或者全部系数。该系数为特征系数或量化特征系数。Alternatively, if the first data is a feature coefficient or a quantized feature coefficient, and the first transformed image is a three-dimensional feature map obtained by performing feature extraction on the first image, the first context information of the first data includes part or all of the first transformed image Coefficients, the coefficients are characteristic coefficients or quantized characteristic coefficients; further, the first context information of the first data includes characteristic coefficients or quantized characteristic coefficients around the first data in the first transformed image, or the first context information includes the first data Some or all of the coefficients in the channel. The coefficient is a characteristic coefficient or a quantized characteristic coefficient.
上述“周围的小波系数或者量化小波系数”是指与该第一数据之间距离小于预设阈值的小波系数或者量化小波系数,该预设阈值的单位为“小波系数或者量化小波系数”;上述“周围的DCT系数或者量化DCT系数”是指与该第一数据之间距离小于预设阈值的DCT系数或者量化DCT系数,该预设阈值的单位为“DCT系数或者量化DCT系数”;上述“周围的特征系数或量化特征系数”是指与该第一数据之间距离小于预设阈值的特征系数或量化特征系数,该预设阈值的单位为“特征系数或量化特征系数”。The aforementioned "surrounding wavelet coefficients or quantized wavelet coefficients" refer to wavelet coefficients or quantized wavelet coefficients whose distance from the first data is smaller than a preset threshold, and the unit of the preset threshold is "wavelet coefficients or quantized wavelet coefficients"; "Surrounding DCT coefficients or quantized DCT coefficients" refer to DCT coefficients or quantized DCT coefficients whose distance from the first data is less than a preset threshold, and the unit of the preset threshold is "DCT coefficients or quantized DCT coefficients"; the above " Surrounding characteristic coefficients or quantified characteristic coefficients" refer to characteristic coefficients or quantified characteristic coefficients whose distance from the first data is smaller than a preset threshold, and the unit of the preset threshold is "characteristic coefficient or quantified characteristic coefficient".
在一个示例中,概率估计单元503根据第一数据的第一上下文信息进行概率估计得到第一数据的概率估计结果,包括:In an example, the probability estimation unit 503 performs probability estimation according to the first context information of the first data to obtain a probability estimation result of the first data, including:
概率估计单元503根据第一数据的第一上下文信息和第二上下文信息进行概率估计得到第一数据的概率估计结果;其中,第一上下文信息和第二上下文信息分别根据第一图像和第二图像得到。The probability estimation unit 503 performs probability estimation according to the first context information and the second context information of the first data to obtain a probability estimation result of the first data; wherein, the first context information and the second context information are respectively based on the first image and the second image get.
举例说明,如图6b所示,假设第一数据为待编码图像中位置P处的像素,第一数据的第一上下文信息包括待编码图像中位置P处像素周围的像素(图6b的灰色块所示),或者包括与位置P处像素相邻的图像块中的部分或者全部像素,或者包括位置P处像素所在图像块内的部分或者全部像素;第二上下文信息包括已解码图像中位置P处像素周围的像素,或者包括与位置P处像素相邻的图像块中的部分或者全部像素,或者包括位置P处像素所在图像块内的部分或者全部像素。For example, as shown in Figure 6b, assuming that the first data is the pixel at position P in the image to be encoded, the first context information of the first data includes the pixels around the pixel at position P in the image to be encoded (the gray block in Figure 6b shown), or include some or all of the pixels in the image block adjacent to the pixel at position P, or include some or all of the pixels in the image block where the pixel at position P is located; the second context information includes position P in the decoded image The pixels around the pixel at position , or include some or all of the pixels in the image block adjacent to the pixel at position P, or include some or all of the pixels in the image block where the pixel at position P is located.
假设第一数据为第一变换图像中位置P处的小波系数或者量化小波系数,第一数据的第一上下文信息包括第一变换图像中位置P处系数周围的系数,该系数为小波系数或者量化小 波系数,或者第一上下文信息包括与位置P处系数相邻的子带中的部分或者全部系数,该系数为小波系数或者量化小波系数,或者第一上下文信息包括位置P处系数所在子带内的部分或者全部系数,该系数为小波系数或者量化小波系数;第二上下文信息包括第二变换图像中位置P处系数周围的系数,该系数为小波系数或者量化小波系数,或者第二上下文信息包括与位置P处系数相邻的子带中的部分或者全部系数,该系数为小波系数或者量化小波系数,或者第二上下文信息包括位置P处系数所在子带内的部分或者全部系数,该系数为小波系数或者量化小波系数。Assuming that the first data is the wavelet coefficient or quantized wavelet coefficient at position P in the first transformed image, the first context information of the first data includes coefficients around the coefficient at position P in the first transformed image, and the coefficient is the wavelet coefficient or quantized wavelet coefficient The wavelet coefficients, or the first context information includes some or all of the coefficients in the subband adjacent to the coefficient at position P, and the coefficients are wavelet coefficients or quantized wavelet coefficients, or the first context information includes the subband where the coefficient at position P is located Part or all of the coefficients, the coefficients are wavelet coefficients or quantized wavelet coefficients; the second context information includes coefficients around the coefficients at position P in the second transformed image, and the coefficients are wavelet coefficients or quantized wavelet coefficients, or the second context information includes Some or all of the coefficients in the subband adjacent to the coefficient at position P, the coefficients are wavelet coefficients or quantized wavelet coefficients, or the second context information includes some or all of the coefficients in the subband where the coefficient at position P is located, and the coefficients are Wavelet coefficients or quantized wavelet coefficients.
假设第一数据为第一变换图像中位置P处的DCT系数或者量化DCT系数,第一数据的第一上下文信息包括第一变换图像中位置P处系数周围的系数,该系数为DCT系数或者量化DCT系数,或者第一上下文信息包括与位置P处系数相邻的频带中的部分或者全部系数,该系数为DCT系数或者量化DCT系数,或者第一上下文信息包括位置P处系数所在频带内的部分或者全部系数,该系数为DCT系数或者量化DCT系数;第二上下文信息包括第二变换图像中位置P处系数周围的系数,该系数为DCT系数或者量化DCT系数,或者第二上下文信息包括与位置P处系数相邻的频带中的部分或者全部系数,该系数为DCT系数或者量化DCT系数,或者第二上下文信息包括位置P处系数所在频带内的部分或者全部系数,该系数为DCT系数或者量化DCT系数。Assuming that the first data is a DCT coefficient or quantized DCT coefficient at position P in the first transformed image, the first context information of the first data includes coefficients around the coefficient at position P in the first transformed image, and the coefficient is a DCT coefficient or quantized DCT coefficients, or the first context information includes part or all of the coefficients in the frequency band adjacent to the coefficient at position P, and the coefficients are DCT coefficients or quantized DCT coefficients, or the first context information includes the part in the frequency band where the coefficient at position P is located Or all coefficients, the coefficients are DCT coefficients or quantized DCT coefficients; the second context information includes coefficients around the coefficient at position P in the second transformed image, and the coefficients are DCT coefficients or quantized DCT coefficients, or the second context information includes the same as the position Some or all of the coefficients in the frequency band adjacent to the coefficient at P, the coefficients are DCT coefficients or quantized DCT coefficients, or the second context information includes some or all of the coefficients in the frequency band where the coefficient at position P is located, and the coefficients are DCT coefficients or quantized DCT coefficients.
假设第一数据为第一变换图像中位置P处的特征系数或量化特征系数,也即第一变换图像和第二变换图像分别为对第一图像和第二图像进行特征提取得到三维特征图;第一数据的第一上下文信息包括第一变换图像中位置P处系数周围的特征系数或量化特征系数,或者第一上下文信息包括与位置P处特征系数相邻的通道内的部分或者全部系数,该系数为特征系数或量化特征系数,或者第一上下文信息包括位置P处系数所在通道内的部分或者全部系数,该系数为特征系数或量化特征系数;第二上下文信息包括第二变换图像中位置P处系数周围的特征系数或量化特征系数,或者第二上下文信息包括与位置P处系数相邻的通道内的部分或者全部系数,该系数为特征系数或量化特征系数,或者第二上下文信息包括位置P处系数所出通道内的部分或者全部系数,该系数为特征系数或量化特征系数。Assume that the first data is the feature coefficient or quantized feature coefficient at position P in the first transformed image, that is, the first transformed image and the second transformed image are three-dimensional feature maps obtained by feature extraction of the first image and the second image respectively; The first context information of the first data includes characteristic coefficients or quantized characteristic coefficients around the coefficient at position P in the first transformed image, or the first context information includes some or all coefficients in channels adjacent to the characteristic coefficient at position P, The coefficient is a characteristic coefficient or a quantized characteristic coefficient, or the first context information includes part or all of the coefficients in the channel where the coefficient at the position P is located, and the coefficient is a characteristic coefficient or a quantized characteristic coefficient; the second context information includes the position in the second transformed image Characteristic coefficients or quantized characteristic coefficients around the coefficient at position P, or the second context information includes some or all coefficients in the channel adjacent to the coefficient at position P, and the coefficients are characteristic coefficients or quantized characteristic coefficients, or the second context information includes Part or all of the coefficients in the channel from which the coefficient at position P comes out, the coefficients are characteristic coefficients or quantized characteristic coefficients.
在一个示例中,概率估计单元503还根据第二数据的第一上下文信息进行概率估计得到第二数据的概率估计结果。In an example, the probability estimation unit 503 further performs probability estimation according to the first context information of the second data to obtain a probability estimation result of the second data.
需要指出的是,第二数据与第一数据属于同一图像(比如第一图像或者对第一图像进行变换得到的第一变换图像)不同位置的数据,根据第二数据的第一上下文信息进行概率估计得到第二数据的概率估计结果的具体过程可参见上述根据第一数据的第一上下文信息进行概率估计得到第一数据的概率估计结果的相关描述,在此不再叙述。It should be pointed out that the second data and the first data belong to data in different positions of the same image (such as the first image or the first transformed image obtained by transforming the first image), and the probability is calculated according to the first context information of the second data. For the specific process of estimating and obtaining the probability estimation result of the second data, refer to the related description of obtaining the probability estimation result of the first data by performing probability estimation according to the first context information of the first data, and will not be described here again.
在一个可行的实施例中,第一数据和第二数据属于同一预置区域,该预置区域可以为第一图像中的一个图像块,或为对第一图像进行小波变换得到的子带,或者为对第一图像进行DCT得到的频带,或对第一图像进行特征提取得到的三维特征图的一个通道,在概率估计时可以只得到一个概率估计结果,该概率估计结果可以称为预置区域的概率估计结果。对于一个预置区域内的数据,只得到一个概率估计结果,传输时也只需要传输一个概率估计结果(即预置区域的概率估计结果),可以节省码流。In a feasible embodiment, the first data and the second data belong to the same preset area, and the preset area may be an image block in the first image, or a subband obtained by performing wavelet transform on the first image, Or for the frequency band obtained by performing DCT on the first image, or a channel of the three-dimensional feature map obtained by performing feature extraction on the first image, only one probability estimation result can be obtained during probability estimation, and the probability estimation result can be called preset Probability estimates for the region. For data in a preset area, only one probability estimation result is obtained, and only one probability estimation result (that is, the probability estimation result of the preset area) needs to be transmitted during transmission, which can save code streams.
下面介绍如何得到第一预置区域的概率估计结果。The following describes how to obtain the probability estimation result of the first preset area.
方式一:对于第一预置区域内的每个数据,按照上述得到第一数据的概率估计结果方式进行处理,可得到第一预置区域内所有数据的概率估计结果,比如第一预置区域内有5个数据,就可以得到5个概率估计结果;然后从第一预置区域内所有数据的概率估计结果中选取 出目标概率估计结果,作为第一预置区域的概率估计结果。比如位于第一预置区域的中间位置,或者左上角或右上角、左下角或者右下角的数据的概率估计结果为第一预置区域的概率估计结果。Method 1: For each data in the first preset area, the probability estimation result of all data in the first preset area can be obtained by processing according to the above method of obtaining the probability estimation result of the first data, such as the first preset area If there are 5 data in it, 5 probability estimation results can be obtained; then the target probability estimation result is selected from the probability estimation results of all the data in the first preset area as the probability estimation result of the first preset area. For example, the probability estimation result of the data located in the middle of the first preset area, or the upper left corner or the upper right corner, the lower left corner or the lower right corner is the probability estimation result of the first preset area.
方式二:根据第一预置区域的第一上下文信息进行概率估计得到第一预置区域的概率估计结果;或者根据第一预置区域的第一上下文信息和第二上下文信息进行概率估计得到第一预置区域的概率估计结果。Method 2: Perform probability estimation according to the first context information of the first preset area to obtain the probability estimation result of the first preset area; or perform probability estimation according to the first context information and the second context information of the first preset area to obtain the second A probability estimation result of a preset area.
在一个示例中,若第一预置区域为第一图像的一个图像块,第一预置区域的第一上下文信息包括第一图像中的部分或者全部像素,进一步地,第一预置区域的第一上下文信息包括第一图像中第一预置区域周围图像块内的部分或者全部像素;In an example, if the first preset area is an image block of the first image, the first context information of the first preset area includes some or all pixels in the first image, further, the first preset area The first context information includes some or all pixels in the image blocks around the first preset area in the first image;
若预置区域为第一变换图像(通过对第一图像进行小波变换得到)的一个子带,第一预置区域的第一上下文信息包括第一变换图像中的部分或者全部系数,进一步地,第一预置区域的第一上下文信息包括第一图像中第一预置区域周围子带内的部分或者全部系数,该系数为小波系数或者量化小波系数;If the preset area is a subband of the first transformed image (obtained by performing wavelet transform on the first image), the first context information of the first preset area includes some or all coefficients in the first transformed image, further, The first context information of the first preset area includes some or all coefficients in subbands around the first preset area in the first image, the coefficients being wavelet coefficients or quantized wavelet coefficients;
若第一预置区域为第一变换图像(通过对第一图像进行DCT得到)的一个频带,第一预置区域的第一上下文信息包括第一变换图像中的部分或者全部系数,进一步地,第一预置区域的第一上下文信息包括第一图像中第一预置区域周围频带内的部分或者全部系数,该系数为DCT系数或者量化DCT系数;If the first preset area is a frequency band of the first transformed image (obtained by performing DCT on the first image), the first context information of the first preset area includes some or all coefficients in the first transformed image, further, The first context information of the first preset area includes some or all coefficients in the frequency band around the first preset area in the first image, and the coefficients are DCT coefficients or quantized DCT coefficients;
若第一预置区域为第一变换图像(通过对第一图像进行特征提取得到三维特征图)的一个通道,第一预置区域的第一上下文信息包括第一变换图像中的部分或者全部系数,该系数为特征系数或量化特征系数,进一步地,第一预置区域的第一上下文信息包括第一图像中第一预置区域所属通道内的部分或者全部系数,该系数为特征系数或量化特征系数。If the first preset area is a channel of the first transformed image (a three-dimensional feature map obtained by performing feature extraction on the first image), the first context information of the first preset area includes some or all coefficients in the first transformed image , the coefficient is a characteristic coefficient or a quantized characteristic coefficient. Further, the first context information of the first preset area includes some or all coefficients in the channel to which the first preset area belongs in the first image, and the coefficient is a characteristic coefficient or a quantized characteristic coefficient.
在概率估计单元503根据第一预置区域的第一上下文信息和第二上下文信息进行概率估计得到第一预置区域的概率估计结果时,其中,第一上下文信息和第二上下文信息分别根据第一图像和第二图像,其中,第二图像为待编码图像或已解码图像,且第一图像与第二图像不相同。When the probability estimation unit 503 performs probability estimation according to the first context information and the second context information of the first preset area to obtain the probability estimation result of the first preset area, wherein the first context information and the second context information are respectively based on the first context information An image and a second image, wherein the second image is an image to be encoded or a decoded image, and the first image is different from the second image.
举例说明,如图6c所示,假设第一预置区域为第一图像中的区域B,第一预置区域的第一上下文信息包括第一图像中区域B的周围区域(图6c中左图所示的灰色块)中的部分或者全部像素;第二上下文信息包括第二图像中区域B的周围区域(图6c中右图所示的灰色块)中的部分或者全部像素。第一图像中的区域B为第一图像中的图像块。For example, as shown in Figure 6c, assuming that the first preset area is area B in the first image, the first context information of the first preset area includes the surrounding area of area B in the first image (the left figure in Figure 6c Part or all of the pixels in the gray block shown in FIG. 6 c ); the second context information includes part or all of the pixels in the surrounding area of area B in the second image (the gray block shown in the right figure in FIG. 6 c ). Region B in the first image is an image block in the first image.
假设第一预置区域为第一变换图像中子带B,第一预置区域的第一上下文信息包括第一变换图像中子带B周围子带内的全部或者部分系数,第二上下文信息包括第二变换图像中子带B周围子带内的全部或者部分系数,该系数为小波系数或者量化小波系数。Assuming that the first preset area is subband B in the first transformed image, the first context information of the first preset area includes all or part of the coefficients in the subbands around subband B in the first transformed image, and the second context information includes All or part of the coefficients in subbands around subband B in the second transformed image are wavelet coefficients or quantized wavelet coefficients.
假设第一预置区域为第一变换图像中频带B,第一预置区域的第一上下文信息包括第一变换图像中子带B周围频带内的全部或者部分系数,第二上下文信息包括第二变换图像中子带B周围频带内的全部或者部分系数,该系数为DCT系数或者量化DCT系数。Assuming that the first preset area is the frequency band B in the first transformed image, the first context information of the first preset area includes all or part of the coefficients in the frequency band around subband B in the first transformed image, and the second context information includes the second Transform all or part of the coefficients in the frequency band around the sub-band B in the image, and the coefficients are DCT coefficients or quantized DCT coefficients.
假设第一预置区域为第一变换图像中通道B,第一预置区域的第一上下文信息包括第一变换图像中通道B周围子带内的全部或者部分系数,第二上下文信息包括第二变换图像中通道B周围通道内的全部或者部分系数,该系数为小波系数或者量化小波系数。Assuming that the first preset area is channel B in the first transformed image, the first context information of the first preset area includes all or part of the coefficients in the subbands around channel B in the first transformed image, and the second context information includes the second Transform all or part of the coefficients in channels around channel B in the image, and the coefficients are wavelet coefficients or quantized wavelet coefficients.
在一个示例中,对于第一数据的概率估计结果,概率估计单元503获取第一数据的概率分布模型;将第一数据的第一上下文信息和/或第二上下文信息经过第一概率估计网络进行处理,以得到该概率分布模型的参数;根据第一数据的概率分布模型和该概率分布模型的参数 得到第一数据的概率分布;上述第一数据的概率估计结果包括上述第一数据的概率分布,或者上述第一数据的概率分布模型的参数;In an example, for the probability estimation result of the first data, the probability estimation unit 503 obtains the probability distribution model of the first data; and performs the first context information and/or the second context information of the first data through the first probability estimation network. Processing to obtain the parameters of the probability distribution model; obtain the probability distribution of the first data according to the probability distribution model of the first data and the parameters of the probability distribution model; the probability estimation result of the above-mentioned first data includes the probability distribution of the above-mentioned first data , or the parameters of the probability distribution model of the above-mentioned first data;
或者,or,
将第一数据的第一上下文信息和/或第二上下文信息经过第二概率估计网络进行处理,以得到第一数据的概率分布;上述第一数据的概率估计结果包括第一数据的概率分布,或者包括该概率分布对应的概率分布模型的参数,其中,第一概率估计网络和第二概率估计网络是基于神经网络实现的。Processing the first context information and/or the second context information of the first data through the second probability estimation network to obtain the probability distribution of the first data; the probability estimation result of the first data includes the probability distribution of the first data, Or include parameters of a probability distribution model corresponding to the probability distribution, wherein the first probability estimation network and the second probability estimation network are implemented based on a neural network.
按照上述方式,可以上述第二数据的概率估计结果。In the manner described above, the probability estimation result of the above-mentioned second data can be obtained.
在一个示例中,对于第一预置区域的概率估计结果,可以通过如下方式得到:In an example, the probability estimation result of the first preset area can be obtained in the following manner:
概率估计单元503获取第一预置区域的概率分布模型;将第一预置区域的第一上下文信息和/或第二上下文信息经过第三概率估计网络进行处理,以得到该概率分布模型的参数;根据第一预置区域的概率分布模型和该概率分布模型的参数得到第一预置区域的概率分布;其中,上述第一预置区域的概率估计结果包括上述第一预置区域的概率分布,或者上述第一预置区域的概率分布模型的参数;The probability estimation unit 503 obtains the probability distribution model of the first preset area; processes the first context information and/or the second context information of the first preset area through a third probability estimation network to obtain parameters of the probability distribution model ; According to the probability distribution model of the first preset area and the parameters of the probability distribution model, the probability distribution of the first preset area is obtained; wherein, the probability estimation result of the first preset area includes the probability distribution of the first preset area , or the parameters of the probability distribution model of the above-mentioned first preset area;
或者,or,
将第一预置区域的第一上下文信息和/或第二上下文信息经过第四概率估计网络进行处理,以得到第一预置区域的概率分布;上述第一预置区域的概率估计结果包括第一预置区域的概率分布,或者包括该概率分布对应的概率分布模型的参数;其中,第三概率估计网络和第四概率估计网络是基于神经网络实现的。Processing the first context information and/or the second context information of the first preset area through the fourth probability estimation network to obtain the probability distribution of the first preset area; the probability estimation result of the first preset area includes the second A probability distribution of a preset area, or parameters of a probability distribution model corresponding to the probability distribution; wherein, the third probability estimation network and the fourth probability estimation network are realized based on a neural network.
可选地,上述概率分布模型可以是:单高斯模型(Gaussian single model,GSM)、非对称高斯模型、混合高斯模型(Gaussian mixture model,GMM)或者拉普拉斯分布模型(Laplace distribution)。其中,概率估计网络可以基于深度学习网络实现,例如循环神经网络(recurrent neural network,RNN)和逐像素卷积神经网络(Pixel convolutional neural network,PixelCNN)等,在此不做限定。Optionally, the above probability distribution model may be: a single Gaussian model (Gaussian single model, GSM), an asymmetric Gaussian model, a mixed Gaussian model (Gaussian mixture model, GMM) or a Laplace distribution model (Laplace distribution). Wherein, the probability estimation network can be implemented based on a deep learning network, such as a recurrent neural network (recurrent neural network, RNN) and a pixel convolutional neural network (Pixel convolutional neural network, PixelCNN), etc., which are not limited here.
作为示例,当概率分布模型为高斯模型(单高斯模型或者非对称高斯模型或者混合高斯模型)时,概率分布模型的参数为高斯模型的参数,包括均值μ和方差σ。As an example, when the probability distribution model is a Gaussian model (a single Gaussian model or an asymmetric Gaussian model or a mixed Gaussian model), the parameters of the probability distribution model are parameters of the Gaussian model, including mean μ and variance σ.
作为示例,当概率分布模型为拉普拉斯分布模型时,概率分布模型的参数为拉普拉斯分布模型的参数,包括位置参数μ和尺度参数b。As an example, when the probability distribution model is a Laplace distribution model, the parameters of the probability distribution model are parameters of the Laplace distribution model, including a location parameter μ and a scale parameter b.
作为示例,一个典型的基于PixelCNN的概率估计网络(包括上述第一概率估计网络、第二概率估计网络、第三概率估计网络和第四概率估计网络)如图6d所示。“h×w”表示当前卷积层使用尺寸为“h×w”的卷积核,“ResB”表示残差模块,结构如图6e所示,“*/relu”表示在当前层之后使用relu激活函数。As an example, a typical probability estimation network based on PixelCNN (including the first probability estimation network, the second probability estimation network, the third probability estimation network and the fourth probability estimation network) is shown in Fig. 6d. "h×w" indicates that the current convolutional layer uses a convolution kernel with a size of "h×w", "ResB" indicates the residual module, and the structure is shown in Figure 6e, "*/relu" indicates that relu is used after the current layer activation function.
在一个示例中,概率估计单元503在得到第一数据的概估计结果后,对第一数据的概估计结果进行预处理,得到处理后的概率估计结果。具体地,若第一数据的概率估计结果包括高斯分布的均值和方差,对高斯分布的方差进行处理得到处理后的方差,高斯分布的均值和处理后的方差作为第一数据的处理后的概率估计结果;或者,In an example, after obtaining the approximate estimation result of the first data, the probability estimation unit 503 performs preprocessing on the approximate estimation result of the first data to obtain a processed probability estimation result. Specifically, if the probability estimation result of the first data includes the mean and variance of the Gaussian distribution, the variance of the Gaussian distribution is processed to obtain the processed variance, and the mean and the processed variance of the Gaussian distribution are used as the processed probability of the first data estimated results; or,
对高斯分布的均值进行处理得到处理后的均值,高斯分布的方差和处理后的均值作为第一数据的处理后的概率估计结果。The mean value of the Gaussian distribution is processed to obtain the processed mean value, and the variance of the Gaussian distribution and the processed mean value are used as the processed probability estimation result of the first data.
在一个示例中,对高斯分布的方差进行处理得到处理后的方差,包括:In one example, the variance of the Gaussian distribution is processed to obtain the processed variance, including:
将高斯分布的方差置为0作为处理后的方差。The variance of the Gaussian distribution is set to 0 as the variance after processing.
在一个示例中,对高斯分布的方差进行处理得到处理后的方差,包括:In one example, the variance of the Gaussian distribution is processed to obtain the processed variance, including:
根据第一数据的缩放因子对高斯分布的方差进行处理,得到处理后的方差;Process the variance of the Gaussian distribution according to the scaling factor of the first data to obtain the processed variance;
其中,第一数据的缩放因子和第二数据的缩放因子相同;或者,Wherein, the scaling factor of the first data is the same as the scaling factor of the second data; or,
第一数据的缩放因子和第二数据的缩放因子不同;或者,the scaling factor of the first data and the scaling factor of the second data are different; or,
若第一数据和第二数据在第一图像中属于同一个图像块,则第一数据的缩放因子和第二数据的缩放因子相同;或者若第一数据和第二数据属于不同图像块,则第一数据的缩放因子和第二数据的缩放因子不同;或者第一数据的缩放因子是根据第一数据所属的图像块的纹理复杂度确定的;If the first data and the second data belong to the same image block in the first image, the scaling factor of the first data and the scaling factor of the second data are the same; or if the first data and the second data belong to different image blocks, then The scaling factor of the first data is different from the scaling factor of the second data; or the scaling factor of the first data is determined according to the texture complexity of the image block to which the first data belongs;
或者,or,
若第一数据和第二数据属于对第一图像进行小波变换得到的多个子带中的一个子带,则第一数据的缩放因子和第二数据的缩放因子相同;或者若第一数据和第二数据属于不同子带,则第一数据的缩放因子和第二数据的缩放因子不同;或者第一数据的缩放因子是根据第一数据所属的子带的纹理复杂度确定的;If the first data and the second data belong to one subband among the plurality of subbands obtained by performing wavelet transformation on the first image, then the scaling factor of the first data is the same as that of the second data; or if the first data and the second data The two data belong to different subbands, and the scaling factor of the first data is different from the scaling factor of the second data; or the scaling factor of the first data is determined according to the texture complexity of the subband to which the first data belongs;
或者,or,
若第一数据和第二数据属于对第一图像进行DCT得到的多个频带中一个频带,则第一数据的缩放因子和第二数据的缩放因子相同;或者若第一数据和第二数据属于不同频带,则第一数据的缩放因子和第二数据的缩放因子不同;若或者第一数据的缩放因子是根据第一数据所属的频带的纹理复杂度确定的;If the first data and the second data belong to one frequency band among a plurality of frequency bands obtained by performing DCT on the first image, then the scaling factor of the first data and the scaling factor of the second data are the same; or if the first data and the second data belong to Different frequency bands, the scaling factor of the first data is different from the scaling factor of the second data; if or the scaling factor of the first data is determined according to the texture complexity of the frequency band to which the first data belongs;
或者,or,
若第一数据和第二数据属于对第一图像进行特征提取得到的三维特征图的同一通道,则第一数据的缩放因子和第二数据的缩放因子相同;或者若第一数据和第二数据属于不同通道,则第一数据的缩放因子和第二数据的缩放因子不同;或者第一数据的缩放因子是根据第一数据所属的通道的纹理复杂度确定的。If the first data and the second data belong to the same channel of the three-dimensional feature map obtained by performing feature extraction on the first image, then the scaling factor of the first data and the scaling factor of the second data are the same; or if the first data and the second data belong to different channels, the scaling factor of the first data is different from the scaling factor of the second data; or the scaling factor of the first data is determined according to the texture complexity of the channel to which the first data belongs.
在一个示例中,当第一数据的概率估计结果包括拉普拉斯分布的位置参数和尺度参数,根据第一数据的缩放因子对拉普拉斯分布的尺度参数进行处理,第一数据的处理后的概率估计结果包括处理后的尺度参数和拉普拉斯分布的位置参数。In one example, when the probability estimation result of the first data includes the location parameter and the scale parameter of the Laplace distribution, the scale parameter of the Laplace distribution is processed according to the scaling factor of the first data, and the processing of the first data The final probability estimation results include the processed scale parameters and the location parameters of the Laplace distribution.
在一个示例中,当第一数据的概率估计结果包括拉普拉斯分布的位置参数和尺度参数,根据第一数据的缩放因子对拉普拉斯分布的位置参数进行处理,第一数据的处理后的概率估计结果包括处理后的位置参数和拉普拉斯分布的尺度参数。In one example, when the probability estimation result of the first data includes the location parameter and scale parameter of the Laplace distribution, the location parameter of the Laplace distribution is processed according to the scaling factor of the first data, and the processing of the first data The final probability estimation results include the processed location parameters and the scale parameters of the Laplace distribution.
在一个示例后,概率估计单元503在得到第一预置区域的概估计结果后,对第一预置区域的概估计结果进行预处理,得到处理后的概率估计结果。具体地,若第一预置区域的概率估计结果包括高斯分布的均值和方差,对高斯分布的方差进行处理得到处理后的方差,高斯分布的均值和处理后的方差作为第一预置区域的处理后的概率估计结果;或者,After an example, after obtaining the approximate estimation result of the first preset area, the probability estimation unit 503 preprocesses the approximate estimation result of the first preset area to obtain the processed probability estimation result. Specifically, if the probability estimation result of the first preset area includes the mean value and variance of the Gaussian distribution, the variance of the Gaussian distribution is processed to obtain the processed variance, and the mean value and the processed variance of the Gaussian distribution are used as the value of the first preset area. Processed probability estimates; or,
对高斯分布的均值进行处理得到处理后的均值,高斯分布的方差和处理后的均值作为预置区域的处理后的概率估计结果。在一个示例中,对高斯分布的方差进行处理得到处理后的方差,包括:将高斯分布的方差置为0作为处理后的方差。在一个示例中,对高斯分布的方差进行处理得到处理后的方差,包括:The mean value of the Gaussian distribution is processed to obtain the processed mean value, and the variance of the Gaussian distribution and the processed mean value are used as the processed probability estimation result of the preset area. In an example, processing the variance of the Gaussian distribution to obtain the processed variance includes: setting the variance of the Gaussian distribution to 0 as the processed variance. In one example, the variance of the Gaussian distribution is processed to obtain the processed variance, including:
根据第一预置区域的缩放因子对高斯分布的方差进行处理,得到处理后的方差;Processing the variance of the Gaussian distribution according to the scaling factor of the first preset area to obtain the processed variance;
其中,第一预置区域的缩放因子和其他预置区域的缩放因子相同;或者,Wherein, the scaling factor of the first preset area is the same as that of other preset areas; or,
第一预置区域的缩放因子和其他预置区域的缩放因子不同。The scaling factor of the first preset area is different from that of other preset areas.
在一个示例中,当第一预置区域的概率估计结果包括拉普拉斯分布的位置参数和尺度参数,根据第一预置区域的缩放因子对拉普拉斯分布的尺度参数进行处理,第一预置区域的处 理后的概率估计结果包括处理后的尺度参数和拉普拉斯分布的位置参数。In an example, when the probability estimation result of the first preset area includes the position parameter and the scale parameter of the Laplace distribution, the scale parameter of the Laplace distribution is processed according to the scaling factor of the first preset area, the first The processed probability estimation result of a preset area includes processed scale parameters and location parameters of Laplace distribution.
在一个示例中,当第一预置区域的概率估计结果包括拉普拉斯分布的位置参数和尺度参数,根据第一预置区域的缩放因子对拉普拉斯分布的位置参数进行处理,第一预置区域的处理后的概率估计结果包括处理后的位置参数和拉普拉斯分布的尺度参数。In an example, when the probability estimation result of the first preset area includes the location parameter and the scale parameter of the Laplace distribution, the location parameter of the Laplace distribution is processed according to the scaling factor of the first preset area, the first The processed probability estimation result of a preset area includes processed position parameters and scale parameters of Laplace distribution.
在一个示例中,在得到第一数据的概率估计结果和第二数据的概率估计结果后,编码单元501直接将第一数据的概率估计结果和第二数据的概率估计结果写入压缩码流。在一个示例中,在视频压缩中,可以将第一数据的概率估计结果和第二数据的概率估计结果保存在序列头(sequence header)、图像头(picture header)、Slice(slice header)或附加增强信息(suplemental enhancement information,SEI)中传输到解码器30。In an example, after obtaining the probability estimation results of the first data and the probability estimation results of the second data, the encoding unit 501 directly writes the probability estimation results of the first data and the probability estimation results of the second data into the compressed code stream. In one example, in video compression, the probability estimation result of the first data and the probability estimation result of the second data can be stored in the sequence header (sequence header), image header (picture header), Slice (slice header) or attached Enhanced information (supplemental enhancement information, SEI) is transmitted to the decoder 30.
在一个示例中,在得到第一预置区域的概率估计结果后,将第一预置区域的第一标识enable_flag置为第一值(比如1或true),以指示在解码端在采样得到第一预置区域中的估计系数时使用同一概率分布,即第一预置区域的概率估计结果,并将第一预置区域的概率估计结果保存至概率估计结果集合中,并记录第一预置区域的概率估计结果在概率估计结果集合中索引和第一预置区域的尺寸信息,编码单元501将概率估计结果集合、第一预置区域的enable_flag、索引和尺寸信息写入压缩码流。In an example, after obtaining the probability estimation result of the first preset area, the first flag enable_flag of the first preset area is set to a first value (for example, 1 or true), to indicate that the decoding end obtains the first Use the same probability distribution when estimating coefficients in a preset area, that is, the probability estimation result of the first preset area, and save the probability estimation result of the first preset area in the probability estimation result set, and record the first preset The probability estimation result of the region is indexed in the probability estimation result set and the size information of the first preset region, and the encoding unit 501 writes the probability estimation result set, enable_flag, index and size information of the first preset region into the compressed code stream.
需要指出的是,对于多个不同的预置区域,可以得到多个概率估计结果,该多个概率估计结果构成一个概率估计结果集合,预置区域的概率估计结果在概率估计结果集合中的位置,即预置区域的索引。It should be pointed out that for multiple different preset areas, multiple probability estimation results can be obtained, and the multiple probability estimation results form a probability estimation result set, and the position of the probability estimation result of the preset area in the probability estimation result set , which is the index of the preset area.
在一个示例中,概率估计结果集合可通过参数集(adaptation parameter set,APS)传输到解码器30。In one example, the probability estimation result set may be transmitted to the decoder 30 through an adaptation parameter set (APS).
在一个示例中,在得到第一预置区域的概率估计结果后,将第一预置区域的enable_flag置为第一值(比如1或true),以指示在解码端在采样得到第一预置区域中的估计系数时使用同一概率分布,即第一预置区域的概率估计结果;编码单元501将第一预置区域的概率估计结果、enable_flag和第一预置区域的尺寸信息写入压缩码流。In an example, after obtaining the probability estimation result of the first preset area, the enable_flag of the first preset area is set to the first value (such as 1 or true) to indicate that the first preset is obtained by sampling at the decoding end The same probability distribution is used when estimating coefficients in the area, that is, the probability estimation result of the first preset area; the encoding unit 501 writes the probability estimation result of the first preset area, enable_flag and the size information of the first preset area into the compressed code flow.
在一个示例中,若第一预置区域内所有的数据在采样时使用各自的概率估计结果,将第一预置区域的enable_flag置为第二值(比如0或false),编码单元501将第一预置区域内所有的数据各自的概率估计结果、第一预置区域的enable_flag写入压缩码流。可选地,编码单元501还将第一预置区域的尺寸信息写入压缩码流。In an example, if all the data in the first preset area use their respective probability estimation results when sampling, the enable_flag of the first preset area is set to the second value (such as 0 or false), and the encoding unit 501 sets the The respective probability estimation results of all the data in a preset area and the enable_flag of the first preset area are written into the compressed code stream. Optionally, the coding unit 501 also writes the size information of the first preset area into the compressed code stream.
在一个示例中,编码单元不将预置区域的尺寸信息写入码流,可以在编解码之前,编码端和解码端协商预置区域的尺寸,并将预置区域的尺寸预先分别保存在编码端和解码端。In an example, the encoding unit does not write the size information of the preset area into the code stream. Before encoding and decoding, the encoding end and the decoding end can negotiate the size of the preset area, and save the size of the preset area in the codec in advance. terminal and decoding terminal.
解码单元504decoding unit 504
解码单元504从压缩码流中解码得到第一概率估计结果。The decoding unit 504 decodes the compressed code stream to obtain a first probability estimation result.
在一个示例中,解码单元504还从压缩码流中解码得到第二概率估计结果。In an example, the decoding unit 504 further decodes the compressed code stream to obtain the second probability estimation result.
可选地,第一概率估计结果包括第一概率分布或者第一概率分布模型的参数。第二概率估计结果包括第二概率分布或者第二概率分布模型的参数。Optionally, the first probability estimation result includes parameters of the first probability distribution or the first probability distribution model. The second probability estimation result includes parameters of the second probability distribution or the second probability distribution model.
在一个示例中,解码单元504还从压缩码流中解码出第一标识,若该第一标识为第一值,表示在采样得到第一预置区域内所有估计系数时采用同一概率估计结果(即第一预置区域的概率估计结果),该第一预置区域为增强图像中的一个区域;解码单元504还从压缩码流中解码出概率估计结果集合和第一预置区域的索引,该概率估计结果集合中包括多个预置区域的概率估计结果,解码单元504根据第一预置区域的索引,根据第一预置区域的索引从概率估 计结果集合中获取第一预置区域的概率估计结果;In one example, the decoding unit 504 also decodes the first identifier from the compressed code stream. If the first identifier is the first value, it means that the same probability estimation result ( That is, the probability estimation result of the first preset area), the first preset area is an area in the enhanced image; the decoding unit 504 also decodes the probability estimation result set and the index of the first preset area from the compressed code stream, The probability estimation result set includes probability estimation results of multiple preset areas, and the decoding unit 504 obtains the probability estimation results of the first preset area from the probability estimation result set according to the index of the first preset area. Probability estimate results;
若第一标识为第二值,表示在采样得到第一预置区域内所有估计系数时采用估计系数各自的概率估计结果;解码单元504从码流中解码出第一预置区域的尺寸信息H1*W1,指示解码单元504从压缩码流中解码出H1*W1个概率估计结果,采样单元505通过该H1*W1个概率估计结果可以采样得到第一预置区域内所有的估计系数,H1和W1均为大于1的整数。If the first flag is the second value, it means that the respective probability estimation results of the estimated coefficients are used when sampling all the estimated coefficients in the first preset area; the decoding unit 504 decodes the size information H1 of the first preset area from the code stream *W1, indicating that the decoding unit 504 decodes H1*W1 probability estimation results from the compressed code stream, and the sampling unit 505 can obtain all estimated coefficients in the first preset area by sampling the H1*W1 probability estimation results, H1 and W1 are all integers greater than 1.
在一个示例中,解码单元504还从压缩码流中解码出第一标识,表示在采样得到第一预置区域内所有估计系数时采用同一概率估计结果(即第一预置区域的概率估计结果),该第一预置区域为增强图像中的一个区域,解码单元504还从码流中解码出第一预置区域的概率估计结果和H1*W1,采样单元505通过第一预置区域的概率估计结果进行H1*W1次采样得到H1*W1个估计系数,即第一预置区域包括H1*W1个估计系数。In one example, the decoding unit 504 also decodes the first identifier from the compressed code stream, indicating that the same probability estimation result (that is, the probability estimation result of the first preset area) is used when sampling to obtain all estimated coefficients in the first preset area ), the first preset area is an area in the enhanced image, and the decoding unit 504 also decodes the probability estimation result and H1*W1 of the first preset area from the code stream, and the sampling unit 505 passes the The probability estimation result is sampled H1*W1 times to obtain H1*W1 estimated coefficients, that is, the first preset area includes H1*W1 estimated coefficients.
采样单元505Sampling unit 505
采样单元505根据第一概率估计结果进行采样得到第一估计系数,根据第一概率估计结果进行采样得到第二估计系数,由于两个采样过程一致,下面以根据第一概率估计结果进行采样得到第一估计系数来具体说明。The sampling unit 505 performs sampling according to the first probability estimation result to obtain the first estimated coefficient, and performs sampling according to the first probability estimation result to obtain the second estimated coefficient. Since the two sampling processes are consistent, the following uses sampling according to the first probability estimation result to obtain the second estimated coefficient An estimated coefficient to specify.
在一个示例中,第一概率估计结果包括高斯分布的均值和方差,采样单元505根据第一概率估计结果进行采样得到第一估计系数,包括:In an example, the first probability estimation result includes the mean and variance of the Gaussian distribution, and the sampling unit 505 performs sampling according to the first probability estimation result to obtain the first estimation coefficient, including:
获取第一随机数;根据第一随机数确定第一参考值,该第一参考值服从高斯分布;根据第一参考值和第一概率估计结果的均值和方差确定第一估计系数。Acquiring a first random number; determining a first reference value according to the first random number, and the first reference value obeys a Gaussian distribution; determining a first estimation coefficient according to the first reference value and the mean value and variance of the first probability estimation result.
具体地,使用线性同余法生成[0,1]上的均匀分布的随机数u;令
Figure PCTCN2022100578-appb-000011
则z 1服从标准高斯分布。其中,erf()是高斯误差函数,它是标准正态分布的累计分布函数,定义如下:
Specifically, use the linear congruence method to generate a uniformly distributed random number u on [0,1]; let
Figure PCTCN2022100578-appb-000011
Then z 1 obeys the standard Gaussian distribution. Among them, erf() is the Gaussian error function, which is the cumulative distribution function of the standard normal distribution, defined as follows:
Figure PCTCN2022100578-appb-000012
Figure PCTCN2022100578-appb-000012
令z 2=δ·z 1+μ,则z 2服从均值为μ,方差为δ的高斯分布,z 2即为上述第一估计系数,其中,δ和μ分别为上述第一概率估计结果的均值和方差。 Let z 2 =δ·z 1 +μ, then z 2 obeys the Gaussian distribution with mean value μ and variance δ, and z 2 is the above-mentioned first estimation coefficient, where δ and μ are the above-mentioned first probability estimation results respectively mean and variance.
可选地,在进行采样之前,对第一概率估计结果的方差进行处理,具体处理过程包括:将第一概率估计结果的方差置为0作为处理后的方差;然后再根据处理后的方差和第一概率估计结果的均值按照上述采样方式进行采样得到第一估计系数。Optionally, before sampling, the variance of the first probability estimation result is processed, and the specific processing process includes: setting the variance of the first probability estimation result to 0 as the processed variance; and then according to the processed variance and The mean value of the first probability estimation result is sampled according to the above sampling manner to obtain the first estimated coefficient.
可选地,在进行采样之前,根据第一估计系数的缩放因子对第一概率估计结果的方差进行处理,然后再根据处理后的方差和概率估计结果的均值按照上述采样方式进行采样得到第一估计系数。Optionally, before sampling, process the variance of the first probability estimation result according to the scaling factor of the first estimation coefficient, and then perform sampling according to the above-mentioned sampling method according to the processed variance and the mean value of the probability estimation result to obtain the first estimated coefficients.
可选地,在进行采样之前,根据第一估计系数的缩放因子对第一概率估计结果的均值进行处理,然后再根据处理后的均值和第一概率估计结果的方差按照上述采样方式进行采样得到第一估计系数。Optionally, before sampling, the mean value of the first probability estimation result is processed according to the scaling factor of the first estimation coefficient, and then according to the processed mean value and the variance of the first probability estimation result, sampling is performed according to the above sampling method to obtain The first estimated coefficient.
应理解,可以按照上述方式根据第二概率估计结果进行采样得到第二估计系数。It should be understood that sampling may be performed according to the second probability estimation result in the above manner to obtain the second estimated coefficient.
可选地,其中,第一估计系数的缩放因子和第二估计系数的缩放因子相同;或者,Optionally, wherein the scaling factor of the first estimated coefficient is the same as the scaling factor of the second estimated coefficient; or,
第一估计系数的缩放因子和第二估计系数的缩放因子不同;或者,the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient are different; or,
在第一估计系数和第二估计系数为量化小波系数或者为小波系数时,若第一估计系数和第二估计系数属于同一个子带,则第一估计系数的缩放因子和第二估计系数的缩放因子相同;或者若第一估计系数和第二估计系数属于不同子带,则第一估计系数的缩放因子和第二估计 系数的缩放因子不同;或者第一估计系数的缩放因子是根据第一估计系数所属的子带的纹理复杂度确定的;When the first estimated coefficient and the second estimated coefficient are quantized wavelet coefficients or wavelet coefficients, if the first estimated coefficient and the second estimated coefficient belong to the same subband, the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient The factors are the same; or if the first estimated coefficient and the second estimated coefficient belong to different subbands, the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient are different; or the scaling factor of the first estimated coefficient is based on the first estimated Determined by the texture complexity of the subband to which the coefficient belongs;
或者,or,
在第一估计系数和第二估计系数为量化DCT系数或者为DCT系数时,若第一估计系数和第二估计系数属于同一个频带,则第一估计系数的缩放因子和第二估计系数的缩放因子相同;或者若第一估计系数和第二估计系数属于不同频带,则第一估计系数的缩放因子和第二估计系数的缩放因子不同;或者第一估计系数的缩放因子是根据第一估计系数所属的频带的纹理复杂度确定的;When the first estimated coefficient and the second estimated coefficient are quantized DCT coefficients or DCT coefficients, if the first estimated coefficient and the second estimated coefficient belong to the same frequency band, the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient The factors are the same; or if the first estimated coefficient and the second estimated coefficient belong to different frequency bands, the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient are different; or the scaling factor of the first estimated coefficient is based on the first estimated coefficient The texture complexity of the band to which it belongs is determined;
或者,or,
在第一估计系数和第二估计系数为特征系数或特征系数时,若第一估计系数和第二估计系数属于同一通道,则第一估计系数的缩放因子和第二估计系数的缩放因子相同;或者若第一估计系数和第二估计系数属于不同通道,则第一估计系数的缩放因子和第二估计系数的缩放因子不同;或者第一估计系数的缩放因子是根据第一估计系数所属的通道的纹理复杂度确定的。When the first estimated coefficient and the second estimated coefficient are characteristic coefficients or characteristic coefficients, if the first estimated coefficient and the second estimated coefficient belong to the same channel, the scaling factor of the first estimated coefficient is the same as the scaling factor of the second estimated coefficient; Or if the first estimated coefficient and the second estimated coefficient belong to different channels, the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient are different; or the scaling factor of the first estimated coefficient is based on the channel to which the first estimated coefficient belongs The texture complexity is determined.
在一个示例中,第一估计系数和第二估计系数为像素值,对第一概率估计结果的方差进行预处理,以得到处理后的方差,包括:In an example, the first estimated coefficient and the second estimated coefficient are pixel values, and the variance of the first probability estimation result is preprocessed to obtain the processed variance, including:
根据第一系数的缩放因子对第一概率分布的方差进行预处理,以得到处理后的方差,Preprocess the variance of the first probability distribution according to the scaling factor of the first coefficient to obtain the processed variance,
第一估计系数的缩放因子和第二估计系数的缩放因子相同,或者第一估计系数的缩放因子和第二估计系数的缩放因子不相同;或者,第一估计系数的缩放因子是根据第一估计系数所属的通道的纹理复杂度确定的。The scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient are the same, or the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient are not the same; or, the scaling factor of the first estimated coefficient is based on the first estimate Determined by the texture complexity of the channel the coefficient belongs to.
通过对第一概率分布进行预处理,可以按照用户的需求得到不同性质的重建图像。比如将第一概率分布的方差置0作为处理后的方差,可以得到信号质量最佳(客观质量最佳)的重建图像,也就是增大图像的PSNR或者降低MSE;通过将多个系数的缩放因子设置为相同,可以得到主观质量最佳的图像,也即是降低图像的PSNR或者增大图像的MSE;通过将图像中属于同于部分的系数的缩放因子设置为相同,将属于不同部分的系数的缩放因子设置为不相同,可以得到性质在主观质量最佳和客观质量最佳之间的图像。By preprocessing the first probability distribution, reconstructed images with different properties can be obtained according to user requirements. For example, if the variance of the first probability distribution is set to 0 as the variance after processing, the reconstructed image with the best signal quality (best objective quality) can be obtained, that is, the PSNR of the image can be increased or the MSE can be reduced; by scaling multiple coefficients If the factors are set to be the same, the image with the best subjective quality can be obtained, that is, to reduce the PSNR of the image or to increase the MSE of the image; The scaling factors of the coefficients are set to be different, and images whose properties are between the best subjective quality and the best objective quality can be obtained.
当第一概率估计结果包括拉普拉斯分布的位置参数和尺度参数,根据第一概率估计结果进行概率估计得到第一估计系数,包括:When the first probability estimation result includes the location parameter and scale parameter of the Laplace distribution, the probability estimation is performed according to the first probability estimation result to obtain the first estimated coefficient, including:
生成两个均匀分布的随机数μ 1和μ 1,令z 3=b·log(μ 1),z 4=b·log(μ 2),第一估计系数为z 5=z 3-z 4+μ,其中,μ和b分别为拉普拉斯分布的位置参数和尺度参数。 Generate two uniformly distributed random numbers μ 1 and μ 1 , set z 3 =b·log(μ 1 ), z 4 =b·log(μ 2 ), and the first estimated coefficient is z 5 =z 3 -z 4 +μ, where μ and b are the location and scale parameters of the Laplace distribution, respectively.
可选地,在进行采样之前,根据第一系数的缩放因子对拉普拉斯分布的尺度参数进行处理,然后再根据处理后的尺度参数和拉普拉斯分布的位置参数按照上述采样方式进行采样得到第一估计系数。Optionally, before sampling, the scale parameter of the Laplace distribution is processed according to the scaling factor of the first coefficient, and then according to the processed scale parameter and the position parameter of the Laplacian distribution, the above sampling method is performed Sampling to obtain the first estimated coefficients.
可选地,在进行采样之前,根据第一系数的缩放因子对拉普拉斯分布的位置参数进行处理,然后再根据处理后的位置参数和拉普拉斯分布的尺度参数按照上述采样方式进行采样得到第一估计系数。Optionally, before sampling, the location parameter of the Laplace distribution is processed according to the scaling factor of the first coefficient, and then the sampling method is performed according to the processed location parameter and the scale parameter of the Laplace distribution Sampling to obtain the first estimated coefficients.
按照上述方式,可以得到多个估计系数,该多个估计系数包括第一估计系数和第二估计系数。In the manner described above, a plurality of estimated coefficients can be obtained, and the plurality of estimated coefficients include a first estimated coefficient and a second estimated coefficient.
反向变换单元506Inverse transformation unit 506
反向变换单元506根据多个估计系数得到增强图像,The inverse transformation unit 506 obtains the enhanced image according to a plurality of estimated coefficients,
具体地,若多个估计系数为多个量化小波系数,反向变换单元506对多个估计系数进行 反量化和小波反变换得到增强图像,或者,Specifically, if the multiple estimated coefficients are multiple quantized wavelet coefficients, the inverse transform unit 506 performs inverse quantization and wavelet inverse transform on the multiple estimated coefficients to obtain an enhanced image, or,
若多个估计系数为多个小波系数,反向变换单元506对多个估计系数进行小波反变换得到增强图像,或者,If the multiple estimated coefficients are multiple wavelet coefficients, the inverse transform unit 506 performs wavelet inverse transform on the multiple estimated coefficients to obtain an enhanced image, or,
若多个估计系数为多个量化DCT系数,反向变换单元506对多个估计系数进行反量化和反DCT得到重建图像,或者,If the multiple estimated coefficients are multiple quantized DCT coefficients, the inverse transform unit 506 performs inverse quantization and inverse DCT on the multiple estimated coefficients to obtain a reconstructed image, or,
若多个估计系数为多个DCT系数,反向变换单元506对多个估计系数进行反DCT得到增强图像。If the multiple estimated coefficients are multiple DCT coefficients, the inverse transform unit 506 performs inverse DCT on the multiple estimated coefficients to obtain an enhanced image.
若多个估计系数为多个像素值,即是多个重建像素值,基于多个估计系数得到增强图像。If the multiple estimated coefficients are multiple pixel values, that is, multiple reconstructed pixel values, the enhanced image is obtained based on the multiple estimated coefficients.
在一个示例中,在得到由多个特征元素构成的特征图后,可以将该特征图经过神经网络输出上述增强图像。该神经网络可以采用任一结构,例如全连接网络、卷积神经网络、循环神经网络等。神经网络可以采用多层结构的深度神经网络结构可得到质量更佳的第一重建图像或第二重建图像。In an example, after obtaining a feature map composed of multiple feature elements, the feature map may be passed through a neural network to output the enhanced image. The neural network can adopt any structure, such as a fully connected network, a convolutional neural network, a recurrent neural network, and the like. The neural network can adopt a deep neural network structure with a multi-layer structure to obtain the first reconstructed image or the second reconstructed image with better quality.
在一个示例中,在得到由多个特征元素构成的特征图后,可以将该特征图输入面向机器视觉任务模块执行相应的机器任务。例如完成物体分类、识别、分割等机器视觉任务。In an example, after obtaining a feature map composed of multiple feature elements, the feature map can be input into a machine vision task module to perform corresponding machine tasks. For example, complete machine vision tasks such as object classification, recognition, and segmentation.
在此需要指出的,本实施例的编码端方案是在待编码图像经过编码得到压缩码流,再参考编码信息(例如压缩码流或者编码过程中变换得到的系数信息)进行概率估计。本实施例的解码端方案是在对压缩码流进行解码得到已解码图像的前提下进行的,也可以说本实施例的解码端方案是个后处理过程。It should be pointed out here that the encoding end scheme of this embodiment is to obtain a compressed bit stream after encoding the image to be encoded, and then refer to the encoded information (such as the compressed bit stream or the coefficient information transformed during the encoding process) to perform probability estimation. The solution at the decoding end of this embodiment is performed on the premise that the compressed code stream is decoded to obtain a decoded image. It can also be said that the solution at the decoding end of this embodiment is a post-processing process.
可以看出,在编码端进行概率估计,得到概率估计结果,将概率估计结果传输至解码端,解码端基于概率估计结果进行采样,得到估计系数,再采样得到的估计系数得到增强图像。由于采样过程具有随机性,是一个不确定过程,因此解码端对于同一压缩码流按照上述方式进行多次解码可以得到的多张不同性质的高质量图像。比如主观质量最优的图像,客观质量最优的图像。It can be seen that the probability estimation is performed at the encoding end, the probability estimation result is obtained, and the probability estimation result is transmitted to the decoding end. The decoding end performs sampling based on the probability estimation result to obtain the estimated coefficient, and the estimated coefficient obtained by re-sampling obtains an enhanced image. Since the sampling process is random and is an uncertain process, multiple high-quality images of different properties can be obtained by decoding the same compressed code stream multiple times in the above-mentioned manner. For example, the image with the best subjective quality and the image with the best objective quality.
在一个示例中,在熵编码过程中,编码单元501先对第一数据进行概率估计,得到第一数据的概率估计结果,该概率估计结果称为概率估计结果A;再根据概率估计结果A对第一数据进行熵编码;在熵解码过程中,解码单元504先对第一数据进行概率估计,得到第一数据的概率估计结果,该概率估计结果也可称为概率估计结果A;再根据概率估计结果A进行熵解码。对于上述实施例所说的概率估计结果称为概率估计结果B。In one example, during the entropy encoding process, the encoding unit 501 first performs probability estimation on the first data to obtain the probability estimation result of the first data, which is called the probability estimation result A; Entropy encoding is performed on the first data; in the process of entropy decoding, the decoding unit 504 first performs probability estimation on the first data to obtain the probability estimation result of the first data, which may also be called probability estimation result A; The estimated result A is entropy decoded. The probability estimation result mentioned in the above embodiment is called the probability estimation result B.
可选地,在编码端根据概率估计结果A对第一数据进行熵编码,解码端按照编码端对第一数据进行概率估计的方式,对第一数据进行概率估计,得到概率估计结果(也可以看成概率估计结果A),根据该概率估计结果A进行熵解码,还可以根据该概率估计结果A进行采样,采样方式与上述实施例一致。Optionally, entropy encoding is performed on the first data at the encoding end according to the probability estimation result A, and the decoding end performs probability estimation on the first data according to the manner in which the encoding end performs probability estimation on the first data, and obtains the probability estimation result (also can be As the probability estimation result A), entropy decoding is performed according to the probability estimation result A, and sampling may also be performed according to the probability estimation result A, and the sampling method is consistent with the above-mentioned embodiment.
可选地,在编码端根据概率估计结果A对第一数据进行熵编码,向解码端传输概率估计结果A,解码端根据该概率估计结果A进行熵解码,还可以根据该概率估计结果A进行采样,采样方式与上述实施例一致。Optionally, entropy encoding is performed on the first data at the encoding end according to the probability estimation result A, and the probability estimation result A is transmitted to the decoding end, and the decoding end performs entropy decoding according to the probability estimation result A, and may also perform Sampling, the sampling method is consistent with the above-mentioned embodiment.
可选地,在编码端根据概率估计结果B对第一数据进行熵编码,编码端向解码端发送概率估计结果B,解码端根据概率估计结果B进行熵解码,还可根据概率估计结果B进行采样,采样方式与上述实施例一致。Optionally, entropy encoding is performed on the first data at the encoding end according to the probability estimation result B, the encoding end sends the probability estimation result B to the decoding end, and the decoding end performs entropy decoding according to the probability estimation result B, and can also perform entropy decoding according to the probability estimation result B Sampling, the sampling method is consistent with the above-mentioned embodiment.
可选地,在编码端根据概率估计结果B对第一数据进行熵编码;解码端对第一数据进行概率估计,可得到概率估计结果B,再根据概率估计结果B进行熵解码,还可根据概率估计结果B进行采样,采样方式与上述实施例一致。Optionally, entropy encoding is performed on the first data at the encoding end according to the probability estimation result B; the probability estimation is performed on the first data at the decoding end to obtain the probability estimation result B, and then entropy decoding is performed according to the probability estimation result B, and it is also possible to obtain the probability estimation result B according to The probability estimation result B is sampled, and the sampling method is consistent with the above-mentioned embodiment.
图7为用于实现本申请技术的另一种视频编解码器的示例的示意性框图。在图7的示例中,视频编码器20包括系数获取单元701、概率估计单元702和熵编码单元703;视频解码器30包括熵解码单元704、采样单元705、第一重建单元706和第二重建单元707。图5所示的视频编解码器也可称为端到端的视频编解码器或者基于端到端视频编解码器的视频编解码器。FIG. 7 is a schematic block diagram of an example of another video codec for implementing the technology of the present application. In the example of FIG. 7, the video encoder 20 includes a coefficient acquisition unit 701, a probability estimation unit 702, and an entropy encoding unit 703; the video decoder 30 includes an entropy decoding unit 704, a sampling unit 705, a first reconstruction unit 706, and a second reconstruction unit. Unit 707. The video codec shown in FIG. 5 may also be referred to as an end-to-end video codec or a video codec based on an end-to-end video codec.
系数获取单元701Coefficient acquisition unit 701
系数获取单元701从待编码图像获得多个系数,多个系数包括第一系数。The coefficient obtaining unit 701 obtains a plurality of coefficients from the image to be encoded, and the plurality of coefficients include a first coefficient.
可选地,多个系数可以为多个像素。Optionally, the multiple coefficients can be multiple pixels.
在一个示例中,系数获取单元701将待编码图像划分为预置大小的图像块,预置大小的图像块的尺寸可以是4x4、8x8、16x16、32x32、64x64、128x128和256x256等。或者2)系数获取单元701对待编码图像进行划分得到一个或者多个图像块,图像块的大小不做限定。可以使用现有编码标准(H266,H265,H264,AVS2或者AVS3)中的四叉树、二叉树或者三叉树的划分方法对待编码图像进行划分,以得到一个或者多个图像块。每个图像块包括一个或多个像素。In an example, the coefficient acquiring unit 701 divides the image to be coded into image blocks of a preset size, and the sizes of the image blocks of the preset size may be 4x4, 8x8, 16x16, 32x32, 64x64, 128x128, and 256x256. Or 2) The coefficient acquisition unit 701 divides the image to be coded to obtain one or more image blocks, and the size of the image blocks is not limited. The quadtree, binary tree or ternary tree division method in existing encoding standards (H266, H265, H264, AVS2 or AVS3) can be used to divide the image to be encoded to obtain one or more image blocks. Each image block includes one or more pixels.
在一个示例中,对待编码图像进行N次小波变换,3N+1个子带,每个子带包括一个或多个小波系数,N为大于0的整数。In an example, the image to be coded is subjected to wavelet transformation N times, 3N+1 subbands, each subband includes one or more wavelet coefficients, and N is an integer greater than 0.
其中,小波变换方式可以为传统小波变换或者基于深度网络的小波变换或者其他类似的变换方法,在此不做限定。基于深度网络的小波变换方法中,与传统的小波变换的不同之处在于,变换和预测使用基于深度网络的方法来实现,具体的深度网络的实现方法在此不做限定。本申请以一次小波变换为例,即N=1,如图6a所示,待编码图像经一次小波变换后得到四个子带LL1,HL1,LH1和HH1。Wherein, the wavelet transform method may be a traditional wavelet transform or a deep network-based wavelet transform or other similar transform methods, which are not limited here. The difference between the deep network-based wavelet transform method and the traditional wavelet transform lies in that the transformation and prediction are implemented using the deep network-based method, and the specific implementation method of the deep network is not limited here. This application takes a wavelet transform as an example, that is, N=1. As shown in FIG. 6a, four subbands LL1, HL1, LH1 and HH1 are obtained after the image to be coded is subjected to a wavelet transform.
对于待编码图像来说,对待编码图像进行小波变换得到的子带构成的图像即为上述第一变换图像,同理,对于已解码图像来说,对已解码图像进行小波变换得到的子带构成的图像即为上述第二变换图像。For the image to be coded, the image composed of subbands obtained by performing wavelet transformation on the image to be coded is the first transformed image. Similarly, for the decoded image, the subband composition obtained by performing wavelet transformation on the decoded image The image of is the above-mentioned second transformed image.
可选地,在小波变换得到多个小波系数后,对每个小波系数进行量化,得到多个量化小波系数。具体地,在对每个小波系数进行量化时,可以按照预置次序一处理每个子带,然后再按照预置次序二对当前子带内的小波系数进行量化得到量化小波系数,其中,预置次序一可以现有的Z字扫描顺序,例如:LL1→HL1→LH1→HH1。预置次序二可以为现有的Z字扫描顺序、水平扫描顺序或者竖直扫描顺序。Optionally, after the wavelet transform obtains multiple wavelet coefficients, each wavelet coefficient is quantized to obtain multiple quantized wavelet coefficients. Specifically, when quantizing each wavelet coefficient, each subband can be processed according to a preset order one, and then the wavelet coefficients in the current subband can be quantized according to a preset order two to obtain quantized wavelet coefficients, wherein the preset The order one can be the existing zigzag scanning order, for example: LL1→HL1→LH1→HH1. The second preset order can be an existing zigzag scanning order, horizontal scanning order or vertical scanning order.
应理解,上述预置次序一和预置次序二只是一个示例,不是对申请的限定,当然还可以是其他顺序。It should be understood that the preset order 1 and the preset order 2 above are just examples, and are not limitations on the application, and of course other orders may also be used.
可选地,在对每个小波系数进行量化之前,可以对小波系数进行预处理,得到处理后的小波系数,再对预处理后的小波系数进行量化操作,例如:对得到的小波系数经过一个神经网络进行特征提取,再对特征提取结果进行量化。在量化前对小波系数进行处理,可以使得解码器能够解码得到高质量的第一重建图像。Optionally, before quantizing each wavelet coefficient, the wavelet coefficient can be preprocessed to obtain the processed wavelet coefficient, and then the preprocessed wavelet coefficient can be quantized, for example: the obtained wavelet coefficient is subjected to a The neural network performs feature extraction, and then quantifies the feature extraction results. Processing the wavelet coefficients before quantization can enable the decoder to decode and obtain a high-quality first reconstructed image.
上述多个系数可以为多个小波系数或者量化小波系数。The above multiple coefficients may be multiple wavelet coefficients or quantized wavelet coefficients.
在另一个示例中,系数获取单元701对待编码图像进行DCT,得到DCT图像,DCT图像包括多个频带,每个频带包括一个或多个DCT系数;其中,待编码图像经过变换后,其低频分量都集中在左上角,高频分量分布在右下角,其中第一行第一列的系数值代表直流(DC)系数,即待编码图像的平均值,其它系数是交流(AC)系数,DC系数和AC系数统称为DCT 系数。In another example, the coefficient acquisition unit 701 performs DCT on the image to be encoded to obtain a DCT image, the DCT image includes multiple frequency bands, and each frequency band includes one or more DCT coefficients; wherein, after the image to be encoded is transformed, its low frequency components They are all concentrated in the upper left corner, and the high-frequency components are distributed in the lower right corner. The coefficient values in the first row and first column represent direct current (DC) coefficients, that is, the average value of the image to be encoded, and the other coefficients are alternating current (AC) coefficients, DC coefficients and AC coefficients are collectively referred to as DCT coefficients.
可选地,系数获取单元701对待编码图像进行块划分,得到多个图像块,然后以图像块为单位进行DCT得到变换块。例如1)将待编码图像划分为预置大小的图像块,预置大小的图像块的尺寸可以是4x4、8x8、16x16、32x32、64x64、128x128和256x256等。或者2)对待编码图像进行划分得到一个或者多个图像块,图像块的大小不做限定。可以使用现有编码标准(H266,H265,H264,AVS2或者AVS3)中的四叉树、二叉树或者三叉树的划分方法对待编码图像进行划分,以得到一个或者多个图像块。Optionally, the coefficient acquiring unit 701 divides the image to be coded into blocks to obtain multiple image blocks, and then performs DCT in units of image blocks to obtain transform blocks. For example 1) Divide the image to be coded into image blocks of a preset size, and the size of the image blocks of the preset size may be 4x4, 8x8, 16x16, 32x32, 64x64, 128x128, and 256x256. Or 2) dividing the image to be coded to obtain one or more image blocks, and the size of the image blocks is not limited. The quadtree, binary tree or ternary tree division method in existing encoding standards (H266, H265, H264, AVS2 or AVS3) can be used to divide the image to be encoded to obtain one or more image blocks.
应理解,对于基于待编码图像得到的DCT系数构成的图像为上述第一变换图像。It should be understood that the image formed based on the DCT coefficients obtained from the image to be encoded is the above-mentioned first transformed image.
可选地,得到的DCT系数进行量化,比如均匀量化,得到量化DCT系数。对于待编码图像来说,基于对待编码图像得到的量化DCT系数构成的图像即为上述第一变换图像。Optionally, the obtained DCT coefficients are quantized, such as uniformly quantized, to obtain quantized DCT coefficients. For the image to be encoded, the image formed based on the quantized DCT coefficients obtained from the image to be encoded is the first transformed image.
上述多个系数可以为多个DCT系数或者量化DCT系数。The above multiple coefficients may be multiple DCT coefficients or quantized DCT coefficients.
在另一个示例中,对待编码图像进行特征提取,得到三维特征图,该三维特征图即为上述第一变换图像。可选地,对三维特征图中的特征元素进行量化,得到量化特征元素,该量化特征元素构成的三维特征图即为上述第一变换图像;其中,上述多个系数可以为多个特征系数或者多个量化特征系数。In another example, feature extraction is performed on the image to be coded to obtain a three-dimensional feature map, and the three-dimensional feature map is the above-mentioned first transformed image. Optionally, quantify the feature elements in the three-dimensional feature map to obtain quantized feature elements, and the three-dimensional feature map formed by the quantized feature elements is the above-mentioned first transformed image; wherein, the above-mentioned multiple coefficients can be multiple feature coefficients or Multiple quantization feature coefficients.
概率估计单元702Probability Estimation Unit 702
概率估计单元702根据第一系数的上下文信息得到第一概率估计结果。The probability estimation unit 702 obtains a first probability estimation result according to the context information of the first coefficient.
在一个示例,第一系数为待编码图像的一个像素,该像素的第一上下文信息包括待编码图像中的全部或者部分像素。进一步地,该像素的第一上下文信息包括在待编码图像中与该像素相邻的像素,或者包括与该像素相邻的图像块中的部分或者全部像素,或者包括该像素所在图像块内的部分或者全部像素。In an example, the first coefficient is a pixel of the image to be encoded, and the first context information of the pixel includes all or part of the pixels in the image to be encoded. Further, the first context information of the pixel includes the pixels adjacent to the pixel in the image to be encoded, or includes part or all of the pixels in the image block adjacent to the pixel, or includes the pixels in the image block where the pixel is located. some or all of the pixels.
在此需要指出的是,上述“周围的像素”是指与该第一数据之间距离小于预设阈值的像素,该预设阈值的单位为“像素”。It should be noted here that the above-mentioned "surrounding pixels" refer to pixels whose distance from the first data is smaller than a preset threshold, and the unit of the preset threshold is "pixel".
在一个示例中,第一系数为第一变换图像中的一个系数,若第一数据为小波系数或者量化小波系数,则第一系数的第一上下文信息包括第一变换图像中的部分或者全部系数,该系数为小波系数或者量化小波系数。进一步地,第一系数的第一上下文信息包括第一变换图像中第一系数周围的小波系数或者量化小波系数,或者第一上下文信息包括与该第一系数相邻的子带内的部分或者全部系数,该系数为小波系数或者量化小波系数;或者第一上下文信息包括第一系数所在子带内的部分或者全部系数;该系数为小波系数或者量化小波系数;In an example, the first coefficient is a coefficient in the first transformed image, and if the first data is a wavelet coefficient or a quantized wavelet coefficient, the first context information of the first coefficient includes part or all of the coefficients in the first transformed image , the coefficient is a wavelet coefficient or a quantized wavelet coefficient. Further, the first context information of the first coefficient includes wavelet coefficients or quantized wavelet coefficients around the first coefficient in the first transformed image, or the first context information includes part or all of the subbands adjacent to the first coefficient A coefficient, the coefficient is a wavelet coefficient or a quantized wavelet coefficient; or the first context information includes part or all of the coefficients in the subband where the first coefficient is located; the coefficient is a wavelet coefficient or a quantized wavelet coefficient;
或者,or,
若第一系数为DCT系数或者量化DCT系数,则第一系数的第一上下文信息包括第一变换图像中的部分或者全部系数,该系数为DCT系数或者量化DCT系数。第一系数的第一上下文信息包括第一变换图像中第一数据周围的DCT系数或者量化DCT系数,或者第一上下文信息包括与该第一系数相邻的子带内的部分或者全部系数该系数为DCT系数或者量化DCT系数;或者第一上下文信息包括第一系数所在子带内的部分或者全部系数;该系数为DCT系数或者量化DCT系数;If the first coefficient is a DCT coefficient or a quantized DCT coefficient, the first context information of the first coefficient includes part or all of the coefficients in the first transformed image, and the coefficient is a DCT coefficient or a quantized DCT coefficient. The first context information of the first coefficient includes DCT coefficients or quantized DCT coefficients around the first data in the first transformed image, or the first context information includes some or all of the coefficients in the subband adjacent to the first coefficient. It is a DCT coefficient or a quantized DCT coefficient; or the first context information includes some or all coefficients in the subband where the first coefficient is located; the coefficient is a DCT coefficient or a quantized DCT coefficient;
或者,若第一数据为特征系数或者量化特征系数,则第一系数的第一上下文信息包括第一变换图像中的部分或者全部系数,该系数为特征系数或者量化特征系数;进一步地,第一系数的第一上下文信息包括第一变换图像中第一系数周围的特征系数或者量化特征系数,或者第一上下文信息包括第一系数所在通道内的部分或者全部系数,该系数为特征系数或者量化特征系数。Alternatively, if the first data is a characteristic coefficient or a quantized characteristic coefficient, the first context information of the first coefficient includes part or all of the coefficients in the first transformed image, and the coefficient is a characteristic coefficient or a quantized characteristic coefficient; further, the first The first context information of the coefficient includes feature coefficients or quantized feature coefficients around the first coefficient in the first transformed image, or the first context information includes some or all coefficients in the channel where the first coefficient is located, and the coefficients are feature coefficients or quantized features coefficient.
上述“周围的小波系数或者量化小波系数”是指与该第一数据之间距离小于预设阈值的小波系数或者量化小波系数,该预设阈值的单位为“小波系数或者量化小波系数”;上述“周围的DCT系数或者量化DCT系数”是指与该第一数据之间距离小于预设阈值的DCT系数或者量化DCT系数,该预设阈值的单位为“DCT系数或者量化DCT系数”;上述“周围的特征系数或者量化特征系数”是指与该第一数据之间距离小于预设阈值的特征系数或者量化特征系数,该预设阈值的单位为“特征系数或者量化特征系数”。The aforementioned "surrounding wavelet coefficients or quantized wavelet coefficients" refer to wavelet coefficients or quantized wavelet coefficients whose distance from the first data is smaller than a preset threshold, and the unit of the preset threshold is "wavelet coefficients or quantized wavelet coefficients"; "Surrounding DCT coefficients or quantized DCT coefficients" refer to DCT coefficients or quantized DCT coefficients whose distance from the first data is less than a preset threshold, and the unit of the preset threshold is "DCT coefficients or quantized DCT coefficients"; the above " Surrounding characteristic coefficients or quantized characteristic coefficients" refer to characteristic coefficients or quantized characteristic coefficients whose distance from the first data is smaller than a preset threshold, and the unit of the preset threshold is "characteristic coefficient or quantized characteristic coefficient".
在一个示例中,多个系数还包括第二系数,概率估计单元702还用于根据第二系数的上下文信息进行概率估计得到第二概率估计结果。In an example, the plurality of coefficients further include a second coefficient, and the probability estimation unit 702 is further configured to perform probability estimation according to context information of the second coefficient to obtain a second probability estimation result.
第二系数与第一系数位于同一图像(比如待编码图像或者对待编码图像进行变换得到的第一变换图像)中不同位置的数据,根据第二系数的上下文信息进行概率估计得到第二概率估计结果的具体过程可参见上述根据第一系数的上下文信息进行概率估计得到第一概率估计结果的相关描述,在此不再叙述。The second coefficient and the first coefficient are located at different positions in the same image (such as the image to be encoded or the first transformed image obtained by transforming the image to be encoded), and the probability estimation is performed according to the context information of the second coefficient to obtain the second probability estimation result For the specific process, please refer to the related description of obtaining the first probability estimation result by performing probability estimation according to the context information of the first coefficient above, and will not be described here again.
在一个示例中,第一系数和第二系数属于同一预置区域,该预置区域可以为待编码图像中的一个图像块,或为对待编码图像进行小波变换得到的子带,或者为对待编码图像进行DCT得到的频带或图像块,或对待编码图像进行特征提取得到的三维特征图的一个通道,在概率估计时可以只得到一个概率估计结果,该概率估计结果可以称为预置区域的概率估计结果。对于一个预置区域内的数据,只得到一个概率估计结果,传输时也只需要传输一个概率估计结果(即预置区域的概率估计结果),可以节省码流。In an example, the first coefficient and the second coefficient belong to the same preset area, and the preset area can be an image block in the image to be coded, or a subband obtained by wavelet transform of the image to be coded, or a The frequency band or image block obtained by performing DCT on the image, or a channel of the three-dimensional feature map obtained by performing feature extraction on the image to be coded, can only obtain one probability estimation result during probability estimation, and this probability estimation result can be called the probability of the preset area Estimated results. For data in a preset area, only one probability estimation result is obtained, and only one probability estimation result (that is, the probability estimation result of the preset area) needs to be transmitted during transmission, which can save code streams.
下面介绍如何得到第一预置区域的概率估计结果。The following describes how to obtain the probability estimation result of the first preset area.
方式一:对于第一预置区域内的每个系数,按照上述得到第一系数的概率估计结果方式进行处理,可得到第一预置区域内所有系数的概率估计结果,比如第一预置区域内有5个系数,就可以得到5个概率估计结果;然后从第一预置区域内所有系数的概率估计结果中选取出目标概率估计结果,作为第一预置区域的概率估计结果。比如位于第一预置区域的中间位置,或者左上角或右上角、左下角或者右下角的系数的概率估计结果为第一预置区域的概率估计结果。Method 1: For each coefficient in the first preset area, the probability estimation results of all coefficients in the first preset area can be obtained by processing according to the method of obtaining the probability estimation result of the first coefficient, such as the first preset area If there are 5 coefficients in it, 5 probability estimation results can be obtained; then the target probability estimation result is selected from the probability estimation results of all the coefficients in the first preset area as the probability estimation result of the first preset area. For example, the probability estimation result of the coefficient located in the middle of the first preset area, or the upper left corner or the upper right corner, the lower left corner or the lower right corner is the probability estimation result of the first preset area.
方式二:根据第一预置区域的上下文信息进行概率估计得到第一预置区域的概率估计结果。Mode 2: Perform probability estimation according to the context information of the first preset area to obtain a probability estimation result of the first preset area.
可选地,若第一预置区域为第一图像的一个图像块,第一预置区域的上下文信息包括第一图像中的部分或者全部像素,进一步地,第一预置区域的上下文信息包括第一图像中第一预置区域周围图像块内的部分或者全部像素;Optionally, if the first preset area is an image block of the first image, the context information of the first preset area includes some or all pixels in the first image, further, the context information of the first preset area includes Part or all of the pixels in the image block around the first preset area in the first image;
若第一预置区域为第一变换图像(通过对第一图像进行小波变换得到的)的一个子带,第一预置区域的上下文信息包括第一变换图像中的部分或者全部系数,进一步地,第一预置区域的上下文信息包括第一图像中第一预置区域周围子带内的部分或者全部系数,该系数为小波系数或者量化小波系数;If the first preset area is a subband of the first transformed image (obtained by performing wavelet transformation on the first image), the context information of the first preset area includes some or all coefficients in the first transformed image, further , the context information of the first preset area includes some or all of the coefficients in the subbands around the first preset area in the first image, and the coefficients are wavelet coefficients or quantized wavelet coefficients;
若第一预置区域为第一变换图像(通过对第一图像进行DCT得到的)的一个频带,第一预置区域的上下文信息包括第一变换图像中的部分或者全部系数,进一步地,第一预置区域的上下文信息包括第一图像中第一预置区域周围频带内的部分或者全部系数,该系数为DCT系数或者量化DCT系数;If the first preset area is a frequency band of the first transformed image (obtained by performing DCT on the first image), the context information of the first preset area includes some or all coefficients in the first transformed image, further, the second The context information of a preset area includes some or all coefficients in the frequency band around the first preset area in the first image, and the coefficients are DCT coefficients or quantized DCT coefficients;
若第一预置区域为第一变换图像(通过对第一图像进行DCT得到的)的一个变换块,对第一图像以一个或者多个图像块为单位进行DCT变换可以得到一个或者多个变换块。If the first preset area is a transform block of the first transform image (obtained by performing DCT on the first image), performing DCT transform on the first image in units of one or more image blocks can obtain one or more transforms piece.
若第一预置区域为第一变换图像(通过对第一图像进行特征提取得到三维特征图)的一 个通道,第一预置区域的上下文信息包括第一变换图像中的部分或者全部系数,该系数为特征系数或量化特征系数,进一步地,第一预置区域的上下文信息包括第一图像中第一预置区域所属通道内的部分或者全部系数,该系数为特征系数或量化特征系数。If the first preset area is a channel of the first transformed image (a three-dimensional feature map obtained by performing feature extraction on the first image), the context information of the first preset area includes some or all coefficients in the first transformed image, the The coefficients are feature coefficients or quantized feature coefficients. Further, the context information of the first preset area includes some or all coefficients in the channel to which the first preset area belongs in the first image, and the coefficients are feature coefficients or quantized feature coefficients.
在一个示例中,对于第一概率估计结果,概率估计单元702获取第一系数的概率分布模型;将第一系数的上下文信息经过第五概率估计网络进行处理,以得到该概率分布模型的参数;根据第一系数的概率分布模型和该概率分布模型的参数得到第一概率分布;上述第一概率估计结果包括上述第一概率分布,或者上述第一概率分布模型的参数;In an example, for the first probability estimation result, the probability estimation unit 702 obtains the probability distribution model of the first coefficient; processes the context information of the first coefficient through the fifth probability estimation network to obtain the parameters of the probability distribution model; The first probability distribution is obtained according to the probability distribution model of the first coefficient and the parameters of the probability distribution model; the above-mentioned first probability estimation result includes the above-mentioned first probability distribution, or the parameters of the above-mentioned first probability distribution model;
或者,or,
将第一系数的上下文信息经过第六概率估计网络进行处理,以得到第一概率分布;上述第一概率估计结果包括第一概率分布,或者包括该概率分布对应的概率分布模型的参数,其中,第五概率估计网络和第六概率估计网络是基于神经网络实现的。The context information of the first coefficient is processed through the sixth probability estimation network to obtain the first probability distribution; the above-mentioned first probability estimation result includes the first probability distribution, or includes the parameters of the probability distribution model corresponding to the probability distribution, wherein, The fifth probability estimation network and the sixth probability estimation network are implemented based on neural networks.
按照上述方式,可以上述第二系数的概率估计结果。In the above manner, the probability estimation result of the above second coefficient can be obtained.
在一个示例中,对于第一预置区域的概率估计结果,可以通过如下方式得到:In an example, the probability estimation result of the first preset area can be obtained in the following manner:
概率估计单元503获取第一预置区域的概率分布模型;将第一预置区域的上下文信息经过第七概率估计网络进行处理,以得到该概率分布模型的参数;根据第一预置区域的概率分布模型和该概率分布模型的参数得到第一预置区域的概率分布;其中,上述第一预置区域的概率估计结果包括上述第一预置区域的概率分布,或者上述第一预置区域的概率分布模型的参数;The probability estimation unit 503 obtains the probability distribution model of the first preset area; processes the context information of the first preset area through the seventh probability estimation network to obtain the parameters of the probability distribution model; according to the probability of the first preset area The distribution model and the parameters of the probability distribution model obtain the probability distribution of the first preset area; wherein, the probability estimation result of the first preset area includes the probability distribution of the first preset area, or the probability distribution of the first preset area The parameters of the probability distribution model;
或者,or,
将第一预置区域的上下文信息经过第八概率估计网络进行处理,以得到第一预置区域的概率分布;上述第一预置区域的概率估计结果包括第一预置区域的概率分布,或者包括该概率分布对应的概率分布模型的参数;其中,第七概率估计网络和第八概率估计网络是基于神经网络实现的。Processing the context information of the first preset area through the eighth probability estimation network to obtain the probability distribution of the first preset area; the probability estimation result of the first preset area includes the probability distribution of the first preset area, or It includes parameters of a probability distribution model corresponding to the probability distribution; wherein, the seventh probability estimation network and the eighth probability estimation network are implemented based on neural networks.
可选地,上述概率分布模型可以是:GSM、非对称高斯模型、GMM或者拉普拉斯分布模型(Laplace distribution)。其中,概率估计网络可以基于深度学习网络实现,例如RNN和PixelCNN等,在此不做限定。Optionally, the above probability distribution model may be: GSM, an asymmetric Gaussian model, GMM or a Laplace distribution model (Laplace distribution). Wherein, the probability estimation network can be implemented based on a deep learning network, such as RNN and PixelCNN, etc., which is not limited here.
作为示例,当概率分布模型为高斯模型(单高斯模型或者非对称高斯模型或者混合高斯模型)时,概率分布模型的参数为高斯模型的参数,包括均值μ和方差σ。As an example, when the probability distribution model is a Gaussian model (a single Gaussian model or an asymmetric Gaussian model or a mixed Gaussian model), the parameters of the probability distribution model are parameters of the Gaussian model, including mean μ and variance σ.
作为示例,当概率分布模型为拉普拉斯分布模型时,概率分布模型的参数为拉普拉斯分布模型的参数,包括位置参数μ和尺度参数b。As an example, when the probability distribution model is a Laplace distribution model, the parameters of the probability distribution model are parameters of the Laplace distribution model, including a location parameter μ and a scale parameter b.
作为示例,一个典型的基于PixelCNN的概率估计网络(包括上述第五概率估计网络、第六概率估计网络、第七概率估计网络和第八概率估计网络)如图6d所示。“h×w”表示当前卷积层使用尺寸为“h×w”的卷积核,“ResB”表示残差模块,结构如图6e所示,“*/relu”表示在当前层之后使用relu激活函数。As an example, a typical probability estimation network based on PixelCNN (including the fifth probability estimation network, the sixth probability estimation network, the seventh probability estimation network and the eighth probability estimation network) is shown in Fig. 6d. "h×w" indicates that the current convolutional layer uses a convolution kernel with a size of "h×w", "ResB" indicates the residual module, and the structure is shown in Figure 6e, "*/relu" indicates that relu is used after the current layer activation function.
在一个示例中,概率估计单元702在得到第一概估计结果后,对第一概估计结果进行预处理,得到处理后的概率估计结果。具体地,若第一概率估计结果包括高斯分布的均值和方差,对高斯分布的方差进行处理得到处理后的方差,高斯分布的均值和处理后的方差作为处理后的概率估计结果;或者,In an example, after obtaining the first approximate estimation result, the probability estimation unit 702 performs preprocessing on the first approximate estimation result to obtain a processed probability estimation result. Specifically, if the first probability estimation result includes the mean value and variance of the Gaussian distribution, the variance of the Gaussian distribution is processed to obtain the processed variance, and the mean value and the processed variance of the Gaussian distribution are used as the processed probability estimation result; or,
对高斯分布的均值进行处理得到处理后的均值,高斯分布的方差和处理后的均值作为处理后的概率估计结果。The mean value of the Gaussian distribution is processed to obtain the processed mean value, and the variance of the Gaussian distribution and the processed mean value are used as the probability estimation result after processing.
在一个示例中,对高斯分布的方差进行处理得到处理后的方差,包括:In one example, the variance of the Gaussian distribution is processed to obtain the processed variance, including:
将高斯分布的方差置为0作为处理后的方差。The variance of the Gaussian distribution is set to 0 as the variance after processing.
在一个示例中,对高斯分布的方差进行处理得到处理后的方差,包括:In one example, the variance of the Gaussian distribution is processed to obtain the processed variance, including:
根据第一系数的缩放因子对高斯分布的方差进行处理,得到处理后的方差;Process the variance of the Gaussian distribution according to the scaling factor of the first coefficient to obtain the processed variance;
其中,第一系数的缩放因子和第二系数的缩放因子相同;或者,Wherein, the scaling factor of the first coefficient is the same as the scaling factor of the second coefficient; or,
第一系数的缩放因子和第二系数的缩放因子不同;或者,the scaling factor of the first coefficient and the scaling factor of the second coefficient are different; or,
若第一系数和第二系数在待编码图像中属于同一个图像块,则第一数据的缩放因子和第二系数的缩放因子相同;或者若第一系数和第二系数属于不同图像块,则第一系数的缩放因子和第二系数的缩放因子不同;或者第一系数的缩放因子是根据第一系数所属的子带的纹理复杂度确定的;或者,If the first coefficient and the second coefficient belong to the same image block in the image to be encoded, then the scaling factor of the first data and the scaling factor of the second coefficient are the same; or if the first coefficient and the second coefficient belong to different image blocks, then The scaling factor of the first coefficient is different from the scaling factor of the second coefficient; or the scaling factor of the first coefficient is determined according to the texture complexity of the subband to which the first coefficient belongs; or,
若第一系数和第二系数属于对待编码图像进行小波变换得到的多个子带中的一个子带,则第一系数的缩放因子和第二系数的缩放因子相同;或者若第一系数和第二系数属于不同子带,则第一系数的缩放因子和第二系数的缩放因子不同;或者第一系数的缩放因子是根据第一系数所属的子带的纹理复杂度确定的;If the first coefficient and the second coefficient belong to one of the multiple subbands obtained by performing wavelet transformation on the image to be encoded, then the scaling factor of the first coefficient is the same as the scaling factor of the second coefficient; or if the first coefficient and the second If the coefficients belong to different subbands, the scaling factor of the first coefficient is different from the scaling factor of the second coefficient; or the scaling factor of the first coefficient is determined according to the texture complexity of the subband to which the first coefficient belongs;
或者,or,
若第一系数和第二系数属于对待编码图像进行DCT得到的多个频带中一个频带或变换块,则第一系数的缩放因子和第二系数的缩放因子相同;或者若第一系数和第二系数属于不同频带变换块,则第一系数的缩放因子和第二系数的缩放因子不同;若第一系数的缩放因子是根据第一系数所属的频带变换块的纹理复杂度确定的;If the first coefficient and the second coefficient belong to one frequency band or transformation block among multiple frequency bands obtained by performing DCT on the image to be encoded, then the scaling factor of the first coefficient and the scaling factor of the second coefficient are the same; or if the first coefficient and the second If the coefficients belong to different frequency band transform blocks, the scaling factor of the first coefficient is different from the scaling factor of the second coefficient; if the scaling factor of the first coefficient is determined according to the texture complexity of the frequency band transform block to which the first coefficient belongs;
或者,or,
若第一系数和第二系数属于对待编码图像进行特征提取得到的三维特征图的同一通道,则第一系数的缩放因子和第二系数的缩放因子相同;或者若第一系数和第二系数属于不同通道,则第一系数的缩放因子和第二系数的缩放因子不同;若第一系数的缩放因子是根据第一系数所属的通道的纹理复杂度确定的。If the first coefficient and the second coefficient belong to the same channel of the three-dimensional feature map obtained by feature extraction of the image to be coded, the scaling factor of the first coefficient and the scaling factor of the second coefficient are the same; or if the first coefficient and the second coefficient belong to For different channels, the scaling factor of the first coefficient is different from the scaling factor of the second coefficient; if the scaling factor of the first coefficient is determined according to the texture complexity of the channel to which the first coefficient belongs.
在一个示例中,概率估计结果是基于高斯分布模型实现的,概率估计结果包括高斯分布或高斯分布的均值和/或方差。In an example, the probability estimation result is implemented based on a Gaussian distribution model, and the probability estimation result includes a Gaussian distribution or a mean and/or a variance of the Gaussian distribution.
在一个示例中,当第一系数的概率估计结果包括拉普拉斯分布的位置参数和尺度参数,根据第一系数的缩放因子对拉普拉斯分布的尺度参数进行处理,第一系数的处理后的概率估计结果包括处理后的尺度参数和拉普拉斯分布的位置参数。In one example, when the probability estimation result of the first coefficient includes the location parameter and the scale parameter of the Laplace distribution, the scale parameter of the Laplace distribution is processed according to the scaling factor of the first coefficient, and the processing of the first coefficient The final probability estimation results include the processed scale parameters and the location parameters of the Laplace distribution.
在一个示例中,当第一系数的概率估计结果包括拉普拉斯分布的位置参数和尺度参数,根据第一系数的缩放因子对拉普拉斯分布的位置参数进行处理,第一系数的处理后的概率估计结果包括处理后的位置参数和拉普拉斯分布的尺度参数。In one example, when the probability estimation result of the first coefficient includes the location parameter and scale parameter of the Laplace distribution, the location parameter of the Laplace distribution is processed according to the scaling factor of the first coefficient, and the processing of the first coefficient The final probability estimation results include the processed location parameters and the scale parameters of the Laplace distribution.
在一个示例后,概率估计单元702在得到第一预置区域的概估计结果后,对第一预置区域的概估计结果进行预处理,得到处理后的概率估计结果。具体地,若第一预置区域的概率估计结果包括高斯分布的均值和方差,对高斯分布的方差进行处理得到处理后的方差,高斯分布的均值和处理后的方差作为第一预置区域的处理后的概率估计结果;或者,After an example, after obtaining the approximate estimation result of the first preset area, the probability estimation unit 702 preprocesses the approximate estimation result of the first preset area to obtain the processed probability estimation result. Specifically, if the probability estimation result of the first preset area includes the mean value and variance of the Gaussian distribution, the variance of the Gaussian distribution is processed to obtain the processed variance, and the mean value and the processed variance of the Gaussian distribution are used as the value of the first preset area. Processed probability estimates; or,
对高斯分布的均值进行处理得到处理后的均值,高斯分布的方差和处理后的均值作为预置区域的处理后的概率估计结果。在一个示例中,对高斯分布的方差进行处理得到处理后的方差,包括:将高斯分布的方差置为0作为处理后的方差。在一个示例中,对高斯分布的方差进行处理得到处理后的方差,包括:The mean value of the Gaussian distribution is processed to obtain the processed mean value, and the variance of the Gaussian distribution and the processed mean value are used as the processed probability estimation result of the preset area. In an example, processing the variance of the Gaussian distribution to obtain the processed variance includes: setting the variance of the Gaussian distribution to 0 as the processed variance. In one example, the variance of the Gaussian distribution is processed to obtain the processed variance, including:
根据第一预置区域的缩放因子对高斯分布的方差进行处理,得到处理后的方差;Processing the variance of the Gaussian distribution according to the scaling factor of the first preset area to obtain the processed variance;
其中,第一预置区域的缩放因子和其他预置区域的缩放因子相同;或者,Wherein, the scaling factor of the first preset area is the same as that of other preset areas; or,
第一预置区域的缩放因子和其他预置区域的缩放因子不同。The scaling factor of the first preset area is different from that of other preset areas.
在一个示例中,当第一预置区域的概率估计结果包括拉普拉斯分布的位置参数和尺度参数,根据第一预置区域的缩放因子对拉普拉斯分布的尺度参数进行处理,第一预置区域的处理后的概率估计结果包括处理后的尺度参数和拉普拉斯分布的位置参数。In an example, when the probability estimation result of the first preset area includes the position parameter and the scale parameter of the Laplace distribution, the scale parameter of the Laplace distribution is processed according to the scaling factor of the first preset area, the first The processed probability estimation result of a preset area includes processed scale parameters and location parameters of Laplace distribution.
在一个示例中,当第一预置区域的概率估计结果包括拉普拉斯分布的位置参数和尺度参数,根据第一预置区域的缩放因子对拉普拉斯分布的位置参数进行处理,第一预置区域的处理后的概率估计结果包括处理后的位置参数和拉普拉斯分布的尺度参数。In an example, when the probability estimation result of the first preset area includes the location parameter and the scale parameter of the Laplace distribution, the location parameter of the Laplace distribution is processed according to the scaling factor of the first preset area, the first The processed probability estimation result of a preset area includes processed position parameters and scale parameters of Laplace distribution.
在一个示例中,若概率估计结果是基于拉普拉斯分布得到的,概率估计结果包括拉普拉斯分布、或者拉普拉斯分布的尺度参数和/或位置参数。In an example, if the probability estimation result is obtained based on the Laplace distribution, the probability estimation result includes the Laplace distribution, or the scale parameter and/or the location parameter of the Laplace distribution.
熵编码单元703Entropy coding unit 703
熵编码单元703将第一系数、第二系数、第一概率估计结果和第二概率估计结果写入压缩码流中。The entropy encoding unit 703 writes the first coefficient, the second coefficient, the first probability estimation result and the second probability estimation result into the compressed code stream.
在一个示例中,在视频压缩中,可以将第一概率估计结果和第二概率估计结果保存在序列头、图像头、Slice或SEI中传输到解码器30。In one example, in video compression, the first probability estimation result and the second probability estimation result may be stored in a sequence header, image header, Slice or SEI and transmitted to the decoder 30 .
在一个示例中,在得到第一预置区域的概率估计结果后,将第一预置区域的第一标识enable_flag置为第一值(比如1或true),以指示在解码端在采样得到第一预置区域中的估计系数时使用同一概率分布,即第一预置区域的概率估计结果,并将第一预置区域的概率估计结果保存至概率估计结果集合中,并记录第一预置区域的概率估计结果在概率估计结果集合中索引和第一预置区域的尺寸信息,熵编码单元703将第一预置区域内的所有系数、概率估计结果集合、第一预置区域的enable_flag、索引和尺寸信息写入压缩码流。In an example, after obtaining the probability estimation result of the first preset area, the first flag enable_flag of the first preset area is set to a first value (for example, 1 or true), to indicate that the decoding end obtains the first Use the same probability distribution when estimating coefficients in a preset area, that is, the probability estimation result of the first preset area, and save the probability estimation result of the first preset area in the probability estimation result set, and record the first preset The probability estimation result of the region is indexed in the probability estimation result set and the size information of the first preset region. The entropy encoding unit 703 encodes all the coefficients in the first preset region, the probability estimation result set, the enable_flag of the first preset region, Index and size information is written into the compressed codestream.
需要指出的是,对于多个不同的预置区域,可以得到多个概率估计结果,该多个概率估计结果构成一个概率估计结果集合,预置区域的概率估计结果在概率估计结果集合中的位置,即预置区域的索引。It should be pointed out that for multiple different preset areas, multiple probability estimation results can be obtained, and the multiple probability estimation results form a probability estimation result set, and the position of the probability estimation result of the preset area in the probability estimation result set , which is the index of the preset area.
在一个示例中,概率估计结果集合可通过APS传输到解码器30。In one example, the set of probability estimation results may be transmitted to decoder 30 via APS.
在一个示例中,在得到第一预置区域的概率估计结果后,将第一预置区域的enable_flag置为第一值(比如1或true),以指示在解码端在采样得到第一预置区域中的估计系数时使用同一概率分布,即第一预置区域的概率估计结果;熵编码单元703将第一预置区域内的所有系数、第一预置区域的概率估计结果、enable_flag和第一预置区域的尺寸信息写入压缩码流。In an example, after obtaining the probability estimation result of the first preset area, the enable_flag of the first preset area is set to the first value (such as 1 or true) to indicate that the first preset is obtained by sampling at the decoding end The same probability distribution is used when estimating the coefficients in the region, that is, the probability estimation result of the first preset region; the entropy coding unit 703 converts all the coefficients in the first preset region, the probability estimation result of the first preset region, enable_flag and the The size information of a preset area is written into the compressed code stream.
在一个示例中,若第一预置区域内所有的系数在采样时使用各自的概率估计结果,将第一预置区域的enable_flag置为第二值(比如0或false),熵编码单元703将第一预置区域内的所有系数、第一预置区域内所有的系数各自的概率估计结果、第一预置区域的enable_flag写入压缩码流。可选地,熵编码单元703还将第一预置区域的尺寸信息写入压缩码流。In one example, if all the coefficients in the first preset area use their respective probability estimation results when sampling, and the enable_flag of the first preset area is set to a second value (such as 0 or false), the entropy coding unit 703 will All the coefficients in the first preset area, the respective probability estimation results of all the coefficients in the first preset area, and the enable_flag of the first preset area are written into the compressed code stream. Optionally, the entropy coding unit 703 also writes the size information of the first preset area into the compressed code stream.
在此需要指出的是,熵编码单元703将上述数据写入压缩码流,具体是指对上述数据今进行熵编码以得到压缩码流。可选地,可以采用例如哈夫曼编码、CABAC编码、H.264/H265/H.266中的熵编码方法。It should be pointed out here that the entropy coding unit 703 writes the above data into the compressed code stream, specifically refers to performing entropy coding on the above data to obtain the compressed code stream. Optionally, entropy coding methods such as Huffman coding, CABAC coding, and H.264/H265/H.266 can be used.
熵解码单元704Entropy decoding unit 704
解码单元504从压缩码流中解码得到第一概率估计结果。The decoding unit 504 decodes the compressed code stream to obtain a first probability estimation result.
在一个示例中,熵解码单元704还从压缩码流中解码得到第二概率估计结果。In an example, the entropy decoding unit 704 also decodes the compressed code stream to obtain the second probability estimation result.
可选地,第一概率估计结果包括第一概率分布或者第一概率分布模型的参数。第二概率估计结果包括第二概率分布或者第二概率分布模型的参数。Optionally, the first probability estimation result includes parameters of the first probability distribution or the first probability distribution model. The second probability estimation result includes parameters of the second probability distribution or the second probability distribution model.
在一个示例中,熵解码单元704还从压缩码流中解码出第一标识,若该第一标识为第一值,表示在采样得到第一预置区域内所有估计系数时采用同一概率估计结果(即第一预置区域的概率估计结果),该第一预置区域为增强图像中的一个区域;熵解码单元704还从压缩码流中解码出概率估计结果集合和第一预置区域的索引,该概率估计结果集合中包括多个预置区域的概率估计结果,熵解码单元704根据第一预置区域的索引,根据第一预置区域的索引从概率估计结果集合中获取第一预置区域的概率估计结果;In one example, the entropy decoding unit 704 also decodes the first identifier from the compressed code stream. If the first identifier is the first value, it means that the same probability estimation result is used when all the estimated coefficients in the first preset area are obtained by sampling. (that is, the probability estimation result of the first preset area), the first preset area is an area in the enhanced image; the entropy decoding unit 704 also decodes the probability estimation result set and the first preset area from the compressed code stream index, the probability estimation result set includes probability estimation results of multiple preset areas, and the entropy decoding unit 704 obtains the first preset area from the probability estimation result set according to the index of the first preset area. The probability estimation result of the location area;
若第一标识为第二值,表示在采样得到第一预置区域内所有估计系数时采用估计系数各自的概率估计结果;熵解码单元704从码流中解码出第一预置区域的尺寸信息H1*W1,指示熵解码单元704从压缩码流中解码出H1*W1个概率估计结果,采样单元705通过该H1*W1个概率估计结果可以采样得到第一预置区域内所有的估计系数,H1和W1均为大于1的整数。If the first flag is the second value, it means that the respective probability estimation results of the estimated coefficients are used when sampling all the estimated coefficients in the first preset area; the entropy decoding unit 704 decodes the size information of the first preset area from the code stream H1*W1, indicating that the entropy decoding unit 704 decodes H1*W1 probability estimation results from the compressed code stream, and the sampling unit 705 can obtain all estimated coefficients in the first preset area by sampling the H1*W1 probability estimation results, Both H1 and W1 are integers greater than 1.
在一个示例中,熵解码单元704还从压缩码流中解码出第一标识,表示在采样得到第一预置区域内所有估计系数时采用同一概率估计结果(即第一预置区域的概率估计结果),该第一预置区域为增强图像中的一个区域,熵解码单元704还从码流中解码出第一预置区域的概率估计结果和H1*W1,采样单元705通过第一预置区域的概率估计结果进行H1*W1次采样得到H1*W1个估计系数,即第一预置区域包括H1*W1个估计系数。In one example, the entropy decoding unit 704 also decodes the first flag from the compressed code stream, indicating that the same probability estimation result is used when sampling all estimated coefficients in the first preset area (that is, the probability estimation of the first preset area result), the first preset area is an area in the enhanced image, and the entropy decoding unit 704 also decodes the probability estimation result and H1*W1 of the first preset area from the code stream, and the sampling unit 705 passes the first preset The probability estimation result of the area is sampled H1*W1 times to obtain H1*W1 estimated coefficients, that is, the first preset area includes H1*W1 estimated coefficients.
熵解码单元704还用于从压缩码流中解码得到多个重建系数。The entropy decoding unit 704 is further configured to decode the compressed code stream to obtain multiple reconstruction coefficients.
需要指出的是,熵解码单元704对压缩码流进行解码采用的解码方法与熵编码单元703所采用的熵编码方法相对应。It should be noted that the decoding method used by the entropy decoding unit 704 to decode the compressed code stream corresponds to the entropy coding method used by the entropy coding unit 703 .
采样单元705Sampling unit 705
具体过程可参见上述采样单元505的相关描述,在此不再叙述。For the specific process, reference may be made to the relevant description of the above-mentioned sampling unit 505, which will not be described here again.
第一重建单元706The first reconstruction unit 706
第一重建单元706根据多个估计系数得到第一重建图像。The first reconstruction unit 706 obtains a first reconstructed image according to a plurality of estimated coefficients.
具体地,若多个估计系数为多个像素值,基于该多个像素值可得到第一重建图像。Specifically, if the multiple estimated coefficients are multiple pixel values, the first reconstructed image can be obtained based on the multiple pixel values.
若多个估计系数为多个量化小波系数,第一重建单元706对多个估计系数进行反量化和小波反变换得到第一重建图像,或者,If the multiple estimated coefficients are multiple quantized wavelet coefficients, the first reconstruction unit 706 performs inverse quantization and inverse wavelet transform on the multiple estimated coefficients to obtain the first reconstructed image, or,
若多个估计系数为多个小波系数,第一重建单元706对多个估计系数进行小波反变换得到第一重建图像,或者,If the multiple estimated coefficients are multiple wavelet coefficients, the first reconstruction unit 706 performs wavelet inverse transform on the multiple estimated coefficients to obtain the first reconstructed image, or,
若多个估计系数为多个量化DCT系数,第一重建单元706对多个估计系数进行反量化和反DCT得到重建图像,或者,If the multiple estimated coefficients are multiple quantized DCT coefficients, the first reconstruction unit 706 performs inverse quantization and inverse DCT on the multiple estimated coefficients to obtain a reconstructed image, or,
若多个估计系数为多个DCT系数,第一重建单元706对多个估计系数进行反DCT得到第一重建图像。If the multiple estimated coefficients are multiple DCT coefficients, the first reconstruction unit 706 performs inverse DCT on the multiple estimated coefficients to obtain the first reconstructed image.
若多个估计系数为多个特征系数,第一重建单元706对由多个特征系数构成的特征图进行处理得到第一重建图像;或者,If the multiple estimated coefficients are multiple feature coefficients, the first reconstruction unit 706 processes the feature map composed of multiple feature coefficients to obtain the first reconstructed image; or,
若多个估计系数为多个量化特征系数,第一重建单元706对多个估计系数进行反量化,得到多个特征系数;由多个特征系数构成的特征图进行处理得到第一重建图像。If the multiple estimated coefficients are multiple quantized feature coefficients, the first reconstruction unit 706 dequantizes the multiple estimated coefficients to obtain multiple feature coefficients; the feature map composed of multiple feature coefficients is processed to obtain the first reconstructed image.
在一个示例中,可以将多个估计系数输入到第二重建单元中进行处理得到重建图像,该重建图像可以用于后续图像预测时的参考图像。In an example, a plurality of estimated coefficients may be input into the second reconstruction unit for processing to obtain a reconstructed image, and the reconstructed image may be used as a reference image for subsequent image prediction.
第二重建单元707Second reconstruction unit 707
第二重建单元707根据多个重建系数得到第二重建图像。The second reconstruction unit 707 obtains a second reconstructed image according to the plurality of reconstruction coefficients.
具体地,若多个重建系数为多个像素值,基于该多个像素值可得到第二重建图像。Specifically, if the multiple reconstruction coefficients are multiple pixel values, the second reconstructed image can be obtained based on the multiple pixel values.
若多个重建系数为多个量化小波系数,第二重建单元707对多个量化系数进行反量化和小波反变换得到第一重建图像,或者,If the multiple reconstruction coefficients are multiple quantized wavelet coefficients, the second reconstruction unit 707 performs inverse quantization and wavelet inverse transform on the multiple quantized coefficients to obtain the first reconstructed image, or,
若多个重建系数为多个小波系数,第二重建单元707对多个重建系数进行小波反变换得到第一重建图像,或者,If the multiple reconstruction coefficients are multiple wavelet coefficients, the second reconstruction unit 707 performs wavelet inverse transform on the multiple reconstruction coefficients to obtain the first reconstructed image, or,
若多个重建系数为多个量化DCT系数,第二重建单元707对多个重建系数进行反量化和反DCT得到重建图像,或者,If the multiple reconstruction coefficients are multiple quantized DCT coefficients, the second reconstruction unit 707 performs inverse quantization and inverse DCT on the multiple reconstruction coefficients to obtain a reconstructed image, or,
若多个重建系数为多个DCT系数,第二重建单元707对多个重建系数进行反DCT得到第一重建图像。If the multiple reconstruction coefficients are multiple DCT coefficients, the second reconstruction unit 707 performs inverse DCT on the multiple reconstruction coefficients to obtain the first reconstructed image.
若多个重建系数为多个特征系数,第二重建单元707对由多个特征系数构成的特征图进行处理得到第一重建图像;或者,If the multiple reconstruction coefficients are multiple feature coefficients, the second reconstruction unit 707 processes the feature map composed of multiple feature coefficients to obtain the first reconstructed image; or,
若多个重建系数为多个量化特征系数,第二重建单元707对多个重建系数进行反量化,得到多个特征系数;由多个特征系数构成的特征图进行处理得到第一重建图像。If the multiple reconstruction coefficients are multiple quantized feature coefficients, the second reconstruction unit 707 dequantizes the multiple reconstruction coefficients to obtain multiple feature coefficients; the feature map composed of multiple feature coefficients is processed to obtain the first reconstructed image.
可选地,第二重建单元707可以与第一重建单元706的实现方式相同,也可以不同,在此不做限定。Optionally, the implementation manner of the second reconstruction unit 707 may be the same as that of the first reconstruction unit 706, or may be different, which is not limited here.
在一个示例中,在得到由多个特征元素构成的特征图后,可以将该特征图经过神经网络输出上述第一重建图或者第二重建图像。该神经网络可以采用任一结构,例如全连接网络、卷积神经网络、循环神经网络等。神经网络可以采用多层结构的深度神经网络结构可得到质量更佳的第一重建图像或第二重建图像。In an example, after obtaining a feature map composed of multiple feature elements, the feature map may be passed through a neural network to output the above-mentioned first reconstructed image or the second reconstructed image. The neural network can adopt any structure, such as a fully connected network, a convolutional neural network, a recurrent neural network, and the like. The neural network can adopt a deep neural network structure with a multi-layer structure to obtain the first reconstructed image or the second reconstructed image with better quality.
在一个示例中,在得到由多个特征元素构成的特征图后,可以将该特征图输入面向机器视觉任务模块执行相应的机器任务。例如完成物体分类、识别、分割等机器视觉任务。In an example, after obtaining a feature map composed of multiple feature elements, the feature map can be input into a machine vision task module to perform corresponding machine tasks. For example, complete machine vision tasks such as object classification, recognition, and segmentation.
在一个示例中,对于采样单元705得到的多个估计系数,可以同时与多个重建系数输入到第二重建单元707中;具体地,在多个估计系数和多个重建系数均为特征系数时,第二重建单元707按照第一重建单元706的方式对多个估计系数进行处理,得到第一特征图,第二重建单元707基于多个重建系数得到的第二特征图,然后基于第一特征图和第二特征图经过上述神经网络处理得到第二重建图像。In one example, the plurality of estimated coefficients obtained by the sampling unit 705 can be input to the second reconstruction unit 707 together with the plurality of reconstruction coefficients; specifically, when the plurality of estimated coefficients and the plurality of reconstruction coefficients are characteristic coefficients , the second reconstruction unit 707 processes multiple estimated coefficients according to the method of the first reconstruction unit 706 to obtain the first feature map, the second reconstruction unit 707 obtains the second feature map based on the multiple reconstruction coefficients, and then based on the first feature and the second feature map are processed by the neural network to obtain the second reconstructed image.
可以看出,由于采样过程具有随机性,本申请的中可重复进行采样步骤,以得到多张第一重建图像。多张第一重建图像可以是主观质量最优的重建图像,也可以是客观质量最优的重建图像。第一重建图像可用于编解码环路内作为帧内或帧间预测的参考;也可以用于编解码环路外,作为后处理的方式优化图像质量。例如:通过采样步骤和重建步骤得到多张第一重建图像后,主观质量最优的重建图像放入图像缓存区(decoded picture buffer,DPB)中或参考帧集合中,用于编解码环路内帧内或帧间预测的参考图像;客观质量最优的重建图像用于后处理,对编解码后的重建图像进行主观质量的调整,提升压缩重建后的图像/视频质量。可选地,根据重建系数得到第二重建图像,在视频压缩中,可以用于在做下一帧的预测时的参考帧。It can be seen that, due to the randomness of the sampling process, the sampling step can be repeated in the present application to obtain multiple first reconstructed images. The multiple first reconstructed images may be the reconstructed images with the best subjective quality, or the reconstructed images with the best objective quality. The first reconstructed image can be used in the encoding and decoding loop as a reference for intra-frame or inter-frame prediction; it can also be used outside the encoding and decoding loop to optimize image quality in a post-processing manner. For example: After obtaining multiple first reconstructed images through the sampling step and the reconstruction step, the reconstructed image with the best subjective quality is put into the decoded picture buffer (DPB) or the reference frame set for use in the codec loop The reference image for intra-frame or inter-frame prediction; the reconstructed image with the best objective quality is used for post-processing, and the subjective quality of the reconstructed image after codec is adjusted to improve the image/video quality after compression and reconstruction. Optionally, the second reconstructed image is obtained according to the reconstruction coefficient, which can be used as a reference frame when predicting the next frame in video compression.
在一个示例中,在熵编码过程中,熵编码单元703先对第一系数进行概率估计,得到第一系数的概率估计结果,该概率估计结果称为概率估计结果C;再根据概率估计结果C对第一系数进行熵编码;在熵解码过程中,熵解码单元704先对第一系数进行概率估计,得到第一系数的概率估计结果,该概率估计结果也可称为概率估计结果C;再根据概率估计结果C进行熵解码。对于上述实施例所说的概率估计结果称为概率估计结果D。In one example, during the entropy coding process, the entropy coding unit 703 first performs probability estimation on the first coefficient to obtain the probability estimation result of the first coefficient, which is called the probability estimation result C; and then according to the probability estimation result C Perform entropy encoding on the first coefficient; during the entropy decoding process, the entropy decoding unit 704 first performs probability estimation on the first coefficient to obtain a probability estimation result of the first coefficient, which may also be called a probability estimation result C; and then Entropy decoding is performed according to the probability estimation result C. The probability estimation result mentioned in the above embodiments is called the probability estimation result D.
可选地,在编码端根据概率估计结果C对第一系数进行熵编码,解码端按照编码端对第一数据进行概率估计的方式,对第一系数进行概率估计,得到概率估计结果(也可以看成概 率估计结果C,根据该概率估计结果C进行熵解码,还可以根据该概率估计结果C进行采样,采样方式与上述实施例一致。Optionally, entropy encoding is performed on the first coefficient at the encoding end according to the probability estimation result C, and the decoding end performs probability estimation on the first coefficient according to the manner in which the encoding end performs probability estimation on the first data to obtain the probability estimation result (also can be As the probability estimation result C, entropy decoding is performed according to the probability estimation result C, and sampling may also be performed according to the probability estimation result C, and the sampling method is consistent with the above-mentioned embodiment.
可选地,在编码端根据概率估计结果C对第一系数进行熵编码,向解码端传输概率估计结果C,解码端根据该概率估计结果C进行熵解码,还可以根据该概率估计结果C进行采样,采样方式与上述实施例一致。Optionally, entropy encoding is performed on the first coefficient at the encoding end according to the probability estimation result C, and the probability estimation result C is transmitted to the decoding end, and the decoding end performs entropy decoding according to the probability estimation result C, and can also perform entropy decoding according to the probability estimation result C Sampling, the sampling method is consistent with the above-mentioned embodiment.
可选地,在编码端根据概率估计结果D对第一系数进行熵编码,编码端向解码端发送概率估计结果D,解码端根据概率估计结果D进行熵解码,还可根据概率估计结果D进行采样,采样方式与上述实施例一致。Optionally, entropy encoding is performed on the first coefficient at the encoding end according to the probability estimation result D, and the encoding end sends the probability estimation result D to the decoding end, and the decoding end performs entropy decoding according to the probability estimation result D, or performs entropy decoding according to the probability estimation result D Sampling, the sampling method is consistent with the above-mentioned embodiment.
可选地,在编码端根据概率估计结果D对第一系数进行熵编码;解码端对第一系数进行概率估计,可得到概率估计结果D,再根据概率估计结果D进行熵解码,还可根据概率估计结果D进行采样,采样方式与上述实施例一致。Optionally, entropy encoding is performed on the first coefficient at the encoding end according to the probability estimation result D; the probability estimation is performed on the first coefficient at the decoding end to obtain the probability estimation result D, and then entropy decoding is performed according to the probability estimation result D, and it is also possible to obtain the probability estimation result D according to The probability estimation result D is sampled, and the sampling method is consistent with the above-mentioned embodiment.
图8是示出基于本申请一种实施例的编码方法的过程800的流程图。过程800可由视频编码器20执行。过程800描述为一系列的步骤或操作,应当理解的是,过程1000可以以各种顺序执行和/或同时发生,不限于图8所示的执行顺序。FIG. 8 is a flowchart showing a process 800 of an encoding method based on an embodiment of the present application. Process 800 may be performed by video encoder 20 . The process 800 is described as a series of steps or operations. It should be understood that the process 1000 may be performed in various orders and/or concurrently, and is not limited to the order of execution shown in FIG. 8 .
如图8所示,该编码方法包括:As shown in Figure 8, the encoding method includes:
S801、获取第一图像,该第一图像为待编码图像或已解码图像。S801. Acquire a first image, where the first image is an image to be encoded or an image that has been decoded.
S802、根据第一上下文信息进行概率估计得到第一概率估计结果;该第一上下文信息从第一图像得到的。S802. Perform probability estimation according to the first context information to obtain a first probability estimation result; the first context information is obtained from the first image.
其中,第一上下文信息可以为第一图像中的像素或者通过对第一图像进行变换得到的第一变换图像中的系数。Wherein, the first context information may be pixels in the first image or coefficients in the first transformed image obtained by transforming the first image.
在一个可能的设计中,本实施例的方法还包括:In a possible design, the method of this embodiment also includes:
获取第二图像,第二图像为待编码图像或已解码图像,且第二图像与第一图像不相同;根据第一上下文信息进行概率估计得到第一概率估计结果,包括:Acquire a second image, the second image is an image to be encoded or a decoded image, and the second image is different from the first image; perform probability estimation according to the first context information to obtain a first probability estimation result, including:
根据第一上下文信息和第二上下文信息进行概率估计得到所述第一概率估计结果;第二上下文信息从第二图像得到的。Probability estimation is performed according to the first context information and the second context information to obtain the first probability estimation result; the second context information is obtained from the second image.
在一个可能的设计中,根据第一上下文信息进行概率估计得到第一概率估计结果,包括:In a possible design, the probability estimation is performed according to the first context information to obtain the first probability estimation result, including:
根据第一数据的上下文信息进行概率估计得到第一数据的概率估计结果;performing probability estimation according to the context information of the first data to obtain a probability estimation result of the first data;
据第二数据的上下文信息进行概率估计得到第二数据的概率估计结果;其中,第一数据和第二数据是根据第一图像得到的;第一上下文信息包括第一数据的上下文信息和第二数据的上下文信息。Perform probability estimation according to the context information of the second data to obtain the probability estimation result of the second data; wherein, the first data and the second data are obtained according to the first image; the first context information includes the context information of the first data and the second Contextual information about the data.
在一个可能的设计中,第一概率估计结果包括第一预置区域的概率估计结果,第一预置区域包括第一数据和第二数据,第一预置区域位于第一图像中、或者位于对第一图像进行变换得到的图像中,根据第一上下文信息进行概率估计得到第一概率估计结果,包括:In a possible design, the first probability estimation result includes the probability estimation result of a first preset area, the first preset area includes the first data and the second data, and the first preset area is located in the first image, or located in In the image obtained by transforming the first image, performing probability estimation according to the first context information to obtain a first probability estimation result, including:
根据第一数据的上下文信息进行概率估计得到第一数据的概率估计结果;根据第二数据的上下文信息进行概率估计得到第二数据的概率估计结果,其中第一上下文信息包括第一数据的上下文信息和第二数据的上下文信息;根据第一数据的概率估计结果和第二数据的概率估计结果选择得到第一预置区域的概率估计结果,第一概率估计结果包括第一预置区域的概率估计结果。Perform probability estimation according to the context information of the first data to obtain the probability estimation result of the first data; perform probability estimation according to the context information of the second data to obtain the probability estimation result of the second data, wherein the first context information includes the context information of the first data and the context information of the second data; according to the probability estimation result of the first data and the probability estimation result of the second data, the probability estimation result of the first preset area is selected, and the first probability estimation result includes the probability estimation of the first preset area result.
在一个可能的设计中,第一概率估计结果包括第二预置区域的概率估计结果,第二预置区域位于第一图像中、或者位于对第一图像进行变换得到的图像中,第一上下文信息包括第 二预置区域的上下文信息,根据第一上下文信息进行概率估计得到第一概率估计结果,包括:根据第二预置区域的上下文信息进行概率估计得到第二预置区域的概率估计结果,第一概率估计结果包括第二预置区域的概率估计结果。In a possible design, the first probability estimation result includes the probability estimation result of the second preset area, and the second preset area is located in the first image or in an image obtained by transforming the first image, and the first context The information includes context information of the second preset area, and performing probability estimation according to the first context information to obtain the first probability estimation result includes: performing probability estimation according to the context information of the second preset area to obtain the probability estimation result of the second preset area , the first probability estimation result includes the probability estimation result of the second preset area.
S803、将第一概率估计结果写入压缩码流。S803. Write the first probability estimation result into the compressed code stream.
在一个可能的设计中,本编码方法还包括:将第一预置区域的第一标识的值置为第一值,以用于指示在采样得到第一预置区域中的估计系数时均使用第一预置区域的概率估计结果;将第一预置区域的概率估计结果保存至概率估计结果集合中,并记录第一预置区域的概率估计结果在概率估计结果集合的索引;将概率估计结果写入压缩码流,包括:将概率估计结果集合,索引、第一预置区域的尺寸信息及第一标识写入压缩码流。In a possible design, the encoding method further includes: setting the value of the first flag of the first preset area as the first value, which is used to indicate that when the estimated coefficients in the first preset area are obtained by sampling The probability estimation result of the first preset area; the probability estimation result of the first preset area is saved in the probability estimation result set, and the index of the probability estimation result of the first preset area in the probability estimation result set is recorded; the probability estimation Writing the result into the compressed code stream includes: writing the probability estimation result set, the index, the size information of the first preset area and the first identification into the compressed code stream.
在一个可能的设计中,本编码方法还包括:In a possible design, this encoding method also includes:
将第一预置区域的第一标识的值置为第一值,以用于指示在采样得到第一预置区域中的估计系数时均使用第一预置区域的概率估计结果;根据第一预置区域的缩放因子对第一预置区域的概率估计结果进行预处理,得到处理后的概率估计结果,将处理后的概率估计结果保存至概率估计结果集合中,并记录处理后的概率估计结果在概率估计结果集合的索引;将概率估计结果写入压缩码流,包括:将概率估计结果集合,索引、第一预置区域的尺寸信息及第一标识写入压缩码流。Set the value of the first identifier of the first preset area as the first value to indicate that the probability estimation result of the first preset area is used when sampling the estimated coefficients in the first preset area; according to the first The scaling factor of the preset area preprocesses the probability estimation result of the first preset area to obtain the processed probability estimation result, saves the processed probability estimation result into the probability estimation result set, and records the processed probability estimation result The results are in the index of the probability estimation result set; writing the probability estimation result into the compressed code stream includes: writing the probability estimation result set, the index, the size information of the first preset area and the first identification into the compressed code stream.
在一个可能的设计中,本编码方法还包括:In a possible design, this encoding method also includes:
将第一预置区域的第一标识的值置为第一值,以用于指示在采样得到第一预置区域中的估计系数时均使用第一预置区域的概率估计结果;将第一概率估计结果写入压缩码流,包括:将第一预置区域的概率估计结果、第一预置区域的尺寸信息和第一标识写入码流。Set the value of the first identifier of the first preset area as the first value to indicate that the probability estimation result of the first preset area is used when sampling the estimated coefficients in the first preset area; set the first Writing the probability estimation result into the compressed code stream includes: writing the probability estimation result of the first preset area, the size information of the first preset area and the first identification into the code stream.
在一个可能的设计中,本编码方法还包括:In a possible design, this encoding method also includes:
对第一数据的概率估计结果进行预处理,得到处理后的概率估计结果。The probability estimation result of the first data is preprocessed to obtain the probability estimation result after processing.
在一个可能的设计中,第一数据的概率估计结果包括高斯分布的均值和方差,对第一数据的概率估计结果进行预处理,得到处理后的概率估计结果,包括:In a possible design, the probability estimation result of the first data includes the mean and variance of the Gaussian distribution, and the probability estimation result of the first data is preprocessed to obtain the processed probability estimation result, including:
将高斯分布的方差置为0作为处理后的方差,其中,处理后的概率估计结果包括高斯分布的均值和处理后的方差。The variance of the Gaussian distribution is set to 0 as the processed variance, wherein the processed probability estimation result includes the mean value of the Gaussian distribution and the processed variance.
在一个可能的设计中,第一数据的概率估计结果包括高斯分布的均值和方差,对第一数据的概率估计结果进行预处理,得到处理后的概率估计结果,包括:In a possible design, the probability estimation result of the first data includes the mean and variance of the Gaussian distribution, and the probability estimation result of the first data is preprocessed to obtain the processed probability estimation result, including:
根据第一数据的缩放因子对高斯分布的方差进行预处理,以得到处理后的方差,其中,处理后的概率估计结果包括高斯分布的均值和处理后的方差;Preprocessing the variance of the Gaussian distribution according to the scaling factor of the first data to obtain the processed variance, wherein the processed probability estimation result includes the mean value of the Gaussian distribution and the processed variance;
第一数据的缩放因子和第二数据的缩放因子相同;或者,第一数据的缩放因子和第二数据的缩放因子不同;或者,The scaling factor of the first data is the same as the scaling factor of the second data; or, the scaling factor of the first data is different from the scaling factor of the second data; or,
若第一数据和第二数据在第一图像中属于同一个图像块,则第一数据的缩放因子和第二数据的缩放因子相同;或者若第一数据和第二数据属于不同图像块,则第一数据的缩放因子和第二数据的缩放因子不同;或者第一数据的缩放因子是根据第一数据所属的图像块的纹理复杂度确定的;If the first data and the second data belong to the same image block in the first image, the scaling factor of the first data and the scaling factor of the second data are the same; or if the first data and the second data belong to different image blocks, then The scaling factor of the first data is different from the scaling factor of the second data; or the scaling factor of the first data is determined according to the texture complexity of the image block to which the first data belongs;
或者,or,
若第一数据和第二数据属于对第一图像进行小波变换得到的多个子带中的一个子带,则第一数据的缩放因子和第二数据的缩放因子相同;或者若第一数据和第二数据属于不同子带,则第一数据的缩放因子和第二数据的缩放因子不同;或者第一数据的缩放因子是根据第一数据所属的子带的纹理复杂度确定的;If the first data and the second data belong to one subband among the plurality of subbands obtained by performing wavelet transformation on the first image, then the scaling factor of the first data is the same as that of the second data; or if the first data and the second data The two data belong to different subbands, and the scaling factor of the first data is different from the scaling factor of the second data; or the scaling factor of the first data is determined according to the texture complexity of the subband to which the first data belongs;
或者,or,
若第一数据和第二数据属于对第一图像进行DCT得到的多个频带中一个频带或者变换块,则第一数据的缩放因子和第二数据的缩放因子相同;或者若第一数据和第二数据属于不同频带或者变换块,则第一数据的缩放因子和第二数据的缩放因子不同;若或者第一数据的缩放因子是根据第一数据所属的频带者变换块的纹理复杂度确定的;If the first data and the second data belong to one frequency band or transform block among the plurality of frequency bands obtained by performing DCT on the first image, then the scaling factor of the first data is the same as that of the second data; or if the first data and the second data The two data belong to different frequency bands or transform blocks, then the scaling factor of the first data and the scaling factor of the second data are different; if or the scaling factor of the first data is determined according to the frequency band to which the first data belongs or the texture complexity of the transform block ;
或者,or,
若第一数据和第二数据属于对第一图像进行特征提取得到的三维特征图的同一通道,则第一数据的缩放因子和第二数据的缩放因子相同;或者若第一数据和第二数据属于不同通道,则第一数据的缩放因子和第二数据的缩放因子不同;或者第一数据的缩放因子是根据第一数据所属的通道的纹理复杂度确定的。If the first data and the second data belong to the same channel of the three-dimensional feature map obtained by performing feature extraction on the first image, then the scaling factor of the first data and the scaling factor of the second data are the same; or if the first data and the second data belong to different channels, the scaling factor of the first data is different from the scaling factor of the second data; or the scaling factor of the first data is determined according to the texture complexity of the channel to which the first data belongs.
在一个可能的设计中,本编码方法还包括:In a possible design, this encoding method also includes:
对第二预置区域的概率估计结果进行预处理,得到处理后的概率估计结果。The probability estimation result of the second preset area is preprocessed to obtain the probability estimation result after processing.
在一个可能的设计中,第一数据的概率估计结果包括高斯分布的均值和方差,对第一数据的概率估计结果进行预处理,得到处理后的概率估计结果,包括:In a possible design, the probability estimation result of the first data includes the mean and variance of the Gaussian distribution, and the probability estimation result of the first data is preprocessed to obtain the processed probability estimation result, including:
将高斯分布的方差置为0作为第一方差,其中,处理后的概率估计结果包括高斯分布的均值和第一方差,或者,根据第二预置区域的缩放因子对高斯分布的方差进行处理,以得到第二方差,其中,处理后的概率估计结果包括高斯分布的均值和第二方差,第一预置区域的缩放因子和第二预制区域的缩放因子相同或者不同。Set the variance of the Gaussian distribution to 0 as the first variance, wherein the processed probability estimation result includes the mean value and the first variance of the Gaussian distribution, or, the variance of the Gaussian distribution is calculated according to the scaling factor of the second preset area processing to obtain the second variance, wherein the processed probability estimation result includes the mean value of the Gaussian distribution and the second variance, and the scaling factor of the first prefabricated area is the same or different from the scaling factor of the second prefabricated area.
在一个可能的设计中,第一上下文信息包括在第一图像中部分或者全部像素值。In a possible design, the first context information includes some or all pixel values in the first image.
在一个可能的设计中,本编码方法还包括:In a possible design, this encoding method also includes:
对第一图像进行变换,以得到第一变换图像;其中,若变换为小波变换,则第一上下文信息包括在第一变换图像中部分或者全部的系数,该系数为小波系数或者量化小波系数,或者;若变换为DCT,则第一上下文信息包括在第一变换图像中部分或者全部的系数,该系数为DCT系数或者量化DCT系数;或者,若变换为特征变换,则第一上下文信息包括在第一变换图像中部分或者全部的系数,该系数为特征系数或者量化特征系数。Transforming the first image to obtain a first transformed image; wherein, if transformed into a wavelet transform, the first context information includes some or all coefficients in the first transformed image, and the coefficients are wavelet coefficients or quantized wavelet coefficients, Or; if the transformation is DCT, the first context information includes some or all of the coefficients in the first transformed image, and the coefficients are DCT coefficients or quantized DCT coefficients; or, if the transformation is feature transformation, the first context information is included in Part or all of the coefficients in the first transformed image are characteristic coefficients or quantized characteristic coefficients.
在一个可能的设计中,根据第一上下文信息进行概率估计得到第一概率估计结果,包括:In a possible design, the probability estimation is performed according to the first context information to obtain the first probability estimation result, including:
将第一上下文信息输入到第一概率估计网络中进行处理,得到第一概率分布模型的参数;概率估计结果第一概率分布模型的参数;The first context information is input into the first probability estimation network for processing to obtain the parameters of the first probability distribution model; the parameters of the probability estimation result first probability distribution model;
或者,or,
将第一上下文信息输入到第二概率估计网络中进行处理,得到目标概率分布,概率估计结果包括目标概率分布的参数;其中,第一概率估计网络和第二概率估计网络是神经网络实现的。The first context information is input into the second probability estimation network for processing to obtain the target probability distribution, and the probability estimation result includes the parameters of the target probability distribution; wherein, the first probability estimation network and the second probability estimation network are realized by a neural network.
在此需要说明的是,图8所示实施例的具体实现过程可参见图5中的编码单元501、前向变换单元502和概率估计单元503的相关描述,在此不再叙述。It should be noted here that, for the specific implementation process of the embodiment shown in FIG. 8 , reference may be made to the related descriptions of the coding unit 501 , the forward transformation unit 502 and the probability estimation unit 503 in FIG. 5 , which will not be described here again.
图9是示出基于本申请一种实施例的编码方法的过程900的流程图。过程900可由视频编码器20执行。过程900描述为一系列的步骤或操作,应当理解的是,过程900可以以各种顺序执行和/或同时发生,不限于图9所示的执行顺序。FIG. 9 is a flow chart showing a process 900 of an encoding method based on an embodiment of the present application. Process 900 may be performed by video encoder 20 . The process 900 is described as a series of steps or operations. It should be understood that the process 900 may be performed in various orders and/or concurrently, and is not limited to the order of execution shown in FIG. 9 .
如图9所示,该编码方法包括:As shown in Figure 9, the encoding method includes:
S901、根据待编码图像获得多个系数,该多个系数包括第一系数。S901. Obtain a plurality of coefficients according to an image to be encoded, where the plurality of coefficients include a first coefficient.
S902、根据第一系数的上下文信息得到第一概率估计结果。S902. Obtain a first probability estimation result according to the context information of the first coefficient.
S903、将第一系数和第一概率估计结果写入压缩码流。S903. Write the first coefficient and the first probability estimation result into the compressed code stream.
在一个可能的设计中,多个系数还包括第二系数,本编码方法还包括:In a possible design, the multiple coefficients also include a second coefficient, and the encoding method also includes:
根据第二系数的上下文信息得到第二概率估计结果;将第一系数和第一概率估计结果写入压缩码流,包括:将第一系数、第一概率估计结果、第二系数和第二概率估计结果写入压缩码流。Obtaining a second probability estimation result according to the context information of the second coefficient; writing the first coefficient and the first probability estimation result into the compressed code stream, including: writing the first coefficient, the first probability estimation result, the second coefficient and the second probability The estimated results are written to the compressed codestream.
在一个可能的设计中,多个系数还包括第二系数,第一系数和第二系数属于同一预置区域,预置区域位于待编码图像中,或者位于对待编码图像进行变换得到的图像中,根据第一系数的上下文信息得到第一概率估计结果,包括:In a possible design, the plurality of coefficients further includes a second coefficient, the first coefficient and the second coefficient belong to the same preset area, and the preset area is located in the image to be coded, or in an image obtained by transforming the image to be coded, The first probability estimation result is obtained according to the context information of the first coefficient, including:
根据第一系数的上下文信息进行概率估计得到第三概率估计结果;根据第二系数的上下文信息进行概率估计得到第二概率估计结果;从第三概率估计结果和第二概率估计结果中确定出第一概率估计结果;Perform probability estimation according to the context information of the first coefficient to obtain a third probability estimation result; perform probability estimation according to the context information of the second coefficient to obtain a second probability estimation result; determine the third probability estimation result from the third probability estimation result and the second probability estimation result a probability estimate result;
将第一系数和第一概率估计结果写入压缩码流,包括:将第一系数、第二系数和第一概率估计结果写入压缩码流。Writing the first coefficient and the first probability estimation result into the compressed code stream includes: writing the first coefficient, the second coefficient and the first probability estimation result into the compressed code stream.
在一个可能的设计中,多个系数还包括第二系数,第一系数和第二系数属于同一预置区域,预置区域位于待编码图像中,或者位于对待编码图像进行变换得到的图像中,根据第一系数的上下文信息得到第一概率分布,包括:In a possible design, the plurality of coefficients further includes a second coefficient, the first coefficient and the second coefficient belong to the same preset area, and the preset area is located in the image to be coded, or in an image obtained by transforming the image to be coded, The first probability distribution is obtained according to the context information of the first coefficient, including:
根据预置区域的上下文信息进行概率估计得到第一概率估计结果;预置区域的上下文信息包括第一系数的上下文信息;将第一系数和第一概率估计结果写入压缩码流,包括:将第一系数、第二系数和第一概率估计结果写入压缩码流。Probability estimation is performed according to the context information of the preset area to obtain a first probability estimation result; the context information of the preset area includes context information of the first coefficient; writing the first coefficient and the first probability estimation result into the compressed code stream includes: The first coefficient, the second coefficient and the first probability estimation result are written into the compressed code stream.
在一个可能的设计中,本编码方法还包括:In a possible design, this encoding method also includes:
将预置区域的第一标识的值置为第一值,以用于指示在采样得到预置区域中的估计系数时均使用第一概率估计结果;将第一概率估计结果保存至概率估计结果集合中,并记录第一概率估计结果在概率估计结果集合的索引;将第一系数、第二系数和第一概率估计结果写入压缩码流,包括:将第一系数、第二系数、概率估计结果集合,索引、预置区域的尺寸信息及第一标识写入压缩码流。Set the value of the first flag of the preset area to the first value to indicate that the first probability estimation result is used when sampling the estimated coefficients in the preset area; save the first probability estimation result to the probability estimation result set, and record the index of the first probability estimation result in the probability estimation result set; write the first coefficient, the second coefficient and the first probability estimation result into the compressed code stream, including: writing the first coefficient, the second coefficient, the probability The estimation result set, the index, the size information of the preset area and the first identification are written into the compressed code stream.
在一个可能的设计中,本编码方法还包括:In a possible design, this encoding method also includes:
将预置区域的第一标识的值置为第一值,以用于指示在采样得到预置区域中的估计系数时均使用第一概率估计结果将第一系数、第二系数和第一概率估计结果写入压缩码流,包括:将第一系数、第二系数、第一概率估计结果、预置区域的尺寸信息及第一标识写入压缩码流。Set the value of the first flag of the preset area to the first value to indicate that when the estimated coefficients in the preset area are obtained by sampling, the first probability estimation result is used to combine the first coefficient, the second coefficient and the first probability Writing the estimation result into the compressed code stream includes: writing the first coefficient, the second coefficient, the first probability estimation result, the size information of the preset area and the first identification into the compressed code stream.
在一个可能的设计中,第一系数和第二系数属于同一预置区域,本编码方法还包括:In a possible design, the first coefficient and the second coefficient belong to the same preset area, and the encoding method further includes:
将预置区域的第一标识的值置为第二值,以用于指示在采样得到预置区域中的估计系数时均使用各自的概率估计结果;将第一系数、第一概率估计结果、第二系数和第二概率估计结果写入压缩码流,包括:将第一系数、第一概率估计结果、第二系数和第二概率估计结果和预置区域的第一标识写入压缩码流。Set the value of the first flag of the preset area to the second value to indicate that when sampling the estimated coefficients in the preset area, the respective probability estimation results are used; set the first coefficient, the first probability estimation result, Writing the second coefficient and the second probability estimation result into the compressed code stream includes: writing the first coefficient, the first probability estimation result, the second coefficient, the second probability estimation result and the first identification of the preset area into the compressed code stream .
在一个可能的设计中,本编码方法还包括:In a possible design, this encoding method also includes:
对第一系数的概率估计结果进行预处理,得到处理后的概率估计结果。The probability estimation result of the first coefficient is preprocessed to obtain the probability estimation result after processing.
在一个可能的设计中,第一系数的概率估计结果包括高斯分布的均值和方差,对第一系数的概率估计结果进行预处理,得到处理后的概率估计结果,包括:In a possible design, the probability estimation result of the first coefficient includes the mean and variance of the Gaussian distribution, and the probability estimation result of the first coefficient is preprocessed to obtain the processed probability estimation result, including:
将高斯分布的方差置为0作为处理后的方差,其中,处理后的概率估计结果包括高斯分布的均值和处理后的方差。The variance of the Gaussian distribution is set to 0 as the processed variance, wherein the processed probability estimation result includes the mean value of the Gaussian distribution and the processed variance.
在一个可能的设计中,第一系数的概率估计结果包括高斯分布的均值和方差,对第一系 数的概率估计结果进行预处理,得到处理后的概率估计结果,包括:In a possible design, the probability estimation result of the first coefficient includes the mean and variance of the Gaussian distribution, the probability estimation result of the first coefficient is preprocessed, and the probability estimation result after processing is obtained, including:
根据第一系数的缩放因子对高斯分布的方差进行预处理,以得到处理后的方差,其中,处理后的概率估计结果包括高斯分布的均值和处理后的方差;Preprocessing the variance of the Gaussian distribution according to the scaling factor of the first coefficient to obtain the processed variance, wherein the processed probability estimation result includes the mean value of the Gaussian distribution and the processed variance;
第一系数的缩放因子和第二系数的缩放因子相同;或者,第一系数的缩放因子和第二系数的缩放因子不同;或者,The scaling factor of the first coefficient and the scaling factor of the second coefficient are the same; or, the scaling factor of the first coefficient and the scaling factor of the second coefficient are different; or,
若第一系数和第二系数在待编码图像中属于同一个图像块,则第一数据的缩放因子和第二系数的缩放因子相同;或者若第一系数和第二系数属于不同图像块,则第一系数的缩放因子和第二系数的缩放因子不同;或者第一系数的缩放因子是根据第一系数所属的子带的纹理复杂度确定的;或者,If the first coefficient and the second coefficient belong to the same image block in the image to be encoded, then the scaling factor of the first data and the scaling factor of the second coefficient are the same; or if the first coefficient and the second coefficient belong to different image blocks, then The scaling factor of the first coefficient is different from the scaling factor of the second coefficient; or the scaling factor of the first coefficient is determined according to the texture complexity of the subband to which the first coefficient belongs; or,
若第一系数和第二系数属于对待编码图像进行小波变换得到的多个子带中的一个子带,则第一系数的缩放因子和第二系数的缩放因子相同;或者若第一系数和第二系数属于不同子带,则第一系数的缩放因子和第二系数的缩放因子不同;或者第一系数的缩放因子是根据第一系数所属的子带的纹理复杂度确定的;If the first coefficient and the second coefficient belong to one of the multiple subbands obtained by performing wavelet transformation on the image to be encoded, then the scaling factor of the first coefficient is the same as the scaling factor of the second coefficient; or if the first coefficient and the second If the coefficients belong to different subbands, the scaling factor of the first coefficient is different from the scaling factor of the second coefficient; or the scaling factor of the first coefficient is determined according to the texture complexity of the subband to which the first coefficient belongs;
或者,or,
若第一系数和第二系数属于对待编码图像进行DCT得到的多个频带中一个频带,则第一系数的缩放因子和第二系数的缩放因子相同;或者若第一系数和第二系数属于不同频带,则第一系数的缩放因子和第二系数的缩放因子不同;若第一系数的缩放因子是根据第一系数所属的频带的纹理复杂度确定的;If the first coefficient and the second coefficient belong to one of the multiple frequency bands obtained by performing DCT on the image to be coded, the scaling factor of the first coefficient and the scaling factor of the second coefficient are the same; or if the first coefficient and the second coefficient belong to different frequency band, the scaling factor of the first coefficient is different from the scaling factor of the second coefficient; if the scaling factor of the first coefficient is determined according to the texture complexity of the frequency band to which the first coefficient belongs;
或者,or,
若第一系数和第二系数属于对待编码图像进行特征提取得到的三维特征图的同一通道,则第一系数的缩放因子和第二系数的缩放因子相同;或者若第一系数和第二系数属于不同通道,则第一系数的缩放因子和第二系数的缩放因子不同;若第一系数的缩放因子是根据第一系数所属的通道的纹理复杂度确定的。If the first coefficient and the second coefficient belong to the same channel of the three-dimensional feature map obtained by feature extraction of the image to be coded, the scaling factor of the first coefficient and the scaling factor of the second coefficient are the same; or if the first coefficient and the second coefficient belong to For different channels, the scaling factor of the first coefficient is different from the scaling factor of the second coefficient; if the scaling factor of the first coefficient is determined according to the texture complexity of the channel to which the first coefficient belongs.
在一个可能的设计中,本编码方法还包括:In a possible design, this encoding method also includes:
对预置区域的概率估计结果进行预处理,得到处理后的概率估计结果。The probability estimation result of the preset area is preprocessed to obtain the probability estimation result after processing.
在一个可能的设计中,预置区域的概率估计结果包括高斯分布的均值和方差,对预置区域的概率估计结果进行预处理,得到处理后的概率估计结果,包括:In a possible design, the probability estimation result of the preset area includes the mean and variance of the Gaussian distribution, and the probability estimation result of the preset area is preprocessed to obtain the processed probability estimation result, including:
将高斯分布的方差置为0作为第一方差,其中,处理后的概率估计结果包括高斯分布的均值和第一方差,或者,根据预置区域的缩放因子对高斯分布的方差进行处理,以得到第二方差,其中,处理后的概率估计结果包括高斯分布的均值和第二方差。Set the variance of the Gaussian distribution to 0 as the first variance, wherein the processed probability estimation result includes the mean value and the first variance of the Gaussian distribution, or process the variance of the Gaussian distribution according to the scaling factor of the preset area, to obtain the second variance, wherein the processed probability estimation result includes the mean value and the second variance of the Gaussian distribution.
在一个可能的设计中,若多个系数为待编码图像中的多个像素值,第一上下文信息包括待编码图像中部分或者全部像素值;或者,In a possible design, if the multiple coefficients are multiple pixel values in the image to be encoded, the first context information includes some or all pixel values in the image to be encoded; or,
根据待编码图像获得多个系数,包括:Obtain multiple coefficients according to the image to be encoded, including:
若对待编码图像进行小波变换得到多个系数,多个系数为多个小波系数,第一上下文信息包括多个小波系数中的部分或者全部;或者,若对待编码图像进行小波变换和量化得到多个系数,多个系数为多个量化小波系数,第一上下文信息包括多个量化小波系数中的部分或者全部;或者,若对待编码图像进行DCT得到多个系数,多个系数为多个DCT系数,第一上下文信息包括多个DCT系数中的部分或者全部;或者,若对待编码图像进行DCT和量化得到多个系数,多个系数为多个量化DCT系数,第一上下文信息包括多个量化DCT系数中的部分或者全部;或者,若对待编码图像进行特征提取得到多个系数,多个系数为多个特征系数,第一上下文信息包括多个特征系数中的部分或者全部;或者,若对待编码图像进行特 征提取和量化得到多个系数,多个系数为多个量化特征系数,第一上下文信息包括多个量化特征系数中的部分或者全部。If the image to be coded is subjected to wavelet transformation to obtain multiple coefficients, the multiple coefficients are multiple wavelet coefficients, and the first context information includes part or all of the multiple wavelet coefficients; or, if the image to be coded is subjected to wavelet transformation and quantization to obtain multiple Coefficients, the plurality of coefficients are a plurality of quantized wavelet coefficients, the first context information includes part or all of the plurality of quantized wavelet coefficients; or, if the image to be coded is subjected to DCT to obtain a plurality of coefficients, the plurality of coefficients are a plurality of DCT coefficients, The first context information includes some or all of the multiple DCT coefficients; or, if the image to be coded is subjected to DCT and quantization to obtain multiple coefficients, the multiple coefficients are multiple quantized DCT coefficients, and the first context information includes multiple quantized DCT coefficients Part or all of them; or, if the feature extraction of the image to be coded obtains multiple coefficients, the multiple coefficients are multiple feature coefficients, and the first context information includes some or all of the multiple feature coefficients; or, if the image to be coded Feature extraction and quantization are performed to obtain multiple coefficients, the multiple coefficients are multiple quantized feature coefficients, and the first context information includes part or all of the multiple quantized feature coefficients.
在一个可能的设计中,根据第一系数的上下文信息得到第一概率估计结果,包括:In a possible design, the first probability estimation result is obtained according to the context information of the first coefficient, including:
获取第二概率分布模型,将第一上下文信息输入到第三概率估计网络中进行处理,得到第二概率分布模型的参数;根据第二概率分布模型和第二概率分布模型的参数得到第一概率估计结果;Obtain the second probability distribution model, input the first context information into the third probability estimation network for processing, and obtain the parameters of the second probability distribution model; obtain the first probability according to the parameters of the second probability distribution model and the second probability distribution model estimated results;
或者,or,
将第一上下文信息输入到第四概率估计模型中进行处理,得到概率估计结果;其中,第三概率估计网络和第四概率估计网络是神经网络实现的。The first context information is input into the fourth probability estimation model for processing to obtain a probability estimation result; wherein, the third probability estimation network and the fourth probability estimation network are realized by a neural network.
在此需要说明的是,图9所示实施例的具体实现过程可参见图7中的系数获取单元701、概率估计单元702和熵编码单元703的相关描述,在此不再叙述。It should be noted here that, for the specific implementation process of the embodiment shown in FIG. 9 , reference may be made to the relevant descriptions of the coefficient acquisition unit 701 , the probability estimation unit 702 and the entropy encoding unit 703 in FIG. 7 , which will not be described here again.
图10是示出基于本申请一种实施例的解码方法的过程1000的流程图。过程1000可由视频解码器30执行。过程1000描述为一系列的步骤或操作,应当理解的是,过程1000可以以各种顺序执行和/或同时发生,不限于图10所示的执行顺序。Fig. 10 is a flowchart showing a process 1000 of a decoding method based on an embodiment of the present application. Process 1000 may be performed by video decoder 30 . The process 1000 is described as a series of steps or operations. It should be understood that the process 1000 may be performed in various orders and/or concurrently, and is not limited to the order of execution shown in FIG. 10 .
如图10所示,该解码方法包括:As shown in Figure 10, the decoding method includes:
S1001、从压缩码流解码获得第一概率估计结果。S1001. Obtain a first probability estimation result from decoding a compressed code stream.
S1002、根据第一概率估计结果进行采样得到第一估计系数。S1002. Perform sampling according to the first probability estimation result to obtain a first estimation coefficient.
S1003、根据第一估计系数得到第一重建图像。S1003. Obtain a first reconstructed image according to the first estimation coefficient.
在一个可能的设计中,本解码方法还包括:In a possible design, the decoding method also includes:
从压缩码流解码获得第二概率估计结果;根据第二概率估计结果进行采样得到第二估计系数;根据第一估计系数得到第一重建图像,包括:根据第一估计系数和第二估计系数得到第一重建图像。Decoding the compressed code stream to obtain a second probability estimation result; performing sampling according to the second probability estimation result to obtain a second estimation coefficient; obtaining a first reconstructed image according to the first estimation coefficient, including: obtaining according to the first estimation coefficient and the second estimation coefficient First reconstruct the image.
在一个可能的设计中,从压缩码流解码获得第一概率估计结果,包括:In a possible design, the first probability estimation result is obtained from decoding the compressed code stream, including:
从压缩码流中解码出第一标识;若第一标识的值为第一值,从压缩码流解码获得第一概率估计结果,包括:Decoding the first identifier from the compressed code stream; if the value of the first identifier is the first value, decoding the compressed code stream to obtain a first probability estimation result, including:
从压缩码流中解码出概率估计结果集合和预置区域的索引;预置区域包括第一估计系数,预置区域为第一重建图像中的一个区域,根据索引从概率估计结果集合中确定出预置区域的概率估计结果,第一概率估计结果为预置区域的概率估计结果;其中,第一标识的值为第一值用于指示采样得到预置区域内的所有估计系时均使用述预置区域的概率估计结果。Decode the probability estimation result set and the index of the preset area from the compressed code stream; the preset area includes the first estimated coefficient, the preset area is an area in the first reconstructed image, and is determined from the probability estimation result set according to the index The probability estimation result of the preset area, the first probability estimation result is the probability estimation result of the preset area; wherein, the value of the first identifier is the first value used to indicate that all estimation systems in the preset area are sampled using the above Probability estimation results for preset regions.
在一个可能的设计中,本解码方法还包括:In a possible design, the decoding method also includes:
从压缩码流中解码出第一标识;若第一标识的值为第一值,从压缩码流解码获得第一概率估计结果,包括:从压缩码流中解码出预置区域的概率估计结果和预置区域的尺寸信息;预置区域包括第一估计系数,预置区域为第一重建图像中的一个区域;预置区域的概率估计结果为第一概率估计结果;其中,第一标识的值为第一值用于指示采样得到预置区域内的所有待估计系时均使用预置区域的概率估计结果。Decoding the first identifier from the compressed code stream; if the value of the first identifier is the first value, decoding the compressed code stream to obtain a first probability estimation result, including: decoding the probability estimation result of the preset area from the compressed code stream and the size information of the preset area; the preset area includes the first estimation coefficient, and the preset area is an area in the first reconstructed image; the probability estimation result of the preset area is the first probability estimation result; wherein, the first identified The value is the first value and is used to indicate that the probability estimation result of the preset area is used when all the systems to be estimated in the preset area are obtained by sampling.
在一个可能的设计中,第一估计系数和第二估计系数属于同一预置区域,预置区域为第一重建图像中的一个区域,本解码方法还包括:In a possible design, the first estimated coefficient and the second estimated coefficient belong to the same preset area, and the preset area is an area in the first reconstructed image, and the decoding method further includes:
从压缩码流中解码出第一标识;若第一标识的值为第二值,第一标识的值为第二值用于指示采样得到预置区域内的所有待估计系时使用各自的概率估计结果。Decode the first identifier from the compressed code stream; if the value of the first identifier is the second value, the value of the first identifier is the second value, which is used to indicate that when sampling all the systems to be estimated in the preset area, use their respective probabilities Estimated results.
在一个可能的设计中,第一概率估计结果包括高斯分布的均值和方差,根据第一概率估 计结果进行采样得到第一估计系数,包括:In a possible design, the first probability estimation result includes the mean and variance of the Gaussian distribution, and the first estimation coefficient is obtained by sampling according to the first probability estimation result, including:
获取第一随机数;根据第一随机数确定第一参考值,该第一参考值服从高斯分布;根据第一参考值和第一概率估计结果的均值和方差确定第一估计系数。Acquiring a first random number; determining a first reference value according to the first random number, and the first reference value obeys a Gaussian distribution; determining a first estimation coefficient according to the first reference value and the mean value and variance of the first probability estimation result.
在一个可能的设计中,本解码方法还包括:In a possible design, the decoding method also includes:
对第一概率估计结果的方差进行预处理,以得到处理后的方差;Preprocessing the variance of the first probability estimation result to obtain the processed variance;
根据第一参考值和第一概率估计结果的均值和方差确定第一估计系数,包括:Determining the first estimated coefficient according to the first reference value and the mean value and variance of the first probability estimation result, including:
根据第一参考值、第一概率估计结果的均值及处理后的方差确定第一估计系数。The first estimation coefficient is determined according to the first reference value, the mean value of the first probability estimation result and the processed variance.
在一个可能的设计中,对第一概率估计结果的方差进行预处理,以得到处理后的方差,包括:In one possible design, the variance of the first probability estimation result is preprocessed to obtain the processed variance, including:
将第一概率分布的方差置0作为处理后的方差。Set the variance of the first probability distribution to 0 as the processed variance.
在一个可能的设计中,第一估计系数为量化小波系数,或者,小波系数,或者量化DCT系数,或者DCT系数,或者特征系数,或者量化特征系数,对第一概率分布的方差进行预处理,以得到处理后的方差,包括:In a possible design, the first estimated coefficient is a quantized wavelet coefficient, or, a wavelet coefficient, or a quantized DCT coefficient, or a DCT coefficient, or a feature coefficient, or a quantized feature coefficient, and the variance of the first probability distribution is preprocessed, To get the processed variance, including:
根据第一估计系数的缩放因子对第一概率分布的方差进行预处理,以得到处理后的方差,Preprocess the variance of the first probability distribution according to the scaling factor of the first estimated coefficient to obtain the processed variance,
第一估计系数的缩放因子和第二估计系数的缩放因子相同;或者,第一估计系数的缩放因子和第二估计系数的缩放因子不同;或者The scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient are the same; or, the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient are different; or
在第一估计系数和第二估计系数为量化小波系数或者为小波系数时,若第一估计系数和第二估计系数属于同一个子带,则第一估计系数的缩放因子和第二估计系数的缩放因子相同;或者若第一估计系数和第二估计系数属于不同子带,则第一估计系数的缩放因子和第二估计系数的缩放因子不同;或者第一估计系数的缩放因子是根据第一估计系数所属的图像块的纹理复杂度确定的;When the first estimated coefficient and the second estimated coefficient are quantized wavelet coefficients or wavelet coefficients, if the first estimated coefficient and the second estimated coefficient belong to the same subband, the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient The factors are the same; or if the first estimated coefficient and the second estimated coefficient belong to different subbands, the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient are different; or the scaling factor of the first estimated coefficient is based on the first estimated The texture complexity of the image block to which the coefficient belongs is determined;
或者,or,
在第一估计系数和第二估计系数为量化DCT系数或者为DCT系数时,若第一估计系数和第二估计系数属于同一个频带,则第一估计系数的缩放因子和第二估计系数的缩放因子相同;或者若第一估计系数和第二估计系数属于不同频带,则第一估计系数的缩放因子和第二估计系数的缩放因子不同;或者第一估计系数的缩放因子是根据第一估计系数所属的频带的纹理复杂度确定的;When the first estimated coefficient and the second estimated coefficient are quantized DCT coefficients or DCT coefficients, if the first estimated coefficient and the second estimated coefficient belong to the same frequency band, the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient The factors are the same; or if the first estimated coefficient and the second estimated coefficient belong to different frequency bands, the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient are different; or the scaling factor of the first estimated coefficient is based on the first estimated coefficient The texture complexity of the band to which it belongs is determined;
或者,or,
在第一估计系数和第二估计系数为特征系数或者量化特征系数时,若第一估计系数和第二估计系数属于同一通道,则第一估计系数的缩放因子和第二估计系数的缩放因子相同;或者若第一估计系数和第二估计系数属于不同通道,则第一估计系数的缩放因子和第二估计系数的缩放因子不同;或者第一估计系数的缩放因子是根据第一估计系数所属的通道的纹理复杂度确定的。When the first estimated coefficient and the second estimated coefficient are characteristic coefficients or quantized characteristic coefficients, if the first estimated coefficient and the second estimated coefficient belong to the same channel, the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient are the same ; or if the first estimated coefficient and the second estimated coefficient belong to different channels, the scaling factor of the first estimated coefficient is different from the scaling factor of the second estimated coefficient; or the scaling factor of the first estimated coefficient is based on the channel to which the first estimated coefficient belongs The channel's texture complexity is determined.
在一个可能的设计中,第一估计系数和第二估计系数为像素值,对第一概率估计结果的方差进行预处理,以得到处理后的方差,包括:In a possible design, the first estimated coefficient and the second estimated coefficient are pixel values, and the variance of the first probability estimation result is preprocessed to obtain the processed variance, including:
根据第一系数的缩放因子对第一概率估计结果的方差进行预处理,以得到处理后的方差,Preprocess the variance of the first probability estimate according to the scaling factor of the first coefficient to obtain the processed variance,
第一估计系数的缩放因子和第二估计系数的缩放因子相同,或者第一估计系数的缩放因子和第二估计系数的缩放因子不相同;或者,第一估计系数的缩放因子是根据第一估计系数所属的图像块的纹理复杂度确定的。The scaling factor of the first estimated coefficient is the same as the scaling factor of the second estimated coefficient, or the scaling factor of the first estimated coefficient is different from the scaling factor of the second estimated coefficient; or, the scaling factor of the first estimated coefficient is based on the first estimate The texture complexity of the image block to which the coefficient belongs is determined.
在一个可能的设计中,根据第一估计系数和第二估计系数得到第一重建图像,包括:In a possible design, the first reconstructed image is obtained according to the first estimated coefficient and the second estimated coefficient, including:
若第一估计系数和第二估计系数为量化小波系数,对第一估计系数和第二估计系数进行 反量化和小波反变换得到第一重建图像,或者,若第一估计系数和第二估计系数为小波系数,对第一估计系数和第二估计系数进行小波反变换得到第一重建图像,或者,若第一估计系数和第二估计系数为量化DCT系数,对第一估计系数和第二估计系数进行反量化和反DCT得到第一重建图像,或者,第一估计系数和第二估计系数为DCT系数,对第一估计系数和第二估计系数进行反DCT得到第一重建图像。If the first estimated coefficient and the second estimated coefficient are quantized wavelet coefficients, inverse quantization and wavelet inverse transform are performed on the first estimated coefficient and the second estimated coefficient to obtain the first reconstructed image, or, if the first estimated coefficient and the second estimated coefficient is the wavelet coefficient, perform wavelet inverse transform on the first estimated coefficient and the second estimated coefficient to obtain the first reconstructed image, or, if the first estimated coefficient and the second estimated coefficient are quantized DCT coefficients, the first estimated coefficient and the second estimated coefficient Perform inverse quantization and inverse DCT on the coefficients to obtain the first reconstructed image, or, the first estimated coefficient and the second estimated coefficient are DCT coefficients, and perform inverse DCT on the first estimated coefficient and the second estimated coefficient to obtain the first reconstructed image.
在一个可能的设计中,本解码方法还包括:In a possible design, the decoding method also includes:
从压缩码流中解码得到多个重建系数;根据多个重建系数得到第二重建图像。A plurality of reconstruction coefficients are obtained by decoding the compressed code stream; and a second reconstruction image is obtained according to the plurality of reconstruction coefficients.
在一个可能的设计中,根据多个系数得到第二重建图像,包括:In one possible design, the second reconstructed image is derived from a plurality of coefficients, including:
若多个重建系数为量化小波系数,对多个重建系数进行反量化和小波反变换得到第二重建图像,或者,若多个重建系数为小波系数,对多个重建系数进行小波反变换得到第二重建图像,或者,若多个重建系数为量化DCT系数,对多个重建系数进行反量化和反DCT得到第二重建图像,或者,若多个重建系数为DCT系数,对多个重建系数进行反DCT得到第二重建图像。If the multiple reconstruction coefficients are quantized wavelet coefficients, perform inverse quantization and wavelet inverse transform on the multiple reconstruction coefficients to obtain the second reconstructed image, or, if the multiple reconstruction coefficients are wavelet coefficients, perform wavelet inverse transform on the multiple reconstruction coefficients to obtain the second reconstructed image Two reconstructed images, or, if the plurality of reconstruction coefficients are quantized DCT coefficients, perform inverse quantization and inverse DCT on the plurality of reconstruction coefficients to obtain a second reconstructed image, or, if the plurality of reconstruction coefficients are DCT coefficients, perform inverse quantization on the plurality of reconstruction coefficients The inverse DCT obtains the second reconstructed image.
在此需要说明的是,图10所示实施例的具体实现过程可参见图5所示实施例中的解码单元504、采样单元505和反向变换单元506、及图7所示实施例中的熵解码单元704、采样单元705、第一重建单元706和第二重建单元707的相关描述,在此不再叙述。It should be noted here that the specific implementation process of the embodiment shown in FIG. 10 can refer to the decoding unit 504, the sampling unit 505, and the inverse transformation unit 506 in the embodiment shown in FIG. 5, and the Relevant descriptions of the entropy decoding unit 704, the sampling unit 705, the first reconstruction unit 706 and the second reconstruction unit 707 are omitted here.
本领域技术人员能够领会,结合本文公开描述的各种说明性逻辑框、模块和算法步骤所描述的功能可以硬件、软件、固件或其任何组合来实施。如果以软件来实施,那么各种说明性逻辑框、模块、和步骤描述的功能可作为一或多个指令或代码在计算机可读媒体上存储或传输,且由基于硬件的处理单元执行。计算机可读媒体可包含计算机可读存储媒体,其对应于有形媒体,例如数据存储媒体,或包括任何促进将计算机程序从一处传送到另一处的媒体(例如,基于通信协议)的通信媒体。以此方式,计算机可读媒体大体上可对应于(1)非暂时性的有形计算机可读存储媒体,或(2)通信媒体,例如信号或载波。数据存储媒体可为可由一或多个计算机或一或多个处理器存取以检索用于实施本申请中描述的技术的指令、代码和/或数据结构的任何可用媒体。计算机程序产品可包含计算机可读媒体。Those of skill in the art would appreciate that the functions described in conjunction with the various illustrative logical blocks, modules, and algorithm steps disclosed herein may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions described by the various illustrative logical blocks, modules, and steps may be stored or transmitted as one or more instructions or code on a computer-readable medium and executed by a hardware-based processing unit. Computer-readable media may include computer-readable storage media, which correspond to tangible media, such as data storage media, or communication media including any medium that facilitates transfer of a computer program from one place to another (eg, based on a communication protocol) . In this manner, a computer-readable medium may generally correspond to (1) a non-transitory tangible computer-readable storage medium, or (2) a communication medium, such as a signal or carrier wave. Data storage media may be any available media that can be accessed by one or more computers or one or more processors to retrieve instructions, code and/or data structures for implementation of the techniques described in this application. A computer program product may include a computer readable medium.
作为实例而非限制,此类计算机可读存储媒体可包括RAM、ROM、EEPROM、CD-ROM或其它光盘存储装置、磁盘存储装置或其它磁性存储装置、快闪存储器或可用来存储指令或数据结构的形式的所要程序代码并且可由计算机存取的任何其它媒体。并且,任何连接被恰当地称作计算机可读媒体。举例来说,如果使用同轴缆线、光纤缆线、双绞线、数字订户线(DSL)或例如红外线、无线电和微波等无线技术从网站、服务器或其它远程源传输指令,那么同轴缆线、光纤缆线、双绞线、DSL或例如红外线、无线电和微波等无线技术包含在媒体的定义中。但是,应理解,所述计算机可读存储媒体和数据存储媒体并不包括连接、载波、信号或其它暂时媒体,而是实际上针对于非暂时性有形存储媒体。如本文中所使用,磁盘和光盘包含压缩光盘(CD)、激光光盘、光学光盘、数字多功能光盘(DVD)和蓝光光盘,其中磁盘通常以磁性方式再现数据,而光盘利用激光以光学方式再现数据。以上各项的组合也应包含在计算机可读媒体的范围内。By way of example and not limitation, such computer-readable storage media may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage, flash memory, or any other medium that can contain the desired program code in the form of a computer and can be accessed by a computer. Also, any connection is properly termed a computer-readable medium. For example, if instructions are transmitted from a website, server, or other remote source using coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable Wire, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of media. It should be understood, however, that computer-readable storage media and data storage media do not include connections, carrier waves, signals, or other transitory media, but are instead directed to non-transitory, tangible storage media. Disk and disc, as used herein, includes compact disc (CD), laser disc, optical disc, digital versatile disc (DVD) and blu-ray disc where disks usually reproduce data magnetically, while discs reproduce optically with lasers data. Combinations of the above should also be included within the scope of computer-readable media.
可通过例如一或多个数字信号处理器(DSP)、通用微处理器、专用集成电路(ASIC)、现场可编程逻辑阵列(FPGA)或其它等效集成或离散逻辑电路等一或多个处理器来执行指令。因此,如本文中所使用的术语“处理器”可指前述结构或适合于实施本文中所描述的技术的任一其它结构中的任一者。另外,在一些方面中,本文中所描述的各种说明性逻辑框、模块、和步骤 所描述的功能可以提供于经配置以用于编码和解码的专用硬件和/或软件模块内,或者并入在组合编解码器中。而且,所述技术可完全实施于一或多个电路或逻辑元件中。can be processed by one or more digital signal processors (DSPs), general purpose microprocessors, application specific integrated circuits (ASICs), field programmable logic arrays (FPGAs), or other equivalent integrated or discrete logic circuits. device to execute instructions. Accordingly, the term "processor," as used herein may refer to any of the foregoing structure or any other structure suitable for implementation of the techniques described herein. Additionally, in some aspects, the functionality described by the various illustrative logical blocks, modules, and steps described herein may be provided within dedicated hardware and/or software modules configured for encoding and decoding, or in conjunction with into the combined codec. Also, the techniques may be fully implemented in one or more circuits or logic elements.
本申请的技术可在各种各样的装置或设备中实施,包含无线手持机、集成电路(IC)或一组IC(例如,芯片组)。本申请中描述各种组件、模块或单元是为了强调用于执行所揭示的技术的装置的功能方面,但未必需要由不同硬件单元实现。实际上,如上文所描述,各种单元可结合合适的软件和/或固件组合在编码解码器硬件单元中,或者通过互操作硬件单元(包含如上文所描述的一或多个处理器)来提供。The techniques of the present application may be implemented in a wide variety of devices or devices, including wireless handsets, an integrated circuit (IC), or a group of ICs (eg, a chipset). Various components, modules, or units are described in this application to emphasize functional aspects of means for performing the disclosed techniques, but do not necessarily require realization by different hardware units. Indeed, as described above, the various units may be combined in a codec hardware unit in conjunction with suitable software and/or firmware, or by interoperating hardware units (comprising one or more processors as described above) to supply.
以上所述,仅为本申请示例性的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到的变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应该以权利要求的保护范围为准。The above is only an exemplary embodiment of the present application, but the scope of protection of the present application is not limited thereto. Any skilled person familiar with the technical field can easily think of changes or Replacement should be covered within the protection scope of this application. Therefore, the protection scope of the present application should be based on the protection scope of the claims.

Claims (50)

  1. 一种编码设备实现的图像处理方法,其特征在于,包括:An image processing method implemented by a coding device, comprising:
    获取第一图像,所述第一图像为待编码图像或已解码图像,Acquiring a first image, where the first image is an image to be encoded or an image that has been decoded,
    根据第一上下文信息进行概率估计得到第一概率估计结果,其中所述第一上下文信息从所述第一图像得到的;performing probability estimation according to the first context information to obtain a first probability estimation result, wherein the first context information is obtained from the first image;
    将所述第一概率估计结果写入所述压缩码流。Writing the first probability estimation result into the compressed code stream.
  2. 根据权利要求1所述的方法,其特征在于,所述方法还包括:The method according to claim 1, further comprising:
    获取第二图像,所述第二图像为待编码图像或已解码图像,且所述第二图像与所述第一图像不相同;Acquiring a second image, the second image is an image to be encoded or a decoded image, and the second image is different from the first image;
    所述根据第一上下文信息进行概率估计得到所述第一概率估计结果,包括:The performing probability estimation according to the first context information to obtain the first probability estimation result includes:
    根据第一上下文信息和第二上下文信息进行概率估计得到所述第一概率估计结果,其中所述第二上下文信息从所述第二图像得到的。Performing probability estimation according to the first context information and second context information to obtain the first probability estimation result, wherein the second context information is obtained from the second image.
  3. 根据权利要求1或2所述的方法,其特征在于,所述根据第一上下文信息进行概率估计得到所述第一概率估计结果,包括:The method according to claim 1 or 2, wherein the probability estimation according to the first context information to obtain the first probability estimation result comprises:
    根据第一数据的上下文信息进行概率估计得到所述第一数据的概率估计结果;performing probability estimation according to the context information of the first data to obtain a probability estimation result of the first data;
    根据第二数据的上下文信息进行概率估计得到所述第二数据的概率估计结果;performing probability estimation according to the context information of the second data to obtain a probability estimation result of the second data;
    其中,所述第一数据和第二数据是根据所述第一图像得到的;Wherein, the first data and the second data are obtained according to the first image;
    所述第一上下文信息包括所述第一数据的上下文信息和所述第二数据的上下文信息。The first context information includes context information of the first data and context information of the second data.
  4. 根据权利要求1或2所述的方法,其特征在于,所述第一概率估计结果包括第一预置区域的概率估计结果,所述第一预置区域包括第一数据和第二数据,所述第一预置区域位于所述第一图像中、或者位于对所述第一图像进行变换得到的图像中,所述根据第一上下文信息进行概率估计得到所述第一概率估计结果,包括:The method according to claim 1 or 2, wherein the first probability estimation result comprises a probability estimation result of a first preset area, and the first preset area includes first data and second data, so The first preset area is located in the first image, or in an image obtained by transforming the first image, and performing probability estimation according to the first context information to obtain the first probability estimation result includes:
    根据所述第一数据的上下文信息进行概率估计得到所述第一数据的概率估计结果;performing probability estimation according to the context information of the first data to obtain a probability estimation result of the first data;
    根据所述第二数据的上下文信息进行概率估计得到所述第二数据的概率估计结果,其中所述第一上下文信息包括所述第一数据的上下文信息和所述第二数据的上下文信息;performing probability estimation according to context information of the second data to obtain a probability estimation result of the second data, wherein the first context information includes context information of the first data and context information of the second data;
    根据所述第一数据的概率估计结果和所述第二数据的概率估计结果选择得到所述第一预置区域的概率估计结果,所述第一概率估计结果包括所述第一预置区域的概率估计结果。Selecting and obtaining the probability estimation result of the first preset area according to the probability estimation result of the first data and the probability estimation result of the second data, the first probability estimation result including the probability estimation result of the first preset area Probability estimate results.
  5. 根据权利要求1或2所述的方法,其特征在于,所述第一概率估计结果包括第二预置区域的概率估计结果,所述第二预置区域位于所述第一图像中、或者位于对所述第一图像进行变换得到的图像中,所述第一上下文信息包括所述第二预置区域的上下文信息,所述根据第一上下文信息进行概率估计得到所述第一概率估计结果,包括:The method according to claim 1 or 2, wherein the first probability estimation result includes a probability estimation result of a second preset area, and the second preset area is located in the first image, or located in In the image obtained by transforming the first image, the first context information includes the context information of the second preset area, and performing probability estimation according to the first context information to obtain the first probability estimation result, include:
    根据所述第二预置区域的上下文信息进行概率估计得到所述第二预置区域的概率估计结果,所述第一概率估计结果包括所述第二预置区域的概率估计结果。Probability estimation is performed according to the context information of the second preset area to obtain a probability estimation result of the second preset area, and the first probability estimation result includes the probability estimation result of the second preset area.
  6. 根据权利要求4或5所述的方法,其特征在于,所述方法还包括:The method according to claim 4 or 5, characterized in that the method further comprises:
    将所述第一预置区域的第一标识的值置为第一值,以用于指示在采样得到所述第一预置 区域中的估计系数时均使用所述第一预置区域的概率估计结果;Set the value of the first identifier of the first preset area as the first value, which is used to indicate the probability of using the first preset area when sampling the estimated coefficients in the first preset area estimated results;
    将所述第一预置区域的概率估计结果保存至概率估计结果集合中,并记录所述第一预置区域的概率估计结果在所述概率估计结果集合的索引;saving the probability estimation result of the first preset area in a probability estimation result set, and recording the index of the probability estimation result of the first preset area in the probability estimation result set;
    所述将所述第一概率估计结果写入所述压缩码流,包括:The writing the first probability estimation result into the compressed code stream includes:
    将所述概率估计结果集合,所述索引、所述第一预置区域的尺寸信息及第一标识写入所述压缩码流。Writing the probability estimation result set, the index, the size information of the first preset area and the first identifier into the compressed code stream.
  7. 根据权利要求4或5所述的方法,其特征在于,所述方法还包括:The method according to claim 4 or 5, characterized in that the method further comprises:
    将所述第一预置区域的第一标识的值置为第一值,以用于指示在采样得到所述第一预置区域中的估计系数时均使用所述第一预置区域的概率估计结果;Set the value of the first identifier of the first preset area as the first value, which is used to indicate the probability of using the first preset area when sampling the estimated coefficients in the first preset area estimated results;
    根据所述第一预置区域的缩放因子对所述第一预置区域的概率估计结果进行预处理,得到处理后的概率估计结果,将所述处理后的概率估计结果保存至概率估计结果集合中,并记录所述处理后的概率估计结果在所述概率估计结果集合的索引;Preprocessing the probability estimation result of the first preset area according to the scaling factor of the first preset area to obtain a processed probability estimation result, and saving the processed probability estimation result to a probability estimation result set , and record the index of the processed probability estimation result in the probability estimation result set;
    所述将所述第一概率估计结果写入所述压缩码流,包括:The writing the first probability estimation result into the compressed code stream includes:
    将所述概率估计结果集合,所述索引、所述第一预置区域的尺寸信息及所述第一标识写入所述压缩码流。Writing the probability estimation result set, the index, the size information of the first preset area and the first identifier into the compressed code stream.
  8. 根据权利要求4或5所述的方法,其特征在于,所述方法还包括:The method according to claim 4 or 5, characterized in that the method further comprises:
    将所述第一预置区域的第一标识的值置为第一值,以用于指示在采样得到所述第一预置区域中的估计系数时均使用所述第一预置区域的概率估计结果;Set the value of the first identifier of the first preset area as the first value, which is used to indicate the probability of using the first preset area when sampling the estimated coefficients in the first preset area estimated results;
    所述将所述第一概率估计结果写入所述压缩码流,包括:The writing the first probability estimation result into the compressed code stream includes:
    将所述第一预置区域的概率估计结果、所述第一预置区域的尺寸信息和所述第一标识写入所述码流。Writing the probability estimation result of the first preset area, the size information of the first preset area, and the first identifier into the code stream.
  9. 根据权利要求3-4、6和8任一所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 3-4, 6 and 8, wherein the method further comprises:
    对所述第一数据的概率估计结果进行预处理,得到处理后的概率估计结果。The probability estimation result of the first data is preprocessed to obtain the probability estimation result after processing.
  10. 根据权利要求9所述的方法,其特征在于,所述第一数据的概率估计结果包括高斯分布的均值和方差,所述对所述第一数据的概率估计结果进行预处理,得到处理后的概率估计结果,包括:The method according to claim 9, wherein the probability estimation result of the first data includes the mean and variance of the Gaussian distribution, and the probability estimation result of the first data is preprocessed to obtain the processed Probability estimation results, including:
    将所述高斯分布的方差置为0作为处理后的方差,其中,所述处理后的概率估计结果包括高斯分布的均值和处理后的方差。Setting the variance of the Gaussian distribution to 0 as the processed variance, wherein the processed probability estimation result includes the mean value of the Gaussian distribution and the processed variance.
  11. 根据权利要求9所述的方法,其特征在于,所述第一数据的概率估计结果包括高斯分布的均值和方差,所述对所述第一数据的概率估计结果进行预处理,得到处理后的概率估计结果,包括:根据所述第一数据的缩放因子所述对所述高斯分布的方差进行预处理,以得到处理后的方差,其中,所述处理后的概率估计结果包括高斯分布的均值和处理后的方差;所述方法还包括:根据所述第二系数的缩放因子对所述第二概率分布的方差进行预处理,其中:The method according to claim 9, wherein the probability estimation result of the first data includes the mean and variance of the Gaussian distribution, and the probability estimation result of the first data is preprocessed to obtain the processed The probability estimation result includes: preprocessing the variance of the Gaussian distribution according to the scaling factor of the first data to obtain the processed variance, wherein the processed probability estimation result includes the mean value of the Gaussian distribution and the processed variance; the method also includes: preprocessing the variance of the second probability distribution according to the scaling factor of the second coefficient, wherein:
    所述第一数据的缩放因子和所述第二数据的缩放因子相同;或者,The scaling factor of the first data is the same as the scaling factor of the second data; or,
    所述第一数据的缩放因子和所述第二数据的缩放因子不同;或者,a scaling factor of the first data and a scaling factor of the second data are different; or,
    若所述第一数据和所述第二数据在所述第一图像中属于同一个图像块,则所述第一数据的缩放因子和所述第二数据的缩放因子相同;或者若所述第一数据和所述第二数据属于不同图像块,则所述第一数据的缩放因子和所述第二数据的缩放因子不同;或者所述第一数据的缩放因子是根据所述第一数据所属的图像块的纹理复杂度确定的;或者,If the first data and the second data belong to the same image block in the first image, the scaling factor of the first data is the same as the scaling factor of the second data; or if the first data If the first data and the second data belong to different image blocks, the scaling factor of the first data is different from the scaling factor of the second data; or the scaling factor of the first data is based on the The texture complexity of the image block is determined; or,
    若所述第一数据和所述第二数据属于对所述第一图像进行小波变换得到的子带中的一个子带,则所述第一数据的缩放因子和所述第二数据的缩放因子相同;或者若所述第一数据和所述第二数据属于不同子带,则所述第一数据的缩放因子和所述第二数据的缩放因子不同;或者所述第一数据的缩放因子是根据所述第一数据所属的子带的纹理复杂度确定的;If the first data and the second data belong to one of the subbands obtained by performing wavelet transformation on the first image, the scaling factor of the first data and the scaling factor of the second data the same; or if the first data and the second data belong to different subbands, the scaling factor of the first data is different from the scaling factor of the second data; or the scaling factor of the first data is determined according to the texture complexity of the subband to which the first data belongs;
    或者,or,
    若所述第一数据和所述第二数据属于对所述第一图像进行DCT得到的频带中一个频带,则所述第一数据的缩放因子和所述第二数据的缩放因子相同;或者若所述第一数据和所述第二数据属于不同频带,则所述第一数据的缩放因子和所述第二数据的缩放因子不同;或者所述第一数据的缩放因子是根据所述第一数据所属的频带的纹理复杂度确定的;If the first data and the second data belong to one of the frequency bands obtained by performing DCT on the first image, the scaling factor of the first data is the same as the scaling factor of the second data; or if The first data and the second data belong to different frequency bands, then the scaling factor of the first data is different from the scaling factor of the second data; or the scaling factor of the first data is based on the first Determined by the texture complexity of the frequency band to which the data belongs;
    或者,or,
    若所述第一数据和所述第二数据属于对所述第一图像进行特征提取得到的三维特征图的同一通道,则所述第一数据的缩放因子和所述第二数据的缩放因子相同;或者若所述第一数据和所述第二数据属于不同通道,则所述第一数据的缩放因子和所述第二数据的缩放因子不同;或者所述第一数据的缩放因子是根据所述第一数据所属的通道的纹理复杂度确定的。If the first data and the second data belong to the same channel of the three-dimensional feature map obtained by performing feature extraction on the first image, the scaling factor of the first data is the same as the scaling factor of the second data ; or if the first data and the second data belong to different channels, the scaling factor of the first data is different from the scaling factor of the second data; or the scaling factor of the first data is based on the determined by the texture complexity of the channel to which the first data belongs.
  12. 根据权利要求5所述的方法,其特征在于,所述方法还包括:The method according to claim 5, wherein the method further comprises:
    对所述第二预置区域的概率估计结果进行预处理,得到处理后的概率估计结果。The probability estimation result of the second preset area is preprocessed to obtain a processed probability estimation result.
  13. 根据权利要求12所述的方法,其特征在于,所述第一数据的概率估计结果包括高斯分布的均值和方差,所述对所述第一数据的概率估计结果进行预处理,得到处理后的概率估计结果,包括:The method according to claim 12, wherein the probability estimation result of the first data includes the mean and variance of the Gaussian distribution, and the probability estimation result of the first data is preprocessed to obtain the processed Probability estimation results, including:
    将所述高斯分布的方差置为0作为第一方差,其中,所述处理后的概率估计结果包括高斯分布的均值和所述第一方差,或者,Setting the variance of the Gaussian distribution to 0 as the first variance, wherein the processed probability estimation result includes the mean value of the Gaussian distribution and the first variance, or,
    根据第二预置区域的缩放因子对所述高斯分布的方差进行处理,以得到第二方差,其中,所述处理后的概率估计结果包括高斯分布的均值和所述第二方差,第一预置区域的缩放因子和所述第二预制区域的缩放因子相同或者不同。The variance of the Gaussian distribution is processed according to the scaling factor of the second preset area to obtain a second variance, wherein the processed probability estimation result includes the mean value of the Gaussian distribution and the second variance, and the first preset The scaling factor of the prefabricated area is the same as or different from the scaling factor of the second prefabricated area.
  14. 根据权利要求1-13任一项所述的方法,其特征在于,所述第一上下文信息包括在所述第一图像中部分或者全部像素值。The method according to any one of claims 1-13, wherein the first context information includes some or all pixel values in the first image.
  15. 根据权利要求1-13任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 1-13, wherein the method further comprises:
    对所述第一图像进行变换,以得到第一变换图像;transforming the first image to obtain a first transformed image;
    其中,若所述变换为小波变换,则所述第一上下文信息包括在所述第一变换图像中部分或者全部的系数,该系数为小波系数或者量化小波系数,或者;Wherein, if the transformation is wavelet transformation, the first context information includes some or all coefficients in the first transformed image, and the coefficients are wavelet coefficients or quantized wavelet coefficients, or;
    若所述变换为离散余弦变换DCT,则所述第一上下文信息包括在所述第一变换图像中部分或者全部的系数,该系数为DCT系数或者量化DCT系数;或者,If the transform is a discrete cosine transform DCT, the first context information includes some or all coefficients in the first transformed image, and the coefficients are DCT coefficients or quantized DCT coefficients; or,
    若所述变换为特征变换,则所述第一上下文信息包括在所述第一变换图像中部分或者全 部的系数,该系数为特征系数或者量化特征系数。If the transformation is a feature transformation, the first context information includes part or all of the coefficients in the first transformed image, and the coefficients are feature coefficients or quantized feature coefficients.
  16. 根据权利要求1-15任一项所述的方法,所述根据第一上下文信息进行概率估计得到第一概率估计结果,包括:According to the method according to any one of claims 1-15, the probability estimation according to the first context information to obtain the first probability estimation result comprises:
    将所述第一上下文信息输入到第一概率估计网络中进行处理,得到所述第一概率分布模型的参数;所述第一概率估计结果包括第一概率分布模型的参数;Inputting the first context information into a first probability estimation network for processing to obtain parameters of the first probability distribution model; the first probability estimation result includes parameters of the first probability distribution model;
    或者,or,
    将所述第一上下文信息输入到第二概率估计网络中进行处理,得到目标概率分布,所述第一概率估计结果包括所述目标概率分布的参数;inputting the first context information into a second probability estimation network for processing to obtain a target probability distribution, the first probability estimation result including parameters of the target probability distribution;
    其中,所述第一概率估计网络和所述第二概率估计网络是神经网络实现的。Wherein, the first probability estimation network and the second probability estimation network are realized by a neural network.
  17. 一种编码设备实现的编码方法,其特征在于,包括:An encoding method implemented by an encoding device, characterized in that it comprises:
    根据待编码图像获得多个系数,所述多个系数包括第一系数;obtaining a plurality of coefficients according to the image to be encoded, the plurality of coefficients including a first coefficient;
    根据所述第一系数的上下文信息得到第一概率估计结果;Obtaining a first probability estimation result according to the context information of the first coefficient;
    将所述第一系数和所述第一概率估计结果写入压缩码流。Writing the first coefficient and the first probability estimation result into a compressed code stream.
  18. 根据权利要求17所述的方法,其特征在于,所述多个系数还包括第二系数,所述方法还包括:The method according to claim 17, wherein the plurality of coefficients further comprises a second coefficient, and the method further comprises:
    根据所述第二系数的上下文信息得到第二概率估计结果;Obtaining a second probability estimation result according to the context information of the second coefficient;
    所述将所述第一系数和所述第一概率估计结果写入压缩码流,包括:The writing the first coefficient and the first probability estimation result into the compressed code stream includes:
    将所述第一系数、所述第一概率估计结果、所述第二系数和所述第二概率估计结果写入所述压缩码流。writing the first coefficient, the first probability estimation result, the second coefficient and the second probability estimation result into the compressed code stream.
  19. 根据权利要求17所述的方法,其特征在于,所述多个系数还包括第二系数,所述第一系数和所述第二系数属于同一预置区域,所述预置区域位于所述待编码图像中,或者位于对所述待编码图像进行变换得到的图像中,所述根据所述第一系数的上下文信息得到第一概率估计结果,包括:The method according to claim 17, wherein the plurality of coefficients further include a second coefficient, the first coefficient and the second coefficient belong to the same preset area, and the preset area is located in the waiting In the coded image, or in the image obtained by transforming the image to be coded, the obtaining the first probability estimation result according to the context information of the first coefficient includes:
    根据所述第一系数的上下文信息进行概率估计得到第三概率估计结果;根据所述第二系数的上下文信息进行概率估计得到第二概率估计结果;从所述第三概率估计结果和所述第二概率估计结果中确定出所述第一概率估计结果;Perform probability estimation according to the context information of the first coefficient to obtain a third probability estimation result; perform probability estimation according to the context information of the second coefficient to obtain a second probability estimation result; obtain the second probability estimation result from the third probability estimation result and the first probability estimation result The first probability estimation result is determined from the second probability estimation result;
    所述将所述第一系数和所述第一概率估计结果写入压缩码流,包括:The writing the first coefficient and the first probability estimation result into the compressed code stream includes:
    将所述第一系数、所述第二系数和所述第一概率估计结果写入所述压缩码流。Writing the first coefficient, the second coefficient and the first probability estimation result into the compressed code stream.
  20. 根据权利要求17所述的方法,其特征在于,所述多个系数还包括第二系数,所述第一系数和所述第二系数属于同一预置区域,所述预置区域位于所述待编码图像中,或者位于对所述待编码图像进行变换得到的图像中,所述根据所述第一系数的上下文信息得到第一概率分布,包括:The method according to claim 17, wherein the plurality of coefficients further include a second coefficient, the first coefficient and the second coefficient belong to the same preset area, and the preset area is located in the waiting In the coded image, or in the image obtained by transforming the image to be coded, the obtaining the first probability distribution according to the context information of the first coefficient includes:
    根据所述预置区域的上下文信息进行概率估计得到第一概率估计结果;所述预置区域的上下文信息包括所述第一系数的上下文信息;performing probability estimation according to context information of the preset area to obtain a first probability estimation result; the context information of the preset area includes context information of the first coefficient;
    所述将所述第一系数和所述第一概率估计结果写入压缩码流,包括:The writing the first coefficient and the first probability estimation result into the compressed code stream includes:
    将所述第一系数、所述第二系数和所述第一概率估计结果写入压缩码流。Writing the first coefficient, the second coefficient and the first probability estimation result into a compressed code stream.
  21. 根据权利要求19或20所述的方法,其特征在于,所述方法还包括:The method according to claim 19 or 20, wherein the method further comprises:
    将所述预置区域的第一标识的值置为第一值,以用于指示在采样得到所述预置区域中的估计系数时均使用所述第一概率估计结果;Setting the value of the first identifier of the preset area as the first value, which is used to indicate that the first probability estimation result is used when sampling the estimated coefficients in the preset area;
    将所述第一概率估计结果保存至概率估计结果集合中,并记录所述第一概率估计结果在所述概率估计结果集合的索引;saving the first probability estimation result into a probability estimation result set, and recording the index of the first probability estimation result in the probability estimation result set;
    所述将所述第一系数、所述第二系数和所述第一概率估计结果写入压缩码流,包括:The writing the first coefficient, the second coefficient and the first probability estimation result into the compressed code stream includes:
    将所述第一系数、所述第二系数、所述概率估计结果集合,所述索引、所述预置区域的尺寸信息及所述第一标识写入所述压缩码流。Writing the first coefficient, the second coefficient, the probability estimation result set, the index, the size information of the preset area and the first identifier into the compressed code stream.
  22. 根据权利要求19或20所述的方法,其特征在于,所述方法还包括:The method according to claim 19 or 20, wherein the method further comprises:
    将所述预置区域的第一标识的值置为第一值,以用于指示在采样得到所述预置区域中的估计系数时均使用所述第一概率估计结果;Setting the value of the first identifier of the preset area as the first value, which is used to indicate that the first probability estimation result is used when sampling the estimated coefficients in the preset area;
    所述将所述第一系数、所述第二系数和所述第一概率估计结果写入压缩码流,包括:The writing the first coefficient, the second coefficient and the first probability estimation result into the compressed code stream includes:
    将所述第一系数、所述第二系数、所述第一概率估计结果、所述预置区域的尺寸信息及所述第一标识写入所述压缩码流。Writing the first coefficient, the second coefficient, the first probability estimation result, the size information of the preset area and the first identifier into the compressed code stream.
  23. 根据权利要求18所述的方法,其特征在于,所述第一系数和所述第二系数属于同一预置区域,所述方法还包括:The method according to claim 18, wherein the first coefficient and the second coefficient belong to the same preset area, and the method further comprises:
    将所述预置区域的第一标识的值置为第二值,以用于指示在采样得到所述预置区域中的估计系数时均使用各自的概率估计结果;Setting the value of the first identifier of the preset area to a second value, which is used to indicate that the respective probability estimation results are used when sampling the estimated coefficients in the preset area;
    所述将所述第一系数、第一概率估计结果、第二系数和所述第二概率估计结果写入所述压缩码流,包括:The writing the first coefficient, the first probability estimation result, the second coefficient and the second probability estimation result into the compressed code stream includes:
    将所述第一系数、所述第一概率估计结果、所述第二系数和所述第二概率估计结果和所述预置区域的第一标识写入所述压缩码流。Writing the first coefficient, the first probability estimation result, the second coefficient and the second probability estimation result, and the first identifier of the preset area into the compressed code stream.
  24. 根据权利要求17-19和21-23任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 17-19 and 21-23, wherein the method further comprises:
    对所述第一系数的概率估计结果进行预处理,得到处理后的概率估计结果。The probability estimation result of the first coefficient is preprocessed to obtain a processed probability estimation result.
  25. 根据权利要求24所述的方法,其特征在于,所述第一系数的概率估计结果包括高斯分布的均值和方差,所述对所述第一系数的概率估计结果进行预处理,得到处理后的概率估计结果,包括:The method according to claim 24, wherein the probability estimation result of the first coefficient includes the mean and variance of the Gaussian distribution, and the probability estimation result of the first coefficient is preprocessed to obtain the processed Probability estimation results, including:
    将所述高斯分布的方差置为0作为处理后的方差,其中,所述处理后的概率估计结果包括高斯分布的均值和处理后的方差。Setting the variance of the Gaussian distribution to 0 as the processed variance, wherein the processed probability estimation result includes the mean value of the Gaussian distribution and the processed variance.
  26. 根据权利要求24所述的方法,其特征在于,所述第一系数的概率估计结果包括高斯分布的均值和方差,所述对所述第一系数的概率估计结果进行预处理,得到处理后的概率估计结果,包括:The method according to claim 24, wherein the probability estimation result of the first coefficient includes the mean and variance of the Gaussian distribution, and the probability estimation result of the first coefficient is preprocessed to obtain the processed Probability estimation results, including:
    根据所述第一系数的缩放因子所述对所述高斯分布的方差进行预处理,以得到处理后的方差,其中,所述处理后的概率估计结果包括高斯分布的均值和处理后的方差;Preprocessing the variance of the Gaussian distribution according to the scaling factor of the first coefficient to obtain a processed variance, wherein the processed probability estimation result includes a mean value of the Gaussian distribution and a processed variance;
    所述方法还包括:根据所述第二系数的缩放因子对所述第二概率分布的方差进行预处理, 其中:The method further includes: preprocessing the variance of the second probability distribution according to the scaling factor of the second coefficient, wherein:
    所述第一系数的缩放因子和所述第二系数的缩放因子相同;或者,the scaling factor of the first coefficient is the same as the scaling factor of the second coefficient; or,
    所述第一系数的缩放因子和所述第二系数的缩放因子不同;或者,a scaling factor of the first coefficient and a scaling factor of the second coefficient are different; or,
    若所述第一系数和所述第二系数在所述待编码图像中属于同一个图像块,则所述第一数据的缩放因子和所述第二系数的缩放因子相同;或者若所述第一系数和所述第二系数属于不同图像块,则所述第一系数的缩放因子和所述第二系数的缩放因子不同;或者所述第一系数的缩放因子是根据所述第一系数所属的图像块的纹理复杂度确定的;或者,If the first coefficient and the second coefficient belong to the same image block in the image to be encoded, then the scaling factor of the first data is the same as the scaling factor of the second coefficient; or if the first data If the first coefficient and the second coefficient belong to different image blocks, then the scaling factor of the first coefficient is different from the scaling factor of the second coefficient; or the scaling factor of the first coefficient is based on the The texture complexity of the image block is determined; or,
    若所述第一系数和所述第二系数属于对所述待编码图像进行小波变换得到的子带中的一个子带,则所述第一系数的缩放因子和所述第二系数的缩放因子相同;或者若所述第一系数和所述第二系数属于不同子带,则所述第一系数的缩放因子和所述第二系数的缩放因子不同;或者所述第一系数的缩放因子是根据所述第一系数所属的子带的纹理复杂度确定的;If the first coefficient and the second coefficient belong to one of the subbands obtained by performing wavelet transformation on the image to be coded, the scaling factor of the first coefficient and the scaling factor of the second coefficient the same; or if the first coefficient and the second coefficient belong to different subbands, the scaling factor of the first coefficient and the scaling factor of the second coefficient are different; or the scaling factor of the first coefficient is determined according to the texture complexity of the subband to which the first coefficient belongs;
    或者,or,
    若所述第一系数和所述第二系数属于对所述待编码图像进行DCT得到的频带中一个频带,则所述第一系数的缩放因子和所述第二系数的缩放因子相同;或者若所述第一系数和所述第二系数属于不同频带,则所述第一系数的缩放因子和所述第二系数的缩放因子不同;或者所述第一系数的缩放因子是根据所述第一系数所属的频带的纹理复杂度确定的;If the first coefficient and the second coefficient belong to one of the frequency bands obtained by performing DCT on the image to be encoded, then the scaling factor of the first coefficient is the same as the scaling factor of the second coefficient; or if The first coefficient and the second coefficient belong to different frequency bands, then the scaling factor of the first coefficient is different from the scaling factor of the second coefficient; or the scaling factor of the first coefficient is based on the first determined by the texture complexity of the frequency band to which the coefficient belongs;
    或者,or,
    若所述第一系数和所述第二系数属于对所述待编码图像进行特征提取得到的三维特征图的同一通道,则所述第一系数的缩放因子和所述第二系数的缩放因子相同;或者若所述第一系数和所述第二系数属于不同通道,则所述第一系数的缩放因子和所述第二系数的缩放因子不同;或者所述第一系数的缩放因子是根据所述第一系数所属的通道的纹理复杂度确定的。If the first coefficient and the second coefficient belong to the same channel of the three-dimensional feature map obtained by performing feature extraction on the image to be encoded, the scaling factor of the first coefficient and the scaling factor of the second coefficient are the same ; or if the first coefficient and the second coefficient belong to different channels, the scaling factor of the first coefficient is different from the scaling factor of the second coefficient; or the scaling factor of the first coefficient is based on the determined by the texture complexity of the channel to which the first coefficient belongs.
  27. 根据权利要求20所述的方法,其特征在于,所述方法还包括:The method according to claim 20, further comprising:
    对所述预置区域的概率估计结果进行预处理,得到处理后的概率估计结果。The probability estimation result of the preset area is preprocessed to obtain the probability estimation result after processing.
  28. 根据权利要求27所述的方法,其特征在于,所述预置区域的概率估计结果包括高斯分布的均值和方差,所述对所述预置区域的概率估计结果进行预处理,得到处理后的概率估计结果,包括:The method according to claim 27, wherein the probability estimation result of the preset area includes the mean and variance of the Gaussian distribution, and the preprocessing is performed on the probability estimation result of the preset area to obtain the processed Probability estimation results, including:
    将所述高斯分布的方差置为0作为第一方差,其中,所述处理后的概率估计结果包括高斯分布的均值和第一方差,或者,Setting the variance of the Gaussian distribution to 0 as the first variance, wherein the processed probability estimation result includes the mean value and the first variance of the Gaussian distribution, or,
    根据所述预置区域的缩放因子对所述高斯分布的方差进行处理,以得到第二方差,其中,所述处理后的概率估计结果包括高斯分布的均值和第二方差。The variance of the Gaussian distribution is processed according to the scaling factor of the preset area to obtain a second variance, wherein the processed probability estimation result includes a mean value and a second variance of the Gaussian distribution.
  29. 根据权利要求17-28任一项所述的方法,其特征在于,若所述多个系数为所述待编码图像中的多个像素值,所述第一上下文信息包括所述第一图像中部分或者全部像素值;或者,The method according to any one of claims 17-28, wherein if the multiple coefficients are multiple pixel values in the image to be encoded, the first context information includes some or all of the pixel values; or,
    若对所述待编码图像进行小波变换得到所述多个系数,所述多个系数为多个小波系数,所述第一上下文信息包括所述多个小波系数中的部分或者全部;或者,If performing wavelet transformation on the image to be coded to obtain the multiple coefficients, the multiple coefficients are multiple wavelet coefficients, and the first context information includes part or all of the multiple wavelet coefficients; or,
    若对所述待编码图像进行小波变换和量化得到所述多个系数,所述多个系数为多个量化小波系数,所述第一上下文信息包括所述多个量化小波系数中的部分或者全部;或者,If performing wavelet transformation and quantization on the image to be encoded to obtain the multiple coefficients, the multiple coefficients are multiple quantized wavelet coefficients, and the first context information includes part or all of the multiple quantized wavelet coefficients ;or,
    若对所述待编码图像进行DCT得到所述多个系数,所述多个系数为多个DCT系数,所 述第一上下文信息包括所述多个DCT系数中的部分或者全部;或者,If DCT is performed on the image to be encoded to obtain the multiple coefficients, the multiple coefficients are multiple DCT coefficients, and the first context information includes part or all of the multiple DCT coefficients; or,
    若对所述待编码图像进行DCT和量化得到所述多个系数,所述多个系数为多个量化DCT系数,所述第一上下文信息包括所述多个量化DCT系数中的部分或者全部;或者,If performing DCT and quantization on the image to be encoded to obtain the multiple coefficients, the multiple coefficients are multiple quantized DCT coefficients, and the first context information includes part or all of the multiple quantized DCT coefficients; or,
    若对所述待编码图像进行特征提取得到所述多个系数,所述多个系数为多个特征系数,所述第一上下文信息包括所述多个特征系数中的部分或者全部;或者,If performing feature extraction on the image to be encoded to obtain the multiple coefficients, the multiple coefficients are multiple feature coefficients, and the first context information includes part or all of the multiple feature coefficients; or,
    若对所述待编码图像进行特征提取和量化得到所述多个系数,所述多个系数为多个量化特征系数,所述第一上下文信息包括所述多个量化特征系数中的部分或者全部。If feature extraction and quantization are performed on the image to be encoded to obtain the multiple coefficients, the multiple coefficients are multiple quantized feature coefficients, and the first context information includes part or all of the multiple quantized feature coefficients .
  30. 根据权利要求17-29任一项所述的方法,所述根据所述第一系数的上下文信息得到第一概率估计结果,包括:According to the method according to any one of claims 17-29, said obtaining the first probability estimation result according to the context information of the first coefficient comprises:
    获取第二概率分布模型,将所述第一上下文信息输入到第三概率估计网络中进行处理,得到所述第二概率分布模型的参数;根据所述第二概率分布模型和所述第二概率分布模型的参数得到所述第一概率估计结果;acquiring a second probability distribution model, inputting the first context information into a third probability estimation network for processing, and obtaining parameters of the second probability distribution model; according to the second probability distribution model and the second probability The parameters of the distribution model obtain the first probability estimation result;
    或者,or,
    将所述第一上下文信息输入到第四概率估计模型中进行处理,得到所述概率估计结果;inputting the first context information into a fourth probability estimation model for processing to obtain the probability estimation result;
    其中,所述第三概率估计网络和所述第四概率估计网络是神经网络实现的。Wherein, the third probability estimation network and the fourth probability estimation network are implemented by a neural network.
  31. 一种解码设备实现的图像处理方法,其特征在于,包括:An image processing method implemented by a decoding device, characterized in that it comprises:
    从压缩码流解码获得第一概率估计结果;Obtaining a first probability estimation result from decoding the compressed code stream;
    根据所述第一概率估计结果进行采样得到第一估计系数;performing sampling according to the first probability estimation result to obtain a first estimated coefficient;
    根据所述第一估计系数得到第一重建图像。A first reconstructed image is obtained according to the first estimated coefficients.
  32. 根据权利要求31所述的方法,其特征在于,所述方法还包括:The method according to claim 31, further comprising:
    从所述压缩码流解码获得第二概率估计结果;Obtaining a second probability estimation result from decoding the compressed code stream;
    根据所述第二概率估计结果进行采样得到第二估计系数;performing sampling according to the second probability estimation result to obtain a second estimation coefficient;
    所述根据第一估计系数得到第一重建图像,包括:The obtaining the first reconstructed image according to the first estimation coefficient includes:
    根据所述第一估计系数和所述第二估计系数得到所述第一重建图像。The first reconstructed image is obtained according to the first estimated coefficient and the second estimated coefficient.
  33. 根据权利要求31所述的方法,其特征在于,所述从压缩码流解码获得第一概率估计结果,包括:The method according to claim 31, wherein said obtaining the first probability estimation result from the decoding of the compressed code stream comprises:
    从所述压缩码流中解码出第一标识;Decode the first identifier from the compressed code stream;
    若所述第一标识的值为第一值,所述从压缩码流解码获得第一概率估计结果,包括:If the value of the first identifier is the first value, the decoding of the compressed code stream to obtain a first probability estimation result includes:
    从所述压缩码流中解码出概率估计结果集合和预置区域的索引;所述预置区域包括所述第一估计系数,所述预置区域为所述第一重建图像中的一个区域,Decoding a probability estimation result set and an index of a preset area from the compressed code stream; the preset area includes the first estimated coefficient, and the preset area is an area in the first reconstructed image,
    根据所述索引从所述概率估计结果集合中确定出所述预置区域的概率估计结果,所述第一概率估计结果为所述预置区域的概率估计结果;determining the probability estimation result of the preset area from the probability estimation result set according to the index, the first probability estimation result being the probability estimation result of the preset area;
    其中,所述第一标识的值为所述第一值用于指示采样得到所述预置区域内的所有估计系时均使用所述预置区域的概率估计结果。Wherein, the value of the first identifier is the first value used to indicate that the probability estimation result of the preset area is used when sampling all the estimation systems in the preset area.
  34. 根据权利要求31所述的方法,其特征在于,所述方法还包括:The method according to claim 31, further comprising:
    从所述压缩码流中解码出第一标识;Decode the first identifier from the compressed code stream;
    若所述第一标识的值为第一值,所述从压缩码流解码获得第一概率估计结果,包括:If the value of the first identifier is the first value, the decoding of the compressed code stream to obtain a first probability estimation result includes:
    从所述压缩码流中解码出预置区域的概率估计结果和所述预置区域的尺寸信息;所述预置区域包括所述第一估计系数,所述预置区域为所述第一重建图像中的一个区域;预置区域的概率估计结果为所述第一概率估计结果;The probability estimation result of the preset area and the size information of the preset area are decoded from the compressed code stream; the preset area includes the first estimated coefficient, and the preset area is the first reconstruction An area in the image; the probability estimation result of the preset area is the first probability estimation result;
    其中,所述第一标识的值为所述第一值用于指示采样得到所述预置区域内的所有待估计系时均使用所述预置区域的概率估计结果。Wherein, the value of the first identifier is the first value used to indicate that the probability estimation result of the preset area is used when sampling all the systems to be estimated in the preset area.
  35. 根据权利要求32所述的方法,其特征在于,所述第一估计系数和第二估计系数属于同一预置区域,所述预置区域为所述第一重建图像中的一个区域,所述方法还包括:The method according to claim 32, wherein the first estimated coefficient and the second estimated coefficient belong to the same preset area, and the preset area is an area in the first reconstructed image, the method Also includes:
    从所述压缩码流中解码出第一标识;Decode the first identifier from the compressed code stream;
    若所述第一标识的值为第二值,所述第一标识的值为所述第二值用于指示采样得到所述预置区域内的所有待估计系时使用各自的概率估计结果。If the value of the first identifier is the second value, the value of the first identifier is the second value used to indicate that the respective probability estimation results are used when all the systems to be estimated in the preset area are obtained by sampling.
  36. 根据权利要求31-35任一项所述的方法,其特征在于,第一概率估计结果包括高斯分布的均值和方差,所述根据所述第一概率估计结果进行采样得到第一估计系数,包括:The method according to any one of claims 31-35, wherein the first probability estimation result includes the mean and variance of the Gaussian distribution, and the first estimation coefficient obtained by sampling according to the first probability estimation result includes :
    获取第一随机数;Obtain the first random number;
    根据所述第一随机数确定第一参考值,所述第一参考值服从高斯分布;determining a first reference value according to the first random number, where the first reference value obeys a Gaussian distribution;
    根据所述第一参考值和所述第一概率估计结果的均值和方差确定所述第一估计系数。The first estimation coefficient is determined according to the first reference value and the mean value and variance of the first probability estimation result.
  37. 根据权利要求36所述的方法,其特征在于,所述方法还包括:The method of claim 36, further comprising:
    对所述第一概率估计结果的方差进行预处理,以得到处理后的方差;Preprocessing the variance of the first probability estimation result to obtain the processed variance;
    所述根据所述第一参考值和所述第一概率估计结果的均值和方差确定所述第一估计系数,包括:The determining the first estimated coefficient according to the first reference value and the mean value and variance of the first probability estimation result includes:
    根据所述第一参考值、所述第一概率估计结果的均值及所述处理后的方差确定所述第一估计系数。The first estimation coefficient is determined according to the first reference value, the mean value of the first probability estimation result, and the processed variance.
  38. 根据权利要求37所述的方法,其特征在于,所述对所述第一概率估计结果的方差进行预处理,以得到处理后的方差,包括:The method according to claim 37, wherein the preprocessing the variance of the first probability estimation result to obtain the processed variance comprises:
    将所述第一概率分布的方差置0作为所述处理后的方差。Set the variance of the first probability distribution to 0 as the processed variance.
  39. 根据权利要求37所述的方法,当所述第一估计系数为量化小波系数,或者,小波系数,或者量化离散余弦变换DCT系数,或者DCT系数,或者特征系数,或者量化特征系数时,所述对所述第一概率分布的方差进行预处理,以得到处理后的方差,包括:According to the method according to claim 37, when the first estimated coefficient is a quantized wavelet coefficient, or, a wavelet coefficient, or a quantized discrete cosine transform DCT coefficient, or a DCT coefficient, or a feature coefficient, or a quantized feature coefficient, the Preprocessing the variance of the first probability distribution to obtain the processed variance includes:
    根据所述第一估计系数的缩放因子所述对所述第一概率分布的方差进行预处理,以得到处理后的方差,preprocessing the variance of the first probability distribution according to the scaling factor of the first estimated coefficient to obtain the processed variance,
    所述第一估计系数的缩放因子和所述第二估计系数的缩放因子相同;或者,the scaling factor of the first estimated coefficient is the same as the scaling factor of the second estimated coefficient; or,
    所述第一估计系数的缩放因子和所述第二估计系数的缩放因子不同;或者the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient are different; or
    在所述第一估计系数和第二估计系数为量化小波系数或者为小波系数时,若所述第一估计系数和所述第二估计系数属于同一个子带,则所述第一估计系数的缩放因子和所述第二估计系数的缩放因子相同;或者若所述第一估计系数和所述第二估计系数属于不同子带,则所述第一估计系数的缩放因子和所述第二估计系数的缩放因子不同;或者所述第一估计系数的 缩放因子是根据所述第一估计系数所属的图像块的纹理复杂度确定的;When the first estimated coefficient and the second estimated coefficient are quantized wavelet coefficients or wavelet coefficients, if the first estimated coefficient and the second estimated coefficient belong to the same subband, the scaling of the first estimated coefficient factor is the same as the scaling factor of the second estimated coefficient; or if the first estimated coefficient and the second estimated coefficient belong to different subbands, the scaling factor of the first estimated coefficient and the second estimated coefficient different scaling factors; or the scaling factor of the first estimated coefficient is determined according to the texture complexity of the image block to which the first estimated coefficient belongs;
    或者,or,
    在所述第一估计系数和第二估计系数为量化DCT系数或者为DCT系数时,若所述第一估计系数和所述第二估计系数属于同一个频带,则所述第一估计系数的缩放因子和所述第二估计系数的缩放因子相同;或者若所述第一估计系数和所述第二估计系数属于不同频带,则所述第一估计系数的缩放因子和所述第二估计系数的缩放因子不同;或者所述第一估计系数的缩放因子是根据所述第一估计系数所属的频带的纹理复杂度确定的;When the first estimated coefficient and the second estimated coefficient are quantized DCT coefficients or DCT coefficients, if the first estimated coefficient and the second estimated coefficient belong to the same frequency band, the scaling of the first estimated coefficient factor is the same as the scaling factor of the second estimated coefficient; or if the first estimated coefficient and the second estimated coefficient belong to different frequency bands, the scaling factor of the first estimated coefficient and the scaling factor of the second estimated coefficient The scaling factors are different; or the scaling factor of the first estimated coefficient is determined according to the texture complexity of the frequency band to which the first estimated coefficient belongs;
    或者,or,
    在所述第一估计系数和第二估计系数为特征系数或者量化特征系数时,若所述第一估计系数和所述第二估计系数属于同一通道,则所述第一估计系数的缩放因子和所述第二估计系数的缩放因子相同;或者若所述第一估计系数和所述第二估计系数属于不同通道,则所述第一估计系数的缩放因子和所述第二估计系数的缩放因子不同;若所述第一估计系数的缩放因子是根据所述第一估计系数所属的通道的纹理复杂度确定的。When the first estimated coefficient and the second estimated coefficient are characteristic coefficients or quantized characteristic coefficients, if the first estimated coefficient and the second estimated coefficient belong to the same channel, the scaling factor of the first estimated coefficient and The scaling factors of the second estimated coefficients are the same; or if the first estimated coefficients and the second estimated coefficients belong to different channels, the scaling factors of the first estimated coefficients and the scaling factors of the second estimated coefficients Different; if the scaling factor of the first estimated coefficient is determined according to the texture complexity of the channel to which the first estimated coefficient belongs.
  40. 根据权利要求37所述的方法,当所述第一估计系数和第二估计系数为像素值时,所述对所述第一概率估计结果的方差进行预处理,以得到处理后的方差,包括:According to the method according to claim 37, when the first estimated coefficient and the second estimated coefficient are pixel values, the preprocessing is performed on the variance of the first probability estimation result to obtain the processed variance, comprising :
    根据所述第一系数的缩放因子所述对所述第一概率估计结果的方差进行预处理,以得到处理后的方差,preprocessing the variance of the first probability estimation result according to the scaling factor of the first coefficient to obtain the processed variance,
    所述第一估计系数的缩放因子和所述第二估计系数的缩放因子相同,或者第一估计系数的缩放因子和所述第二估计系数的缩放因子不相同;或者,The scaling factor of the first estimated coefficient is the same as the scaling factor of the second estimated coefficient, or the scaling factor of the first estimated coefficient is different from the scaling factor of the second estimated coefficient; or,
    若所述第一估计系数和所述第二估计系数属于同一个图像块,且该图像块的分辨率低于预设分辨率,所述第一估计系数的缩放因子和所述第二估计系数的缩放因子不同;或者若所述第一估计系数和所述第二估计系数属于同一个图像块,且该图像块的分辨率不低于所述预设分辨率,所述第一估计系数的缩放因子和所述第二估计系数的缩放因子相同。If the first estimation coefficient and the second estimation coefficient belong to the same image block, and the resolution of the image block is lower than the preset resolution, the scaling factor of the first estimation coefficient and the second estimation coefficient different scaling factors; or if the first estimated coefficient and the second estimated coefficient belong to the same image block, and the resolution of the image block is not lower than the preset resolution, the first estimated coefficient The scaling factor is the same as the scaling factor of said second estimated coefficient.
  41. 根据权利要求31-40任一项所述的方法,其特征在于,所述根据所述第一估计系数和所述第二估计系数得到所述第一重建图像,包括:The method according to any one of claims 31-40, wherein the obtaining the first reconstructed image according to the first estimated coefficient and the second estimated coefficient comprises:
    若所述第一估计系数和第二估计系数为量化小波系数,对所述第一估计系数和所述第二估计系数进行反量化和小波反变换得到所述第一重建图像,或者,If the first estimated coefficient and the second estimated coefficient are quantized wavelet coefficients, performing inverse quantization and wavelet inverse transform on the first estimated coefficient and the second estimated coefficient to obtain the first reconstructed image, or,
    若所述第一估计系数和第二估计系数为小波系数,对所述第一估计系数和所述第二估计系数进行小波反变换得到所述第一重建图像,或者,If the first estimated coefficient and the second estimated coefficient are wavelet coefficients, performing inverse wavelet transform on the first estimated coefficient and the second estimated coefficient to obtain the first reconstructed image, or,
    若所述第一估计系数和第二估计系数为量化DCT系数,对所述第一估计系数和所述第二估计系数进行反量化和反DCT得到所述第一重建图像,或者,If the first estimated coefficient and the second estimated coefficient are quantized DCT coefficients, performing inverse quantization and inverse DCT on the first estimated coefficient and the second estimated coefficient to obtain the first reconstructed image, or,
    若所述第一估计系数和第二估计系数为DCT系数,对所述第一估计系数和所述第二估计系数进行反DCT得到所述第一重建图像。If the first estimated coefficient and the second estimated coefficient are DCT coefficients, performing inverse DCT on the first estimated coefficient and the second estimated coefficient to obtain the first reconstructed image.
  42. 根据权利要求31-41任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 31-41, further comprising:
    从所述压缩码流中解码得到多个重建系数;Decoding the compressed code stream to obtain a plurality of reconstruction coefficients;
    根据所述多个重建系数得到第二重建图像。A second reconstructed image is obtained according to the plurality of reconstruction coefficients.
  43. 根据权利要求42所述的方法,其特征在于,所述根据所述多个系数得到第二重建图 像,包括:The method according to claim 42, wherein said obtaining a second reconstructed image according to said plurality of coefficients comprises:
    若所述多个重建系数为量化小波系数,对所述多个重建系数进行反量化和小波反变换得到所述第二重建图像,或者,If the multiple reconstruction coefficients are quantized wavelet coefficients, performing inverse quantization and wavelet inverse transform on the multiple reconstruction coefficients to obtain the second reconstructed image, or,
    若所述多个重建系数为小波系数,对所述多个重建系数进行小波反变换得到所述第二重建图像,或者,If the multiple reconstruction coefficients are wavelet coefficients, performing inverse wavelet transform on the multiple reconstruction coefficients to obtain the second reconstructed image, or,
    若所述多个重建系数为量化DCT系数,对所述多个重建系数进行反量化和反DCT得到所述第二重建图像,或者,If the multiple reconstruction coefficients are quantized DCT coefficients, performing inverse quantization and inverse DCT on the multiple reconstruction coefficients to obtain the second reconstructed image, or,
    若所述多个重建系数为DCT系数,对所述多个重建系数进行反DCT得到所述第二重建图像。If the multiple reconstruction coefficients are DCT coefficients, performing inverse DCT on the multiple reconstruction coefficients to obtain the second reconstructed image.
  44. 一种解码器,其特征在于,包括处理电路,用于执行如权利要求31-43任一项所述的方法。A decoder, characterized by comprising a processing circuit configured to execute the method according to any one of claims 31-43.
  45. 一种编码器,其特征在于,包括处理电路,用于执行如权利要求1-30任一项所述的方法。An encoder, characterized by comprising a processing circuit configured to execute the method according to any one of claims 1-30.
  46. 一种计算机程序产品,其特征在于,包括程序代码,当其在计算机或处理器上执行时,用于执行如权利要求1-43任一项所述的方法。A computer program product, characterized in that it includes program code, which is used to execute the method according to any one of claims 1-43 when it is executed on a computer or a processor.
  47. 一种解码器,其特征在于,包括:A decoder, characterized in that it comprises:
    一个或多个处理器;one or more processors;
    非瞬时性计算机可读存储介质,耦合到所述处理器,存储有所述处理器执行的程序,其中,所述程序在由所述处理器执行时,使得所述解码器执行如权利要求31-43任一项所述的方法。A non-transitory computer-readable storage medium, coupled to the processor, storing a program executed by the processor, wherein the program, when executed by the processor, causes the decoder to perform the operation described in claim 31 - The method described in any one of 43.
  48. 一种编码器,其特征在于,包括:An encoder, characterized in that it comprises:
    一个或多个处理器;one or more processors;
    非瞬时性计算机可读存储介质,耦合到所述处理器,存储有所述处理器执行的程序,其中,所述程序在由所述处理器执行时,使得所述解码器执行如权利要求1-30任一项所述的方法。A non-transitory computer-readable storage medium, coupled to the processor, storing a program executed by the processor, wherein the program, when executed by the processor, causes the decoder to perform the operation described in claim 1 - the method of any one of 30.
  49. 一种非瞬时性计算机可读存储介质,其特征在于,包括程序代码,当其由计算机设备执行时,用于执行基于权利要求1-43任一项所述的方法。A non-transitory computer-readable storage medium, characterized by comprising program code, which is used to execute the method according to any one of claims 1-43 when executed by a computer device.
  50. 一种非瞬时性存储介质,其特征在于,包括基于权利要求1-43任一项所述的方法编码的比特流。A non-transitory storage medium, characterized by comprising a bit stream encoded based on the method according to any one of claims 1-43.
PCT/CN2022/100578 2021-07-09 2022-06-22 Method and apparatus for encoding and decoding video image WO2023279968A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110781903.8 2021-07-09
CN202110781903.8A CN115604486A (en) 2021-07-09 2021-07-09 Video image coding and decoding method and device

Publications (1)

Publication Number Publication Date
WO2023279968A1 true WO2023279968A1 (en) 2023-01-12

Family

ID=84800361

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/100578 WO2023279968A1 (en) 2021-07-09 2022-06-22 Method and apparatus for encoding and decoding video image

Country Status (2)

Country Link
CN (1) CN115604486A (en)
WO (1) WO2023279968A1 (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107920250A (en) * 2017-11-15 2018-04-17 西安交通大学 A kind of compressed sensing image coding and transmission method
US10652581B1 (en) * 2019-02-27 2020-05-12 Google Llc Entropy coding in image and video compression using machine learning
US20200160565A1 (en) * 2018-11-19 2020-05-21 Zhan Ma Methods And Apparatuses For Learned Image Compression
CN111247797A (en) * 2019-01-23 2020-06-05 深圳市大疆创新科技有限公司 Method and apparatus for image encoding and decoding
CN111405283A (en) * 2020-02-20 2020-07-10 北京大学 End-to-end video compression method, system and storage medium based on deep learning
CN112929663A (en) * 2021-04-08 2021-06-08 中国科学技术大学 Knowledge distillation-based image compression quality enhancement method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107920250A (en) * 2017-11-15 2018-04-17 西安交通大学 A kind of compressed sensing image coding and transmission method
US20200160565A1 (en) * 2018-11-19 2020-05-21 Zhan Ma Methods And Apparatuses For Learned Image Compression
CN111247797A (en) * 2019-01-23 2020-06-05 深圳市大疆创新科技有限公司 Method and apparatus for image encoding and decoding
US10652581B1 (en) * 2019-02-27 2020-05-12 Google Llc Entropy coding in image and video compression using machine learning
CN111405283A (en) * 2020-02-20 2020-07-10 北京大学 End-to-end video compression method, system and storage medium based on deep learning
CN112929663A (en) * 2021-04-08 2021-06-08 中国科学技术大学 Knowledge distillation-based image compression quality enhancement method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
JOOYOUNG LEE; SEUNGHYUN CHO; SEUNG-KWON BEACK: "Context-adaptive Entropy Model for End-to-end Optimized Image Compression", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 27 September 2018 (2018-09-27), 201 Olin Library Cornell University Ithaca, NY 14853 , XP080933978 *
LIU DONG, WANG YE-FEI; LIN JIAN-PING; MA HAI-CHUAN; YANG RUN-YU: "Advances in End-to-End Optimized Image Compression Technologies", COMPUTER SCIENCE, vol. 48, no. 3, 31 March 2021 (2021-03-31), XP093022615, DOI: 10.11896/jsjkx.201100134 *
MU LI; KAI ZHANG; WANGMENG ZUO; RADU TIMOFTE; DAVID ZHANG: "Learning Context-Based Non-local Entropy Modeling for Image Compression", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 10 May 2020 (2020-05-10), 201 Olin Library Cornell University Ithaca, NY 14853 , XP081666748 *

Also Published As

Publication number Publication date
CN115604486A (en) 2023-01-13

Similar Documents

Publication Publication Date Title
WO2023279961A1 (en) Video image encoding method and apparatus, and video image decoding method and apparatus
WO2022068716A1 (en) Entropy encoding/decoding method and device
WO2023000179A1 (en) Video super-resolution network, and video super-resolution, encoding and decoding processing method and device
WO2021136056A1 (en) Encoding method and encoder
WO2022253249A1 (en) Feature data encoding method and apparatus and feature data decoding method and apparatus
US20230209096A1 (en) Loop filtering method and apparatus
US10021398B2 (en) Adaptive tile data size coding for video and image compression
CN114125446A (en) Image encoding method, decoding method and device
US11638025B2 (en) Multi-scale optical flow for learned video compression
US20230396810A1 (en) Hierarchical audio/video or picture compression method and apparatus
WO2023193629A1 (en) Coding method and apparatus for region enhancement layer, and decoding method and apparatus for area enhancement layer
CN116965029A (en) Apparatus and method for decoding image using convolutional neural network
WO2022156688A1 (en) Layered encoding and decoding methods and apparatuses
WO2022100173A1 (en) Video frame compression method and apparatus, and video frame decompression method and apparatus
WO2023279968A1 (en) Method and apparatus for encoding and decoding video image
WO2022063267A1 (en) Intra frame prediction method and device
WO2021196087A1 (en) Video quality improvement method and apparatus
JP2024513693A (en) Configurable position of auxiliary information input to picture data processing neural network
WO2023165487A1 (en) Feature domain optical flow determination method and related device
WO2022194137A1 (en) Video image encoding method, video image decoding method and related devices
WO2024007820A1 (en) Data encoding and decoding method and related device
WO2023000182A1 (en) Image encoding, decoding and processing methods, image decoding apparatus, and device
JP2024511587A (en) Independent placement of auxiliary information in neural network-based picture processing
CN116797674A (en) Image coding and decoding method and device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22836722

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE