WO2023159883A1 - 图像处理方法及装置、电子设备、存储介质、计算机程序和计算机程序产品 - Google Patents

图像处理方法及装置、电子设备、存储介质、计算机程序和计算机程序产品 Download PDF

Info

Publication number
WO2023159883A1
WO2023159883A1 PCT/CN2022/110266 CN2022110266W WO2023159883A1 WO 2023159883 A1 WO2023159883 A1 WO 2023159883A1 CN 2022110266 W CN2022110266 W CN 2022110266W WO 2023159883 A1 WO2023159883 A1 WO 2023159883A1
Authority
WO
WIPO (PCT)
Prior art keywords
component
dct coefficient
target
row
information corresponding
Prior art date
Application number
PCT/CN2022/110266
Other languages
English (en)
French (fr)
Inventor
郭莉娜
王岩
王园园
秦红伟
Original Assignee
上海商汤智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海商汤智能科技有限公司 filed Critical 上海商汤智能科技有限公司
Publication of WO2023159883A1 publication Critical patent/WO2023159883A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/625Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using discrete cosine transform [DCT]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/149Data rate or code amount at the encoder output by estimating the code amount by means of a model, e.g. mathematical model or statistical model
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • H04N19/88Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving rearrangement of data among different coding units, e.g. shuffling, interleaving, scrambling or permutation of pixel data or permutation of transform coefficient data among different blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/91Entropy coding, e.g. variable length coding [VLC] or arithmetic coding

Definitions

  • the present disclosure relates to but not limited to the field of computer technology, and in particular relates to an image processing method and device, electronic equipment, storage media, computer programs and computer program products.
  • JPEG Joint Photographic Experts Group
  • Embodiments of the present disclosure provide an image processing method and device, electronic equipment, storage media, computer programs and computer program products.
  • an image processing method including: extracting initial discrete cosine transform DCT coefficients of three color components corresponding to a Joint Photographic Experts Group JPEG image to be compressed; The initial DCT coefficients are arranged according to different frequencies to obtain the target DCT coefficients of the three color components; entropy encoding is performed on the target DCT coefficients of the three color components to obtain the target compressed data corresponding to the JPEG image to be compressed.
  • an image processing device including: a DCT coefficient extraction part configured to extract initial transformed DCT coefficients of three color components corresponding to a JPEG image to be compressed; a DCT coefficient rearrangement part configured to To respectively arrange the initial DCT coefficients of the three color components according to different frequencies to obtain the target DCT coefficients of the three color components; the entropy encoding part is configured to perform entropy on the target DCT coefficients of the three color components Encoding to obtain the target compressed data corresponding to the JPEG image to be compressed.
  • an electronic device including: a processor; a memory for storing instructions executable by the processor; wherein the processor is configured to call the instructions stored in the memory to Execute the method above.
  • a computer-readable storage medium on which computer program instructions are stored, and when the computer program instructions are executed by a processor, the foregoing method is implemented.
  • a computer program includes computer readable codes, and when the computer readable codes run on a device, the processor in the device implements the above method when executed.
  • a computer program product includes a computer program or an instruction, and when the computer program or instruction is run on an electronic device, the electronic device is caused to execute implement the above method.
  • the initial discrete cosine transform (Discrete Cosine Transform, DCT) coefficients of the three color components corresponding to the JPEG image to be compressed are extracted, and the initial DCT coefficients of the three color components are respectively arranged according to the frequency to obtain three
  • the target DCT coefficients of the color components are entropy encoded for the target DCT coefficients of the three color components to obtain the target compressed data corresponding to the JPEG image to be compressed, so as to realize end-to-end image compression of the JPEG image to be compressed, thereby effectively eliminating the
  • the redundancy of the JPEG image in the spatial dimension and the channel dimension reduces the data size of the JPEG image to be compressed and improves the compression rate of the JPEG image to be compressed.
  • FIG. 1 shows a flowchart of an image processing method according to an embodiment of the present disclosure
  • Fig. 2 shows a schematic diagram of initial DCT coefficients according to an embodiment of the present disclosure
  • Fig. 3 shows a schematic diagram of arranging the initial DCT coefficients shown in Fig. 2 according to different frequencies according to an embodiment of the present disclosure
  • Fig. 4 shows a schematic diagram of an image codec neural network according to an embodiment of the present disclosure
  • FIG. 5 shows a schematic diagram of determining a DCT coefficient matrix corresponding to a Y component according to an embodiment of the present disclosure
  • FIG. 6 shows a schematic diagram of an MLCC model according to an embodiment of the disclosure
  • Fig. 7 shows a block diagram of an image processing device according to an embodiment of the present disclosure
  • Fig. 8 shows a block diagram of an electronic device according to an embodiment of the present disclosure.
  • JPEG JPEG standard is currently a widely supported and used image compression standard. Widely present in data centers, cloud storage and cloud file system centers. According to the survey, JPEG accounts for 35% of cloud storage file systems such as Dropbox. However, due to the limitations of JPEG technology itself, it is difficult to fully eliminate data redundancy by relying on hand-designed compression modules.
  • JPEG has been surpassed by a large number of image compression technologies, such as JPEG2000, BPG, VVC/H.266 intra-frame coding, and a large number of image compression technologies based on deep learning.
  • JPEG2000, BPG, VVC/H.266 intra-frame coding has achieved significantly better results than JPEG in terms of compression ratio, these compression techniques are still not widely supported and used.
  • these compression techniques are significantly different from JPEG, so they cannot handle the huge number of JPEG files that already exist.
  • Some compression technologies are used in related technologies to introduce lossless compression on the basis of JPEG, so that the volume of JPEG files can be compressed while ensuring that the original JPEG files are lossless, thereby saving massive storage and bandwidth. resource.
  • These compression techniques include Lepton, Packjpg, MozJPEG, JPEGrescan, JPEG XL, Cmix, etc. These compression techniques usually need to manually design predictors and context models through feature engineering, resulting in low compression rates of these compression techniques.
  • the image processing method provided by the embodiment of the present disclosure can be applied to recompressing the existing massive JPEG files, extracting the initial DCT coefficients of the three color components corresponding to the JPEG image to be compressed, and respectively calculating the initial DCT coefficients of the three color components Arrange according to different frequencies to obtain the target DCT coefficients of the three color components, perform entropy encoding on the target DCT coefficients of the three color components, and obtain the target compressed data corresponding to the JPEG image to be compressed, so as to realize the end-to-end treatment of the compressed JPEG image Perform image compression to effectively eliminate the redundancy of the JPEG image to be compressed in the spatial and channel dimensions, reduce the data size of the JPEG image to be compressed, and increase the compression rate of the JPEG image to be compressed
  • Fig. 1 shows a flowchart of an image processing method according to an embodiment of the present disclosure.
  • the image processing method can be performed by electronic devices such as terminal equipment or servers, and the terminal equipment can be user equipment (User Equipment, UE), mobile equipment, user terminal, terminal, cellular phone, cordless phone, personal digital assistant (Personal Digital Assistant, PDA), handheld device, computing device, vehicle-mounted device, wearable device, etc.
  • the image processing method can be implemented by calling the computer-readable instructions stored in the memory by the processor.
  • the image processing method may be performed by a server.
  • the image processing method includes steps S11 to S13:
  • the initial DCT coefficients of the three color components are extracted from the code stream of the JPEG image to be compressed to perform a subsequent recompression process.
  • the initial DCT coefficients of the three color components are respectively arranged according to different frequencies to obtain target DCT coefficients of the three color components.
  • Fig. 2 shows a schematic diagram of initial DCT coefficients according to an embodiment of the present disclosure. What is shown in FIG. 2 may be the initial DCT coefficient of any color component. As shown in FIG. 2 , the size of the initial DCT coefficients is 16 ⁇ 16 ⁇ 1, which includes 4 DCT blocks of 8 ⁇ 8 ⁇ 1 size, and each DCT block includes 64 coefficients of different frequencies. Wherein, coefficients with the same relative position in different DCT blocks have the same frequency, and coefficients with different relative positions have different frequencies. As shown in Fig. 2, each DCT block includes 0 to 63 different positions. The frequency of coefficients at position 0 in each DCT block is the same, the frequency of coefficients at position 1 is the same, and so on.
  • the target DCT coefficients can be obtained.
  • the process of arranging the initial DCT coefficients according to different frequencies to obtain the target DCT coefficients will be described in detail later in combination with possible implementations of the present disclosure, and will not be repeated here.
  • entropy encoding is performed on the target DCT coefficients of the three color components to obtain target compressed data corresponding to the JPEG image to be compressed.
  • the embodiments of the present disclosure perform entropy encoding on the target DCT coefficients of the three color components, effectively eliminating the redundancy of the JPEG image to be compressed in the spatial dimension and the channel dimension, so as to realize end-to-end image compression of the JPEG image to be compressed, and obtain the target compression data.
  • the process of performing entropy encoding on the target DCT coefficients of each color component will be described in detail later in combination with possible implementations of the present disclosure, and will not be repeated here.
  • the initial DCT coefficients of the three color components corresponding to the JPEG image to be compressed are extracted, and the initial DCT coefficients of the three color components are respectively arranged according to different frequencies to obtain the target DCT coefficients of the three color components.
  • the target DCT coefficients of the three color components are entropy encoded to obtain the target compressed data corresponding to the JPEG image to be compressed, so as to realize end-to-end image compression of the JPEG image to be compressed, thereby effectively eliminating the gap between the spatial dimension and the channel dimension of the JPEG image to be compressed. redundancy, reduce the data size of the JPEG image to be compressed, and increase the compression rate of the JPEG image to be compressed.
  • the initial DCT coefficients of the three color components are respectively arranged according to different frequencies to obtain the target DCT coefficients of the three color components, including: the initial DCT coefficients for any color component, from the initial DCT Among the coefficients, the coefficients of the same frequency constitute the spatial dimension, and the coefficients of different frequencies constitute the channel dimension to obtain multi-channel DCT sub-coefficients; perform zigzag scanning (ie, zigzag scanning) on the initial DCT coefficients to determine the zigzag sorting; based on zigzag Sorting is to arrange the multi-channel DCT sub-coefficients in the channel dimension to obtain the target DCT coefficients corresponding to the initial DCT coefficients.
  • zigzag scanning ie, zigzag scanning
  • the initial DCT coefficients of the color component are preprocessed, and the zigzag sorting is performed in the channel dimension according to different frequencies, so that the obtained target DCT coefficients of the color component have certain values in both the space dimension and the channel dimension.
  • Structural redundant information the subsequent encoding process can make full use of these redundant information, and effectively realize the entropy encoding of the target DCT coefficient of the color component in the spatial dimension and channel dimension.
  • Fig. 3 shows a schematic diagram of arranging the initial DCT coefficients shown in Fig. 2 according to different frequencies according to an embodiment of the present disclosure.
  • the coefficients of the same frequency in the initial DCT coefficients are extracted to form the spatial dimension, and the coefficients of different frequencies form the channel dimension, and the DCT of 64 channels is obtained
  • the size of the DCT sub-coefficients of each channel is 2 ⁇ 2 in the spatial dimension.
  • the zigzag sorting is to reorder the positions in different DCT blocks.
  • the zigzag order is 0, 1, 8, 16, 9, 2, 3, ..., 61, 54, 47, 55, 62, 63.
  • the DCT sub-coefficients of 64 channels are reordered in the channel dimension to obtain the target DCT coefficients in FIG. 3 , where the size of the target DCT coefficients is 2 ⁇ 2 ⁇ 64.
  • the target DCT coefficients of the three color components are entropy encoded to obtain the target compressed data corresponding to the JPEG image to be compressed, including: using the image codec neural network to encode the target DCT coefficients of the three color components The coefficients are entropy coded to obtain the target compressed data.
  • entropy encoding is performed on the target DCT coefficients of the three color components to achieve end-to-end image compression of the JPEG image to be compressed. Therefore, the target compressed data corresponding to the JPEG image to be compressed is directly output, and the compression rate of the JPEG image to be compressed is effectively improved.
  • the three color components include: a luminance component Y and two chrominance components Cb and Cr.
  • the Y component contains richer information in the JPEG image to be compressed.
  • the image codec neural network includes a multi-level cross-channel autoregressive entropy coding model (Multi-Level Cross-Channel Entropy Model, MLCC), and the target compressed data includes coding information corresponding to the Y component; Entropy coding the target DCT coefficients of the three color components using the image encoding and decoding neural network to obtain the target compressed data, including: using the MLCC model to perform multi-level channel autoregressive entropy on the Y component of the three color components in the channel dimension Encoding to obtain the encoding information corresponding to the Y component.
  • MLCC Multi-Level Cross-Channel Entropy Model
  • the MLCC model is used to perform multi-level channel autoregressive entropy coding on the Y component in the channel dimension to effectively reduce the internal data redundancy of the Y component and realize the Y component.
  • the target DCT coefficients are entropy coded to reduce the data size of the Y component.
  • FIG. 4 shows a schematic diagram of an image compression neural network according to an embodiment of the present disclosure.
  • the image codec neural network includes a cross-color entropy coding model 401 (Cross-color Entropy Model) and an MLCC model 402 .
  • the image processing method before performing entropy coding on the target DCT coefficients of the three color components, the image processing method further includes: fusing the target DCT coefficients of the three color components to obtain the fused DCT coefficients; based on After fusing the DCT coefficients, the shared super prior information is determined; the shared super prior information is split to obtain the coding prior information corresponding to each color component.
  • the target DCT coefficients of the three color components are fused to obtain the fused DCT coefficients, so that the correlation between different color components can be used to determine the shared super prior information based on the fused DCT coefficients, and then through the shared super prior information By splitting the information, the encoding prior information corresponding to each color component can be obtained for subsequent encoding processing.
  • the cross-color entropy coding model includes a coefficient fusion module 4011 (Coefficient Fusion Model, CFM), a super encoder 4012 (Hyper Encoder), a quantization module 4013 (Q), a super Decoder 4014 (Hyper Decoder), coefficient prior splitting module 4015 (Coefficient Prior Split Model, CPSM), entropy parameter prediction module 4016 (Entropy Parameters).
  • the coefficient fusion module 4011 uses the coefficient fusion module 4011 to fuse the target DCT coefficients of the three color components of Y, Cb, and Cr to obtain the fused DCT coefficients. Since the JPEG image to be compressed may have different resolution formats such as YCbCr 4:4:4, YCbCr 4:1:1, and YCbCr 4:2:0, the JPEG image to be compressed is not the full resolution of YCbCr 4:4:4 In the case of the format, in order to better integrate the three color components of Y, Cb, and Cr, it is necessary to perform resolution alignment on the three color components of Y, Cb, and Cr before fusion.
  • the JPEG image to be compressed is a resolution format of YCbCr 4:2:0
  • the Cb component and the Cr component have the same resolution
  • the resolution of the Y component is Cb component, Cr component
  • the target DCT coefficient of the Y component is downsampled by 2 times, so that the target DCT coefficient of the Y component after downsampling is similar to the target DCT coefficient of the Cb component and the target DCT coefficient of the Cr component.
  • the target DCT coefficients of the Y component, the target DCT coefficients of the Cb component, and the target DCT coefficients of the Cr component after downsampling are fused to obtain fused DCT coefficients.
  • the fused DCT coefficients output by the coefficient fusion module 4011 are input into the super encoder 4012 for super-priori prediction, and the initial super-prior information z is obtained, and then the super encoder 4012 The output initial super-prior information z is input to the quantization module 4013 for quantization, so as to effectively obtain the shared super-prior information
  • the shared super prior information output by the quantization module 4013 After the super decoder 4014 and the coefficient prior splitting module 4015, the coding prior information corresponding to the three color components is effectively split: the coding prior information Cr prior corresponding to the Cr component, and the coding prior information Cb corresponding to the Cb component prior , the encoding prior information Y prior corresponding to the Y component.
  • the target compressed data includes: super prior encoding information, encoding information corresponding to the Cr component, encoding information corresponding to the Cb component, encoding information corresponding to the Y component; target DCT coefficients for the three color components Perform entropy coding to obtain the target compressed data corresponding to the JPEG image to be compressed, including: entropy coding the shared super prior information to obtain super prior coding information; based on the coding prior information corresponding to each color component, the Cr component
  • the target DCT coefficients of the Cb component, the target DCT coefficients of the Cb component, and the target DCT coefficients of the Y component are entropy coded to obtain the coding information corresponding to the Cr component, the coding information corresponding to the Cb component, and the coding information corresponding to the Y component.
  • the entropy coding is arithmetic coding.
  • Using arithmetic coding in entropy coding to compress the JPEG image to be compressed can effectively improve the compression effect.
  • both the cross-color entropy coding model 401 and the MLCC model 402 shown in FIG. 4 include an arithmetic encoder 403 (Arithmetic Encoder, AE) for arithmetic coding.
  • AE Arimetic Encoder
  • the shared hyper-prior information can be further Entropy coding is performed, and the shared super prior coding information obtained after coding is stored in the target compressed data as additional information.
  • the cross-color entropy coding model also includes a factorized entropy module 4017 (Factorized Entropy), using the factorized entropy module 4017 and the arithmetic encoder 403 (AE), the shared super Prior Information Arithmetic coding is performed to obtain shared super prior coding information.
  • a factorized entropy module 4017 Factorized Entropy
  • AE arithmetic encoder 403
  • entropy encoding is performed on the target DCT coefficient of the Cr component, the target DCT coefficient of the Cb component, and the target DCT coefficient of the Y component in sequence to obtain the Cr component
  • the corresponding encoding information, the encoding information corresponding to the Cb component, and the encoding information corresponding to the Y component include: based on the encoding prior information corresponding to the Cr component, entropy encoding is performed on the target DCT coefficient of the Cr component to obtain the encoding information corresponding to the Cr component; Based on the encoding prior information corresponding to the Cb component and the target DCT coefficient of the Cr component, entropy encoding is performed on the target DCT coefficient of the Cb component to obtain the encoding information corresponding to the Cb component; based on the encoding prior information corresponding to the Y component and the target of the Cr component The DCT coefficient and the target DCT coefficient of the Cb component
  • the target DCT coefficient of the Cr component is directly entropy encoded; after the entropy encoding of the target DCT coefficient of the Cr component is completed, the encoding prior information corresponding to the Cb component is used to encode
  • the target DCT coefficient is used as context information to effectively perform entropy coding on the target DCT coefficient of the Cb component; after the entropy coding of the target DCT coefficient of the Cb component is completed, the target DCT coefficient of the Cr component
  • the target DCT coefficients of the , Cr components are used as context information, and the target DCT coefficients of the Y component are effectively entropy encoded, thereby effectively eliminating data redundancy between different color components.
  • entropy encoding is performed on the target DCT coefficient of the Cr component to obtain the encoding information corresponding to the Cr component, including: based on the encoding prior information corresponding to the Cr component, Determine the probability mass function (Probability Mass Function, PMF) corresponding to the target DCT coefficient of the Cr component; based on the PMF corresponding to the target DCT coefficient of the Cr component, perform entropy coding on the target DCT coefficient of the Cr component, and obtain the coding information corresponding to the Cr component.
  • PMF probability mass function
  • the PMF corresponding to the target DCT coefficient of the Cr component is determined, so as to effectively realize the entropy encoding of the target DCT coefficient of the Cr component and reduce the data size of the Cr component.
  • the PMF corresponding to the target DCT coefficient of the Cr component is directly determined by using the encoding prior information Cr prior corresponding to the Cr component, and then the arithmetic encoder 403 is used to determine the target DCT coefficient based on the Cr component.
  • the PMF corresponding to the DCT coefficient performs arithmetic coding on the target DCT coefficient of the Cr component to obtain coding information corresponding to the Cr component.
  • entropy encoding is performed on the target DCT coefficient of the Cb component to obtain the encoding information corresponding to the Cb component, including: based on the Cb component Corresponding coding prior information, the target DCT coefficient of the Cr component, determine the PMF corresponding to the target DCT coefficient of the Cb component; based on the PMF corresponding to the target DCT coefficient of the Cb component, perform entropy encoding on the target DCT coefficient of the Cb component, and obtain the Cb component Corresponding encoding information.
  • the target DCT coefficient of the Cr component is used as context information, so that based on the coding prior information corresponding to the Cb component, Cr
  • the target DCT coefficient of the component determines the PMF corresponding to the target DCT coefficient of the Cb component, so as to effectively realize the entropy coding of the target DCT coefficient of the Cb component and reduce the data size of the Cb component.
  • the cross-color entropy coding model 401 also includes an entropy parameter prediction module 4016 (Entropy Parameters), after entropy coding the target DCT coefficient of the Cr component, the Cr component
  • the coding prior information Cb prior corresponding to the target DCT coefficient and Cb component is input to the entropy parameter prediction module 4016, so that the entropy parameter prediction module 4016 can output the PMF corresponding to the target DCT coefficient of the Cb component.
  • the arithmetic encoder 403 is used to perform arithmetic coding on the target DCT coefficients of the Cb component based on the PMF corresponding to the target DCT coefficients of the Cb component, to obtain coding information corresponding to the Cb component.
  • entropy coding is performed on the target DCT coefficient of the Y component to obtain the coding corresponding to the Y component Information, including: based on the coding prior information corresponding to the Y component, the target DCT coefficient of the Cr component, and the target DCT coefficient of the Cb component, determine the probability distribution parameters corresponding to the Y component; based on the probability distribution parameters corresponding to the Y component, use multi-level channel Auto-regression, entropy encoding is performed on the target DCT coefficient of the Y component to obtain encoding information corresponding to the Y component.
  • the target DCT coefficient of the Cr component and the target DCT coefficient of the Cb component are used as context information, In order to determine the probability distribution parameters corresponding to the Y component based on the coding prior information corresponding to the Y component, the target DCT coefficient of the Cr component, and the target DCT coefficient of the Cb component. Since the Y component contains richer information in the JPEG image to be compressed, it can be directly It is not accurate to construct the PMF for entropy coding based on the probability distribution parameters corresponding to the Y component.
  • multi-level channel autoregressive is used to effectively reduce the internal data redundancy of the Y component and realize the
  • the target DCT coefficients of the Y component are entropy coded to reduce the data size of the Y component.
  • the JPEG image to be compressed is a resolution format of YCbCr 4:2:0
  • the resolution of the Y component is that of the Cb and Cr components. 2 times, at this time, perform 2 times upsampling processing on the target DCT coefficients of the Cb and Cr components respectively, so that the target DCT coefficients of the Cb and Cr components after upsampling have the same resolution as the target DCT coefficients of the Y component .
  • the cross-color entropy coding model 401 also includes an upsampling module 404 (Up), and uses the upsampling module to perform upsampling processing on the target DCT coefficients of the Cb and Cr components respectively, So that the target DCT coefficients of the Cb and Cr components after upsampling have the same resolution as the target DCT coefficients of the Y component.
  • Up upsampling module 404
  • the entropy parameter prediction module 4016 After fusing the target DCT coefficients of the Cb and Cr components after upsampling, together with the coding prior information Y prior corresponding to the Y component, input the entropy parameter prediction module 4016, so that the entropy parameter prediction module 4016 is used to output the Y component The corresponding probability distribution parameter hyper y .
  • the probability distribution parameter hyper y corresponding to the Y component and the target DCT coefficient of the Y component are input into the MLCC model 402, so as to implement entropy encoding on the target DCT coefficient of the Y component by using multi-level channel autoregression, and obtain the coding information corresponding to the Y component .
  • the multi-level channel autoregressive is used to perform entropy coding on the target DCT coefficient of the Y component to obtain the coding information corresponding to the Y component, including:
  • the target DCT coefficient is converted from the spatial dimension to the channel dimension to obtain the converted DCT coefficient of the Y component; according to the preset matrix form, the converted DCT coefficient of the Y component is disassembled to obtain the DCT coefficient matrix corresponding to the Y component; based on the Y
  • the DCT coefficient matrix is entropy encoded by using multi-level channel auto-regression to obtain the encoding information corresponding to the Y component.
  • Fig. 5 shows a schematic diagram of determining a DCT coefficient matrix corresponding to a Y component according to an embodiment of the present disclosure.
  • the size of the initial DCT coefficient of the Y component is 32 ⁇ 32 ⁇ 1
  • the initial DCT of the Y component with a size of 32 ⁇ 32 ⁇ 1 After the coefficients are arranged, the target DCT coefficients of the Y component with a size of 4 ⁇ 4 ⁇ 64 shown in FIG. 5 can be obtained.
  • the target DCT coefficients of the Y component are divided into 2 ⁇ 2 partitions in the spatial dimension, and each partition includes 1 to 4 different positions, and then the spatial dimension
  • the conversion to the channel dimension 501 (Space-to-depth) obtains the converted DCT coefficient of the Y component, as shown in Figure 5, the size of the converted DCT coefficient of the Y component is 2 ⁇ 2 ⁇ (64 ⁇ 4), namely The transformed DCT coefficients of the Y component are increased from 64 channels to 256 channels in the channel dimension.
  • the converted DCT coefficients of the Y component are disassembled to obtain the DCT coefficient matrix corresponding to the Y component, including: according to the preset matrix form, the converted Y component The space dimension of the DCT coefficient is disassembled to obtain multiple rows of the DCT coefficient matrix; the channel dimension is disassembled for each row of the DCT coefficient matrix to obtain multiple columns of each row.
  • the converted DCT coefficients of the Y component are disassembled in the space dimension and the channel dimension, so that the rows and columns of the DCT coefficient matrix obtained after the disassembly have certain structural redundant information, so that it can be By utilizing the redundant information, subsequent multi-level channel autoregressive entropy encoding on the DCT coefficient matrix is effectively realized.
  • the converted DCT coefficients of the Y component with a size of 2 ⁇ 2 ⁇ (64 ⁇ 4) are disassembled in the spatial dimension 502 (Row Split ), four rows in the DCT coefficient matrix are obtained: r (1) , r (2) , r (3) , r (4) , and each row includes DCT coefficients with a size of 2 ⁇ 2 ⁇ 64.
  • n of columns in each row and the number of channels contained in each column are the same, and specific values can be set according to actual conditions, which are not specifically limited in the present disclosure.
  • 9 columns in r (1) are obtained: The splitting of r (2) , r (3) , and r (4) in the channel dimension is similar and will not be repeated here.
  • the DCT coefficient matrix After dismantling and obtaining the DCT coefficient matrix corresponding to the Y component, based on the probability distribution parameters corresponding to the Y component, the DCT coefficient matrix is subjected to multi-level channel autoregressive entropy coding.
  • FIG. 6 shows a schematic diagram of an MLCC model according to an embodiment of the disclosure.
  • the MLCC model includes an outer channel module 601 (Outer Channel) and an inner channel module 602 (Inner Channel).
  • the external channel module 601 includes a space dimension to channel dimension conversion unit 6011 (Space-to-depth), and a row disassembly unit 6012 (Row Split).
  • the space dimension to channel dimension conversion unit 6011 is used to convert the target DCT coefficient of the Y component from the space dimension to the channel dimension to obtain the converted DCT coefficient Y′ of the Y component.
  • the row dismantling unit 6012 is used to disassemble the converted DCT coefficient Y' of the Y component in the spatial dimension to obtain four rows in the DCT coefficient matrix: r (1) , r (2) , r (3 ) , r (4) , the internal channel module 602 corresponding to the four row inputs obtained by dismantling performs multi-level channel autoregressive entropy coding.
  • the multi-level channel autoregressive is used to perform entropy coding on the DCT coefficient matrix to obtain the coding information corresponding to the Y component, including: based on the probability distribution corresponding to the Y component Parameters, using multi-level channel autoregression, sequentially determine the PMF corresponding to each column in each row of the DCT coefficient matrix; using the PMF corresponding to each column in each row of the DCT coefficient matrix, perform entropy on each row of the DCT coefficient matrix encoding to obtain the encoding information corresponding to each column in each row of the DCT coefficient matrix; wherein, the encoding information corresponding to each column in each row of the DCT coefficient matrix constitutes the encoding information corresponding to the Y component.
  • the PMF corresponding to each column in each row of the DCT coefficient matrix is sequentially determined by using multi-level channel auto-regression, including: based on the probability distribution parameters corresponding to the Y component , using multi-level channel auto-regression, sequentially determine the coding prior information corresponding to each row in the DCT coefficient matrix; for the i-th row of the DCT coefficient matrix, based on the coding prior information corresponding to the i-th row, using multi-level channel auto-regression, Determine the PMF corresponding to each column in the i-th row, where i represents the number of rows.
  • the multi-level channel autoregressive is used to determine the coding prior information corresponding to each row in the DCT coefficient matrix in turn, and then for any i-th row in the DCT coefficient matrix, it can be based on the i-th row.
  • the prior information is encoded, and the multi-level channel autoregressive is used to determine the PMF with high accuracy corresponding to each column in the i-th row, which is used for subsequent entropy encoding of each column in the i-th row.
  • the multi-level channel autoregressive is used to sequentially determine the coding prior information corresponding to each row in the DCT coefficient matrix, including: for the first row of the DCT coefficient matrix , based on the probability distribution parameters corresponding to the Y component, determine the coding prior information corresponding to the first row; in the case of i>1, for the i-th row of the DCT coefficient matrix, based on the probability distribution parameters corresponding to the Y component, the DCT coefficient matrix From line 1 to line i-1 of , determine the coding prior information corresponding to line i.
  • i>1 means that i is greater than 1.
  • any remaining i-th row is based on the previous first row to i-th row
  • the -1 row is used as context information to determine its corresponding encoding prior information, so as to realize autoregression in the row direction and improve the accuracy of the encoding prior information corresponding to each row.
  • the i-th row in the DCT coefficient matrix After determining the coding prior information corresponding to the i-th row, use the coding prior information corresponding to the i-th row to perform entropy coding on each column in the i-th row. After the entropy coding of all the columns in is completed, the i-th row is used as the context information to determine the coding prior information corresponding to the i+1-th row.
  • Fig. 6 use the spatial dimension to channel dimension conversion unit 6011 to convert the probability distribution parameter hyper y corresponding to the Y component from the spatial dimension to the channel dimension, and obtain the converted probability distribution parameter h '.
  • the external channel module 601 also includes a parameter module 6013 (Param), which inputs the converted probability distribution parameter h' into the parameter module 6013, and outputs the coding prior information pri (1 ) ; Input the first line r (1) of the DCT coefficient matrix and the coding prior information pri (1) corresponding to the first line into the internal channel module 602, so as to each of the first line r (1) of the DCT coefficient matrix A column is entropy encoded.
  • Paraam parameter module 6013
  • the converted probability distribution parameter h' and the first row r (1) of the DCT coefficient matrix are combined and input to the parameter module 6013, output The coding prior information pri (2) corresponding to the 2nd line of the DCT coefficient matrix;
  • the input parameter module 6013 After entropy coding all the columns in the 2nd row r (2) of the DCT coefficient matrix, the transformed probability distribution parameter h', the 1st row r (1) of the DCT coefficient matrix, the 2nd row of the DCT coefficient matrix After r (2) is merged, the input parameter module 6013 outputs the coding prior information pri (3 ) corresponding to the 3rd row of the DCT coefficient matrix; The empirical information pri (3) is input to the inner channel module 602 to entropy encode each column in the third row r (3) of the DCT coefficient matrix.
  • the input parameter module 6013 After entropy coding all the columns in the 3rd row r (3) of the DCT coefficient matrix, the transformed probability distribution parameter h', the 1st row r (1) of the DCT coefficient matrix, the 2nd row of the DCT coefficient matrix After r (2) , the 3rd line r (3) of the DCT coefficient matrix is merged, the input parameter module 6013 outputs the coding prior information pri (4) corresponding to the 4th line of the DCT coefficient matrix; the 4th line of the DCT coefficient matrix r (4) , the encoding prior information pri (4) corresponding to the 4th row are input to the internal channel module 602 to perform entropy encoding on each column in the 4th row r (4) of the DCT coefficient matrix.
  • the PMF corresponding to each column in the i-th row is determined by using multi-level channel autoregressive, including: for the i-th row of the DCT coefficient matrix column, based on the coding prior information corresponding to the i-th row, determine the PMF corresponding to the i-th row and the first column; in the case of j>1, for the i-th row and j-column of the DCT coefficient matrix, based on the i-th row corresponding to Encoding prior information, column 1 to column j-1 in row i of the DCT coefficient matrix, and determining a PMF corresponding to row i and column j, where j represents the number of columns.
  • any other j-th column is based on the previous 1st
  • the column to the j-1th column is used as context information to determine its corresponding PMF, so as to realize auto-regression in the column direction and improve the accuracy of the PMF corresponding to each column in the i-th row.
  • the PMF corresponding to row i and column j+1 is determined using row i and column j as context information.
  • the internal channel module 602 includes a column disassembly unit 6021 (Column Split), and for the i-th row r (i) of the DCT coefficient matrix, the column disassembly unit 6021 is used, The i-th row r (i) of the DCT coefficient matrix is disassembled in the channel dimension to obtain n columns in the i-th row r (i) of the DCT coefficient matrix:
  • the encoding prior information pri (i ) corresponding to the i-th row r (i ) and the first column in the i-th row r (i) Input the parameter module 6013 after merging, and output the second column in the i-th row r (i)
  • the corresponding PMF which in turn is based on column 2 in row i r (i) Corresponding PMF, for column 2 in row i r (i) Do entropy encoding.
  • the PMF corresponding to each column in each row of the DCT coefficient matrix is used to perform entropy encoding on each column in each row of the DCT coefficient matrix to obtain the encoding corresponding to each column in each row of the DCT coefficient matrix Information, including: in the case of i ⁇ 1 and j ⁇ 1, entropy encoding is performed on row i and column j based on the PMF corresponding to row i and column j of the DCT coefficient matrix, Obtain the encoding information corresponding to row i and column j.
  • i ⁇ 1 and j ⁇ 1 means that i is greater than and equal to 1 and j is greater than and equal to 1.
  • entropy coding is performed on the i-th row and j-column to obtain the coding information corresponding to the i-th row and j-column, thereby effectively reducing the data size of the i-th row and j-column .
  • the shared super prior information can be obtained by performing entropy decoding on the shared super prior coding information Furthermore, by sharing super-prior information By splitting, the coding prior information corresponding to each color component can be obtained in the decoding process, which is used for subsequent decoding processing on the coding information of each color component.
  • the cross-color entropy coding model 401 also includes an arithmetic decoder 405 (Arithmetic Decoder, AD).
  • AD Arimetic Decoder
  • the factor entropy module 4017 and the arithmetic decoder 405 to perform entropy decoding on the shared super-prior encoding information, and obtain the shared super-prior information
  • the super decoder 4014 and the coefficient prior splitting module 4015 are used to effectively split the coding process to obtain the coding prior information Cr prior for entropy decoding the coding information corresponding to the Cr component, and for the Cb component
  • the process of entropy decoding the coding information corresponding to the Cr component by using the coding prior information Cr prior corresponding to the Cr component is the inverse of entropy coding the target DCT coefficient of the Cr component using the coding prior information Cr prior corresponding to the Cr component Process; after entropy decoding the coding information corresponding to the Cr component to obtain the target DCT coefficient of the Cr component, use the coding prior information Cb prior corresponding to the Cb component and the target DCT coefficient of the Cr component to perform entropy on the coding information corresponding to the Cb component
  • the decoding process is the inverse process of entropy encoding the target DCT coefficients of the Cb component by using the coding prior information Cb prior corresponding to the Cb component and the target DCT coefficient of the Cr component; After entropy decoding obtains the target DCT coefficients of the Cr component and the Cb component, use the coding prior information Y prior corresponding to the Y component, the target DCT coefficient of the Cr component, and
  • the image processing method further includes: performing network training on the initial neural network to obtain a target image encoding and decoding neural network.
  • the network structure of the initial neural network is the same as that of the target image encoding and decoding neural network, and the network parameters of the cross-color entropy coding model and the MLCC model in the target image encoding and decoding neural network are obtained by performing network training on the initial neural network.
  • the specific process of network training may adopt the network training process in related technologies, which is not specifically limited in the embodiments of the present disclosure.
  • the present disclosure also provides image processing devices, electronic equipment, computer-readable storage media, computer programs and computer program products, all of which can be used to implement any image processing method provided in the present disclosure, corresponding technical solutions and descriptions and see The corresponding records in the method part will not be repeated here.
  • Fig. 7 shows a block diagram of an image processing device according to an embodiment of the present disclosure. As shown in Figure 7, the device 70 includes:
  • the DCT coefficient extraction part 71 is configured to extract the initial transformed DCT coefficients of the three color components corresponding to the JPEG image to be compressed;
  • the DCT coefficient rearrangement part 72 is configured to respectively arrange the initial DCT coefficients of the three color components according to different frequencies to obtain target DCT coefficients of the three color components;
  • the entropy encoding part 73 is configured to perform entropy encoding on the target DCT coefficients of the three color components to obtain target compressed data corresponding to the JPEG image to be compressed.
  • the DCT coefficient rearrangement part 72 is specifically configured as:
  • the coefficients of the same frequency are formed into the spatial dimension, and the coefficients of different frequencies are formed into the channel dimension to obtain multi-channel DCT sub-coefficients;
  • the multi-channel DCT sub-coefficients are arranged in the channel dimension to obtain the target DCT coefficients corresponding to the initial DCT coefficients.
  • the three color components include: the luminance component Y and the chrominance components Cb and Cr;
  • the target compressed data includes: super-prior encoding information, encoding information corresponding to the Cr component, and encoding information corresponding to the Cb component , the coding information corresponding to the Y component;
  • Entropy coding part 73 including:
  • the super-prior information coding subpart is configured to perform entropy coding on the shared super-prior information to obtain super-prior encoding information;
  • the color component encoding subpart is configured to perform entropy encoding on the target DCT coefficient of the Cr component, the target DCT coefficient of the Cb component, and the target DCT coefficient of the Y component in turn based on the encoding prior information corresponding to each color component, and obtain the corresponding The coding information corresponding to the Cb component, the coding information corresponding to the Y component.
  • the device 70 further includes: a coefficient fusion part configured to fuse the target DCT coefficients of the three color components before performing entropy coding on the target DCT coefficients of the three color components, to obtain the fused DCT coefficient;
  • the shared super prior information determining part is configured to determine the shared super prior information based on the fused DCT coefficients
  • the encoding prior information determination part is configured to split the shared super prior information to obtain the encoding prior information corresponding to each color component.
  • the color component encodes subparts, including:
  • the Cr component coding unit is configured to perform entropy coding on the target DCT coefficient of the Cr component based on the coding prior information corresponding to the Cr component, and obtain the coding information corresponding to the Cr component;
  • the Cb component coding unit is configured to perform entropy coding on the target DCT coefficient of the Cb component based on the coding prior information corresponding to the Cb component and the target DCT coefficient of the Cr component to obtain coding information corresponding to the Cb component;
  • the Y component encoding unit is configured to perform entropy encoding on the target DCT coefficient of the Y component based on the encoding prior information corresponding to the Y component, the target DCT coefficient of the Cr component, and the target DCT coefficient of the Cb component, to obtain encoding information corresponding to the Y component.
  • the Cr component coding unit is specifically configured as:
  • entropy coding is performed on the target DCT coefficient of the Cr component to obtain coding information corresponding to the Cr component.
  • the Cb component coding unit is specifically configured as:
  • entropy coding is performed on the target DCT coefficient of the Cb component to obtain coding information corresponding to the Cb component.
  • the Y component coding unit includes:
  • the probability distribution parameter determination subunit is configured to determine the probability distribution parameter corresponding to the Y component based on the coding prior information corresponding to the Y component, the target DCT coefficient of the Cr component, and the target DCT coefficient of the Cb component;
  • the Y component encoding subunit is configured to perform entropy encoding on the target DCT coefficient of the Y component based on the probability distribution parameters corresponding to the Y component by using multi-level channel autoregression to obtain encoding information corresponding to the Y component.
  • the Y component encoding subunit is specifically configured as:
  • the converted DCT coefficients of the Y component are disassembled to obtain the DCT coefficient matrix corresponding to the Y component;
  • the DCT coefficient matrix is entropy encoded by using the multi-level channel autoregressive to obtain the encoding information corresponding to the Y component.
  • the Y component encoding subunit is further specifically configured as:
  • the converted DCT coefficients of the Y component are disassembled in spatial dimensions to obtain multiple rows of the DCT coefficient matrix
  • the plurality of rows and the plurality of columns of each row in the plurality of rows are determined as a DCT coefficient matrix corresponding to the Y component.
  • the Y component encoding subunit is further specifically configured as:
  • the PMF corresponding to each column in each row of the DCT coefficient matrix is sequentially determined by using multi-level channel autoregression;
  • entropy encoding is performed on each column in each row of the DCT coefficient matrix to obtain the corresponding encoding information of each column in each row of the DCT coefficient matrix;
  • the coding information corresponding to each column in each row of the DCT coefficient matrix constitutes the coding information corresponding to the Y component.
  • the Y component encoding subunit is further specifically configured as:
  • the multi-level channel autoregressive is used to sequentially determine the coding prior information corresponding to each row in the DCT coefficient matrix
  • the PMF corresponding to each column in the i-th row is determined by using multi-level channel autoregression, where i represents the number of rows.
  • the Y component encoding subunit is further specifically configured as:
  • the Y component encoding subunit is further specifically configured as:
  • the first column to the j-1th column in the i-th row of the DCT coefficient matrix Determine the PMF corresponding to row i and column j, where j represents the number of columns.
  • the Y component encoding subunit is further specifically configured as:
  • the entropy coding is arithmetic coding.
  • the entropy coding part 73 is specifically configured as:
  • the target DCT coefficients of the three color components are entropy coded by using the image codec neural network to obtain the target compressed data.
  • the image encoding and decoding neural network includes an MLCC model
  • the target compressed data includes encoding information corresponding to the luminance component Y;
  • the entropy coding part 73 is specifically configured as:
  • multi-level channel autoregressive entropy coding is performed on the Y component of the three color components in the channel dimension to obtain the coding information corresponding to the Y component.
  • the DCT coefficient extraction part 71, the DCT coefficient rearrangement part 72, and the entropy coding part 73 can all be processors or processing components.
  • This method has a specific technical relationship with the internal structure of the computer system, and it can solve the technical problems of how to improve the hardware computing efficiency or execution effect (including reducing the amount of data storage, reducing the amount of data transmission, increasing the processing speed of the hardware, etc.), so as to obtain a natural The technical effect of regular computer system internal performance improvements.
  • the functions or modules included in the device provided by the embodiments of the present disclosure can be used to execute the methods described in the method embodiments above, and its specific implementation can refer to the description of the method embodiments above. For brevity, here No longer.
  • Embodiments of the present disclosure also provide a computer-readable storage medium, on which computer program instructions are stored, and the above-mentioned method is implemented when the computer program instructions are executed by a processor.
  • Computer readable storage media may be volatile or nonvolatile computer readable storage media.
  • An embodiment of the present disclosure also proposes an electronic device, including: a processor; a memory for storing instructions executable by the processor; wherein the processor is configured to invoke the instructions stored in the memory to execute the above method.
  • Electronic devices may be provided as terminals, servers, or other forms of devices.
  • Fig. 8 shows a block diagram of an electronic device according to an embodiment of the present disclosure.
  • the electronic device 800 may be a user equipment (User Equipment, UE), a mobile device, a user terminal, a terminal, a cellular phone, a cordless phone, a personal digital assistant (PDA), a handheld device, a computing device, a vehicle Devices, wearable devices and other terminal equipment.
  • UE User Equipment
  • PDA personal digital assistant
  • the hardware entity of this electronic equipment 80 comprises: processor 81, communication interface 82 and memory 83, wherein:
  • the processor 81 generally controls the overall operation of the electronic device 80 .
  • the communication interface 82 enables the electronic device to communicate with other terminals or servers through the network.
  • the memory 83 is configured to store instructions and applications executable by the processor 81, and can also cache data to be processed or processed by each module in the processor 81 and the electronic device 80 (for example, image data, audio data, voice communication data and Video communication data) can be realized by flash memory (FLASH) or random access memory (Random Access Memory, RAM). Data transmission can be performed between the processor 81 , the communication interface 82 and the memory 83 through the bus 84 .
  • FLASH flash memory
  • RAM Random Access Memory
  • An embodiment of the present disclosure also provides a computer program, where the computer program includes computer readable codes, and when the computer readable codes run on a device, the processor in the device implements the above method when executed.
  • An embodiment of the present disclosure also provides a computer program product, including computer-readable codes, or a non-volatile computer-readable storage medium carrying computer-readable codes, when the computer-readable codes are stored in a processor of an electronic device When running in the electronic device, the processor in the electronic device executes the above method.
  • the present disclosure can be a system, method and/or computer program product.
  • a computer program product may include a computer readable storage medium having computer readable program instructions thereon for causing a processor to implement various aspects of the present disclosure.
  • a computer readable storage medium may be a tangible device that can retain and store instructions for use by an instruction execution device.
  • a computer readable storage medium may be, for example, but is not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
  • Computer-readable storage media include: portable computer diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM), or flash memory), static random access memory (SRAM), compact disc read only memory (CD-ROM), digital versatile disc (DVD), memory stick, floppy disk, mechanically encoded device, such as a printer with instructions stored thereon A hole card or a raised structure in a groove, and any suitable combination of the above.
  • RAM random access memory
  • ROM read-only memory
  • EPROM erasable programmable read-only memory
  • flash memory static random access memory
  • SRAM static random access memory
  • CD-ROM compact disc read only memory
  • DVD digital versatile disc
  • memory stick floppy disk
  • mechanically encoded device such as a printer with instructions stored thereon
  • a hole card or a raised structure in a groove and any suitable combination of the above.
  • computer-readable storage media are not to be construed as transient signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., pulses of light through fiber optic cables), or transmitted electrical signals.
  • Computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or downloaded to an external computer or external storage device over a network, such as the Internet, a local area network, a wide area network, and/or a wireless network.
  • the network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers.
  • a network adapter card or a network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in each computing/processing device .
  • Computer program instructions for performing the operations of the present disclosure may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-dependent instructions, microcode, firmware instructions, state setting data, or Source or object code written in any combination, including object-oriented programming languages—such as Smalltalk, C++, etc., and conventional procedural programming languages—such as the “C” language or similar programming languages.
  • Computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server implement.
  • the remote computer can be connected to the user computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as via the Internet using an Internet service provider). connect).
  • LAN local area network
  • WAN wide area network
  • an electronic circuit such as a programmable logic circuit, field programmable gate array (FPGA), or programmable logic array (PLA)
  • FPGA field programmable gate array
  • PDA programmable logic array
  • These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine such that when executed by the processor of the computer or other programmable data processing apparatus , producing an apparatus for realizing the functions/actions specified in one or more blocks in the flowchart and/or block diagram.
  • These computer-readable program instructions can also be stored in a computer-readable storage medium, and these instructions cause computers, programmable data processing devices and/or other devices to work in a specific way, so that the computer-readable medium storing instructions includes An article of manufacture comprising instructions for implementing various aspects of the functions/acts specified in one or more blocks in flowcharts and/or block diagrams.
  • each block in a flowchart or block diagram may represent a module, a portion of a program segment, or an instruction that includes one or more Executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented by a dedicated hardware-based system that performs the specified function or action , or may be implemented by a combination of dedicated hardware and computer instructions.
  • the computer program product can be specifically realized by means of hardware, software or a combination thereof.
  • the computer program product is embodied as a computer storage medium, and in another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK) etc. wait.
  • a software development kit Software Development Kit, SDK
  • the writing order of each step does not mean a strict execution order and constitutes any limitation on the implementation process.
  • the specific execution order of each step should be based on its function and possible
  • the inner logic is OK.
  • the products applying the disclosed technical solution have clearly notified the personal information processing rules and obtained the individual's independent consent before processing personal information.
  • the disclosed technical solution involves sensitive personal information the products applying the disclosed technical solution have obtained individual consent before processing sensitive personal information, and at the same time meet the requirement of "express consent". For example, at a personal information collection device such as a camera, a clear and prominent sign is set up to inform that it has entered the scope of personal information collection, and personal information will be collected.
  • the personal information processing rules may include Information such as the information processor, the purpose of personal information processing, the method of processing, and the type of personal information processed.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Algebra (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Discrete Mathematics (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Compression Of Band Width Or Redundancy In Fax (AREA)

Abstract

本公开涉及一种图像处理方法及装置、电子设备、存储介质、计算机程序和计算机程序产品,所述方法包括:提取待压缩JPEG图像对应的三个颜色分量的初始DCT系数;分别对所述三个颜色分量的初始DCT系数按照不同频率进行排列,得到所述三个颜色分量的目标DCT系数;对所述三个颜色分量的目标DCT系数进行熵编码,得到所述待压缩JPEG图像对应的目标压缩数据。

Description

图像处理方法及装置、电子设备、存储介质、计算机程序和计算机程序产品
相关申请的交叉引用
本公开基于申请号为202210178603.5、申请日为2022年02月25日、申请名称为“图像处理方法及装置、电子设备和存储介质”的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本公开作为参考。
技术领域
本公开涉及但不限于计算机技术领域,尤其涉及一种图像处理方法及装置、电子设备、存储介质、计算机程序和计算机程序产品。
背景技术
联合图像专家组(Joint Photographic Experts Group,JPEG)标准目前是被广泛支持和使用的图像压缩标准。在数据中心、云存储和云文件系统中心广泛存在。根据调查,在Dropbox这类云存储文件系统中,JPEG占据35%。然而,由于JPEG技术本身的限制,依靠手工设计的压缩模块,很难充分消除数据冗余。为了节约海量的存储和带宽资源,需要对当前已经存在的海量JPEG文件进行进一步压缩。
发明内容
本公开实施例提出了一种图像处理方法及装置、电子设备、存储介质、计算机程序和计算机程序产品。
根据本公开实施例的一方面,提供了一种图像处理方法,包括:提取待压缩联合图像专家组JPEG图像对应的三个颜色分量的初始离散余弦变换DCT系数;分别对所述三个颜色分量的初始DCT系数按照不同频率进行排列,得到所述三个颜色分量的目标DCT系数;对所述三个颜色分量的目标DCT系数进行熵编码,得到所述待压缩JPEG图像对应的目标压缩数据。
根据本公开实施例的一方面,提供了一种图像处理装置,包括:DCT系数提取部分,配置为提取待压缩JPEG图像对应的三个颜色分量的初始变换DCT系数;DCT系数重排部分,配置为分别对所述三个颜色分量的初始DCT系数按照不同频率进行排列,得到所述三个颜色分量的目标DCT系数;熵编码部分,配置为对所述三个颜色分量的目标DCT系数进行熵编码,得到所述待压缩JPEG图像对应的目标压缩数据。
根据本公开实施例的一方面,提供了一种电子设备,包括:处理器;用于存储处理器可执行指令的存储器;其中,所述处理器被配置为调用所述存储器存储的指令,以执行上述方法。
根据本公开实施例的一方面,提供了一种计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令被处理器执行时实现上述方法。
根据本公开实施例的一方面,提供了一种计算机程序,所述计算机程序包括计算 机可读代码,在计算机可读代码在设备上运行的情况下,设备中的处理器执行时实现上述方法。
根据本公开实施例的一方面,提供了一种计算机程序产品,所述计算机程序产品包括计算机程序或指令,在所述计算机程序或指令在电子设备上运行的情况下,使得所述电子设备执行时实现上述方法。
在本公开实施例中,提取待压缩JPEG图像对应的三个颜色分量的初始离散余弦变换(Discrete Cosine Transform,DCT)系数,分别对三个颜色分量的初始DCT系数按照频率进行排列,得到三个颜色分量的目标DCT系数,对三个颜色分量的目标DCT系数进行熵编码,得到待压缩JPEG图像对应的目标压缩数据,以实现端到端地对待压缩JPEG图像进行图像压缩,从而有效消除待压缩JPEG图像在空间维度和通道维度的冗余,降低待压缩JPEG图像的数据大小,提高待压缩JPEG图像的压缩率。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,而非限制本公开。根据下面参考附图对示例性实施例的详细说明,本公开的其它特征及方面将变得清楚。
附图说明
为了更清楚地说明本公开实施例的技术方案,下面将对本公开实施例中所需要使用的附图进行说明。
此处的附图被并入说明书中并构成本说明书的一部分,这些附图示出了符合本公开的实施例,并与说明书一起用于说明本公开的技术方案。
图1示出根据本公开实施例的一种图像处理方法的流程图;
图2示出根据本公开实施例的初始DCT系数的示意图;
图3示出根据本公开实施例的对图2所示的初始DCT系数按照不同频率进行排列的示意图;
图4示出根据本公开实施例的图像编解码神经网络的示意图;
图5示出根据本公开实施例的确定Y分量对应的DCT系数矩阵的示意图;
图6示出根据本公开实施例的MLCC模型的示意图;
图7示出根据本公开实施例的一种图像处理装置的框图;
图8示出根据本公开实施例的一种电子设备的框图。
具体实施方式
以下将参考附图详细说明本公开的各种示例性实施例、特征和方面。附图中相同的附图标记表示功能相同或相似的元件。尽管在附图中示出了实施例的各种方面,但是除非特别指出,不必按比例绘制附图。
在这里专用的词“示例性”意为“用作例子、实施例或说明性”。这里作为“示例性”所说明的任何实施例不必解释为优于或好于其它实施例。
本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中术语“至少一种”表示多种中的任意一种或多种中的至少两种的任意组合,例如,包括A、B、C中的至少一种,可以表示包括从A、B和C构成的集合中选择的任意一个或多个元素。
另外,为了更好地说明本公开,在下文的具体实施方式中给出了众多的具体细节。本领域技术人员应当理解,没有某些具体细节,本公开同样可以实施。在一些实例中,对于本领域技术人员熟知的方法、手段、元件和电路未作详细描述,以便于凸显本公开的主旨。
JPEG标准目前是被广泛支持和使用的图像压缩标准。在数据中心、云存储和云文件系统中心广泛存在。根据调查,在Dropbox这类云存储文件系统中,JPEG占据35%。然而,由于JPEG技术本身的限制,依靠手工设计的压缩模块,很难充分消除数据冗余。
相关技术中,JPEG已经被大量图像压缩技术所超越,例如:JPEG2000、BPG、VVC/H.266帧内编码以及大量基于深度学习的图像压缩技术。尽管这些压缩技术在压缩率上取得了显著优于JPEG的效果,但是这些压缩技术仍未得到广泛的支持和使用。而且,这些压缩技术显著不同于JPEG,因此也无法处理已经存在的海量JPEG文件。
为了处理已经存在的海量JPEG文件,相关技术中利用一些压缩技术在JPEG的基础上引入了无损压缩,使得在保证原始JPEG文件无损的情况下,压缩JPEG文件的体积,从而节约海量的存储和带宽资源。这些压缩技术包括Lepton、Packjpg、MozJPEG、JPEGrescan、JPEG XL、Cmix等。这些压缩技术通常需要通过特征工程的方式,人工设计预测器和上下文模型,导致这些压缩技术的压缩率较低。
本公开实施例提供的图像处理方法,可以应用于对已经存在的海量JPEG文件进行再压缩,提取待压缩JPEG图像对应的三个颜色分量的初始DCT系数,分别对三个颜色分量的初始DCT系数按照不同频率进行排列,得到三个颜色分量的目标DCT系数,对三个颜色分量的目标DCT系数进行熵编码,得到待压缩JPEG图像对应的目标压缩数据,以实现端到端地对待压缩JPEG图像进行图像压缩,从而有效消除待压缩JPEG图像在空间维度和通道维度的冗余,降低待压缩JPEG图像的数据大小,提高待压缩JPEG图像的压缩率
下面详细介绍本公开实施例提供的用于对待压缩JPEG图像进行再压缩的图像处理方法。
图1示出根据本公开实施例的一种图像处理方法的流程图。该图像处理方法可以由终端设备或服务器等电子设备执行,终端设备可以为用户设备(User Equipment,UE)、移动设备、用户终端、终端、蜂窝电话、无绳电话、个人数字助理(Personal Digital Assistant,PDA)、手持设备、计算设备、车载设备、可穿戴设备等,该图像处理方法可以通过处理器调用存储器中存储的计算机可读指令的方式来实现。或者,可通过服务器执行该图像处理方法。如图1所示,该图像处理方法包括步骤S11至S13:
在S11中,提取待压缩JPEG图像对应的三个颜色分量的初始DCT系数。
在需要对待压缩JPEG图像进行再压缩的情况下,从待压缩JPEG图像的码流中提取出三个颜色分量的初始DCT系数,以执行后续再压缩过程。
在S12中,分别对三个颜色分量的初始DCT系数按照不同频率进行排列,得到三个颜色分量的目标DCT系数。
图2示出根据本公开实施例的初始DCT系数的示意图。图2所示的可以是任意一个颜色分量的初始DCT系数。如图2所示,初始DCT系数的大小是16×16×1,其中包括4个8×8×1大小的DCT块,每个DCT块中包括64个不同频率的系数。其中,不同DCT块中相对位置相同的系数具有相同的频率,相对位置不同的系数具有不同的频率。如图2所示,每个DCT块中包括0~63个不同位置。每个DCT块中0位置的系数频率相同,1位置的系数频率相同,以此类推。
对初始DCT系数按照不同频率进行排列,可以目标DCT系数。后文会结合本公开 可能的实现方式,对初始DCT系数按照不同频率进行排列得到目标DCT系数的过程做详细描述,此处不做赘述。
在S13中,对三个颜色分量的目标DCT系数进行熵编码,得到待压缩JPEG图像对应的目标压缩数据。
本公开实施例对三个颜色分量的目标DCT系数进行熵编码,有效消除待压缩JPEG图像在空间维度和通道维度的冗余,以实现端到端地对待压缩JPEG图像进行图像压缩,得到目标压缩数据。后文会结合本公开可能的实现方式,对每个颜色分量的目标DCT系数进行熵编码的过程做详细描述,此处不做赘述。
在本公开实施例中,提取待压缩JPEG图像对应的三个颜色分量的初始DCT系数,分别对三个颜色分量的初始DCT系数按照不同频率进行排列,得到三个颜色分量的目标DCT系数,对三个颜色分量的目标DCT系数进行熵编码,得到待压缩JPEG图像对应的目标压缩数据,以实现端到端地对待压缩JPEG图像进行图像压缩,从而有效消除待压缩JPEG图像在空间维度和通道维度的冗余,降低待压缩JPEG图像的数据大小,提高待压缩JPEG图像的压缩率。
在一种可能的实现方式中,分别对三个颜色分量的初始DCT系数按照不同频率进行排列,得到三个颜色分量的目标DCT系数,包括:针对任意一个颜色分量的初始DCT系数,从初始DCT系数中,将相同频率的系数构成空间维度,将不同频率的系数构成通道维度,得到多通道DCT子系数;对初始DCT系数进行zigzag扫描(即“之”字形扫描),确定zigzag排序;基于zigzag排序,对多通道DCT子系数在通道维度进行排列,得到初始DCT系数对应的目标DCT系数。
针对任意一个颜色分量,对该颜色分量的初始DCT系数进行预处理,按照不同频率在通道维度进行zigzag排序,以使得得到的该颜色分量的目标DCT系数,在空间维度和通道维度上都存在一定结构性的冗余信息,后续编码过程可以充分利用这些冗余信息,有效实现在空间维度和通道维度对该颜色分量的目标DCT系数进行熵编码。
图3示出根据本公开实施例的对图2所示的初始DCT系数按照不同频率进行排列的示意图。如图3所示,针对图2所示的16×16×1大小的初始DCT系数,提取初始DCT系数中相同频率的系数构成空间维度,不同频率的系数构成通道维度,得到64个通道的DCT子系数,每个通道的DCT子系数在空间维度的大小是2×2。
对图2所示的初始DCT系数进行zigzag扫描,得到zigzag排序,其中,zigzag排序是对不同DCT块中的位置进行重新排序。针对每个DCT块,zigzag排序为0、1、8、16、9、2、3、……、61、54、47、55、62、63。
基于上述zigzag排序,对64个通道的DCT子系数在通道维度上进行重新排序,得到图3中的目标DCT系数,其中,目标DCT系数的大小是2×2×64。
在一种可能的实现方式中,对三个颜色分量的目标DCT系数进行熵编码,得到待压缩JPEG图像对应的目标压缩数据,包括:利用图像编解码神经网络,对三个颜色分量的目标DCT系数进行熵编码,得到目标压缩数据。
利用预先训练好的,用于对待压缩JPEG图像进行无损再压缩的图像编解码神经网络模型,对三个颜色分量的目标DCT系数进行熵编码,实现端到端地对待压缩JPEG图像进行图像压缩,从而直接输出待压缩JPEG图像对应的目标压缩数据,有效提高待压缩JPEG图像的压缩率。
在一种可能的实现方式中,三个颜色分量包括:一个亮度分量Y和两个色度分量Cb和Cr。
相对于Cb分量和Cr分量,Y分量包含待压缩JPEG图像中更丰富的信息。
在一种可能的实现方式中,图像编解码神经网络中包括多级跨通道自回归熵编码模型(Multi-Level Cross-Channel Entropy Model,MLCC),目标压缩数据中包括Y分量对应的编码信息;利用图像编解码神经网络,对三个颜色分量的目标DCT系数进行熵编码,得到目标压缩数据,包括:利用MLCC模型,对三个颜色分量中的Y分量在通道维度进行多级通道自回归熵编码,得到Y分量对应的编码信息。
由于Y分量包含待压缩JPEG图像中更丰富的信息,因此,利用MLCC模型对Y分量在通道维度进行多级通道自回归熵编码,以有效降低Y分量的内部数据冗余,实现对Y分量的目标DCT系数进行熵编码,降低Y分量的数据大小。
图4示出根据本公开实施例的图像压缩神经网络的示意图。如图4所示,图像编解码神经网络中包括跨颜色熵编码模型401(Cross-color Entropy Model)和MLCC模型402。
仍以上述图4为例,如图4所示,在对待压缩JPEG图像的三个颜色分量的初始DCT系数进行预处理,得到Y、Cb、Cr三个颜色分量的目标DCT系数之后,如图4所示,将Y、Cb、Cr三个颜色分量的目标DCT系数输入图像编解码神经网络的跨颜色熵编码模型401,进行后续编码处理。
在一种可能的实现方式中,在对三个颜色分量的目标DCT系数进行熵编码之前,该图像处理方法还包括:对三个颜色分量的目标DCT系数进行融合,得到融合后DCT系数;基于融合后DCT系数,确定共享超先验信息;对共享超先验信息进行拆分,得到每个颜色分量对应的编码先验信息。
对三个颜色分量的目标DCT系数进行融合得到融合后DCT系数,以使得可以利用不同颜色分量之间的相关性,基于融合后DCT系数确定共享超先验信息,进而再通过对共享超先验信息进行拆分,可以得到每个颜色分量对应的编码先验信息,以用于后续编码处理。
仍以上述图4为例,如图4所示,跨颜色熵编码模型中包括系数融合模块4011(Coefficient Fusion Model,CFM)、超级编码器4012(Hyper Encoder)、量化模块4013(Q)、超级解码器4014(Hyper Decoder)、系数先验拆分模块4015(Coefficient Prior Split Model,CPSM)、熵参数预测模块4016(Entropy Parameters)。
利用系数融合模块4011对Y、Cb、Cr三个颜色分量的目标DCT系数进行融合,得到融合后DCT系数。由于待压缩JPEG图像可能有YCbCr 4:4:4、YCbCr 4:1:1、YCbCr 4:2:0等不同分辨率格式,在待压缩JPEG图像不是YCbCr 4:4:4这种全分辨率格式的情况下,为了使得Y、Cb、Cr三个颜色分量能够更好地融合,需要在融合之前对Y、Cb、Cr三个颜色分量进行分辨率对齐。
在一些实施例中,在待压缩JPEG图像是YCbCr 4:2:0这种分辨率格式的情况下,可以表示Cb分量和Cr分量具有相同的分辨率,Y分量的分辨率是Cb分量、Cr分量的2倍,此时,对Y分量的目标DCT系数进行2倍下采样处理,以使得下采样之后的Y分量的目标DCT系数,与Cb分量的目标DCT系数、Cr分量的目标DCT系数具有相同的分辨率,进而,对下采样之后的Y分量的目标DCT系数、Cb分量的目标DCT系数、Cr分量的目标DCT系数进行融合,得到融合后DCT系数。
仍以上述图4为例,如图4所示,系数融合模块4011输出的融合后DCT系数,输入超级编码器4012进行超先验预测,得到初始超先验信息z,进而将超级编码器4012输出的初始超先验信息z,输入量化模块4013进行量化,从而有效得到共享超先验信息
Figure PCTCN2022110266-appb-000001
量化模块4013输出的共享超先验信息
Figure PCTCN2022110266-appb-000002
经过超级解码器4014和系数先验拆分模 块4015之后,有效拆分得到三个颜色分量对应的编码先验信息:Cr分量对应的编码先验信息Cr prior、Cb分量对应的编码先验信息Cb prior、Y分量对应的编码先验信息Y prior
在一种可能的实现方式中,目标压缩数据包括:超先验编码信息、Cr分量对应的编码信息、Cb分量对应的编码信息、Y分量对应的编码信息;对三个颜色分量的目标DCT系数进行熵编码,得到待压缩JPEG图像对应的目标压缩数据,包括:对共享超先验信息进行熵编码,得到超先验编码信息;基于每个颜色分量对应的编码先验信息,依次对Cr分量的目标DCT系数、Cb分量的目标DCT系数、Y分量的目标DCT系数进行熵编码,得到Cr分量对应的编码信息、Cb分量对应的编码信息、Y分量对应的编码信息。
依次对超先验编码信息、Cr分量的目标DCT系数、Cb分量的目标DCT系数、Y分量的目标DCT系数进行熵编码,有效消除待压缩JPEG图像在空间维度和通道维度的冗余,得到压缩后的目标压缩数据。
在一种可能的实现方式中,熵编码为算术编码。
利用熵编码中的算术编码对待压缩JPEG图像进行压缩,可以有效提高压缩效果。
仍以上述图4为例,图4所示的跨颜色熵编码模型401和MLCC模型402中均包括算术编码器403(Arithmetic Encoder,AE),用于进行算术编码。
在确定共享超先验信息
Figure PCTCN2022110266-appb-000003
之后,可以进一步对共享超先验信息
Figure PCTCN2022110266-appb-000004
进行熵编码,以及将编码后得到的共享超先验编码信息,作为附加信息存储在目标压缩数据中。
仍以上述图4为例,如图4所示,跨颜色熵编码模型中还包括因式熵模块4017(Factorized Entropy),利用因式熵模块4017和算术编码器403(AE),对共享超先验信息
Figure PCTCN2022110266-appb-000005
进行算术编码,得到共享超先验编码信息。
在一种可能的实现方式中,基于每个颜色分量对应的编码先验信息,依次对Cr分量的目标DCT系数、Cb分量的目标DCT系数、Y分量的目标DCT系数进行熵编码,得到Cr分量对应的编码信息、Cb分量对应的编码信息、Y分量对应的编码信息,包括:基于Cr分量对应的编码先验信息,对Cr分量的目标DCT系数进行熵编码,得到Cr分量对应的编码信息;基于Cb分量对应的编码先验信息、Cr分量的目标DCT系数,对Cb分量的目标DCT系数进行熵编码,得到Cb分量对应的编码信息;基于Y分量对应的编码先验信息、Cr分量的目标DCT系数、Cb分量的目标DCT系数,对Y分量的目标DCT系数进行熵编码,得到Y分量对应的编码信息。
基于Cr分量对应的编码先验信息,直接对Cr分量的目标DCT系数进行熵编码;在对Cr分量的目标DCT系数进行熵编码完成之后,利用Cb分量对应的编码先验信息,以Cr分量的目标DCT系数作为上下文信息,有效对Cb分量的目标DCT系数进行熵编码;在对Cb分量的目标DCT系数进行熵编码完成之后,利用Y分量对应的编码先验信息,以Cr分量的目标DCT系数、Cr分量的目标DCT系数作为上下文信息,有效对Y分量的目标DCT系数进行熵编码,从而有效消除不同颜色分量之间的数据冗余。
在一种可能的实现方式中,基于Cr分量对应的编码先验信息,对Cr分量的目标DCT系数进行熵编码,得到Cr分量对应的编码信息,包括:基于Cr分量对应的编码先验信息,确定Cr分量的目标DCT系数对应的概率质量函数(Probability Mass Function,PMF);基于Cr分量的目标DCT系数对应的PMF,对Cr分量的目标DCT系数进行熵编码,得到Cr分量对应的编码信息。
基于Cr分量对应的编码先验信息,确定Cr分量的目标DCT系数对应的PMF,以有效实现对Cr分量的目标DCT系数进行熵编码,降低Cr分量的数据大小。
仍以上述图4为例,如图4所述,利用Cr分量对应的编码先验信息Cr prior,直接确定Cr分量的目标DCT系数对应的PMF,进而利用算术编码器403,基于Cr分量的目 标DCT系数对应的PMF对Cr分量的目标DCT系数进行算术编码,得到Cr分量对应的编码信息。
在一种可能的实现方式中,基于Cb分量对应的编码先验信息、Cr分量的目标DCT系数,对Cb分量的目标DCT系数进行熵编码,得到Cb分量对应的编码信息,包括:基于Cb分量对应的编码先验信息、Cr分量的目标DCT系数,确定Cb分量的目标DCT系数对应的PMF;基于Cb分量的目标DCT系数对应的PMF,对Cb分量的目标DCT系数进行熵编码,得到Cb分量对应的编码信息。
为了充分利用Cr分量和Cb分量之间结构上的冗余信息,在对Cr分量进行熵编码之后,将Cr分量的目标DCT系数作为上下文信息,以使得基于Cb分量对应的编码先验信息、Cr分量的目标DCT系数,确定Cb分量的目标DCT系数对应的PMF,以有效实现对Cb分量的目标DCT系数进行熵编码,降低Cb分量的数据大小。
仍以上述图4为例,如图4所示,跨颜色熵编码模型401中还包括熵参数预测模块4016(Entropy Parameters),在对Cr分量的目标DCT系数进行熵编码之后,将Cr分量的目标DCT系数、Cb分量对应的编码先验信息Cb prior输入熵参数预测模块4016,以使得利用熵参数预测模块4016,输出Cb分量的目标DCT系数对应的PMF。进而利用算术编码器403,基于Cb分量的目标DCT系数对应的PMF对Cb分量的目标DCT系数进行算术编码,得到Cb分量对应的编码信息。
在一种可能的实现方式中,基于Y分量对应的编码先验信息、Cr分量的目标DCT系数、Cb分量的目标DCT系数,对Y分量的目标DCT系数进行熵编码,得到Y分量对应的编码信息,包括:基于Y分量对应的编码先验信息、Cr分量的目标DCT系数、Cb分量的目标DCT系数,确定Y分量对应的概率分布参数;基于Y分量对应的概率分布参数,利用多级通道自回归,对Y分量的目标DCT系数进行熵编码,得到Y分量对应的编码信息。
为了充分利用Cr分量、Cb分量、Y分量之间结构上的冗余信息,在对Cr分量、Cb分量进行熵编码之后,将Cr分量的目标DCT系数、Cb分量的目标DCT系数作为上下文信息,以使得基于Y分量对应的编码先验信息、Cr分量的目标DCT系数、Cb分量的目标DCT系数,确定Y分量对应的概率分布参数,由于Y分量包含待压缩JPEG图像中更丰富的信息,直接基于Y分量对应的概率分布参数构建用于熵编码的PMF并不准确,因此,基于Y分量对应的概率分布参数,利用多级通道自回归,以有效降低Y分量的内部数据冗余,实现对Y分量的目标DCT系数进行熵编码,降低Y分量的数据大小。
在待压缩JPEG图像不是YCbCr 4:4:4这种全分辨率格式的情况下,为了使得Cb、Cr分量能够更好地作为Y分量的上下文信息,需要对Y、Cb、Cr三个颜色分量进行分辨率对齐。
在一些实施例中,在待压缩JPEG图像是YCbCr 4:2:0这种分辨率格式的情况下,可以表示Cb、Cr分量具有相同的分辨率,Y分量的分辨率是Cb、Cr分量的2倍,此时,分别对Cb、Cr分量的目标DCT系数进行2倍上采样处理,以使得上采样之后的Cb、Cr分量的目标DCT系数,与Y分量的目标DCT系数具有相同的分辨率。
仍以上述图4为例,如图4所示,跨颜色熵编码模型401中还包括上采样模块404(Up),利用上采样模块分别对Cb、Cr分量的目标DCT系数进行上采样处理,以使得上采样之后的Cb、Cr分量的目标DCT系数,与Y分量的目标DCT系数具有相同的分辨率。进而,对上采样之后的Cb、Cr分量的目标DCT系数进行融合之后,连同Y分量对应的编码先验信息Y prior,输入熵参数预测模块4016,以使得利用熵参数预测模块 4016,输出Y分量对应的概率分布参数hyper y
将Y分量对应的概率分布参数hyper y、Y分量的目标DCT系数,输入MLCC模型402,以实现利用多级通道自回归,对Y分量的目标DCT系数进行熵编码,得到Y分量对应的编码信息。
在一种可能的实现方式中,基于Y分量对应的概率分布参数,利用多级通道自回归,对Y分量的目标DCT系数进行熵编码,得到Y分量对应的编码信息,包括:对Y分量的目标DCT系数进行空间维度到通道维度的转换,得到Y分量的转换后DCT系数;按照预设矩阵形式,对Y分量的转换后DCT系数进行拆解,得到Y分量对应的DCT系数矩阵;基于Y分量对应的概率分布参数,利用多级通道自回归,对DCT系数矩阵进行熵编码,得到Y分量对应的编码信息。
确定Y分量对应的DCT系数矩阵,进而基于Y分量对应的概率分布参数,基于Y分量对应的概率分布参数,有效利用DCT系数矩阵中存在的结构性的冗余信息,对DCT系数矩阵进行多级通道自回归式的熵编码,降低Y分量的数据大小。
图5示出根据本公开实施例的确定Y分量对应的DCT系数矩阵的示意图。在Y分量的初始DCT系数的大小是32×32×1的情况下,利用图3所示的对初始DCT系数按照不同频率进行排列的方式,对32×32×1大小的Y分量的初始DCT系数进行排列之后,可以得到图5所示的4×4×64大小的Y分量的目标DCT系数。
如图5所示,在空间维度将Y分量的目标DCT系数进行2×2大小的分区,每个分区中包括1~4个不同位置,进而对分区之后的Y分量的目标DCT系数进行空间维度到通道维度501(Space-to-depth)的转换,得到Y分量的转换后DCT系数,如图5所示,Y分量的转换后DCT系数的大小为2×2×(64×4),即Y分量的转换后DCT系数在通道维度从64个通道增加到256个通道。
在一种可能的实现方式中,按照预设矩阵形式,对Y分量的转换后DCT系数进行拆解,得到Y分量对应的DCT系数矩阵,包括:按照预设矩阵形式,对Y分量的转换后DCT系数进行空间维度的拆解,得到DCT系数矩阵的多个行;对DCT系数矩阵的每一行进行通道维度的拆解,得到每一行的多个列。
按照预设矩阵形式,对Y分量的转换后DCT系数进行空间维度和通道维度的拆解,以使得拆解之后得到的DCT系数矩阵的行、列具有一定结构性的冗余信息,以使得可以利用该冗余信息,有效实现后续对DCT系数矩阵进行多级通道自回归式的熵编码。
仍以上述图5为例,如图5所示,按照预设矩阵形式,对2×2×(64×4)大小的Y分量的转换后DCT系数在空间维度进行行拆解502(Row Split),得到DCT系数矩阵中的四个行:r (1)、r (2)、r (3)、r (4),每一行中包括2×2×64大小的DCT系数。
针对DCT系数矩阵中2×2×64大小的每一行,在通道维度进行列拆解503(Column Split),得到每一行中的多个列:
Figure PCTCN2022110266-appb-000006
其中,i=1、2、3、4。
每一行中的列数n,以及每列中包含的通道数均相同,具体取值可以根据实际情况进行设置,本公开对此不作具体限定。
例如,n=9,即DCT系数矩阵的每一行中包括9个列:
Figure PCTCN2022110266-appb-000007
其中,i=1、2、3、4,j=1、2、……、9。其中,每个列包括的通道数分别是28、8、7、6、5、4、3、2、1。则对r (1)在通道维度进行拆分之后,得到r (1)中的9个列:
Figure PCTCN2022110266-appb-000008
Figure PCTCN2022110266-appb-000009
Figure PCTCN2022110266-appb-000010
对r (2)、r (3)、r (4)在通道维度的拆分与此类似,此处不做赘述。
在拆解得到Y分量对应的DCT系数矩阵之后,基于Y分量对应的概率分布参数,对DCT系数矩阵进行多级通道自回归式的熵编码。
图6示出根据本公开实施例的MLCC模型的示意图。如图6所示,MLCC模型包括外部通道模块601(Outer Channel)和内部通道模块602(Inner Channel)。
外部通道模块601中包括空间维度到通道维度转换单元6011(Space-to-depth)、行拆解单元6012(Row Split)。利用空间维度到通道维度转换单元6011,对Y分量的目标DCT系数进行空间维度到通道维度的转换,得到Y分量的转换后DCT系数Y'。进而,利用行拆解单元6012,对Y分量的转换后DCT系数Y'在空间维度进行行拆解,得到DCT系数矩阵中的四个行:r (1)、r (2)、r (3)、r (4),拆解得到的四个行输入对应的内部通道模块602进行多级通道自回归式的熵编码。
在一种可能的实现方式中,基于Y分量对应的概率分布参数,利用多级通道自回归,对DCT系数矩阵进行熵编码,得到Y分量对应的编码信息,包括:基于Y分量对应的概率分布参数,利用多级通道自回归,依次确定DCT系数矩阵的每一行中每一列对应的PMF;利用DCT系数矩阵的每一行中每一列对应的PMF,对DCT系数矩阵的每一行中每一列进行熵编码,得到DCT系数矩阵的每一行中每一列对应的编码信息;其中,DCT系数矩阵的每一行中每一列对应的编码信息,构成Y分量对应的编码信息。
为了充分利用DCT系数矩阵中每一行每一列之间结构上的冗余信息,基于Y分量对应的概率分布参数,利用多级通道自回归,依次确定DCT系数矩阵的每一行中每一列对应的PMF,以有效实现对DCT系数矩阵的每一行中每一列进行熵编码,降低Y分量的数据大小。
仍以上述图6为例,针对DCT系数矩阵的第i行r (i)中的每一列,确定第i行r (i)中的每一列对应的PMF,进而利用第i行r (i)中的每一列对应的PMF,对第i行r (i)中的每一列进行熵编码。
在一种可能的实现方式中,基于Y分量对应的概率分布参数,利用多级通道自回归,依次确定DCT系数矩阵的每一行中每一列对应的PMF,包括:基于Y分量对应的概率分布参数,利用多级通道自回归,依次确定DCT系数矩阵中每一行对应的编码先验信息;针对DCT系数矩阵的第i行,基于第i行对应的编码先验信息,利用多级通道自回归,确定第i行中每一列对应的PMF,其中,i表示行数。
基于Y分量对应的概率分布参数,利用多级通道自回归,依次确定DCT系数矩阵中每一行对应的编码先验信息,进而针对DCT系数矩阵中的任意第i行,可以基于第i行对应的编码先验信息,利用多级通道自回归,确定第i行中每一列对应的准确度较高的PMF,以用于后续对第i行中每一列进行熵编码。
在一种可能的实现方式中,基于Y分量对应的概率分布参数,利用多级通道自回归,依次确定DCT系数矩阵中每一行对应的编码先验信息,包括:针对DCT系数矩阵的第1行,基于Y分量对应的概率分布参数,确定第1行对应的编码先验信息;在i>1的情况下,针对DCT系数矩阵的第i行,基于Y分量对应的概率分布参数、DCT系数矩阵的第1行至第i-1行,确定第i行对应的编码先验信息。其中,i>1为i大于1.
针对DCT系数矩阵中每一行,除了第1行仅基于Y分量对应的概率分布参数确定其对应的编码先验信息之外,剩余的其它任意第i行,均以之前的第1行至第i-1行作为上下文信息,来确定其对应的编码先验信息,以实现在行方向上的自回归,提高每行 对应的编码先验信息的准确性。
针对DCT系数矩阵中第i行,在确定第i行对应的编码先验信息之后,利用第i行对应的编码先验信息,对第i行中的每一列进行熵编码,在对第i行中的所有列进行熵编码完成之后,再以第i行作为上下文信息,确定第i+1行对应的编码先验信息。
仍以上述图6为例,如图6所示,利用空间维度到通道维度转换单元6011,对Y分量对应的概率分布参数hyper y进行空间维度到通道维度的转换,得到转换后概率分布参数h'。
如图6所示,外部通道模块601中还包括参数模块6013(Param),将转换后概率分布参数h'输入参数模块6013,输出DCT系数矩阵的第1行对应的编码先验信息pri (1);将DCT系数矩阵的第1行r (1)、第1行对应的编码先验信息pri (1)输入内部通道模块602,以对DCT系数矩阵的第1行r (1)中的每一列进行熵编码。
在对DCT系数矩阵的第1行r (1)中的所有列进行熵编码之后,将转换后概率分布参数h'、DCT系数矩阵的第1行r (1)合并之后输入参数模块6013,输出DCT系数矩阵的第2行对应的编码先验信息pri (2);将DCT系数矩阵的第2行r (2)、第2行对应的编码先验信息pri (2)输入内部通道模块602,以对DCT系数矩阵的第2行r (2)中的每一列进行熵编码。
在对DCT系数矩阵的第2行r (2)中的所有列进行熵编码之后,将转换后概率分布参数h'、DCT系数矩阵的第1行r (1)、DCT系数矩阵的第2行r (2)合并之后输入参数模块6013,输出DCT系数矩阵的第3行对应的编码先验信息pri (3);将DCT系数矩阵的第3行r (3)、第3行对应的编码先验信息pri (3)输入内部通道模块602,以对DCT系数矩阵的第3行r (3)中的每一列进行熵编码。
在对DCT系数矩阵的第3行r (3)中的所有列进行熵编码之后,将转换后概率分布参数h'、DCT系数矩阵的第1行r (1)、DCT系数矩阵的第2行r (2)、DCT系数矩阵的第3行r (3)合并之后输入参数模块6013,输出DCT系数矩阵的第4行对应的编码先验信息pri (4);将DCT系数矩阵的第4行r (4)、第4行对应的编码先验信息pri (4)输入内部通道模块602,以对DCT系数矩阵的第4行r (4)中的每一列进行熵编码。
在一种可能的实现方式中,基于第i行对应的编码先验信息,利用多级通道自回归,确定第i行中每一列对应的PMF,包括:针对DCT系数矩阵的第i行第1列,基于第i行对应的编码先验信息,确定第i行第1列对应的PMF;在j>1的情况下,针对DCT系数矩阵的第i行第j列,基于第i行对应的编码先验信息、DCT系数矩阵的第i行中第1列至第j-1列,确定第i行第j列对应的PMF,其中,j表示列数。
针对DCT系数矩阵任意第i行中的每一列,除了第1列仅基于第i行对应的编码先验信息确定其对应的PMF之外,剩余的其它任意第j列,均以之前的第1列至第j-1列作为上下文信息,来确定其对应的PMF,以实现在列方向上的自回归,提高第i行中每一列对应的PMF的准确性。
针对DCT系数矩阵中第i行中的第j列,在确定第i行第j列对应的PMF之后,利用第i行第j列对应的PMF,对第i行第j列进行熵编码,在对第i行第j列进行熵编码完成之后,再以第i行第j列作为上下文信息,确定第i行第j+1列对应的PMF。
仍以上述图6为例,如图6所示,内部通道模块602中包括列拆解单元6021(Column Split),针对DCT系数矩阵的第i行r (i),利用列拆解单元6021,对DCT系数矩阵的第i行r (i)在通道维度进行行拆解,得到DCT系数矩阵的第i行r (i)中的n个列:
Figure PCTCN2022110266-appb-000011
Figure PCTCN2022110266-appb-000012
针对DCT系数矩阵的第i行r (i)中的第1列
Figure PCTCN2022110266-appb-000013
将第i行r (i)对应的编码先验信息pri (i)输入参数模块6013,输出第i行r (i)中的第1列
Figure PCTCN2022110266-appb-000014
对应的PMF,进而基于第i行r (i)中的第1列
Figure PCTCN2022110266-appb-000015
对应的PMF,对第i行r (i)中的第1列
Figure PCTCN2022110266-appb-000016
进行熵编码。
在对第i行r (i)中的第1列
Figure PCTCN2022110266-appb-000017
进行熵编码之后,将第i行r (i)对应的编码先验信息pri (i)、第i行r (i)中的第1列
Figure PCTCN2022110266-appb-000018
合并之后输入参数模块6013,输出第i行r (i)中的第2列
Figure PCTCN2022110266-appb-000019
对应的PMF,进而基于第i行r (i)中的第2列
Figure PCTCN2022110266-appb-000020
对应的PMF,对第i行r (i)中的第2列
Figure PCTCN2022110266-appb-000021
进行熵编码。
在对第i行r (i)中的第2列
Figure PCTCN2022110266-appb-000022
进行熵编码之后,将第i行r (i)对应的编码先验信息pri (i)、第i行r (i)中的第1列
Figure PCTCN2022110266-appb-000023
第i行r (i)中的第2列合并之后输入参数模块6013,输出第i行r (i)中的第3列
Figure PCTCN2022110266-appb-000024
对应的PMF,进而基于第i行r (i)中的第3列
Figure PCTCN2022110266-appb-000025
对应的PMF,对第i行r (i)中的第3列
Figure PCTCN2022110266-appb-000026
进行熵编码。
以此类推,直至完成对第i行r (i)中所有列的熵编码,此处不做赘述。
在一种可能的实现方式中,利用DCT系数矩阵的每一行中每一列对应的PMF,对DCT系数矩阵的每一行中每一列进行熵编码,得到DCT系数矩阵的每一行中每一列对应的编码信息,包括:在i≥1且j≥1的情况下,针对DCT系数矩阵的第i行第j列,基于第i行第j列对应的PMF,对第i行第j列进行熵编码,得到第i行第j列对应的编码信息。其中,i≥1且j≥1为i大于且等于1和j大于且等于1。
基于DCT系数矩阵的第i行第j列对应的PMF,对第i行第j列进行熵编码,得到第i行第j列对应的编码信息,从而有效降低第i行第j列的数据大小。
仍以上述图6为例,如图6所示,基于第i行r (i)中的第1列
Figure PCTCN2022110266-appb-000027
对应的PMF,利用AE,对第i行r (i)中的第1列
Figure PCTCN2022110266-appb-000028
进行算术编码,得到第i行r (i)中的第1列
Figure PCTCN2022110266-appb-000029
对应的编码信息;基于第i行r (i)中的第2列
Figure PCTCN2022110266-appb-000030
对应的PMF,利用AE,对第i行r (i)中的第2列
Figure PCTCN2022110266-appb-000031
进行算术编码,得到第i行r (i)中的第2列
Figure PCTCN2022110266-appb-000032
对应的编码信息;以此类推,直至完成对第i行r (i)中所有列的算术编码,此处不做赘述。
实际应用中,在对待压缩JPEG图像进行熵编码得到目标压缩数据后,若存在查看待压缩JPEG图像的需求,需要对目标压缩数据进行熵解码,以得到待压缩JPEG图像。
在对目标压缩数据进行熵解码,得到待压缩JPEG图像的解码过程中,可以通过对共享超先验编码信息进行熵解码,得到共享超先验信息
Figure PCTCN2022110266-appb-000033
进而再通过对共享超先验信息
Figure PCTCN2022110266-appb-000034
进行拆分,可以在解码过程中得到每个颜色分量对应的编码先验信息,以用于对每个颜色分量的编码信息进行后续解码处理。
仍以上述图4为例,如图4所示,跨颜色熵编码模型401中还包括算术解码器405(Arithmetic Decoder,AD),在解码过程中,利用因式熵模块4017和算术解码器405,对共享超先验编码信息进行熵解码,得到共享超先验信息
Figure PCTCN2022110266-appb-000035
进而利用超级解码器4014、系数先验拆分模块4015,以使得在编码过程有效拆分得到,用于对Cr分量对应的编码信息进行熵解码的编码先验信息Cr prior、用于对Cb分量对应的编码信息进行熵解码的编码先验信息Cb prior、用于对Y分量对应的编码信息进行熵解码的编码先验信息Y prior
利用Cr分量对应的编码先验信息Cr prior,对Cr分量对应的编码信息进行熵解码的过程,是利用Cr分量对应的编码先验信息Cr prior,对Cr分量的目标DCT系数进行熵编码的逆过程;在对Cr分量对应的编码信息进行熵解码得到Cr分量的目标DCT系数之后,利用Cb分量对应的编码先验信息Cb prior、Cr分量的目标DCT系数,对Cb分量对应的编码信息进行熵解码的过程,是利用Cb分量对应的编码先验信息Cb prior、Cr分量的目标DCT系数,对Cb分量的目标DCT系数进行熵编码的逆过程;在对Cr分量、Cb分量对应的编码信息进行熵解码得到Cr分量、Cb分量的目标DCT系数之后,利用Y分量对应的编码先验信息Y prior、Cr分量的目标DCT系数、Cb分量的目标DCT系数,对Y分量对应的编码信息进行熵解码的过程,是利用Y分量对应的编码先验信息Y prior、Cr分量的目标DCT系数、Cb分量的目标DCT系数,对Y分量的目标DCT系数进行熵编码的逆过程;具体熵解码过程此处不做赘述。
在一种可能的实现方式中,该图像处理方法还包括:对初始神经网络进行网络训练,得到目标图像编解码神经网络。
初始神经网络的网络结构与目标图像编解码神经网络的网络结构相同,通过对初始神经网络进行网络训练,以得到目标图像编解码神经网络中跨颜色熵编码模型和MLCC模型的网络参数。网络训练的具体过程可以采用相关技术中的网络训练过程,本公开实施例对此不做具体限定。
可以理解,本公开提及的上述各个方法实施例,在不违背原理逻辑的情况下,均可以彼此相互结合形成结合后的实施例,限于篇幅,本公开不再赘述。本领域技术人员可以理解,在具体实施方式的上述方法中,各步骤的具体执行顺序应当以其功能和可能的内在逻辑确定。
此外,本公开还提供了图像处理装置、电子设备、计算机可读存储介质、计算机程序和计算机程序产品,上述均可用来实现本公开提供的任一种图像处理方法,相应技术方案和描述和参见方法部分的相应记载,不再赘述。
图7示出根据本公开实施例的一种图像处理装置的框图。如图7所示,装置70包括:
DCT系数提取部分71,配置为提取待压缩JPEG图像对应的三个颜色分量的初始变换DCT系数;
DCT系数重排部分72,配置为分别对三个颜色分量的初始DCT系数按照不同频率进行排列,得到三个颜色分量的目标DCT系数;
熵编码部分73,配置为对三个颜色分量的目标DCT系数进行熵编码,得到所述待压缩JPEG图像对应的目标压缩数据。
在一种可能的实现方式中,DCT系数重排部分72,具体配置为:
针对任意一个颜色分量的初始DCT系数,从初始DCT系数中,将相同频率的系数构成空间维度,将不同频率的系数构成通道维度,得到多通道DCT子系数;
对初始DCT系数进行zigzag扫描,确定zigzag排序;
基于zigzag排序,对多通道DCT子系数在通道维度进行排列,得到初始DCT系数 对应的目标DCT系数。
在一种可能的实现方式中,三个颜色分量包括:亮度分量Y和色度分量Cb、Cr;目标压缩数据包括:超先验编码信息、Cr分量对应的编码信息、Cb分量对应的编码信息、Y分量对应的编码信息;
熵编码部分73,包括:
超先验信息编码子部分,配置为对共享超先验信息进行熵编码,得到超先验编码信息;
颜色分量编码子部分,配置为基于每个颜色分量对应的编码先验信息,依次对Cr分量的目标DCT系数、Cb分量的目标DCT系数、Y分量的目标DCT系数进行熵编码,得到Cr分量对应的编码信息、Cb分量对应的编码信息、Y分量对应的编码信息。
在一种可能的实现方式中,装置70还包括:系数融合部分,配置为在对三个颜色分量的目标DCT系数进行熵编码之前,对三个颜色分量的目标DCT系数进行融合,得到融合后DCT系数;
共享超先验信息确定部分,配置为基于融合后DCT系数,确定共享超先验信息;
编码先验信息确定部分,配置为对共享超先验信息进行拆分,得到每个颜色分量对应的编码先验信息。
在一种可能的实现方式中,颜色分量编码子部分,包括:
Cr分量编码单元,配置为基于Cr分量对应的编码先验信息,对Cr分量的目标DCT系数进行熵编码,得到Cr分量对应的编码信息;
Cb分量编码单元,配置为基于Cb分量对应的编码先验信息、Cr分量的目标DCT系数,对Cb分量的目标DCT系数进行熵编码,得到Cb分量对应的编码信息;
Y分量编码单元,配置为基于Y分量对应的编码先验信息、Cr分量的目标DCT系数、Cb分量的目标DCT系数,对Y分量的目标DCT系数进行熵编码,得到Y分量对应的编码信息。
在一种可能的实现方式中,Cr分量编码单元,具体配置为:
基于Cr分量对应的编码先验信息,确定Cr分量的目标DCT系数对应的PMF;
基于Cr分量的目标DCT系数对应的PMF,对Cr分量的目标DCT系数进行熵编码,得到Cr分量对应的编码信息。
在一种可能的实现方式中,Cb分量编码单元,具体配置为:
基于Cb分量对应的编码先验信息、Cr分量的目标DCT系数,确定Cb分量的目标DCT系数对应的PMF;
基于Cb分量的目标DCT系数对应的PMF,对Cb分量的目标DCT系数进行熵编码,得到Cb分量对应的编码信息。
在一种可能的实现方式中,Y分量编码单元,包括:
概率分布参数确定子单元,配置为基于Y分量对应的编码先验信息、Cr分量的目标DCT系数、Cb分量的目标DCT系数,确定Y分量对应的概率分布参数;
Y分量编码子单元,配置为基于Y分量对应的概率分布参数,利用多级通道自回归,对Y分量的目标DCT系数进行熵编码,得到Y分量对应的编码信息。
在一种可能的实现方式中,Y分量编码子单元,具体配置为:
对Y分量的目标DCT系数进行空间维度到通道维度的转换,得到Y分量的转换后DCT系数;
按照预设矩阵形式,对Y分量的转换后DCT系数进行拆解,得到Y分量对应的DCT系数矩阵;
基于Y分量对应的概率分布参数,利用多级通道自回归,对DCT系数矩阵进行熵编码,得到Y分量对应的编码信息。
在一种可能的实现方式中,Y分量编码子单元,还具体配置为:
按照预设矩阵形式,对Y分量的转换后DCT系数进行空间维度的拆解,得到DCT系数矩阵的多个行;
对DCT系数矩阵的每一行进行通道维度的拆解,得到每一行的多个列;
将多个行和多个行中每一行的多个列,确定为Y分量对应的DCT系数矩阵。
在一种可能的实现方式中,Y分量编码子单元,还具体配置为:
基于Y分量对应的概率分布参数,利用多级通道自回归,依次确定DCT系数矩阵的每一行中每一列对应的PMF;
利用DCT系数矩阵的每一行中每一列对应的PMF,对DCT系数矩阵的每一行中每一列进行熵编码,得到DCT系数矩阵的每一行中每一列对应的编码信息;
其中,DCT系数矩阵的每一行中每一列对应的编码信息,构成Y分量对应的编码信息。
在一种可能的实现方式中,Y分量编码子单元,还具体配置为:
基于Y分量对应的概率分布参数,利用多级通道自回归,依次确定DCT系数矩阵中每一行对应的编码先验信息;
针对DCT系数矩阵的第i行,基于第i行对应的编码先验信息,利用多级通道自回归,确定第i行中每一列对应的PMF,其中,i表示行数。
在一种可能的实现方式中,Y分量编码子单元,还具体配置为:
针对DCT系数矩阵的第1行,基于Y分量对应的概率分布参数,确定第1行对应的编码先验信息;
在i>1的情况下,针对DCT系数矩阵的第i行,基于Y分量对应的概率分布参数、DCT系数矩阵的第1行至第i-1行,确定第i行对应的编码先验信息。
在一种可能的实现方式中,Y分量编码子单元,还具体配置为:
针对DCT系数矩阵的第i行第1列,基于第i行对应的编码先验信息,确定第i行第1列对应的PMF;
在j>1的情况下,针对DCT系数矩阵的第i行第j列,基于第i行对应的编码先验信息、DCT系数矩阵的第i行中的第1列至第j-1列,确定第i行第j列对应的PMF,其中,j表示列数。
在一种可能的实现方式中,Y分量编码子单元,还具体配置为:
在i≥1且j≥1的情况下,针对DCT系数矩阵的第i行第j列,基于第i行第j列对应的PMF,对第i行第j列进行熵编码,得到第i行第j列对应的编码信息。
在一种可能的实现方式中,熵编码为算术编码。
在一种可能的实现方式中,熵编码部分73,具体配置为:
利用图像编解码神经网络,对三个颜色分量的目标DCT系数进行熵编码,得到目标压缩数据。
在一种可能的实现方式中,图像编解码神经网络中包括MLCC模型,目标压缩数据中包括亮度分量Y对应的编码信息;
熵编码部分73,具体配置为:
利用MLCC模型,对三个颜色分量中的Y分量在通道维度进行多级通道自回归熵编码,得到Y分量对应的编码信息。
在本实施例中,DCT系数提取部分71、DCT系数重排部分72、熵编码部分73均 可以是处理器或处理组件。
该方法与计算机系统的内部结构存在特定技术关联,且能够解决如何提升硬件运算效率或执行效果的技术问题(包括减少数据存储量、减少数据传输量、提高硬件处理速度等),从而获得符合自然规律的计算机系统内部性能改进的技术效果。
在一些实施例中,本公开实施例提供的装置具有的功能或包含的模块可以用于执行上文方法实施例描述的方法,其具体实现可以参照上文方法实施例的描述,为了简洁,这里不再赘述。
本公开实施例还提出一种计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令被处理器执行时实现上述方法。计算机可读存储介质可以是易失性或非易失性计算机可读存储介质。
本公开实施例还提出一种电子设备,包括:处理器;用于存储处理器可执行指令的存储器;其中,所述处理器被配置为调用所述存储器存储的指令,以执行上述方法。
电子设备可以被提供为终端、服务器或其它形态的设备。
图8示出根据本公开实施例的一种电子设备的框图。参照图8,电子设备800可以是用户设备(User Equipment,UE)、移动设备、用户终端、终端、蜂窝电话、无绳电话、个人数字处理(Personal Digital Assistant,PDA)、手持设备、计算设备、车载设备、可穿戴设备等终端设备。
参照图8,该电子设备80的硬件实体包括:处理器81、通信接口82和存储器83,其中:
处理器81通常控制电子设备80的总体操作。
通信接口82可以使电子设备通过网络与其他终端或服务器通信。
存储器83配置为存储由处理器81可执行的指令和应用,还可以缓存待处理器81以及电子设备80中各模块待处理或已经处理的数据(例如,图像数据、音频数据、语音通信数据和视频通信数据),可以通过闪存(FLASH)或随机访问存储器(Random Access Memory,RAM)实现。处理器81、通信接口82和存储器83之间可以通过总线84进行数据传输。
本公开实施例还提供了一种计算机程序,所述计算机程序包括计算机可读代码,在计算机可读代码在设备上运行的情况下,设备中的处理器执行时实现上述方法。
本公开实施例还提供了一种计算机程序产品,包括计算机可读代码,或者承载有计算机可读代码的非易失性计算机可读存储介质,当所述计算机可读代码在电子设备的处理器中运行时,所述电子设备中的处理器执行上述方法。
本公开可以是系统、方法和/或计算机程序产品。计算机程序产品可以包括计算机可读存储介质,其上载有用于使处理器实现本公开的各个方面的计算机可读程序指令。
计算机可读存储介质可以是可以保持和存储由指令执行设备使用的指令的有形设备。计算机可读存储介质例如可以是(但不限于)电存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或者上述的任意合适的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、静态随机存取存储器(SRAM)、便携式压缩盘只读存储器(CD-ROM)、数字多功能盘(DVD)、记忆棒、软盘、机械编码设备、例如其上存储有指令的打孔卡或凹槽内凸起结构、以及上述的任意合适的组合。这里所使用的计算机可读存储介质不被解释为瞬时信号本身,诸如无线电波或者其他自由传播的电磁波、通过波导或其他传输媒介传播的电磁波(例如,通过光纤电缆的光脉冲)、或者通过电线传输的电信号。
这里所描述的计算机可读程序指令可以从计算机可读存储介质下载到各个计算/处理设备,或者通过网络、例如因特网、局域网、广域网和/或无线网下载到外部计算机或外部存储设备。网络可以包括铜传输电缆、光纤传输、无线传输、路由器、防火墙、交换机、网关计算机和/或边缘服务器。每个计算/处理设备中的网络适配卡或者网络接口从网络接收计算机可读程序指令,并转发该计算机可读程序指令,以供存储在各个计算/处理设备中的计算机可读存储介质中。
用于执行本公开操作的计算机程序指令可以是汇编指令、指令集架构(ISA)指令、机器指令、机器相关指令、微代码、固件指令、状态设置数据、或者以一种或多种编程语言的任意组合编写的源代码或目标代码,所述编程语言包括面向对象的编程语言—诸如Smalltalk、C++等,以及常规的过程式编程语言—诸如“C”语言或类似的编程语言。计算机可读程序指令可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络—包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。在一些实施例中,通过利用计算机可读程序指令的状态信息来个性化定制电子电路,例如可编程逻辑电路、现场可编程门阵列(FPGA)或可编程逻辑阵列(PLA),该电子电路可以执行计算机可读程序指令,从而实现本公开的各个方面。
这里参照根据本公开实施例的方法、装置(系统)、计算机程序和计算机程序产品的流程图和/或框图描述了本公开的各个方面。应当理解,流程图和/或框图的每个方框以及流程图和/或框图中各方框的组合,都可以由计算机可读程序指令实现。
这些计算机可读程序指令可以提供给通用计算机、专用计算机或其它可编程数据处理装置的处理器,从而生产出一种机器,使得这些指令在通过计算机或其它可编程数据处理装置的处理器执行时,产生了实现流程图和/或框图中的一个或多个方框中规定的功能/动作的装置。也可以把这些计算机可读程序指令存储在计算机可读存储介质中,这些指令使得计算机、可编程数据处理装置和/或其他设备以特定方式工作,从而,存储有指令的计算机可读介质则包括一个制造品,其包括实现流程图和/或框图中的一个或多个方框中规定的功能/动作的各个方面的指令。
也可以把计算机可读程序指令加载到计算机、其它可编程数据处理装置、或其它设备上,使得在计算机、其它可编程数据处理装置或其它设备上执行一系列操作步骤,以产生计算机实现的过程,从而使得在计算机、其它可编程数据处理装置、或其它设备上执行的指令实现流程图和/或框图中的一个或多个方框中规定的功能/动作。
附图中的流程图和框图显示了根据本公开的多个实施例的系统、方法、计算机程序和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段或指令的一部分,所述模块、程序段或指令的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或动作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
该计算机程序产品可以具体通过硬件、软件或其结合的方式实现。在一个可选实施例中,所述计算机程序产品具体体现为计算机存储介质,在另一个可选实施例中,计算 机程序产品具体体现为软件产品,例如软件开发包(Software Development Kit,SDK)等等。
上文对各个实施例的描述倾向于强调各个实施例之间的不同之处,其相同或相似之处可以互相参考,为了简洁,本文不再赘述。
本领域技术人员可以理解,在具体实施方式的上述方法中,各步骤的撰写顺序并不意味着严格的执行顺序而对实施过程构成任何限定,各步骤的具体执行顺序应当以其功能和可能的内在逻辑确定。
若本公开技术方案涉及个人信息,应用本公开技术方案的产品在处理个人信息前,已明确告知个人信息处理规则,并取得个人自主同意。若本公开技术方案涉及敏感个人信息,应用本公开技术方案的产品在处理敏感个人信息前,已取得个人单独同意,并且同时满足“明示同意”的要求。例如,在摄像头等个人信息采集装置处,设置明确显著的标识告知已进入个人信息采集范围,将会对个人信息进行采集,若个人自愿进入采集范围即视为同意对其个人信息进行采集;或者在个人信息处理的装置上,利用明显的标识/信息告知个人信息处理规则的情况下,通过弹窗信息或请个人自行上传其个人信息等方式获得个人授权;其中,个人信息处理规则可包括个人信息处理者、个人信息处理目的、处理方式以及处理的个人信息种类等信息。
以上已经描述了本公开的各实施例,上述说明是示例性的,并非穷尽性的,并且也不限于所披露的各实施例。在不偏离所说明的各实施例的范围和精神的情况下,对于本技术领域的普通技术人员来说许多修改和变更都是显而易见的。本文中所用术语的选择,旨在最好地解释各实施例的原理、实际应用或对市场中的技术的改进,或者使本技术领域的其它普通技术人员能理解本文披露的各实施例。

Claims (23)

  1. 一种图像处理方法,包括:
    提取待压缩联合图像专家组JPEG图像对应的三个颜色分量的初始离散余弦变换DCT系数;
    分别对所述三个颜色分量的初始DCT系数按照不同频率进行排列,得到所述三个颜色分量的目标DCT系数;
    对所述三个颜色分量的目标DCT系数进行熵编码,得到所述待压缩JPEG图像对应的目标压缩数据。
  2. 根据权利要求1所述的方法,其中,所述分别对所述三个颜色分量的初始DCT系数按照不同频率进行排列,得到所述三个颜色分量的目标DCT系数,包括:
    针对任意一个颜色分量的初始DCT系数,从所述初始DCT系数中,将相同频率的系数构成空间维度,将不同频率的系数构成通道维度,得到多通道DCT子系数;
    对所述初始DCT系数进行之字形zigzag扫描,确定zigzag排序;
    基于所述zigzag排序,对所述多通道DCT子系数在通道维度进行排列,得到所述初始DCT系数对应的目标DCT系数。
  3. 根据权利要求2所述的方法,其中,所述三个颜色分量包括:亮度分量Y和色度分量Cb、Cr;所述目标压缩数据包括:超先验编码信息、Cr分量对应的编码信息、Cb分量对应的编码信息、Y分量对应的编码信息;
    所述对所述三个颜色分量的目标DCT系数进行熵编码,得到所述待压缩JPEG图像对应的目标压缩数据,包括:
    对共享超先验信息进行熵编码,得到所述超先验编码信息;
    基于每个颜色分量对应的编码先验信息,依次对所述Cr分量的目标DCT系数、所述Cb分量的目标DCT系数、所述Y分量的目标DCT系数进行熵编码,得到所述Cr分量对应的编码信息、所述Cb分量对应的编码信息、所述Y分量对应的编码信息。
  4. 根据权利要求3所述的方法,其中,在对所述三个颜色分量的目标DCT系数进行熵编码之前,所述方法还包括:
    对所述三个颜色分量的目标DCT系数进行融合,得到融合后DCT系数;
    基于所述融合后DCT系数,确定所述共享超先验信息;
    对所述共享超先验信息进行拆分,得到所述每个颜色分量对应的编码先验信息。
  5. 根据权利要求3或4所述的方法,其中,所述基于每个颜色分量对应的编码先验信息,依次对所述Cr分量的目标DCT系数、所述Cb分量的目标DCT系数、所述Y分量的目标DCT系数进行熵编码,得到所述Cr分量对应的编码信息、所述Cb分量对应的编码信息、所述Y分量对应的编码信息,包括:
    基于所述Cr分量对应的编码先验信息,对所述Cr分量的目标DCT系数进行熵编码,得到所述Cr分量对应的编码信息;
    基于所述Cb分量对应的编码先验信息、所述Cr分量的目标DCT系数,对所述Cb分量的目标DCT系数进行熵编码,得到所述Cb分量对应的编码信息;
    基于所述Y分量对应的编码先验信息、所述Cr分量的目标DCT系数、所述Cb分量的目标DCT系数,对所述Y分量的目标DCT系数进行熵编码,得到所述Y分量对应的编码信息。
  6. 根据权利要求5所述的方法,其中,所述基于所述Cr分量对应的编码先验信息,对所述Cr分量的目标DCT系数进行熵编码,得到所述Cr分量对应的编码信息,包括:
    基于所述Cr分量对应的编码先验信息,确定所述Cr分量的目标DCT系数对应的 概率质量函数PMF;
    基于所述Cr分量的目标DCT系数对应的PMF,对所述Cr分量的目标DCT系数进行熵编码,得到所述Cr分量对应的编码信息。
  7. 根据权利要求5或6所述的方法,其中,所述基于所述Cb分量对应的编码先验信息、所述Cr分量的目标DCT系数,对所述Cb分量的目标DCT系数进行熵编码,得到所述Cb分量对应的编码信息,包括:
    基于所述Cb分量对应的编码先验信息、所述Cr分量的目标DCT系数,确定所述Cb分量的目标DCT系数对应的PMF;
    基于所述Cb分量的目标DCT系数对应的PMF,对所述Cb分量的目标DCT系数进行熵编码,得到所述Cb分量对应的编码信息。
  8. 根据权利要求5至7中任意一项所述的方法,其中,所述基于所述Y分量对应的编码先验信息、所述Cr分量的目标DCT系数、所述Cb分量的目标DCT系数,对所述Y分量的目标DCT系数进行熵编码,得到所述Y分量对应的编码信息,包括:
    基于所述Y分量对应的编码先验信息、所述Cr分量的目标DCT系数、所述Cb分量的目标DCT系数,确定所述Y分量对应的概率分布参数;
    基于所述Y分量对应的概率分布参数,利用多级通道自回归,对所述Y分量的目标DCT系数进行熵编码,得到所述Y分量对应的编码信息。
  9. 根据权利要求8所述的方法,其中,所述基于所述Y分量对应的概率分布参数,利用多级通道自回归,对所述Y分量的目标DCT系数进行熵编码,得到所述Y分量对应的编码信息,包括:
    对所述Y分量的目标DCT系数进行空间维度到通道维度的转换,得到所述Y分量的转换后DCT系数;
    按照预设矩阵形式,对所述Y分量的转换后DCT系数进行拆解,得到所述Y分量对应的DCT系数矩阵;
    基于所述Y分量对应的概率分布参数,利用多级通道自回归,对所述DCT系数矩阵进行熵编码,得到所述Y分量对应的编码信息。
  10. 根据权利要求9所述的方法,其中,所述按照预设矩阵形式,对所述Y分量的转换后DCT系数进行拆解,得到所述Y分量对应的DCT系数矩阵,包括:
    按照所述预设矩阵形式,对所述Y分量的转换后DCT系数进行空间维度的拆解,得到所述DCT系数矩阵的多个行;
    对所述DCT系数矩阵的每一行进行通道维度的拆解,得到每一行的多个列;
    将所述多个行和所述多个行中每一行的多个列,确定为所述Y分量对应的DCT系数矩阵。
  11. 根据权利要求10所述的方法,其中,所述基于所述Y分量对应的概率分布参数,利用多级通道自回归,对所述DCT系数矩阵进行熵编码,得到所述Y分量对应的编码信息,包括:
    基于所述Y分量对应的概率分布参数,利用多级通道自回归,依次确定所述DCT系数矩阵的每一行中每一列对应的PMF;
    利用所述DCT系数矩阵的每一行中每一列对应的PMF,对所述DCT系数矩阵的每一行中每一列进行熵编码,得到所述DCT系数矩阵的每一行中每一列对应的编码信息;
    其中,所述DCT系数矩阵的每一行中每一列对应的编码信息,构成所述Y分量对应的编码信息。
  12. 根据权利要求11所述的方法,其中,所述基于所述Y分量对应的概率分布参数,利用多级通道自回归,依次确定所述DCT系数矩阵的每一行中每一列对应的PMF, 包括:
    基于所述Y分量对应的概率分布参数,利用多级通道自回归,依次确定所述DCT系数矩阵中每一行对应的编码先验信息;
    针对所述DCT系数矩阵的第i行,基于所述第i行对应的编码先验信息,利用多级通道自回归,确定所述第i行中每一列对应的PMF,其中,i表示行数。
  13. 根据权利要求12所述的方法,其中,所述基于所述Y分量对应的概率分布参数,利用多级通道自回归,依次确定所述DCT系数矩阵中每一行对应的编码先验信息,包括:
    针对所述DCT系数矩阵的第1行,基于所述Y分量对应的概率分布参数,确定所述第1行对应的编码先验信息;
    在i>1的情况下,针对所述DCT系数矩阵的第i行,基于所述Y分量对应的概率分布参数、所述DCT系数矩阵的第1行至第i-1行,确定所述第i行对应的编码先验信息。
  14. 根据权利要求12或13所述的方法,其中,所述基于所述第i行对应的编码先验信息,利用多级通道自回归,确定所述第i行中每一列对应的PMF,包括:
    针对所述DCT系数矩阵的第i行第1列,基于所述第i行对应的编码先验信息,确定所述第i行第1列对应的PMF;
    在j>1的情况下,针对所述DCT系数矩阵的第i行第j列,基于所述第i行对应的编码先验信息、所述DCT系数矩阵的第i行中的第1列至第j-1列,确定所述第i行第j列对应的PMF,其中,j表示列数。
  15. 根据权利要求11至14中任意一项所述的方法,其中,所述利用所述DCT系数矩阵的每一行中每一列对应的PMF,对所述DCT系数矩阵的每一行中每一列进行熵编码,得到所述DCT系数矩阵的每一行中每一列对应的编码信息,包括:
    在i≥1且j≥1的情况下,针对所述DCT系数矩阵的第i行第j列,基于所述第i行第j列对应的PMF,对所述第i行第j列进行熵编码,得到所述第i行第j列对应的编码信息。
  16. 根据权利要求1至15中任意一项所述的方法,其中,所述熵编码为算术编码。
  17. 根据权利要求1至16中任意一项所述的方法,其中,所述对所述三个颜色分量的目标DCT系数进行熵编码,得到所述待压缩JPEG图像对应的目标压缩数据,包括:
    利用图像编解码神经网络,对所述三个颜色分量的目标DCT系数进行熵编码,得到所述目标压缩数据。
  18. 根据权利要求17所述的方法,其中,所述图像编解码神经网络中包括多级跨通道自回归熵编码MLCC模型,所述目标压缩数据中包括亮度分量Y对应的编码信息;
    所述利用图像编解码神经网络,对所述三个颜色分量的目标DCT系数进行熵编码,得到所述目标压缩数据,包括:
    利用所述MLCC模型,对所述三个颜色分量中的Y分量在通道维度进行多级通道自回归熵编码,得到所述Y分量对应的编码信息。
  19. 一种图像处理装置,包括:
    DCT系数提取部分,配置为提取待压缩JPEG图像对应的三个颜色分量的初始变换DCT系数;
    DCT系数重排部分,配置为分别对所述三个颜色分量的初始DCT系数按照不同频率进行排列,得到所述三个颜色分量的目标DCT系数;
    熵编码部分,配置为对所述三个颜色分量的目标DCT系数进行熵编码,得到所述待压缩JPEG图像对应的目标压缩数据。
  20. 一种电子设备,包括:
    处理器;
    用于存储处理器可执行指令的存储器;
    其中,所述处理器被配置为调用所述存储器存储的指令,以执行权利要求1至18中任意一项所述的方法。
  21. 一种计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令被处理器执行时实现权利要求1至18中任意一项所述的方法。
  22. 一种计算机程序,所述计算机程序包括计算机可读代码,在计算机可读代码在设备上运行的情况下,设备中的处理器执行权利要求1至18中任意一项所述的方法。
  23. 一种计算机程序产品,所述计算机程序产品包括计算机程序或指令,在所述计算机程序或指令在电子设备上运行的情况下,使得所述电子设备执行权利要求1至18中任意一项所述的方法。
PCT/CN2022/110266 2022-02-25 2022-08-04 图像处理方法及装置、电子设备、存储介质、计算机程序和计算机程序产品 WO2023159883A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210178603.5 2022-02-25
CN202210178603.5A CN114554226A (zh) 2022-02-25 2022-02-25 图像处理方法及装置、电子设备和存储介质

Publications (1)

Publication Number Publication Date
WO2023159883A1 true WO2023159883A1 (zh) 2023-08-31

Family

ID=81678602

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/110266 WO2023159883A1 (zh) 2022-02-25 2022-08-04 图像处理方法及装置、电子设备、存储介质、计算机程序和计算机程序产品

Country Status (2)

Country Link
CN (1) CN114554226A (zh)
WO (1) WO2023159883A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114554226A (zh) * 2022-02-25 2022-05-27 北京市商汤科技开发有限公司 图像处理方法及装置、电子设备和存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006060657A (ja) * 2004-08-23 2006-03-02 Victor Co Of Japan Ltd 画像データ圧縮装置及び画像データ圧縮方法
CN103581678A (zh) * 2012-07-19 2014-02-12 豪威科技股份有限公司 由量化控制改良解码器性能的方法与系统
CN111868753A (zh) * 2018-07-20 2020-10-30 谷歌有限责任公司 使用条件熵模型的数据压缩
CN114554226A (zh) * 2022-02-25 2022-05-27 北京市商汤科技开发有限公司 图像处理方法及装置、电子设备和存储介质

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7903894B2 (en) * 2006-10-05 2011-03-08 Microsoft Corporation Color image coding using inter-color correlation
CN101951524B (zh) * 2009-07-10 2013-06-19 比亚迪股份有限公司 彩色数字图像的jpeg压缩方法和装置
CN114071141A (zh) * 2020-08-06 2022-02-18 华为技术有限公司 一种图像处理方法及其设备
CN113301347B (zh) * 2021-05-08 2023-05-05 广东工业大学 一种hevc高清视频编码的优化方法
CN113810693B (zh) * 2021-09-01 2022-11-11 上海交通大学 一种jpeg图像无损压缩和解压缩方法、系统与装置
CN114067009A (zh) * 2021-10-22 2022-02-18 深圳力维智联技术有限公司 基于Transformer模型的图像处理方法与装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006060657A (ja) * 2004-08-23 2006-03-02 Victor Co Of Japan Ltd 画像データ圧縮装置及び画像データ圧縮方法
CN103581678A (zh) * 2012-07-19 2014-02-12 豪威科技股份有限公司 由量化控制改良解码器性能的方法与系统
CN111868753A (zh) * 2018-07-20 2020-10-30 谷歌有限责任公司 使用条件熵模型的数据压缩
CN114554226A (zh) * 2022-02-25 2022-05-27 北京市商汤科技开发有限公司 图像处理方法及装置、电子设备和存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZHU YANQIU, CHEN HEXIN, DAI YISONG: "Compression Coding of Color Image via 3D-DCT Transform", JOURNAL OF IMAGE AND GRAPHICS, vol. 11, no. 2, 30 November 1997 (1997-11-30), pages 795 - 800, XP009548471 *

Also Published As

Publication number Publication date
CN114554226A (zh) 2022-05-27

Similar Documents

Publication Publication Date Title
US20240078712A1 (en) Data compression using conditional entropy models
US20200160565A1 (en) Methods And Apparatuses For Learned Image Compression
US11386583B2 (en) Image coding apparatus, probability model generating apparatus and image decoding apparatus
US11257252B2 (en) Image coding apparatus, probability model generating apparatus and image compression system
CN106937111B (zh) 优化图像压缩质量的方法及系统
WO2020237646A1 (zh) 图像处理方法、设备及计算机可读存储介质
US11335034B2 (en) Systems and methods for image compression at multiple, different bitrates
US20190356330A1 (en) Data compression by local entropy encoding
EP4099694A1 (en) Video stream processing method and apparatus, and electronic device and computer-readable medium
WO2023159883A1 (zh) 图像处理方法及装置、电子设备、存储介质、计算机程序和计算机程序产品
CN114973049B (zh) 一种统一卷积与自注意力的轻量视频分类方法
WO2022028197A1 (zh) 一种图像处理方法及其设备
WO2023124148A1 (zh) 数据处理方法及装置、电子设备和存储介质
US8582876B2 (en) Hybrid codec for compound image compression
CN108900532B (zh) 用于消息处理的电子设备、方法、存储介质和装置
WO2022246986A1 (zh) 数据处理方法、装置、设备及计算机可读存储介质
CN110738666A (zh) 一种基于离散余弦变换的图像语义分割方法及装置
WO2023193629A1 (zh) 区域增强层的编解码方法和装置
US10362325B2 (en) Techniques for compressing multiple-channel images
KR20200094363A (ko) 이미지 파일의 픽셀 변환을 통한 압축율 향상 방법 및 시스템
CN103139566A (zh) 用于可变长度码的高效解码的方法
CN111859210A (zh) 图像处理方法、装置、设备及存储介质
CN115103191A (zh) 图像处理方法、装置、设备及存储介质
CN113592966A (zh) 图像处理方法及装置、电子设备和存储介质
Bao et al. Taylor series based dual-branch transformation for learned image compression

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22928150

Country of ref document: EP

Kind code of ref document: A1