WO2023093377A1 - Procédé de codage, procédé de décodage et dispositif électronique - Google Patents

Procédé de codage, procédé de décodage et dispositif électronique Download PDF

Info

Publication number
WO2023093377A1
WO2023093377A1 PCT/CN2022/125944 CN2022125944W WO2023093377A1 WO 2023093377 A1 WO2023093377 A1 WO 2023093377A1 CN 2022125944 W CN2022125944 W CN 2022125944W WO 2023093377 A1 WO2023093377 A1 WO 2023093377A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature
feature map
decoded
group
channels
Prior art date
Application number
PCT/CN2022/125944
Other languages
English (en)
Chinese (zh)
Inventor
师一博
王晶
赵寅
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2023093377A1 publication Critical patent/WO2023093377A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/91Entropy coding, e.g. variable length coding [VLC] or arithmetic coding

Definitions

  • the embodiments of the present application relate to the field of data processing, and in particular, to a codec method and electronic equipment.
  • AI Artificial Intelligence, artificial intelligence
  • image compression algorithm is implemented based on deep learning.
  • traditional image compression technologies such as JPEG (Joint Photographic Experts Group, Joint Photographic Experts Group), BPG (Better Portable Graphics, better Portable Graphics), etc.
  • the AI image compression algorithm predicts the entropy estimation feature of a channel in the feature map matrix based on the information of all channels of the feature map matrix.
  • the correlation between the channels of the feature map matrix is low. Therefore, if all channel information of the feature map matrix is fused, there will be a large amount of unusable information, which will affect the efficiency of encoding and decoding.
  • the present application provides a codec method and electronic equipment.
  • an embodiment of the present application provides an encoding method, and the method includes: first, acquiring an image to be encoded. Next, based on the image to be encoded, a first feature map matrix is generated, where the first feature map matrix includes first feature maps of c channels, and c is a positive integer. Subsequently, the feature map group consisting of the first feature maps of k channels is fused within the group to obtain the first entropy estimation feature corresponding to the feature map group, where k is a positive integer smaller than c. Next, determine a probability distribution corresponding to the first feature map matrix according to the first entropy estimation feature, and then encode the first feature map matrix according to the probability distribution to obtain a code stream.
  • the entropy estimation feature is determined by using the feature map group composed of the feature maps of some channels for intra-group fusion. Compared with the entropy estimation feature determined by fusion of the feature maps of all channels, the introduction of invalid information is reduced, and then It can reduce coding computing power and improve coding efficiency.
  • the first feature maps of c channels and the first feature maps of k channels may be used to form a feature map group, and then N feature map groups may be obtained.
  • N is an integer greater than 1, and N is determined according to c and k.
  • N groups of first entropy estimation features can be obtained.
  • the amount of invalid information can be reduced. Introduced, thereby reducing the coding computing power, thereby improving the coding efficiency.
  • the quality of the reconstructed image can also be improved.
  • the N groups of first entropy estimation features are combined to obtain the first entropy estimation feature corresponding to the first feature map matrix; according to the first entropy estimation feature corresponding to the first feature map matrix, the probability of the first feature map matrix is determined distributed.
  • each feature map group may be the same or different, and this application is not limited thereto.
  • different feature map groups may contain the first feature map of the same channel.
  • different feature map groups may contain first feature maps of different channels.
  • the feature map group composed of the first feature maps of k channels is fused within the group to obtain the first entropy estimation feature corresponding to the feature map group, including: using the autoregressive weight matrix corresponding to the feature map group, Local spatial information is extracted from the feature map group to obtain the first entropy estimation feature corresponding to the feature map group.
  • M is the total number of output channels of the autoregressive model.
  • the feature map group corresponding to c1 ks1*ks2, where "ks1*ks2" represents the size of the convolution kernel of the autoregressive model, and ks1 may or may not be equal to ks2, which is not limited in this application. That is to say, among M i output channels, each output channel corresponds to k weight maps whose size is ks1*ks2.
  • the first feature map matrix includes: a second feature map matrix and a third feature map matrix, wherein the second feature map matrix includes the second feature map matrix of c channels The feature map, the third feature map matrix includes the third feature map of c channels; the feature map group consisting of the first feature maps of k channels is fused within the group to obtain the first entropy estimation feature corresponding to the feature map group, Including: performing intra-group fusion on the feature map group composed of the second feature map of k channels to obtain the first entropy estimation feature corresponding to the feature map group composed of the third feature map of k channels; according to the first entropy estimation feature, determining the probability distribution corresponding to the first feature map matrix, including: determining the probability distribution corresponding to the third feature map matrix according to the first entropy estimation feature corresponding to the feature map group composed of the third feature map of k channels; Encoding the first feature map matrix with a probability distribution to obtain a code stream includes: encoding the
  • the second feature map matrix and the third feature map matrix can be obtained by space-dividing the first feature map matrix.
  • the second feature map and the third feature map of each channel are added to obtain the first feature map of the channel.
  • the first feature map matrix includes a third feature map matrix
  • the third feature map matrix includes a third feature map of c channels
  • the feature map group composed of the first feature map is fused within the group to obtain the first entropy estimation feature corresponding to the feature map group, including: performing intra-group fusion on the feature map group composed of the third feature map of k channels to obtain The first entropy estimation feature corresponding to the feature map group composed of the third feature map of k channels; according to the first entropy estimation feature, determining the probability distribution corresponding to the first feature map matrix, including: according to the third feature of k channels
  • the first entropy estimation feature corresponding to the feature map group composed of graphs determines the probability distribution corresponding to the third feature map matrix; encodes the first feature map matrix according to the probability distribution to obtain a code stream, including: corresponding to the third feature map matrix
  • the probability distribution of encode the third feature map matrix to obtain the code stream.
  • the method further includes: performing feature extraction on the second feature map matrix included in the first feature map matrix to obtain a fourth feature map matrix; according to the fourth feature The map matrix is used to determine the second entropy estimation feature; according to the second entropy estimation feature, the probability distribution corresponding to the second feature map matrix is determined.
  • the method further includes: encoding the fourth feature map matrix to obtain a code stream. In this way, it is convenient for the decoding end to decode the second feature map matrix from the code stream.
  • the embodiment of the present application provides a decoding method, the method includes: obtaining a code stream, decoding the eigenvalues corresponding to the feature points of c channels from the code stream, and obtaining the first feature map matrix, where c is positive Integer; then, perform image reconstruction based on the first feature map matrix, and output the reconstructed image.
  • the decoded information group corresponding to the first feature point to be decoded, where the decoded information group includes the decoded information of the channel corresponding to the first feature point to be decoded and The decoded information of the other k-1 channels, k is a positive integer less than c; the decoded information group is fused within the group to obtain the first entropy estimation feature corresponding to the first feature point to be decoded.
  • the probability distribution corresponding to the first feature point to be decoded is determined, and then the first feature point to be decoded is decoded according to the probability distribution to obtain the corresponding feature value, wherein , the first feature point to be decoded is any feature point to be decoded.
  • the entropy estimation feature corresponding to the feature point to be decoded is determined. In terms of characteristics, it can reduce the introduction of invalid information, thereby reducing the decoding computing power, thereby improving decoding efficiency.
  • the decoded information includes feature values corresponding to the decoded feature points.
  • the intra-group fusion of the decoded information group is carried out to obtain the first entropy estimation feature corresponding to the first feature point to be decoded, including: using the autoregressive weight matrix corresponding to the decoded information group to perform the decoding on the decoded information group
  • the local spatial information is extracted to obtain the first entropy estimation feature corresponding to the first feature point to be decoded.
  • the feature points include feature points at the first preset position and feature points at the second preset position; the first feature point to be decoded is located at the first preset position The feature point at the set position; the method includes: decoding the fourth feature map matrix from the code stream, the fourth feature map matrix includes the feature value corresponding to the feature point at the second preset position in the first feature map matrix The feature obtained by feature extraction; for the second feature point to be decoded at the second preset position: based on the fourth feature map matrix, determine the second entropy estimation feature corresponding to the second feature point to be decoded; estimate the feature according to the second entropy , determining a probability distribution corresponding to the second point to be decoded; decoding the second feature point to be decoded according to the probability distribution to obtain a corresponding feature value.
  • the computing power of determining the first entropy estimation feature is greater; therefore, only determining the first en
  • the decoded information group includes the feature value corresponding to the decoded feature point at the second preset position in the channel corresponding to the first feature point to be decoded, and other k - the feature value corresponding to the decoded feature point at the second preset position in one channel. In this way, parallel decoding can be performed on the first feature point to be decoded at the first preset position, thereby further improving decoding efficiency.
  • the decoded information group includes the feature value corresponding to the decoded feature point at the first preset position in the channel corresponding to the first feature point to be decoded, and other k - the feature value corresponding to the decoded feature point at the first preset position in one channel.
  • the second aspect and any implementation manner of the second aspect correspond to the first aspect and any implementation manner of the first aspect respectively.
  • technical effects corresponding to the second aspect and any implementation manner of the second aspect reference may be made to the technical effects corresponding to the above-mentioned first aspect and any implementation manner of the first aspect, and details are not repeated here.
  • the embodiment of the present application provides an encoder, configured to implement the encoding method in the first aspect or any possible implementation manner of the first aspect.
  • the third aspect and any implementation manner of the third aspect correspond to the first aspect and any implementation manner of the first aspect respectively.
  • the technical effects corresponding to the third aspect and any one of the implementation manners of the third aspect refer to the above-mentioned first aspect and the technical effects corresponding to any one of the implementation manners of the first aspect, which will not be repeated here.
  • the embodiment of the present application provides a decoder, configured to perform the decoding method in the second aspect or any possible implementation manner of the second aspect.
  • the fourth aspect and any implementation manner of the fourth aspect correspond to the second aspect and any implementation manner of the second aspect respectively.
  • the technical effects corresponding to the fourth aspect and any one of the implementation manners of the fourth aspect refer to the above-mentioned second aspect and the technical effects corresponding to any one of the implementation manners of the second aspect, and details are not repeated here.
  • an embodiment of the present application provides an electronic device, including: a memory and a processor, the memory is coupled to the processor; the memory stores program instructions, and when the program instructions are executed by the processor, the electronic device executes the first aspect or An encoding method in any possible implementation manner of the first aspect.
  • the fifth aspect and any implementation manner of the fifth aspect correspond to the first aspect and any implementation manner of the first aspect respectively.
  • the technical effects corresponding to the fifth aspect and any one of the implementation manners of the fifth aspect refer to the technical effects corresponding to the above-mentioned first aspect and any one of the implementation manners of the first aspect, and details are not repeated here.
  • an embodiment of the present application provides an electronic device, including: a memory and a processor, the memory is coupled to the processor; the memory stores program instructions, and when the program instructions are executed by the processor, the electronic device executes the second aspect or The decoding method in any possible implementation manner of the second aspect.
  • the sixth aspect and any implementation manner of the sixth aspect correspond to the second aspect and any implementation manner of the second aspect respectively.
  • the technical effects corresponding to the sixth aspect and any one of the implementation manners of the sixth aspect refer to the above-mentioned second aspect and the technical effects corresponding to any one of the implementation manners of the second aspect, and details are not repeated here.
  • the embodiment of the present application provides a chip, including one or more interface circuits and one or more processors; the interface circuit is used to receive signals from the memory of the electronic device and send signals to the processor, and the signals include memory Computer instructions stored in the computer; when the processor executes the computer instructions, the electronic device is made to execute the encoding method in the first aspect or any possible implementation manner of the first aspect.
  • the seventh aspect and any implementation manner of the seventh aspect correspond to the first aspect and any implementation manner of the first aspect respectively.
  • the technical effects corresponding to the seventh aspect and any one of the implementation manners of the seventh aspect refer to the above-mentioned first aspect and the technical effects corresponding to any one of the implementation manners of the first aspect, and details are not repeated here.
  • the embodiment of the present application provides a chip, including one or more interface circuits and one or more processors; the interface circuit is used to receive signals from the memory of the electronic device and send signals to the processor, and the signals include memory Computer instructions stored in the computer; when the processor executes the computer instructions, the electronic device is made to execute the second aspect or the decoding method in any possible implementation manner of the second aspect.
  • the eighth aspect and any implementation manner of the eighth aspect correspond to the second aspect and any implementation manner of the second aspect respectively.
  • the technical effects corresponding to the eighth aspect and any one of the implementation manners of the eighth aspect refer to the above-mentioned second aspect and the technical effects corresponding to any one of the implementation manners of the second aspect, and details are not repeated here.
  • the embodiment of the present application provides a computer storage medium, the computer readable storage medium stores a computer program, and when the computer program runs on the computer or the processor, the computer or the processor executes the first aspect or the first aspect An encoding method in any possible implementation of .
  • the ninth aspect and any implementation manner of the ninth aspect correspond to the first aspect and any implementation manner of the first aspect respectively.
  • the technical effects corresponding to the ninth aspect and any one of the implementation manners of the ninth aspect refer to the above-mentioned first aspect and the technical effects corresponding to any one of the implementation manners of the first aspect, and details are not repeated here.
  • the embodiment of the present application provides a computer storage medium, the computer readable storage medium stores a computer program, and when the computer program runs on the computer or the processor, the computer or the processor executes the second aspect or the second aspect
  • the decode method in any possible implementation of .
  • the tenth aspect and any implementation manner of the tenth aspect correspond to the second aspect and any implementation manner of the second aspect respectively.
  • the embodiment of the present application provides a computer program product, the computer program product includes a software program, and when the software program is executed by a computer or a processor, the method in the first aspect or any possible implementation manner of the first aspect steps are executed.
  • the eleventh aspect and any implementation manner of the eleventh aspect correspond to the first aspect and any implementation manner of the first aspect respectively.
  • the technical effects corresponding to the eleventh aspect and any one of the implementation manners of the eleventh aspect refer to the above-mentioned first aspect and the technical effects corresponding to any one of the implementation manners of the first aspect, and details are not repeated here.
  • the embodiment of the present application provides a computer program product.
  • the computer program product includes a software program.
  • the software program is executed by a computer or a processor, the method in the second aspect or any possible implementation of the second aspect step is executed.
  • the twelfth aspect and any implementation manner of the twelfth aspect correspond to the second aspect and any implementation manner of the second aspect respectively.
  • For the technical effects corresponding to the twelfth aspect and any one of the implementation manners of the twelfth aspect refer to the above-mentioned second aspect and the technical effects corresponding to any one of the implementation manners of the second aspect, and details are not repeated here.
  • Fig. 1 is a schematic diagram showing an exemplary system framework structure
  • FIG. 2 is a schematic diagram of an exemplary encoding process
  • Fig. 3a is a schematic diagram of an exemplary feature map group
  • Fig. 3b is a schematic diagram of an exemplary feature map group
  • Fig. 3c is a schematic diagram of an exemplary fusion process within a group
  • Figure 3d is a schematic diagram of an exemplary fusion process within a group
  • FIG. 4 is a schematic diagram of an exemplary decoding process
  • Fig. 5a is a schematic diagram of an exemplary decoding process
  • Fig. 5b is a schematic diagram of an exemplary decoded information group
  • Figure 5c is a schematic diagram of an exemplary fusion process within a group
  • Figure 5d is a schematic diagram of an exemplary fusion process within a group
  • Fig. 5e is a schematic diagram showing an exemplary compression effect
  • FIG. 6 is a schematic structural diagram of an exemplary encoding and decoding framework
  • FIG. 7 is a schematic diagram of an exemplary encoding process
  • FIG. 8 is a schematic diagram of a feature map division process schematically shown
  • FIG. 9 is a schematic diagram of an exemplary decoding process
  • FIG. 10 is a schematic diagram of an exemplary encoding process
  • Fig. 11a is a schematic diagram of an exemplary decoding process
  • Fig. 11b is a schematic diagram showing an exemplary compression effect
  • Fig. 12 is a schematic structural diagram of the device shown exemplarily.
  • first and second in the description and claims of the embodiments of the present application are used to distinguish different objects, rather than to describe a specific order of objects.
  • first target object, the second target object, etc. are used to distinguish different target objects, rather than describing a specific order of the target objects.
  • words such as “exemplary” or “for example” are used as examples, illustrations or illustrations. Any embodiment or design scheme described as “exemplary” or “for example” in the embodiments of the present application shall not be interpreted as being more preferred or more advantageous than other embodiments or design schemes. Rather, the use of words such as “exemplary” or “such as” is intended to present related concepts in a concrete manner.
  • multiple processing units refer to two or more processing units; multiple systems refer to two or more systems.
  • Fig. 1 is a schematic diagram showing an exemplary system framework structure. It should be understood that the system shown in FIG. 1 is only an example, and the system of the present application may have more or fewer components than those shown in the figure, may combine two or more components, or may have different component configuration.
  • the various components shown in FIG. 1 may be implemented in hardware, software, or a combination of hardware and software, including one or more signal processing and/or application specific integrated circuits.
  • the image compression process may be as follows: the image to be encoded is input to the AI encoding unit, and after being processed by the AI encoding unit, the feature value and probability distribution corresponding to the feature points to be encoded are output. Then, the feature value and probability distribution corresponding to the feature point to be encoded are input to the entropy encoding unit, and the entropy encoding unit performs entropy encoding on the feature value of the feature point to be encoded according to the probability distribution corresponding to the feature point to be encoded, and outputs a code stream.
  • the image decompression process can be as follows: after the entropy decoding unit obtains the code stream, it can perform entropy decoding on the feature points to be decoded according to the probability distribution predicted by the AI decoding unit for the feature points to be decoded in the code stream , output the feature value corresponding to the decoded feature point to the AI decoding unit.
  • the AI decoding unit performs image reconstruction based on the feature values corresponding to the decoded feature points, and outputs the reconstructed image.
  • entropy coding refers to the coding that does not lose any information according to the principle of entropy in the coding process
  • entropy coding can include many kinds, such as Shannon (Shannon) coding, Huffman (Huffman) coding and arithmetic coding (arithmetic coding) etc. , this application is not limited to this.
  • the image to be encoded that is input to the AI encoding unit may be a RAW (unprocessed) image, an RGB (RedGreenBlue, red, green, blue) image, and a YUV ("Y” indicates brightness (Luminance, Luma), "U” and “V” are any one of chromaticity and concentration (Chrominance, Chroma)) images, which is not limited in the present application.
  • the compression process and the decompression process may be performed in the same electronic device, or may be performed in different electronic devices, which is not limited in the present application.
  • the AI encoding unit and the AI decoding unit may be set in an NPU (Neural network Processing Unit, embedded neural network processor) or a GPU (Graphics Processing Unit, graphics processing unit).
  • An exemplary entropy encoding unit and an entropy decoding unit may be set in a CPU (Central Processing Unit, central processing unit).
  • the present application may be applied to compressing and decompressing an independent image, and may also be applied to compressing and decompressing multiple frames of images in a video sequence, which is not limited in the present application.
  • this application can be applied to a variety of scenarios, for example, scenarios where Huawei Cloud stores (or transmits) images (or videos), and for example, video surveillance scenarios, and for example, live broadcast scenarios, etc., and this application does not make any limit.
  • Fig. 2 is a schematic diagram of an encoding process exemplarily shown.
  • the encoding end may obtain the image to be encoded, and then may refer to S202 to S205 to encode the image to be encoded to obtain a corresponding code stream.
  • the image to be encoded may be subjected to space transformation, and the image to be encoded may be transformed into another space, so as to reduce temporal redundancy and spatial redundancy of the image to be encoded, to obtain the first feature map matrix.
  • the first feature map matrix includes first feature maps of c channels, where c is a positive integer.
  • each first feature map may include h*w feature points.
  • an autoregressive model may be used to determine the first entropy estimation feature corresponding to the first feature map matrix.
  • the total number of input channels of the autoregressive model is c, and the total number of output channels is M; wherein, M is a positive integer, and M can be greater than c, or less than c, or equal to c, which can be set according to requirements.
  • M is a positive integer
  • M can be greater than c, or less than c, or equal to c, which can be set according to requirements.
  • one input channel of the autoregressive model may correspond to at least one output channel
  • one output channel of the autoregressive model may correspond to at least one input channel.
  • the first feature maps of the c channels of the first feature map matrix can be used as c-channel inputs and input to autoregressive model.
  • the autoregressive model can use the first feature maps of k channels in the first feature map matrix to form a feature map group; in this way, N feature map groups can be obtained, and N is an integer greater than 1, which can be specifically determined according to k and c are determined.
  • the number k of channels included in each feature map group may be the same or different, which is not limited in the present application.
  • Fig. 3a is a schematic diagram of an exemplary feature map group. Wherein, a rectangle in Fig. 3a represents a first feature map. Among them, the number k of channels contained in each feature map group in Figure 3a is the same.
  • each feature map group includes 2 first feature maps.
  • Fig. 3b is a schematic diagram of an exemplary feature map group. Among them, a rectangle in Figure 3b represents a first feature map, and the number of channels k contained in each feature map group in Figure 3b is different.
  • the first feature map of c channels can be used, and the first feature map of 1 channel can be used to form feature map group 1; the first feature map of c channels can be used, and 2 channels
  • the first feature map of c channels can be used to form feature map group 2; the first feature maps of c channels can be used, and the first feature maps of 3 channels can be used to form feature map group 3; ...
  • FIG. 3a and FIG. 3b are only an example of the present application, and k can also be set to other values according to requirements, which is not limited in the present application.
  • different feature map groups can contain the first feature map of the same channel, for example, feature map group 1 includes the first feature map of channel 1 and the first feature map of channel 2, and feature map group 2 can include channel 2 The first feature map of , the first feature map of channel 3, and the first feature map of channel 4.
  • Different feature map groups may contain first feature maps of different channels, as shown in Fig. 3a and Fig. 3b; the present application does not limit this.
  • the fusion within a group may refer to fusing the feature maps of k channels in the feature map group, so that the first entropy estimation feature corresponding to the feature map group can be obtained.
  • the autoregressive model may perform intra-group fusion on the N feature map groups to obtain the first entropy estimation features respectively corresponding to the N feature map groups.
  • the determination of the first entropy estimation feature corresponding to the i-th feature map group will be described below as an example.
  • i is an integer between 1 and N, and the value of i can be between 1 and N.
  • each output channel corresponds to k weight maps whose size is ks1*ks2.
  • M i 1
  • the number of output channels corresponding to the i-th feature map group is 1, and the output channel corresponds to two weight maps of size ks1*ks2.
  • the autoregressive weight matrix corresponding to the i-th feature map group may be used to extract local spatial information from the i-th feature map group to obtain the first entropy estimation feature corresponding to the i-th feature map group.
  • the weight map of the j-th output channel corresponding to the i-th feature map group can be used to convolve with the first feature maps of k channels in the i-th feature map group to obtain k convolution results ; Fuse the k convolution results to obtain the first entropy estimation feature of the i-th feature map group corresponding to the j-th output channel. Merge the i-th feature map group corresponding to the first entropy estimation features of the M i output channels to obtain the first entropy estimation feature corresponding to the i-th feature map group.
  • j is a number between 1 and M i , including 1 and M i .
  • Fig. 3c is a schematic diagram of an intra-group fusion process exemplarily shown.
  • the number of output channels corresponding to feature map group 1 in FIG. 3c(1) is 1, and output channel 1 corresponds to two weight maps: weight map 11 and weight map 12 .
  • the weight map 11 can be used to convolve with the first feature map of input channel 1 to obtain the convolution result
  • the weight map 12 can be used to convolve with the first feature map of input channel 2 to obtain the convolution
  • the result of the plot is 12.
  • the convolution result 11 and the convolution result 12 are fused to obtain the first entropy estimation feature corresponding to the output channel 1 of the feature map group 1. In this way, the first entropy estimation feature corresponding to feature map group 1 can be obtained.
  • the number of output channels corresponding to feature map group 1 in Figure 3c(2) is 2, and output channel 1 corresponds to two weight maps: weight map 11 and weight map 12, and output channel 2 corresponds to 2 weight maps: weight map 21 and weight map 22.
  • the weight map 11 can be used to convolve with the first feature map of input channel 1 to obtain the convolution result 11
  • the weight map 12 can be used to convolve with the first feature map of input channel 2 to obtain the convolution
  • the result of the plot is 12.
  • the convolution result 11 and the convolution result 12 are fused to obtain the first entropy estimation feature corresponding to the output channel 1 of the feature map group 1.
  • the weight map 21 can be used to convolve with the first feature map of the input channel 1 to obtain the convolution result 13
  • the weight map 22 can be used to convolve with the first feature map of the input channel 2 to obtain the convolution result 14 .
  • the convolution result 13 and the convolution result 14 are fused to obtain the first entropy estimation feature corresponding to the output channel 2 of the feature map group 1.
  • output channel 1 corresponds to 2 weight maps: weight map 11 and weight map 12
  • output channel 2 corresponds to 2 weight maps
  • output channel 3 corresponds to 2 weight maps: weight map 31 and weight map 32.
  • the weight map 11 can be used to convolve with the first feature map of input channel 1 to obtain the convolution result 11
  • the weight map 12 can be used to convolve with the first feature map of input channel 2 to obtain the convolution
  • the result of the plot is 12.
  • the convolution result 11 and the convolution result 12 are fused to obtain the first entropy estimation feature corresponding to the output channel 1 of the feature map group 1.
  • the weight map 21 can be used to perform convolution with the first feature map of input channel 1 to obtain the convolution result 13
  • the weight map 22 can be used to perform convolution with the first feature map of input channel 2 to obtain the convolution result 14 .
  • the convolution result 13 and the convolution result 14 are fused to obtain the first entropy estimation feature corresponding to the output channel 2 of the feature map group 1.
  • the weight map 31 can be used to perform convolution with the first feature map of input channel 1 to obtain the convolution result 15
  • the weight map 32 can be used to perform convolution with the first feature map of input channel 2 to obtain the convolution result 16 .
  • the convolution result 15 and the convolution result 16 are fused to obtain the first entropy estimation feature corresponding to the output channel 3 of the feature map group 1 .
  • merge feature map group 1 corresponding to the first entropy estimated feature in output channel 1, feature map group 1 corresponding to the first entropy estimated feature in output channel 2, and feature map group 1 corresponding to the first entropy estimated feature in output channel 3 the first entropy estimation feature corresponding to feature map group 1 can be obtained.
  • this application does not limit which weight map of the output channel is used to convolve with the first feature map of which channel in the feature map group to obtain the feature map group 1 corresponding to the output channel.
  • the first entropy estimate feature For example, for feature map group 1, the weight map 12 can also be used to convolve with the first feature map of input channel 1 to obtain the convolution result 11, and the weight map 11 can be used to perform convolution with the first feature map of input channel 2 , to obtain the convolution result 12; then fuse the convolution result 11 and the convolution result 12 to obtain the first entropy estimation feature corresponding to the output channel 1 of the feature map group 1.
  • the weight map 22 can be used to convolve with the first feature map of the input channel 1 to obtain the convolution result 13 and the weight map 21 can be used to convolve with the first feature map of the input channel 2 to obtain the convolution result 14; then The convolution result 13 and the convolution result 14 are fused to obtain the first entropy estimation feature corresponding to the output channel 2 of the feature map group 1.
  • the weight map 32 can be used to convolve with the first feature map of the input channel 1 to obtain the convolution result 15, and the weight map 31 can be used to convolve with the first feature map of the input channel 2 to obtain the convolution result 16; then The convolution result 15 and the convolution result 16 are fused to obtain the first entropy estimation feature corresponding to the output channel 3 of the feature map group 1.
  • the weight map of the jth output channel corresponding to the i-th feature map group can be used, respectively for the first feature map of the k channels in the i-th feature map group, centered on (g1, g2)
  • the ks1*ks2 area of the ks1*ks2 area is used to extract spatial information, and the g-th group of feature points in the i-th feature map group (composed of feature points located at (g1, g2) in the first feature map of the k channels, where g1 is 1 An integer between ⁇ h (including 1 and h), g2 is an integer between 1 and w (including 1 and w), g is an integer between 1 and h*w (including 1 and h*w)) corresponding The first entropy estimate feature at the jth output channel.
  • (g1, g2) is the integer index of the position coordinates of the first feature map
  • g1, g2 represent the coordinate indexes of the horizontal and vertical directions respectively
  • the position
  • the weight map of the j-th output channel corresponding to the i-th feature map group can be used, respectively, with the first feature map of the k channels in the i-th feature map group, centered on (g1, g2)
  • the eigenvalues corresponding to each feature point in the ks1*ks2 area of ks1*ks2 are convoluted to obtain k convolution results; the k convolution results are fused to obtain the g-th group of feature points in the i-th feature map group corresponding to the j-th
  • the first entropy estimated features of the output channels can be used, respectively, with the first feature map of the k channels in the i-th feature map group, centered on (g1, g2)
  • the eigenvalues corresponding to each feature point in the ks1*ks2 area of ks1*ks2 are convoluted to obtain k convolution results; the k convolution results are fused to obtain the g-th group of feature
  • Fig. 3d is a schematic diagram of an intra-group fusion process exemplarily shown.
  • channel 1 and channel 2 in feature map group 1 correspond to input channel 1 and input channel 2 of the autoregressive model, respectively, and the number of output channels corresponding to feature map group 1 is 2.
  • the thirteenth group of feature points includes feature point A1 in the first feature map of input channel 1 and feature point A2 in the first feature map of input channel 2 .
  • a 3*3 weight map corresponding to output channel 1 can be used, and the feature points in the 3*3 area centered on feature point A1 in the first feature map of input channel 1 (such as In Figure 3d, the feature value corresponding to the gray square) is convolved to obtain the convolution result 21; and another 3*3 weight map corresponding to the output channel 1 is used, and the feature point A2 is used in the first feature map of channel 2 as
  • the eigenvalues corresponding to the feature points in the central 3*3 area are convolved to obtain the convolution result 22; then the convolution result 21 and the convolution result 22 are fused to obtain the 13th group Feature points correspond to the first entropy estimated features in output channel 1.
  • a 3*3 weight map corresponding to the output channel 2 may be used, and the feature points in the 3*3 area centered on the feature point A1 in the first feature map of the input channel 1 ( Convolve the feature value corresponding to the gray square in Figure 3d to obtain the convolution result 23; and use another 3*3 weight map corresponding to the output channel 2, and use the feature points in the first feature map of the input channel 2 Convolve the feature values corresponding to the feature points in the 3*3 area centered on A2 (the gray square in Figure 3d) to obtain the convolution result 24; then fuse the convolution result 23 and the convolution result 24 to obtain the first
  • the 13 groups of feature points correspond to the first entropy estimated features in output channel 2.
  • the N feature map groups are respectively fused within the group to obtain the first entropy estimation features respectively corresponding to the N feature map groups, that is, to obtain N groups of first entropy estimation features.
  • the decoder predicts the first entropy estimation feature of the feature point to be decoded according to the feature value corresponding to the decoded feature point.
  • the feature points in the area of ks1*ks2 centered on the corresponding position of the feature point to be decoded in the first feature map include decoded feature points and undecoded feature points, and undecoded feature points cannot participate in the calculation.
  • the weight value of the position corresponding to the position of the undecoded feature point is 0.
  • the decoding end decodes the feature points of each channel according to a preset decoding sequence, so it can determine which positions in the weight map have a weight value of 0.
  • the preset decoding sequence can be set according to requirements, which is not limited in this application.
  • N groups of first entropy estimation features may be combined to obtain the first entropy estimation features corresponding to the first feature map matrix.
  • the first entropy estimation feature R c2*h*w , c2 M.
  • the probability estimation may be performed according to the first entropy estimation feature corresponding to the first feature map matrix to obtain the probability distribution corresponding to the first feature map matrix.
  • the probability distribution R c*h*w*P that is to say, the number of channels of the probability distribution is the same as the number of channels of the first feature map matrix, and each feature point in the first feature map matrix corresponds to P parameters (such as mean , variance, etc.), P is an integer greater than 0, which is not limited by the present application.
  • feature extraction may be performed on the first feature map matrix to obtain a fifth feature map matrix; then, according to the fifth feature map matrix, the second entropy estimation feature corresponding to the first feature map matrix is determined.
  • probability estimation may be performed by combining the first entropy estimation feature corresponding to the first feature map matrix and the second entropy estimation feature corresponding to the first feature map matrix, to obtain a probability distribution corresponding to the first feature map matrix.
  • the first entropy estimation feature and the second entropy estimation feature may be aggregated (such as concatenated), and probability estimation may be performed according to the aggregation result to obtain a probability distribution corresponding to the first feature map matrix.
  • the first feature map matrix may be coded according to the probability distribution corresponding to the first feature map to obtain a code stream corresponding to the image to be coded.
  • the code stream corresponding to the image to be encoded may be stored, or the code stream corresponding to the image to be encoded may be transmitted to the decoding end.
  • the fifth feature map matrix when the fifth feature map matrix is determined, the fifth feature map matrix may be encoded to obtain a code stream corresponding to the fifth feature map matrix. Then the code stream corresponding to the fifth feature map matrix can be stored, and the code stream corresponding to the fifth feature map matrix can also be transmitted to the decoding end.
  • the amount of invalid information can be reduced. Introduced, thereby reducing the coding computing power, thereby improving the coding efficiency. In addition, the quality of the reconstructed image can also be improved.
  • Fig. 4 is a schematic diagram of a decoding process exemplarily shown.
  • the decoding end can obtain the code stream, and then can decode the code stream, and refer to the following S402-S403.
  • S402. Decode eigenvalues corresponding to feature points of c channels from the code stream to obtain a first feature map matrix.
  • the code stream may include the encoding information corresponding to each feature point in the first feature map of c channels, and the encoding information corresponding to each feature point may be decoded to obtain the feature value corresponding to each feature point; all feature points The corresponding eigenvalues can form a first feature map matrix.
  • the decoding end may perform parallel decoding or serial decoding on the feature points of different channels.
  • the feature points of the same channel at the decoding end may be serially decoded or parallelly decoded, which is not limited in the present application.
  • Fig. 5a is a schematic diagram of an exemplary decoding process.
  • Fig. 5a shows a preset decoding sequence for feature points in a channel.
  • Fig. 5a is an example of a first feature map of one channel in the first feature map matrix, and the size of the first feature map is 10*10, wherein each square represents a feature point.
  • the decoder can decode each feature point in the first feature map sequentially according to the order shown in FIG. After all the feature points in one row are decoded, each feature point in the second row can be decoded from left to right, and so on, until the decoding of all the feature points in the first feature map matrix is completed.
  • this application can also use other preset decoding sequences other than the preset decoding sequence shown in Figure 5a to perform serial decoding on each feature point, which can be set according to requirements, and this application does not limit this .
  • all the feature points in the first feature map matrix may be sequentially determined as the first feature points to be decoded according to the preset decoding order, and then decoding may be performed on the first feature points to be decoded referring to S4021-S4024.
  • the decoded information group includes the decoded information of the channel corresponding to the first feature point to be decoded and the decoded information of other k-1 channels, where k is less than A positive integer of c.
  • the decoded information may include feature values corresponding to the decoded feature points.
  • the decoder may use the feature value corresponding to the feature point as an input to the autoregressive model.
  • the feature values of all decoded feature points of the same channel are input from the same channel of the autoregressive model.
  • the corresponding probability distribution may be determined according to the feature value corresponding to the decoded feature point.
  • the autoregressive model can use k (k is a positive integer less than c) channels of decoded information (that is, the feature values corresponding to the decoded feature points) to form a decoded information group; in this way, N Decoded information group, N is an integer greater than 1, which can be determined according to k and c.
  • the number k of channels included in each feature map group may be the same or different, which is not limited in the present application.
  • Fig. 5b is a schematic diagram of an exemplary decoded information group. Wherein, it is assumed that the decoding end performs parallel decoding on c channels, and performs serial decoding on the feature points of each channel according to the preset decoding sequence in FIG. 5a.
  • a large square in Figure 5b represents a first feature map, a small square in the large square represents a feature point, a gray rectangle represents a decoded feature point, and a white rectangle represents an undecoded feature point.
  • each decoded information group includes 2 channels of decoded information.
  • the number k of channels included in each decoded information group may also be different, which is not limited in this application.
  • the way that the decoding end uses the decoded information of c channels to form N decoded information groups is the same as the way that the encoding end uses the first feature maps of c channels to form N feature map groups.
  • the channel where the first feature point to be decoded can be determined, and then from the N decoded information groups, determine the decoded information group to which the channel where the first feature point to be decoded belongs .
  • the decoded information group to which the channel where the first feature point to be decoded belongs can be referred to as the i-th decoded information group.
  • the channels included in the i-th decoded information group are: the channel corresponding to the first feature point to be decoded and other k-1 channels; the i-th decoded information group includes the channel corresponding to the first feature point to be decoded The decoded information and the decoded information of other k-1 channels.
  • the autoregressive weight matrix corresponding to the i-th decoded information group may be used to extract local spatial information from the i-th decoded information group to obtain the first entropy estimation feature corresponding to the first feature point to be decoded.
  • the weight map of the j-th output channel corresponding to the i-th decoded information group can be used to convolve with the decoded information of k channels in the i-th decoded information group to obtain k convolutions Result; the k convolution results are fused to obtain the first entropy estimation feature of the i-th decoded information group corresponding to the j-th output channel. Combining the first entropy estimation features corresponding to the M i output channels of the i-th decoded information group to obtain the first entropy estimation features corresponding to the i-th decoded information group.
  • j is a number between 1 and M i , including 1 and M i .
  • the first entropy estimation feature corresponding to the i-th decoded information group is the first entropy estimation feature corresponding to the first feature point to be decoded.
  • Fig. 5c is a schematic diagram of an intra-group fusion process exemplarily shown.
  • the number of output channels corresponding to decoded information group 1 in Figure 5c is 2, output channel 1 corresponds to 2 weight maps: weight map 11 and weight map 12, and output channel 2 corresponds to 2 weight maps: Weight Map 21 and Weight Map 22.
  • the weight map 11 can be used to convolve the decoded information of the input channel 1 to obtain the convolution result 11
  • the weight map 12 can be used to convolve the decoded information of the input channel 2 to obtain the convolution Result 12.
  • the convolution result 11 and the convolution result 12 are fused to obtain the first entropy estimation feature corresponding to the output channel 1 of the ith decoded information group.
  • the weight map 21 can be used to convolve with the decoded information of the input channel 1 to obtain the convolution result 13
  • the weight map 22 can be used to perform convolution with the decoded information of the input channel 2 to obtain the convolution result 14 . Then, the convolution result 13 and the convolution result 14 are fused to obtain the first entropy estimation feature corresponding to the output channel 2 of the ith decoded information group.
  • this application does not limit which weight map of the output channel is used to convolve with the decoded information of which channel in the decoded information group to obtain the first entropy estimation feature of the output channel .
  • the weight map 12 it is also possible to use the weight map 12 to perform convolution with the decoded information of the input channel 1 to obtain the convolution result 11, and to use the weight map 11 to perform convolution with the decoded information of the input channel 2, Obtain the convolution result 12; then fuse the convolution result 11 and the convolution result 12 to obtain the first entropy estimation feature corresponding to the output channel 1 of the ith decoded information group.
  • weight map 22 can be used to convolve with the decoded information of the input channel 1 to obtain the convolution result 13 and the weight map 21 can be used to convolve with the decoded information of the input channel 2 to obtain the convolution result 14; then the convolution is fused
  • the product result 13 and the convolution result 14 are used to obtain the first entropy estimation feature corresponding to the output channel 2 of the ith decoded information group.
  • the weight map of the j-th output channel corresponding to the i-th decoded information group can be used, and for the k channels of the i-th decoded information group, (g1, g2) (the first to-be-decoded The location corresponding to the feature point) is the center of the decoded information in the ks1*ks2 area for spatial information extraction, and the g-th group of feature points in the i-th decoded information group (k pieces contained in the i-th decoded information group It consists of undecoded feature points located at (g1,g2) in the channel, where g1 is an integer between 1 and h (including 1 and h), g2 is 1 to w (including 1 and w), and g is 1 to h An integer between *w (including 1 and h*w)) corresponds to the first entropy estimation feature of the jth output channel.
  • the first entropy estimation feature corresponding to the j-th output channel of the first feature point to be decoded in the i-th decoded information group can be obtained.
  • the weight map of the j-th output channel corresponding to the i-th decoded information group can be used, and ks1*
  • the decoded information in the ks2 area is convoluted to obtain k convolution results; the k convolution results are fused to obtain the first feature point to be decoded in the i-th decoded information group corresponding to the j-th output channel
  • An entropy estimation feature can be used, and ks1*
  • the decoded information in the ks2 area is convoluted to obtain k convolution results; the k convolution results are fused to obtain the first feature point to be decoded in the i-th decoded information group corresponding to the j-th output channel An entropy estimation feature.
  • Fig. 5d is a schematic diagram of an intra-group fusion process exemplarily shown.
  • channel 1 and channel 2 in the decoded information group 1 correspond to input channel 1 and input channel 2 of the autoregressive model respectively, and the number of output channels corresponding to the decoded information group 1 is 2.
  • the gray squares in input channel 1 and input channel 2 are decoded feature points, and the white squares are undecoded feature points.
  • a 3*3 weight map corresponding to output channel 1 can be used, and the 3*3 area centered on feature point A1 of input channel 1 (such as Area1 in Fig. 5d) has been Decode the feature value corresponding to the feature point to perform convolution to obtain the convolution result 21; and use another 3*3 weight map corresponding to the output channel 1, and the 3*3 area centered on the feature point A2 of the input channel 2 ( Convolve the eigenvalues corresponding to the decoded feature points in Area 2) in Figure 5d to obtain the convolution result 22; then fuse the convolution result 21 and the convolution result 22 to obtain the corresponding The first entropy estimate feature in output channel 1.
  • a 3*3 weight map corresponding to output channel 2 can be used, and the 3*3 area centered on feature point A1 of input channel 1 (such as Area1 in FIG. 5d )
  • the feature value corresponding to the decoded feature point is convolved to obtain the convolution result 23; and another 3*3 weight map corresponding to the output channel 2 is used, and the 3*3 area centered on the feature point A2 of the input channel 2 (As shown in Area2 in Figure 5d) the feature value corresponding to the decoded feature point is convoluted to obtain the convolution result 24; then the convolution result 23 and the convolution result 24 are fused to obtain the first feature point to be decoded A1 Corresponds to the first entropy estimated feature in output channel 2.
  • the first entropy estimation feature corresponding to the first feature point to be decoded is determined.
  • the probability estimation may be performed according to the first entropy estimation feature corresponding to the first feature point to be encoded to obtain the probability distribution corresponding to the first feature point to be encoded.
  • the probability distribution corresponding to the first feature point to be encoded corresponds to a set of probability distribution parameters.
  • each group of probability distribution parameters may include at least one parameter, such as mean value, variance, etc., which is not limited in the present application.
  • the decoding end can extract the fifth feature map from the code stream matrix. Then, according to the fifth feature map matrix, the second entropy estimation feature corresponding to all the feature points in the code stream can be determined.
  • the second entropy estimation feature corresponding to the first feature point to be decoded can be determined from the second entropy estimation feature corresponding to all feature points in the code stream, and then the first entropy estimation feature corresponding to the first feature point to be decoded can be combined Perform probability estimation with the second entropy estimation feature to obtain a probability distribution corresponding to the first feature point to be decoded.
  • the first entropy estimation feature and the second entropy estimation feature corresponding to the first feature point to be decoded can be aggregated (such as spliced), and the probability estimation is performed according to the aggregation result to obtain the probability distribution corresponding to the first feature point to be decoded .
  • the first feature point to be decoded can be decoded according to the probability distribution corresponding to the first feature point to be decoded to obtain the feature value corresponding to the first feature point to be decoded. At this time, the first feature point to be decoded becomes become decoded feature points.
  • the decoding end may perform spatial inverse transformation on the first feature map matrix to perform image reconstruction to obtain a reconstructed image.
  • the entropy estimation feature corresponding to the feature point to be decoded is determined. In terms of characteristics, it can reduce the introduction of invalid information, thereby reducing the decoding computing power, thereby improving decoding efficiency.
  • Fig. 5e is a schematic diagram showing an exemplary compression effect.
  • the ordinate in Fig. 5e is PSNR (Peak Signal to Noise Ratio, peak signal-to-noise ratio), and the unit is dB (decibel), which can be used to characterize the quality of image reconstruction, the larger the PSNR, the greater the image reconstruction The higher the quality.
  • the abscissa is Bits per pixel (the number of bits used to store each pixel, the smaller the compression rate is), the unit is BPP (bit/pixel).
  • the dotted curve is the relationship curve between the image reconstruction quality of the present application and the size of the code stream, and the solid curve is the image reconstruction quality of the prior art and the size of the code stream. Comparing the two curves, it can be seen that when the size of the code stream is the same, the compression rate of the present application The higher the image reconstruction quality for the /decompression scheme.
  • the feature map matrix Y2 represents the first feature map matrix
  • the feature map matrix Z2 represents the fifth feature map matrix
  • the entropy estimation feature phi represents the first entropy estimation feature
  • the entropy estimation feature psi represents the second entropy estimation feature as an example.
  • FIG. 6 is a schematic structural diagram of an encoding and decoding framework exemplarily shown.
  • the encoding network, quantization unit D1, autoregressive unit, aggregation unit, super-encoding network, quantization unit D2, super-decoding network and probability estimation unit belong to the AI encoding unit in FIG. 1 .
  • the decoding network, autoregressive unit, aggregation unit, super decoding network and probability estimation unit belong to the AI decoding unit in FIG. 1 .
  • the entropy coding unit A1 and the entropy coding unit B1 belong to the entropy coding unit in FIG. 1 .
  • the entropy decoding unit A2 and the entropy decoding unit B2 belong to the entropy decoding unit in FIG. 1 .
  • the AI coding unit and the AI decoding unit can be jointly trained, so that each network and unit in the AI coding unit and the AI decoding unit can learn corresponding parameters.
  • the autoregressive unit, aggregation unit, super decoding network and probability estimation unit in the AI coding unit, and the autoregressive unit, aggregation unit, super decoding network and probability estimation unit in the AI decoding unit can be shared.
  • the encoding network can be used to perform spatial transformation on the image to be encoded, and transform the image to be encoded to another space.
  • the encoding network may be a convolutional neural network.
  • a super-encoded network can be used to extract features.
  • the hypercoding network may be a convolutional neural network.
  • the quantization unit (including the quantization unit D1 and the quantization unit D2 ) may be used to perform quantization processing.
  • the aggregation unit can be used to perform probability estimation based on entropy estimation features, and output probability distribution.
  • the aggregation unit may be a convolutional neural network.
  • the probability estimation unit can be used for probability estimation and output probability distribution.
  • the probability estimation unit C2 may be a discrete probability estimation unit.
  • the entropy encoding unit A1 may be configured to perform encoding according to the probability distribution determined by the aggregation unit, so as to reduce the statistical redundancy of the output features.
  • the entropy encoding unit B1 may be configured to perform encoding according to the probability distribution determined by the probability estimation unit, so as to reduce the statistical redundancy of the output features.
  • the entropy decoding unit A2 may be configured to perform decoding according to the probability distribution determined by the aggregation unit.
  • the entropy decoding unit B2 may be configured to perform decoding according to the probability distribution determined by the probability estimation unit.
  • the decoding network can be used to perform inverse spatial transformation on information obtained by entropy decoding, and output a reconstructed image.
  • the decoding network may be a convolutional neural network.
  • a superdecoding network may be used to determine features associated with entropy estimation.
  • the super-decoding network can be a convolutional neural network.
  • the autoregressive unit may include an autoregressive model, configured to determine entropy estimation features according to the autoregressive weight matrix.
  • the image to be encoded is input to the encoding network, the image to be encoded is transformed into another space through the encoding network, and the feature map matrix Y1 is output.
  • the feature map matrix Y1 is input to the quantization unit D1, the feature map matrix Y1 is quantized through the quantization unit D1, and the feature map matrix Y2 is output.
  • the quantization unit D1 may perform quantization processing on the feature value corresponding to each feature point in the feature map matrix Y1 according to a preset quantization step size to obtain a feature map matrix Y2 ⁇ R c*h*w .
  • the feature map matrix Y2 is input to the super-encoding network, and feature extraction is performed on the feature map matrix Y2 through the super-encoding network to obtain the feature map matrix Z1, and then the feature map matrix Z1 is input to the quantization unit D2. After quantization processing is performed on the feature map matrix Z1 by the quantization unit D2, the feature map matrix Z2 is output. Then, on the one hand, the feature map matrix Z2 is input to the probability estimation unit, processed by the probability estimation unit, and the probability distribution PB1 of each feature point in the feature map matrix Z2 is output to the entropy coding unit B1. On the other hand, the feature map matrix Z2 is input to the entropy encoding unit B1.
  • the entropy encoding unit B1 encodes the feature map matrix Z2 according to the probability distribution PB1, and outputs the code stream SB to the entropy decoding unit B2.
  • the probability estimation unit can predict the probability distribution PB2 of the feature points to be decoded in the code stream SB, and input the probability distribution PB2 to the entropy decoding unit B2.
  • the entropy decoding unit B2 can decode the feature points to be decoded in the code stream SB according to the probability distribution PB2, and output the feature map matrix Z2 to the super decoding network.
  • the super-decoding network obtains the feature map matrix Z2, it can convert the feature map matrix Z2 into the entropy estimation feature psi, and input the entropy estimation feature psi to the aggregation unit.
  • the feature map matrix Y2 may be input to the autoregressive unit, the feature map matrix Y2 is processed through the autoregressive unit, and the entropy estimation feature phi is output to the aggregation unit. This process can refer to the description above, and will not be repeated here.
  • the aggregation unit can perform probability estimation based on the entropy estimation feature phi and the entropy estimation feature psi, predict the probability distribution PA1 corresponding to each feature point in the feature map matrix Y2, and input the probability distribution PA1 to the entropy encoding unit A1.
  • the entropy encoding unit A1 can encode each feature point in the feature map matrix Y2 according to the probability distribution PA1, and output a code stream SA. So far, the encoding of the image to be encoded is completed.
  • both the code stream SA obtained by encoding the feature map matrix Y2 and the code stream SB obtained by encoding the feature map matrix Z2 can be sent to the entropy decoding unit A2. Or, during decoding, the code stream SA and the code stream SB are obtained by the entropy decoding unit A2.
  • the decoding process can be as follows:
  • the entropy decoding unit A2 first decodes the feature map matrix Z2 from the code stream SB, and sends the feature map matrix Z2 to the super decoding network. Then, the super-decoding network converts the feature map matrix Z2 into an entropy estimation feature psi, which is output to the aggregation unit.
  • the code stream SA contains the encoding information of each feature point in the feature map matrix Y2
  • the entropy decoding unit A2 decodes the encoding information of each feature point in the code stream SA, and can obtain the feature value corresponding to each feature point,
  • the feature map matrix Y2 is obtained.
  • the entropy decoding unit A2 can input the feature value corresponding to the decoded feature point to the autoregressive unit, and determine the entropy estimation feature corresponding to the first feature point to be decoded via the autoregressive unit phi, which can refer to the description above, and will not be repeated here; then the entropy estimation feature phi is output to the aggregation unit. Then, the aggregation unit performs probability estimation based on the entropy estimation feature phi and the entropy estimation feature psi, predicts the probability distribution PA2 corresponding to the first feature point to be decoded, and inputs the probability distribution PA2 to the entropy decoding unit A2.
  • the entropy encoding unit A2 may decode the first feature point to be decoded according to the probability distribution PA2 corresponding to the first feature point to be decoded, to obtain the corresponding feature value.
  • the entropy decoding unit A2 can decode the code stream SA, output the feature map matrix Y2 to the decoding network, and perform spatial inverse transformation on the feature map matrix Y2 through the decoding network to obtain a reconstructed image.
  • the entropy decoding unit A2 may decode the feature points of different channels in parallel or serially.
  • the entropy decoding unit A2 can decode the feature points of the same channel serially or in parallel, which is not limited in this application.
  • the feature map matrix Y1 can also be input to the super-encoding network, and the feature map matrix Z2 can be obtained through the super-encoding network and the quantization unit D2, which is not limited in this application.
  • network and unit in the dotted line box on the right side of Fig. 6 may also be other networks and other units, which can be set according to requirements, and this application does not limit it.
  • the AI encoding unit, AI decoding unit, entropy encoding unit and entropy decoding unit of the present application may also include other networks and units for generating other entropy estimation features, and then input other entropy estimation features into the aggregation unit, the aggregation unit performs probability estimation according to the entropy estimation feature phi, the entropy estimation feature psi and other entropy estimation features to generate a probability distribution, which is not limited in the present application.
  • the AI encoding unit, AI decoding unit, entropy encoding unit, and entropy decoding unit of this application may not include the network and units in the dotted line box on the right side of Figure 6, which can be set according to requirements.
  • the embodiment of this application There is no limit to this.
  • the AI encoding unit, AI decoding unit, entropy encoding unit, and entropy decoding unit do not include the network and units in the dotted box on the right side of Figure 6, there is no need to generate the entropy estimation feature psi during the encoding and decoding process, and the aggregation unit only Probability estimation can be performed according to the entropy estimation feature phi.
  • Fig. 7 is a schematic diagram of an encoding process exemplarily shown.
  • the first feature map matrix may include a second feature map matrix and a third feature map matrix, wherein the second feature map matrix includes second feature maps of c channels, and the third feature map matrix includes c channels of The third feature map.
  • the second feature map and the third feature map of each channel are added to obtain the first feature map of the channel.
  • the first feature map matrix may be space-divided to obtain the second feature map matrix and the third feature map matrix. It should be understood that other manners may also be used to determine the second feature map matrix and the third feature map matrix, which is not limited in the present application. Wherein, the present application takes the space division of the first feature map matrix to obtain the second feature map matrix and the third feature map matrix as an example for illustration.
  • the second feature map matrix includes second feature maps of c channels
  • the third feature map matrix includes c channels.
  • the first feature map matrix may be space-divided to obtain the second feature map matrix and the third feature map matrix.
  • performing spatial division on the first feature map matrix may refer to dividing the first feature map of each channel into a second feature map and a third feature map according to a preset division rule. In this way, a second feature map matrix including the second feature map of c channels, and a third feature map matrix including the third feature map of c channels can be obtained.
  • the preset division rule can be set according to requirements, for example, the feature points located in the second preset position in the first feature map are divided into feature points in the second feature map, and the feature points in the first preset position in the first feature map are divided into feature points in the first feature map. It is assumed that the feature points of the position are divided into feature points of the third feature map, which is not limited in the present application.
  • the feature value corresponding to the feature point at the second preset position in the second feature map is the feature value corresponding to the feature point at the second preset position in the first feature map;
  • a feature value corresponding to a feature point at a preset position is 0.
  • the feature value corresponding to the feature point at the first preset position is the feature value corresponding to the feature point at the first preset position in the first feature map; in the third feature map, the feature value at the second preset position Let the feature value corresponding to the feature point of the position be 0.
  • the first preset position and the second preset position can be set according to requirements, for example, assuming that the position of a feature point in the first feature map matrix is (wi, hi), then the second preset position can be: wi The position where +hi is equal to an odd number, the first preset position: the position where wi+hi is equal to an even number, which is not limited in this application.
  • the second preset position can be: the position where wi is an odd number
  • the first preset position the position where wi is an even number
  • the second preset position can be: the position where hi is an odd number
  • the first preset position can be: where hi is an even number location, which is not limited in this application.
  • FIG. 8 is a schematic diagram of a feature map division process schematically shown.
  • the size of the first feature map is 5*5. If the first preset position is: the position where wi+hi is an even number, and the second preset position: the position where wi+hi is an odd number, then the second feature point and the fourth feature point in the first row of the first feature map can be feature points, the 1st feature point, the 3rd feature point and the 5th feature point in the second line, the 2nd feature point and the 4th feature point in the third line, the 1st feature point in the fourth line point, the 3rd feature point and the 5th feature point, and the 2nd feature point and the 4th feature point in the fifth row are determined as the second feature map.
  • the feature points at the first preset position are shown as gray squares in FIG. 7 .
  • the first feature point, the third feature point and the fifth feature point in the first line of the first feature map, the second feature point and the fourth feature point in the second line, and the third feature point in the third line 1 feature point, the 3rd feature point and the 5th feature point, the 2nd feature point and the 4th feature point in the fourth row, the 1st feature point, the 3rd feature point and the 5th feature point in the fifth row feature points are determined as the third feature map.
  • the feature points at the second preset position are shown as gray squares in FIG. 7 .
  • feature extraction can be performed on the second feature map matrix to obtain a fourth feature map matrix; then, according to the fourth feature map matrix, the second entropy estimate corresponding to the second feature map matrix is determined feature; and then estimate the feature according to the second entropy to determine the probability distribution corresponding to the second feature map matrix.
  • the encoding end may also encode the fourth feature map matrix to obtain a code stream corresponding to the fourth feature map matrix.
  • the probability distribution corresponding to the third feature map matrix can be determined according to the above method of determining the probability distribution corresponding to the first feature map matrix; refer to S705:
  • an autoregressive model may be used to determine the first entropy estimation feature corresponding to the third feature map matrix.
  • the second feature maps of the c channels of the second feature map matrix may be input to the autoregressive model as c channels of input.
  • the autoregressive model can use the second feature maps of k channels in the second feature map matrix to form a feature map group; in this way, N feature map groups can be obtained, and N is an integer greater than 1, which can be specifically determined according to k and c are determined.
  • the number k of channels contained in each feature map group may be the same or different, and this application is not limited thereto.
  • the autoregressive model performs intra-group fusion on the feature map group composed of the second feature map of k channels, and obtains the first entropy estimation feature corresponding to the feature map group composed of the third feature map of k channels ; and then determine the first entropy estimation feature corresponding to the third feature map matrix.
  • the third feature map matrix does not need to be input to the autoregressive model, but the third feature map of k channels in the third feature map matrix can still be used to form a feature map group; in this way, N feature maps can be obtained group, N is an integer greater than 1, which can be determined according to k and c.
  • the number k of channels included in each feature map group may be the same or different, which is not limited in the present application.
  • the feature map group composed of the second feature map of k channels can be called feature map group A
  • the feature map group composed of the third feature map of k channels can be called feature map group b.
  • the N feature map groups A and N feature map groups B are in one-to-one correspondence, that is, each feature map group A and each feature map group B contain the same channels.
  • each output channel corresponds to k weight maps whose size is ks1*ks2.
  • M i 1
  • the number of output channels corresponding to the i-th feature map group A is 1, and the output channel corresponds to two weight maps of size ks1*ks2.
  • the autoregressive weight matrix corresponding to the i-th feature map group A can be used to extract the local spatial information of the i-th feature map group A to obtain the first entropy corresponding to the i-th feature map group B estimated features.
  • the weight map of the jth output channel corresponding to the i-th feature map group A can be used to convolve with the second feature map of k channels in the i-th feature map group A to obtain k volumes
  • the result of the convolution; the k convolution results are fused to obtain the first entropy estimation feature of the i-th feature map group B corresponding to the j-th output channel.
  • j is a number between 1 and M i , including 1 and M i . This can refer to the above description of "determining the i-th feature map group corresponds to the first entropy estimation feature of the j-th output channel", and will not be repeated here.
  • the manner of determining the first entropy estimation feature of each group of feature points in the i-th feature map group B in the j-th output channel can refer to the above description, and will not be repeated here.
  • the probability estimation may be performed according to the first entropy estimation feature corresponding to the third feature map matrix to obtain the probability distribution corresponding to the third feature map matrix.
  • the probability distribution R c*h*w*P that is to say, the number of channels of the probability distribution is the same as the number of channels of the third feature map matrix, and each feature point in the third feature map matrix corresponds to P parameters (such as mean , variance, etc.), P is an integer greater than 0, which is not limited by the present application.
  • feature extraction may be performed on the third feature map matrix to obtain a sixth feature map matrix; then, according to the sixth feature map matrix, the second entropy estimation feature corresponding to the third feature map matrix is determined.
  • probability estimation may be performed by combining the first entropy estimation feature corresponding to the third feature map matrix and the second entropy estimation feature corresponding to the third feature map matrix to obtain a probability distribution corresponding to the third feature map matrix.
  • the first entropy estimation feature and the second entropy estimation feature may be aggregated (such as concatenated), and probability estimation may be performed according to the aggregation result to obtain a probability distribution corresponding to the third feature map matrix.
  • the sixth feature map matrix when the sixth feature map matrix is determined, the sixth feature map matrix may be encoded to obtain a code stream corresponding to the sixth feature map matrix.
  • the code stream corresponding to the fourth feature map matrix, and the code stream corresponding to the sixth feature map matrix can be stored, the code stream corresponding to the image to be encoded, the fourth feature map
  • the code stream corresponding to the matrix and the code stream corresponding to the sixth feature map matrix, or the code stream corresponding to the image to be encoded, the code stream corresponding to the fourth feature map matrix, and the code stream corresponding to the sixth feature map matrix can also be sent to the decoding end .
  • Fig. 9 is a schematic diagram of a decoding process exemplarily shown.
  • the decoder can decode the fourth feature map matrix from the code stream; then, according to the fourth feature map matrix, decode the feature point to be decoded at the second preset position from the code stream The corresponding eigenvalues, and then obtain the second feature map matrix.
  • all the feature points at the second preset position may be sequentially determined as the second feature points to be decoded, and then decoded for the second feature points to be decoded.
  • the second feature point to be decoded based on the fourth feature map matrix, determine the second entropy estimation feature corresponding to the second feature point to be decoded; according to the second entropy estimation feature corresponding to the second feature point to be decoded, determine A probability distribution corresponding to the second point to be decoded; decoding the second feature point to be decoded according to the probability distribution corresponding to the second feature point to be decoded to obtain a corresponding feature value.
  • the eigenvalues corresponding to all the second to-be-decoded feature points at the second preset position can be obtained, and then the second feature map matrix can be obtained.
  • all the feature points at the first preset position may be sequentially determined as the first feature points to be decoded, and then decoded for the first feature points to be decoded; refer to the following: S9025 ⁇ S9028:
  • the decoded information group includes the feature value corresponding to the decoded feature point at the second preset position in the channel corresponding to the first feature point to be decoded, and other k- The feature value corresponding to the decoded feature point at the second preset position in one channel, where k is a positive integer smaller than c.
  • the first entropy estimation feature corresponding to the feature map group composed of the third feature map of k channels is obtained , the second feature map matrix can be input into the autoregressive model, and the autoregressive model determines the first entropy estimation feature corresponding to the first feature point to be decoded according to the second feature map matrix.
  • the autoregressive model can use k (k is a positive integer less than c) channels corresponding to the feature points at the second preset position (that is, the second feature map of c channels) to form a Decoded information groups; in this way, N decoded information groups can be obtained, and N is an integer greater than 1, which can be determined according to k and c.
  • the channel where the first feature point to be decoded can be determined, and then from the N decoded information groups, determine the decoded information group to which the channel where the first feature point to be decoded belongs .
  • the decoded information group to which the channel of the first feature point to be decoded belongs may be referred to as the i-th decoded information group.
  • the channels included in the i-th decoded information group are: the channel corresponding to the first feature point to be decoded and other k-1 channels; the i-th decoded information group includes the channel corresponding to the first feature point to be decoded
  • the second feature map (the feature value corresponding to the feature point at the second preset position), and the second feature map of the other k-1 channels (the feature value corresponding to the feature point at the second preset position).
  • the weight map of the j-th output channel corresponding to the i-th decoded information group can be used to convolve with the second feature maps of the k channels in the i-th decoded information group to obtain k convolutions
  • the result of the convolution; the k convolution results are fused to obtain the first entropy estimation feature corresponding to the jth output channel of the ith decoded information group.
  • the manner of determining the first entropy estimation feature of each group of feature points in the i-th decoded information group in the j-th output channel can refer to the above description, and will not be repeated here.
  • the probability estimation may be performed according to the first entropy estimation feature corresponding to the first feature point to be encoded to obtain the probability distribution corresponding to the first feature point to be encoded.
  • the probability distribution corresponding to the first feature point to be encoded corresponds to a set of probability distribution parameters.
  • each group of probability distribution parameters may include at least one parameter, such as mean value, variance, etc., which is not limited in the present application.
  • the decoding end can extract the sixth feature map from the code stream matrix. Then, according to the sixth feature map matrix, the second entropy estimation feature corresponding to all the feature points in the code stream can be determined.
  • the second entropy estimation feature corresponding to the first feature point to be decoded can be determined from the second entropy estimation feature corresponding to all feature points in the code stream, and then the first entropy estimation feature corresponding to the first feature point to be encoded can be combined Probability estimation is performed on the second entropy estimation feature corresponding to the first feature point to be encoded to obtain a probability distribution corresponding to the first feature point to be decoded.
  • the first entropy estimation feature and the second entropy estimation feature may be aggregated (such as concatenated), and probability estimation is performed according to the aggregation result to obtain a probability distribution corresponding to the first feature point to be decoded.
  • the first feature point to be decoded can be decoded according to the probability distribution corresponding to the first feature point to be decoded to obtain the corresponding feature value, that is, the feature value corresponding to the feature point in the third feature map matrix.
  • a third feature map matrix can be obtained.
  • the second feature map matrix and the third feature map matrix can be superimposed according to channels to obtain the first feature map matrix; and then based on the first feature map The image matrix is reconstructed to obtain the reconstructed image.
  • the computing power of determining the first entropy estimation feature is greater; therefore, only part of the features to be decoded is determined
  • the first entropy estimation feature corresponding to the point can further improve the decoding efficiency.
  • FIG. 10 is a schematic diagram of an exemplary encoding process.
  • the second feature map matrix includes second feature maps of c channels
  • the third feature map matrix includes c channels The third feature map.
  • the autoregressive weight matrix corresponding to the i-th feature map group B can be used to extract the local spatial information of the i-th feature map group to obtain the first entropy estimate corresponding to the i-th feature map group B feature.
  • the weight map of the j-th output channel corresponding to the i-th feature map group B can be used, which correspond to the decoded feature points at the first preset position in the k channels in the i-th feature map group Convolve the eigenvalues of the k convolution results to obtain k convolution results; fuse the k convolution results to obtain the first entropy estimation feature of the i-th feature map group B corresponding to the j-th output channel. Merge the first entropy estimation features corresponding to the i-th feature map group B corresponding to the M i output channels to obtain the first entropy estimation features corresponding to the i-th feature map group B.
  • j is a number between 1 and M i , including 1 and M i . This can refer to the above description of "determining the first entropy estimation feature of the jth output channel corresponding to the i-th feature map group", and will not be repeated here.
  • Fig. 11a is a schematic diagram of a decoding process exemplarily shown.
  • S1102. Decode eigenvalues corresponding to feature points of c channels from the code stream to obtain a first feature map matrix.
  • the corresponding feature value can be determined by referring to the following S11021-S11028:
  • the decoded information group includes the decoded information group includes the feature value corresponding to the decoded feature point at the first preset position in the channel corresponding to the first feature point to be decoded , and feature values corresponding to the decoded feature points at the first preset position in the other k-1 channels, where k is a positive integer smaller than c.
  • the first entropy estimation feature corresponding to the feature map group composed of the third feature map of k channels is obtained , then the feature value corresponding to the decoded feature point at the first preset position can be input into the autoregressive model, and the autoregressive model can determine the feature value corresponding to the decoded feature point at the first preset position.
  • a first entropy estimation feature corresponding to the feature point to be decoded is obtained.
  • the autoregressive model can use k (k is a positive integer less than c) channel feature values corresponding to the feature points at the first preset position to form a decoded information group; in this way, N already In the decoding information group, N is an integer greater than 1, which can be determined according to k and c.
  • the channel where the first feature point to be decoded can be determined, and then from the N decoded information groups, determine the decoded information group to which the channel where the first feature point to be decoded belongs .
  • the decoded information group to which the channel of the first feature point to be decoded belongs may be referred to as the i-th decoded information group.
  • the channels included in the i-th decoded information group are: the channel corresponding to the first feature point to be decoded and other k-1 channels; the i-th decoded information group includes the channel corresponding to the first feature point to be decoded The feature value corresponding to the feature point at the first preset position, and the feature value corresponding to the feature point at the first preset position in the other k-1 channels.
  • intra-group fusion is performed on the decoded information groups to obtain the first entropy estimation feature corresponding to the first feature point to be decoded.
  • first entropy estimation feature corresponding to the first feature point to be decoded.
  • the decoding computing power can be reduced, and the decoding efficiency can be improved; moreover, compared with determining the second entropy estimation, the computing power of determining the first entropy estimation feature is greater; therefore, only part of the feature points to be decoded can be determined
  • the corresponding first entropy estimation feature can further improve decoding efficiency.
  • Fig. 11b is a schematic diagram showing an exemplary compression effect.
  • the ordinate in Fig. 11b is PSNR (Peak Signal to Noise Ratio, peak signal-to-noise ratio), the unit is dB (decibel), and can be used to characterize the quality of image reconstruction, the larger the PSNR, the greater the image reconstruction The higher the quality.
  • the abscissa is Bits perpixel (the number of bits used to store each pixel, the smaller the compression rate is), the unit is BPP (bits per pixel).
  • the dotted curve is the relationship curve between the image reconstruction quality of the present application and the size of the code stream, and the solid curve is the image reconstruction quality of the prior art and the size of the code stream. Comparing the two curves, it can be seen that when the size of the code stream is the same, the compression rate of the present application /decompression scheme with higher image reconstruction quality.
  • FIG. 12 shows a schematic block diagram of an apparatus 1200 according to an embodiment of the present application.
  • the apparatus 1200 may include: a processor 1201 and a transceiver/transceiving pin 1202 , and optionally, a memory 1203 .
  • bus 1204 includes a power bus, a control bus, and a status signal bus in addition to a data bus.
  • bus 1204 includes a power bus, a control bus, and a status signal bus in addition to a data bus.
  • the various buses are referred to as bus 1204 in the figure.
  • the memory 1203 may be used for the instructions in the foregoing method embodiments.
  • the processor 1201 can be used to execute instructions in the memory 1203, and control the receiving pin to receive signals, and control the sending pin to send signals.
  • Apparatus 1200 may be the electronic device or the chip of the electronic device in the foregoing method embodiments.
  • This embodiment also provides a computer storage medium, in which computer instructions are stored, and when the computer instructions are run on the electronic device, the electronic device is made to execute the above related method steps to implement the codec method in the above embodiment.
  • This embodiment also provides a computer program product, which, when running on a computer, causes the computer to execute the above related steps, so as to implement the encoding and decoding method in the above embodiment.
  • the embodiments of the present application also provide a device, which may specifically be a chip, a component or a module, and the device may include a connected processor and a memory; wherein the memory is used to store computer-executable instructions, and when the device is running, The processor can execute the computer-executable instructions stored in the memory, so that the chip executes the encoding and decoding methods in the foregoing method embodiments.
  • the electronic device, computer storage medium, computer program product or chip provided in this embodiment is all used to execute the corresponding method provided above, therefore, the beneficial effects it can achieve can refer to the corresponding method provided above The beneficial effects in the method will not be repeated here.
  • the disclosed devices and methods may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of modules or units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components can be combined or It may be integrated into another device, or some features may be omitted, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • a unit described as a separate component may or may not be physically separated, and a component shown as a unit may be one physical unit or multiple physical units, which may be located in one place or distributed to multiple different places. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
  • an integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a readable storage medium.
  • the technical solution of the embodiment of the present application is essentially or the part that contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product, and the software product is stored in a storage medium Among them, several instructions are included to make a device (which may be a single-chip microcomputer, a chip, etc.) or a processor (processor) execute all or part of the steps of the methods in various embodiments of the present application.
  • the aforementioned storage medium includes: various media that can store program codes such as U disk, mobile hard disk, read only memory (ROM), random access memory (random access memory, RAM), magnetic disk or optical disk.
  • the steps of the methods or algorithms described in connection with the disclosure of the embodiments of the present application may be implemented in the form of hardware, or may be implemented in the form of a processor executing software instructions.
  • the software instructions can be composed of corresponding software modules, and the software modules can be stored in random access memory (Random Access Memory, RAM), flash memory, read-only memory (Read Only Memory, ROM), erasable programmable read-only memory ( Erasable Programmable ROM, EPROM), Electrically Erasable Programmable Read-Only Memory (Electrically EPROM, EEPROM), registers, hard disk, removable hard disk, CD-ROM, or any other form of storage medium known in the art.
  • An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium.
  • the storage medium may also be a component of the processor.
  • the processor and storage medium can be located in the ASIC.
  • the functions described in the embodiments of the present application may be implemented by hardware, software, firmware or any combination thereof.
  • the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium.
  • Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another.
  • a storage media may be any available media that can be accessed by a general purpose or special purpose computer.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

Les modes de réalisation de la présente demande concernent un procédé de codage, un procédé de décodage et un dispositif électronique. Le procédé de codage consiste à : d'abord, acquérir une image à coder ; ensuite, générer une première matrice de carte de caractéristiques sur la base de ladite image, la première matrice de carte de caractéristiques comprenant des premières cartes de caractéristiques de canaux c ; ensuite, réaliser une fusion intragroupe sur un groupe de cartes de caractéristiques composé de premières cartes de caractéristiques de canaux k, de façon à obtenir une première caractéristique d'estimation d'entropie correspondant au groupe de cartes de caractéristiques, k étant inférieur à c ; ensuite, selon des premières caractéristiques d'estimation d'entropie, déterminer une distribution de probabilité correspondant à la première matrice de carte de caractéristiques, et coder la première matrice de carte de caractéristiques selon la distribution de probabilité, de façon à obtenir un flux de codes. De cette manière, une caractéristique d'estimation d'entropie est déterminée par réalisation d'une fusion intragroupe à l'aide d'un groupe de cartes de caractéristiques composé de cartes de caractéristiques de certains canaux, et comparativement à la détermination d'une caractéristique d'estimation d'entropie par réalisation d'une fusion à l'aide de cartes de caractéristiques de tous les canaux, l'introduction d'informations invalides est réduite, réduisant ainsi la puissance de calcul de codage, et améliorant ainsi l'efficacité de codage.
PCT/CN2022/125944 2021-11-24 2022-10-18 Procédé de codage, procédé de décodage et dispositif électronique WO2023093377A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111407946.6A CN116170596A (zh) 2021-11-24 2021-11-24 编解码方法及电子设备
CN202111407946.6 2021-11-24

Publications (1)

Publication Number Publication Date
WO2023093377A1 true WO2023093377A1 (fr) 2023-06-01

Family

ID=86411882

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/125944 WO2023093377A1 (fr) 2021-11-24 2022-10-18 Procédé de codage, procédé de décodage et dispositif électronique

Country Status (2)

Country Link
CN (1) CN116170596A (fr)
WO (1) WO2023093377A1 (fr)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117409091A (zh) * 2022-07-07 2024-01-16 华为技术有限公司 编解码方法及电子设备

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200154145A1 (en) * 2018-11-08 2020-05-14 Alibaba Group Holding Limited Content-weighted deep residual learning for video in-loop filtering
CN111986278A (zh) * 2019-05-22 2020-11-24 富士通株式会社 图像编码装置、概率模型生成装置和图像压缩系统
CN111988629A (zh) * 2019-05-22 2020-11-24 富士通株式会社 图像编码方法和装置、图像解码方法和装置

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200154145A1 (en) * 2018-11-08 2020-05-14 Alibaba Group Holding Limited Content-weighted deep residual learning for video in-loop filtering
CN111986278A (zh) * 2019-05-22 2020-11-24 富士通株式会社 图像编码装置、概率模型生成装置和图像压缩系统
CN111988629A (zh) * 2019-05-22 2020-11-24 富士通株式会社 图像编码方法和装置、图像解码方法和装置

Also Published As

Publication number Publication date
CN116170596A (zh) 2023-05-26

Similar Documents

Publication Publication Date Title
WO2019105179A1 (fr) Procédé et dispositif de prédiction intra-trame pour composante de couleur
US20230026080A1 (en) Method and device for coding the geometry of a point cloud
CN111131828B (zh) 一种图像压缩方法、装置、电子设备和存储介质
CN112053408B (zh) 基于深度学习的人脸图像压缩方法及装置
WO2019114225A1 (fr) Procédé et dispositif de détermination de division d'unité de codage, dispositif informatique et support de stockage lisible
WO2023093377A1 (fr) Procédé de codage, procédé de décodage et dispositif électronique
WO2022155974A1 (fr) Codage et décodage vidéo ainsi que procédé et appareil d'apprentissage de modèle
WO2019184489A1 (fr) Procédé de transformation lors d'un codage de bloc d'image, ainsi que procédé et appareil de transformation inverse lors d'un décodage de bloc d'image
CN107682699B (zh) 一种近无损图像压缩方法
KR101303503B1 (ko) 컬러 이미지들을 위한 조인트 스칼라 임베디드 그래픽 코딩
WO2020113827A1 (fr) Procédé de compression d'image
CN115442609A (zh) 特征数据编解码方法和装置
WO2022133753A1 (fr) Procédés et systèmes de codage et de décodage de nuage de points, codeur de nuage de points et décodeur de nuage de points
WO2023225808A1 (fr) Compression et décompression d'image apprise à l'aide d'un module d'attention long et court
WO2024007753A1 (fr) Procédé de codage, procédé de décodage et dispositif électronique
CN114979711B (zh) 音视频或图像分层压缩方法和装置
KR20190122479A (ko) 3차원 데이터를 프로세싱하기 위한 방법 및 장치
WO2023082955A1 (fr) Procédé de codage, procédé de décodage et dispositif électronique
CN111669579B (zh) 进行编码和解码的方法、编码端、解码端和系统
WO2023024842A1 (fr) Procédé, appareil et dispositif de codage/décodage de nuage de points, et support de stockage
WO2022258009A1 (fr) Procédés et appareils de codage/décodage entropique
WO2022140937A1 (fr) Procédé et système de codage de nuage de points, procédé et système de décodage de nuage de points, codeur de nuage de points et décodeur de nuage de points
WO2023040745A1 (fr) Procédé et appareil de codage de carte de caractéristiques et procédé et appareil de décodage de carte de caractéristiques
WO2023169190A1 (fr) Procédé de codage et de décodage, et dispositif électronique
WO2023206420A1 (fr) Procédé et appareil de codage et décodage vidéo, dispositif, système, et support de stockage

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22897460

Country of ref document: EP

Kind code of ref document: A1