WO2022028197A1 - Procédé de traitement d'image et dispositif correspondant - Google Patents

Procédé de traitement d'image et dispositif correspondant Download PDF

Info

Publication number
WO2022028197A1
WO2022028197A1 PCT/CN2021/105097 CN2021105097W WO2022028197A1 WO 2022028197 A1 WO2022028197 A1 WO 2022028197A1 CN 2021105097 W CN2021105097 W CN 2021105097W WO 2022028197 A1 WO2022028197 A1 WO 2022028197A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
entropy
estimation result
compressed data
priori
Prior art date
Application number
PCT/CN2021/105097
Other languages
English (en)
Chinese (zh)
Inventor
王晶
白博
冯义晖
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2022028197A1 publication Critical patent/WO2022028197A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/13Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation

Definitions

  • the embodiments of the present application relate to the fields of artificial intelligence and image compression, and in particular, to an image compression method and device thereof.
  • Image compression refers to the use of relatively few bits of data to represent the original image.
  • the purpose of data compression is to reduce the number of bits required to represent data by removing data redundancy.
  • JPEG image is currently the most widely used image encoding format. In the traditional compression method of JPEG image, it is necessary to decode the JPEG image into an RGB image and then perform the relevant compression process.
  • the embodiments of the present application provide an image processing method and device thereof, which are used for performing Huffman decoding on the first JPEG image data to obtain quantized coefficients during image decompression, and then performing entropy estimation on the quantized coefficients to obtain an entropy estimation result, and perform arithmetic coding according to the entropy estimation result and the quantization coefficient, and then obtain the first compressed data.
  • a first aspect of the embodiments of the present application provides an image processing method.
  • the image collected by the camera or the image acquisition device is the original image
  • the JPEG image is the compressed image data of the original image.
  • the terminal device or the server will perform Huffman decoding on the first JPEG image data through the Huffman lookup table to obtain quantization coefficients.
  • entropy estimation is performed on the quantized coefficient to obtain an entropy estimation result
  • the entropy estimation result is used to perform probability estimation on the quantized coefficient.
  • arithmetic coding is performed on the quantization coefficient and the entropy estimation result to obtain first compressed data.
  • the first compressed data is usually byte stream data composed of bit information.
  • the storage space occupied by the first compressed data is The space is smaller than the storage space occupied by the first JPEG image data.
  • entropy estimation is performed on the quantization coefficient to obtain an entropy estimation result, and the quantization coefficient may be input into the entropy estimation network, to get the entropy estimation result.
  • the entropy estimation network is used to estimate the probability of the quantization coefficient, and the entropy estimation network is a network model, which may also include more network models.
  • the entropy estimation network includes a super-prior input model, a super-prior output model, a probability distribution model, an entropy parameter model, and a super-prior input model.
  • the model, the probability distribution model and the super-a priori output model form a super-a priori network, which is used to perform priors on the quantization coefficients, and the entropy parameter model is used to convert the prior results to obtain the mean value required for probability estimation. and variance.
  • the quantization coefficients are input into the super-a priori input model to obtain a first a priori value, where the first a priori value is a priori value related to the quantization coefficient.
  • the first prior value is quantized to obtain a priori quantized coefficient.
  • the priori quantization coefficients are input into the probability distribution model to obtain a priori estimation result, which is the result of probability estimation of the priori quantization coefficients.
  • the a priori estimation result and the a priori quantization coefficient are encoded to obtain second compressed data, and the second compressed data is used for decompressing the first compressed data.
  • the prior quantization coefficients are input into the super-prior output model to obtain first prior data, which is used to generate an entropy estimation result.
  • the first priori data is input into the entropy parameter model to obtain an entropy estimation result, where the entropy estimation result includes a mean value and a variance, and the entropy estimation result is used to perform probability estimation on the quantization coefficients.
  • the quantization coefficients are calculated by each model and the entropy parameter model in the super-prior network to obtain the entropy estimation result for probability estimation, so that in the subsequent probability estimation, the JPEG images can be more accurately evaluated. Probability estimation is performed for each pixel.
  • the entropy estimation network includes a context model and an entropy parameter model, and the context model is used to optimize The entropy estimation result makes the subsequent probability estimation more accurate.
  • the quantized coefficients are input into the context model to obtain context data, where the context data is used to optimize the entropy estimation result.
  • the context data is input into the entropy parameter model to obtain the entropy estimation result.
  • the entropy estimation result is obtained through the context model and the context data obtained by the context model, so that in the subsequent probability estimation, the probability estimation of each pixel in the JPEG image can be performed more accurately.
  • the entropy estimation network includes a super-a priori input model, a super-a priori output model, a probability distribution model, an entropy parameter model, and a context model
  • the first priori data and the context data are input into the entropy parameter model to obtain the entropy estimation result.
  • the entropy estimation result is obtained through the first prior data obtained by the super-a priori input model, the super-a priori output model, the probability distribution model, and the context data obtained by the context model, so that in the subsequent probability estimation, it is possible to obtain the entropy estimation result. More accurate probability estimates for each pixel in a JPEG image.
  • the first compressed data can be calculated according to the entropy estimation result to obtain a quantized coefficient, and the first JPEG image data can be obtained by performing Huffman coding on the quantized coefficient.
  • the first compressed data after decompressing the first compressed data to obtain the quantized coefficients, and then performing Huffman coding on the quantized coefficients, the first compressed data can be restored to the first JPEG image data, and this process is lossless.
  • the restoration process improves the performance of image restoration.
  • a second aspect of the embodiments of the present application provides an image processing method.
  • the target compressed data is obtained, where the target compressed data is compressed data of the first JPEG image data.
  • an entropy estimation result is obtained according to the target compressed data, and the entropy estimation result is used to perform probability estimation on the target compressed data.
  • the target compressed data after decompressing the target compressed data to obtain the quantized coefficients, and then performing Huffman coding on the quantized coefficients, the target compressed data can be restored back to the first JPEG image data, and this process is a lossless restoration process , which improves the performance of image restoration.
  • the target data includes first compressed data and second compressed data
  • the first compressed data is compressed data of the first JPEG image data
  • the second compressed data is used to decompress the first compressed data.
  • the second compressed data is decompressed to obtain intermediate data, and the intermediate data is calculated to obtain a priori quantization coefficient, where the priori quantization coefficient is a priori of the quantization coefficient.
  • the a priori quantized coefficients are input into the super-a priori output model to obtain first prior data, which is used to generate an entropy estimation result.
  • the first compressed data is calculated according to the entropy estimation result to obtain the quantization coefficient.
  • the quantized coefficient is obtained by using the second compressed data, and the first compressed data is decompressed according to the quantized coefficient, which improves the implementability of the solution.
  • the first compressed data is decompressed to obtain dimension information of the quantized coefficient.
  • an analog quantization coefficient is obtained according to the dimension information, and the analog quantization coefficient is input into the context model to obtain context data, and the context data is used to optimize the entropy estimation result.
  • the context data and the first prior data are input into the entropy parameter model to obtain the entropy estimation result.
  • the entropy estimation result is obtained through the context data, which improves the accuracy of the entropy estimation result.
  • a third aspect of the embodiments of the present application provides an image processing apparatus.
  • An image processing device comprising:
  • a decoding unit for performing Huffman decoding on the first JPEG image data to obtain quantized coefficients
  • a processing unit for performing entropy estimation on the quantized coefficients to obtain an entropy estimation result, and the entropy estimation result is used for probability estimation on the quantized coefficients;
  • the coding unit is configured to perform arithmetic coding on the quantization coefficient and the entropy estimation result to obtain first compressed data, and the storage space occupied by the first compressed data is smaller than the storage space occupied by the first JPEG image data.
  • the image processing apparatus further includes:
  • the input unit is used for inputting the quantization coefficient into the entropy estimation network to obtain the entropy estimation result, and the entropy estimation network is used for performing probability estimation on the quantization coefficient.
  • the entropy estimation network includes a super-a priori input model, a super-a priori output model, a probability distribution model, and an entropy parameter model, and the input unit is specifically used to input the quantization coefficients into the super-a priori input model to obtain the first a priori. value, the first prior value is the prior value of the quantization coefficient;
  • the processing unit is further configured to quantize the first prior value to obtain a priori quantized coefficient
  • the input unit is also used to input the prior quantization coefficient into the probability distribution model to obtain the prior estimation result;
  • the encoding unit is further configured to encode the prior estimation result and the prior quantization coefficient to obtain second compressed data, and the second compressed data is used to decompress the first compressed data;
  • the input unit is also used to input the a priori quantization coefficients into the super-a priori output model to obtain first a priori data, and the first a priori data is used to generate an entropy estimation result;
  • the input unit is also used for inputting the first prior data into the entropy parameter model to obtain an entropy estimation result.
  • the entropy estimation network includes a context model and an entropy parameter model, and the input unit is also used to input the quantization coefficients into the context model to obtain context data, and the context data is used to optimize the entropy estimation result;
  • the input unit is also used to input the context data into the entropy parameter model to obtain the entropy estimation result.
  • the input unit is also used to input the quantization coefficients into the context model to obtain the context.
  • the context data is used to optimize the entropy estimation result
  • the input unit is specifically used to input the first prior data and the context data into the entropy parameter model to obtain the entropy estimation result.
  • the processing unit is further configured to calculate the first compressed data according to the entropy estimation result to obtain a quantization coefficient
  • the coding unit is further configured to perform Huffman coding on the quantized coefficients to obtain first JPEG image data.
  • a fourth aspect of the embodiments of the present application provides a network device.
  • An image processing device comprising:
  • an acquisition unit used for acquiring target compressed data, and the target compressed data is the compressed data of the first JPEG image data;
  • a processing unit configured to obtain an entropy estimation result according to the target compressed data, and the entropy estimation result is used to perform probability estimation on the target compressed data;
  • the processing unit is also used to calculate the target compressed data according to the entropy estimation result to obtain a quantization coefficient
  • the coding unit is configured to perform Huffman coding on the quantized coefficients to obtain first JPEG image data.
  • the target compressed data includes first compressed data and second compressed data
  • the first compressed data is compressed data of the first JPEG image data
  • the second compressed data is used for Decompressing the first compressed data
  • the processing unit is specifically configured to obtain a priori quantization coefficient according to the second compressed data, and the priori quantization coefficient is a priori of the quantization coefficient
  • the image processing device further includes:
  • an input unit configured to input the a priori quantization coefficient into a super-a priori output model to obtain first a priori data, where the first a priori data is used to generate the entropy estimation result;
  • the input unit is further configured to input the first prior data into the entropy parameter model to obtain an entropy estimation result
  • the processing unit is specifically configured to calculate the first compressed data according to the entropy estimation result to obtain a quantization coefficient.
  • the processing unit is further configured to decompress the first compressed data to obtain dimension information of the quantized coefficients
  • the processing unit is further configured to obtain an analog quantization coefficient according to the dimension information
  • the input unit is further configured to input the analog quantization coefficient into the context model to obtain context data, where the context data is used to optimize the entropy estimation result;
  • the input unit is specifically configured to input the first prior data and the context data into the entropy parameter model to obtain an entropy estimation result.
  • a fifth aspect of the present application provides a computer storage medium, where instructions are stored in the computer storage medium, and when the instructions are executed on a computer, the instructions cause the computer to execute the method according to the first aspect and/or the second aspect of the present application.
  • a sixth aspect of the present application provides a computer program product, which, when executed on a computer, causes the computer to execute the method according to the first aspect of the present application and/or the embodiments of the second aspect.
  • a seventh aspect of the present application provides a communication device, the communication device includes a processor and a memory, a computer program is stored in the memory, and the processor executes the computer program stored in the memory, so that the communication device executes the method as described in the first embodiment of the present application. In one aspect, and/or, the method of an embodiment of the second aspect.
  • the embodiments of the present application have the following advantages:
  • FIG. 1 is a schematic diagram of a neural network framework provided by an embodiment of the present application.
  • FIG. 2 is a schematic diagram of another neural network framework provided by an embodiment of the present application.
  • FIG. 3 is a schematic diagram of another neural network framework provided by an embodiment of the present application.
  • FIG. 4 is a schematic diagram of another neural network framework provided by an embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of a chip provided by an embodiment of the present application.
  • FIG. 6 is a schematic diagram of a system architecture provided by an embodiment of the present application.
  • FIG. 7 is a schematic diagram of an image processing method framework provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of another image processing method framework provided by an embodiment of the present application.
  • FIG. 9 is a schematic flowchart of an image processing method provided by an embodiment of the present application.
  • FIG. 10 is a schematic flowchart of another image processing method provided by an embodiment of the present application.
  • FIG. 11 is a schematic flowchart of another image processing method provided by an embodiment of the present application.
  • FIG. 12 is a schematic flowchart of another image processing method provided by an embodiment of the present application.
  • FIG. 13 is a schematic flowchart of another image processing method provided by an embodiment of the present application.
  • FIG. 14 is a schematic flowchart of another image processing method provided by an embodiment of the present application.
  • FIG. 15 is a schematic structural diagram of an image processing apparatus provided by an embodiment of the present application.
  • FIG. 16 is a schematic structural diagram of another image processing apparatus provided by an embodiment of the present application.
  • FIG. 17 is a schematic structural diagram of another image processing apparatus provided by an embodiment of the present application.
  • FIG. 18 is a schematic structural diagram of another image processing apparatus provided by an embodiment of the present application.
  • FIG. 19 is a schematic structural diagram of another image processing apparatus according to an embodiment of the present application.
  • the embodiment of the present application provides an image compression method for JPEG image compression.
  • the quantized coefficients are obtained by performing Huffman decoding on the image, the probability model is obtained by entropy estimation of the quantized coefficients, and the bytes obtained by decoding the image.
  • the data and probability model are re-encoded to obtain an image that occupies less storage space than the JPEG image before compression, which improves the performance of JPEG image compression.
  • Figure 1 shows a schematic diagram of an artificial intelligence main frame, which describes the overall workflow of an artificial intelligence system and is suitable for general artificial intelligence field requirements.
  • the "intelligent information chain” reflects a series of processes from data acquisition to processing. For example, it can be the general process of intelligent information perception, intelligent information representation and formation, intelligent reasoning, intelligent decision-making, intelligent execution and output. In this process, data has gone through the process of "data-information-knowledge-wisdom".
  • the "IT value chain” reflects the value brought by artificial intelligence to the information technology industry from the underlying infrastructure of human intelligence, information (providing and processing technology implementation) to the industrial ecological process of the system.
  • the infrastructure provides computing power support for artificial intelligence systems, realizes communication with the outside world, and supports through the basic platform. Communication with the outside world through sensors; computing power is provided by smart chips (hardware acceleration chips such as CPU, NPU, GPU, ASIC, FPGA); the basic platform includes distributed computing framework and network-related platform guarantee and support, which can include cloud storage and computing, interconnection networks, etc. For example, sensors communicate with external parties to obtain data, and these data are provided to the intelligent chips in the distributed computing system provided by the basic platform for calculation.
  • smart chips hardware acceleration chips such as CPU, NPU, GPU, ASIC, FPGA
  • the basic platform includes distributed computing framework and network-related platform guarantee and support, which can include cloud storage and computing, interconnection networks, etc. For example, sensors communicate with external parties to obtain data, and these data are provided to the intelligent chips in the distributed computing system provided by the basic platform for calculation.
  • the data on the upper layer of the infrastructure is used to represent the data sources in the field of artificial intelligence.
  • the data involves graphics, images, voice, video, and text, as well as IoT data of traditional devices, including business data of existing systems and sensory data such as force, displacement, liquid level, temperature, and humidity.
  • Data processing usually includes data training, machine learning, deep learning, search, reasoning, decision-making, etc.
  • machine learning and deep learning can perform symbolic and formalized intelligent information modeling, extraction, preprocessing, training, etc. on data.
  • Reasoning refers to the process of simulating human's intelligent reasoning method in a computer or intelligent system, using formalized information to carry out machine thinking and solving problems according to the reasoning control strategy, and the typical function is search and matching.
  • Decision-making refers to the process of making decisions after intelligent information is reasoned, usually providing functions such as classification, sorting, and prediction.
  • some general capabilities can be formed based on the results of data processing, such as algorithms or a general system, such as translation, text analysis, computer vision processing (such as image recognition, object detection, etc.), speech recognition, etc.
  • algorithms or a general system such as translation, text analysis, computer vision processing (such as image recognition, object detection, etc.), speech recognition, etc.
  • Intelligent products and industry applications refer to the products and applications of artificial intelligence systems in various fields. They are the encapsulation of the overall artificial intelligence solution, and the productization of intelligent information decision-making and implementation of applications. Its application areas mainly include: intelligent manufacturing, intelligent transportation, Smart home, smart medical care, smart security, autonomous driving, safe city, smart terminals, etc.
  • an embodiment of the present application provides a system architecture 200 .
  • the system architecture includes a database 230 and a client device 240 .
  • the data collection device 260 is used to collect data and store it in the database 230 , and the training module 220 generates the target model/rule 201 based on the data maintained in the database 230 .
  • W is the weight vector, and each value in the vector represents the weight value of a neuron in the neural network of this layer.
  • This vector determines the spatial transformation from the input space to the output space described above, that is, the weights of each layer control how the space is transformed.
  • the purpose of training a deep neural network is to finally get the weight matrix of all layers of the trained neural network.
  • the weight vector of each layer of neural network can be updated according to the difference between the predicted value and the target value of the current network (of course, in There is usually an initialization process before the first update, that is, preconfiguring parameters for each layer in a deep neural network). For example, if the predicted value of the network is too high, the value of the weight in the weight matrix is adjusted to reduce the predicted value, and after continuous adjustment, the value output by the neural network is close to or equal to the target value. Therefore, it is necessary to pre-define "how to compare the difference between the predicted value and the target value", that is, the loss function or the objective function.
  • the loss function is used to measure the difference between the predicted value and the target value. Important equation. Among them, taking the loss function as an example, the higher the output value of the loss function (loss), the greater the difference, and the training of the neural network can be understood as the process of reducing the loss as much as possible.
  • the computing module may include a training module 220, and the target model/rule obtained by the training module 220 may be applied to different systems or devices.
  • the execution device 210 configures a transceiver 212, which can be a wireless transceiver, an optical transceiver, or a wired interface (such as an I/O interface), etc., to perform data interaction with external devices, and a "user" can
  • the client device 240 inputs data to the transceiver 212.
  • the client device 240 can send target tasks to the execution device 210, request the execution device to build a neural network, and send the execution device 210 a database for training.
  • the execution device 210 can call data, codes, etc. in the data storage system 250 , and can also store data, instructions, etc. in the data storage system 250 .
  • the calculation module 211 uses the target model/rule 201 to process the input data.
  • transceiver 212 returns the constructed neural network to client device 240 to deploy the neural network in client device 240 or other devices.
  • the training module 220 can obtain corresponding target models/rules 201 based on different data for different target tasks, so as to provide users with better results.
  • the user may manually specify input data in the execution device 210 , eg, operating in an interface provided by the transceiver 212 .
  • the client device 240 can automatically input data to the transceiver 212 and obtain the result. If the client device 240 automatically enters data and needs to obtain the user's authorization, the user can set the corresponding permission in the client device 240 .
  • the user can view the result output by the execution device 210 on the client device 240, and the specific presentation form can be a specific manner such as display, sound, and action.
  • the client device 240 can also act as a data collection end to store the collected data associated with the target task into the database 230 .
  • FIG. 2 is only a schematic diagram of a system architecture provided by an embodiment of the present application, and the positional relationship among the devices, devices, modules, etc. shown in the figure does not constitute any limitation.
  • the data storage system 250 is an external memory relative to the execution device 210 . In other scenarios, the data storage system 250 may also be placed in the execution device 210 .
  • CNN Convolutional Neural Network
  • CNN is a deep neural network with a convolutional structure. It is a deep learning architecture.
  • a deep learning architecture refers to multiple levels of learning at different levels of abstraction through machine learning algorithms.
  • a CNN is a feed-forward artificial neural network in which each neuron responds to overlapping regions in images fed into it.
  • a convolutional neural network (CNN) 100 may include an input layer 110 , a convolutional/pooling layer 120 , where the pooling layer is optional, and a neural network layer 130 .
  • the convolutional/pooling layer 120 may include layers 121-126 as examples.
  • layer 121 is a convolutional layer
  • layer 122 is a pooling layer
  • layer 123 is a convolutional layer
  • layer 124 is a convolutional layer.
  • Layers are pooling layers
  • 125 are convolutional layers
  • 126 are pooling layers; in another implementation, 121 and 122 are convolutional layers, 123 are pooling layers, 124 and 125 are convolutional layers, and 126 are pooling layer. That is, the output of a convolutional layer can be used as the input of a subsequent pooling layer, or it can be used as the input of another convolutional layer to continue the convolution operation.
  • the convolution layer 121 may include many convolution operators, which are also called kernels, and their role in image processing is equivalent to a filter that extracts specific information from the input image matrix.
  • the convolution operator can essentially be a weight matrix, which is usually pre-defined. During a convolution operation on an image, the weight matrix is usually performed one pixel by one pixel (or two pixels by two pixels...depending on the value of stride) in the horizontal direction on the input image. processing to complete the work of extracting specific features from an image.
  • the size of this weight matrix should be related to the size of the image. It should be noted that the depth dimension of the weight matrix is the same as the depth dimension of the input image.
  • the weight matrix will extend to the entire depth of the input image. Therefore, convolution with a single weight matrix will produce a single depth dimension of the convolutional output, but in most cases a single weight matrix is not used, but multiple weight matrices of the same dimension are applied.
  • the output of each weight matrix is stacked to form the depth dimension of the convolutional image.
  • Different weight matrices can be used to extract different features in the image. For example, one weight matrix is used to extract image edge information, another weight matrix is used to extract specific colors of the image, and another weight matrix is used to remove unwanted noise in the image. Blur, etc.
  • the multiple weight matrices have the same dimensions, and the feature maps extracted from the multiple weight matrices with the same dimensions have the same dimensions, and then the multiple extracted feature maps with the same dimensions are combined to form the output of the convolution operation.
  • weight values in these weight matrices need to be obtained through a lot of training in practical applications, and each weight matrix formed by the weight values obtained by training can extract information from the input image, thereby helping the convolutional neural network 100 to make correct predictions.
  • the initial convolutional layer for example, 121
  • the features extracted by the later convolutional layers become more and more complex, such as features such as high-level semantics.
  • pooling layer after the convolutional layer, that is, each layer 121-126 exemplified by 120 in Figure 3, which can be a convolutional layer followed by a layer
  • the pooling layer can also be a multi-layer convolutional layer followed by one or more pooling layers.
  • the pooling layer may include an average pooling operator and/or a max pooling operator for sampling the input image to obtain a smaller size image.
  • the average pooling operator can calculate the average value of the pixel values in the image within a certain range.
  • the max pooling operator can take the pixel with the largest value within a specific range as the result of max pooling. Also, just as the size of the weight matrix used in the convolutional layer should be related to the size of the image, the operators in the pooling layer should also be related to the size of the image.
  • the size of the output image after processing by the pooling layer can be smaller than the size of the image input to the pooling layer, and each pixel in the image output by the pooling layer represents the average or maximum value of the corresponding sub-region of the image input to the pooling layer.
  • the convolutional neural network 100 After being processed by the convolutional layer/pooling layer 120, the convolutional neural network 100 is not sufficient to output the required output information. Because as mentioned before, the convolutional layer/pooling layer 120 only extracts features and reduces the parameters brought by the input image. However, in order to generate the final output information (required class information or other related information), the convolutional neural network 100 needs to utilize the neural network layer 130 to generate one or a set of outputs of the required number of classes. Therefore, the neural network layer 130 may include multiple hidden layers (131, 132 to 13n as shown in FIG. 3) and an output layer 140.
  • the convolutional neural network is obtained by searching the super unit with the output of the delay prediction model as a constraint condition to obtain at least one first building unit, and stacking the at least one first building unit.
  • This convolutional neural network can be used for image recognition, image classification, image super-resolution reconstruction, and more.
  • the output layer 140 After the multi-layer hidden layers in the neural network layer 130, that is, the last layer of the entire convolutional neural network 100 is the output layer 140, the output layer 140 has a loss function similar to the classification cross entropy, and is specifically used to calculate the prediction error, Once the forward propagation of the entire convolutional neural network 100 (as shown in Fig. 3 from 110 to 140 is forward propagation) is completed, the back propagation (as shown in Fig. 3 from 140 to 110 as back propagation) will start to update The weight values and biases of the aforementioned layers are used to reduce the loss of the convolutional neural network 100 and the error between the result output by the convolutional neural network 100 through the output layer and the ideal result.
  • the convolutional neural network 100 shown in FIG. 3 is only used as an example of a convolutional neural network.
  • the convolutional neural network can also exist in the form of other network models, for example, such as The multiple convolutional layers/pooling layers shown in FIG. 4 are in parallel, and the extracted features are input to the full neural network layer 130 for processing.
  • FIG. 5 is a structural diagram of a chip hardware provided by an embodiment of the present invention.
  • the neural network processor NPU 50NPU is mounted on the main CPU (Host CPU) as a co-processor, and tasks are assigned by the Host CPU.
  • the core part of the NPU is the arithmetic circuit 503, which is controlled by the controller 504 to extract the matrix data in the memory and perform multiplication operations.
  • the arithmetic circuit 503 includes multiple processing units (Process Engine, PE). In some implementations, arithmetic circuit 503 is a two-dimensional systolic array. The arithmetic circuit 503 may also be a one-dimensional systolic array or other electronic circuitry capable of performing mathematical operations such as multiplication and addition. In some implementations, arithmetic circuit 503 is a general-purpose matrix processor.
  • PE Processing Unit
  • arithmetic circuit 503 is a two-dimensional systolic array. The arithmetic circuit 503 may also be a one-dimensional systolic array or other electronic circuitry capable of performing mathematical operations such as multiplication and addition. In some implementations, arithmetic circuit 503 is a general-purpose matrix processor.
  • the operation circuit fetches the data corresponding to the matrix B from the weight memory 502 and buffers it on each PE in the operation circuit.
  • the arithmetic circuit fetches the matrix A data from the input memory 501 and performs matrix operations on the matrix B, and the obtained partial result or final result of the matrix is stored in the accumulator 508 accumulator.
  • Unified memory 506 is used to store input data and output data.
  • the weight data directly accesses the controller 505 through the storage unit, and the DMAC is transferred to the weight memory 502.
  • Input data is also moved to unified memory 506 via the DMAC.
  • the BIU is the Bus Interface Unit, that is, the bus interface unit 510, which is used for the interaction between the AXI bus and the DMAC and the instruction fetch memory 509 Instruction Fetch Buffer.
  • the bus interface unit 510 (Bus Interface Unit, BIU for short) is used for the instruction fetch memory 509 to obtain instructions from the external memory, and also for the storage unit access controller 505 to obtain the original data of the input matrix A or the weight matrix B from the external memory.
  • BIU Bus Interface Unit
  • the DMAC is mainly used to transfer the input data in the external memory DDR to the unified memory 506 , the weight data to the weight memory 502 , or the input data to the input memory 501 .
  • the vector calculation unit 507 has multiple operation processing units, and if necessary, further processes the output of the operation circuit, such as vector multiplication, vector addition, exponential operation, logarithmic operation, size comparison and so on.
  • Mainly used for non-convolutional/FC layer network calculations in neural networks such as Pooling (pooling), Batch Normalization (batch normalization), Local Response Normalization (local response normalization), etc.
  • vector computation unit 507 can store the processed output vectors to unified buffer 506 .
  • the vector calculation unit 507 may apply a nonlinear function to the output of the arithmetic circuit 503, such as a vector of accumulated values, to generate activation values.
  • vector computation unit 507 generates normalized values, merged values, or both.
  • the vector of processed outputs can be used as activation input to the arithmetic circuit 503, eg, for use in subsequent layers in a neural network.
  • the instruction fetch memory (instruction fetch buffer) 509 connected to the controller 504 is used to store the instructions used by the controller 504;
  • the unified memory 506, the input memory 501, the weight memory 502 and the instruction fetch memory 509 are all On-Chip memories. External memory is private to the NPU hardware architecture.
  • each layer in the convolutional neural network shown in FIG. 3 and FIG. 4 may be performed by the matrix computing unit or the vector computing unit 507 .
  • an embodiment of the present application provides a system architecture 300 .
  • the execution device 210 is implemented by one or more servers, and optionally, cooperates with other computing devices, such as: data storage, routers, load balancers and other devices; the execution device 210 may be arranged on a physical site, or distributed in multiple on the physical site.
  • the execution device 210 may use the data in the data storage system 250, or call the program code in the data storage system 250 to implement the steps of the image compression method corresponding to FIG. 9 below in the present application.
  • a user may operate respective user devices (eg, local device 301 and local device 302 ) to interact with execution device 210 .
  • Each local device may represent any computing device, such as a personal computer, computer workstation, smartphone, tablet, smart camera, smart car or other type of cellular phone, media consumption device, wearable device, set-top box, gaming console, etc.
  • Each user's local device can interact with the execution device 210 through any communication mechanism/standard communication network, which can be a wide area network, a local area network, a point-to-point connection, etc., or any combination thereof.
  • the communication network may include a wireless network, a wired network, or a combination of a wireless network and a wired network, and the like.
  • the wireless network includes but is not limited to: the fifth generation mobile communication technology (5th-Generation, 5G) system, the long term evolution (long term evolution, LTE) system, the global system for mobile communication (global system for mobile communication, GSM) or code division Multiple access (code division multiple access, CDMA) network, wideband code division multiple access (wideband code division multiple access, WCDMA) network, wireless fidelity (wireless fidelity, WiFi), Bluetooth (bluetooth), Zigbee protocol (Zigbee), Any one or a combination of radio frequency identification technology (radio frequency identification, RFID), long range (Long Range, Lora) wireless communication, and near field communication (near field communication, NFC).
  • the wired network may include an optical fiber communication network or a network composed of coaxial cables, and the like.
  • one or more aspects of the execution device 210 may be implemented by each local device, for example, the local device 301 may provide the execution device 210 with local data or feedback calculation results.
  • the local device 301 implements the functions of the execution device 210 and provides services for its own users, or provides services for the users of the local device 302 .
  • Huffman coding refers to an entropy coding algorithm used for lossless data compression in computer data processing. Specifically, Huffman coding uses a variable-length encoding table to encode a source symbol (such as a letter in a file), where the variable-length encoding table is obtained by a method of evaluating the occurrence probability of the source symbol, the occurrence probability Higher letters use shorter encodings, and vice versa, longer encodings are used for low-occurrence letters, which reduces the average length and expected value of encoded strings, thereby achieving the purpose of lossless compression of data.
  • a source symbol such as a letter in a file
  • the text when you need to encode a text containing 1000 characters, the text contains 6 characters such as a, b, c, d, e and f
  • the Huffman decoder uses the same code table to restore the original code stream losslessly. In the process of Huffman encoding when generating JPEG image data, the Huffman table is stored in the JPEG header part, so that the Huffman table can be obtained by parsing the JPEG header part for decoding.
  • Huffman coding of source symbols usually includes the following steps: the first step is to arrange the probabilities of the source symbols in descending order; the second step is to add the two smallest probabilities, and repeat the process. step, always put the higher probability branch on the right, until the final sum of probabilities sums to 1; third step, assign the left one of each pair combination as 0 and the right one as 1 (or vice versa); fourth Step 1, draw the path from probability 1 to each source symbol, record 0 and 1 along the path in sequence, and the result is the Huffman code word (English: code word) corresponding to the symbol.
  • Arithmetic coding refers to an entropy coding algorithm used for lossless data compression in computer data processing, which can directly encode the input data as a decimal greater than or equal to 0 and less than 1 (English: decimals) .
  • an original interval is first selected, usually the original interval is [0, 1).
  • the original interval is divided into several segments according to the probability of each element of the data to be encoded, and each element corresponds to a certain interval.
  • the original interval is reduced to a new interval according to the type of the element.
  • the interval is adjusted sequentially according to the types of the elements of the data to be encoded until all elements of the data to be encoded are encoded. At this time, taking any number in the current interval can be output as the encoding result.
  • the data to be encoded consists of three elements, A, B and C, and the probability of A is 30%, the probability of B is 20%, and the probability of C is 50%, then it can be considered that A corresponds to 0-30% , B corresponds to 30%-50%, C corresponds to 50%-100%.
  • A corresponds to 0-30%
  • B corresponds to 30%-50%
  • C corresponds to 50%-100%.
  • the initial interval [0, 1) is reduced to [0, 0.3); then according to the range 30%-50% corresponding to B, Take 30%-50% of the current interval [0, 0.3), which is [0.09, 0.15); and then further reduce the current interval to [0.120, 0.150) according to the range of 50%-100% corresponding to C.
  • the encoding result for "ABC” is to arbitrarily select a number from the current interval, such as 0.130.
  • arithmetic coding is often performed on binary data, so each bit of the data to be coded has only two cases, 0 and 1.
  • the coding principle is the same as the above description of arithmetic coding.
  • the coding unit in this application, refers to a unit that performs discrete cosine transform (English full name: Discrete Cosine Transform, DCT for short) on original image data and a unit that performs Huffman decoding on image data in JPEG format.
  • the original image data is composed of data corresponding to each pixel.
  • DCT transformation is usually performed in units of data corresponding to pixels in 8 rows and 8 columns or 16 rows and 16 columns.
  • the pixel points of 8 rows and 8 columns or 16 rows and 16 columns are called coding units.
  • the original DCT transform will be 8 rows and 8 columns or 16 rows and 16 rows.
  • the data corresponding to the pixel points of the column is subjected to Huffman decoding as a unit, so the coding unit in this application is a unit for performing DCT transformation on original image data and a unit for performing Huffman decoding on image data in JPEG format.
  • Image compression refers to the lossy or lossless representation of the original image with fewer bits.
  • Image data can be compressed because there is redundancy in image data.
  • the redundancy of image data is mainly manifested as: spatial redundancy caused by the correlation between adjacent pixels in the image; temporal redundancy caused by the correlation between different frames in the image sequence; caused by the correlation of different color planes or spectral bands. spectrum redundancy.
  • the purpose of image data compression is to reduce the number of bits required to represent the data by removing these data redundancies. Due to the huge amount of image data, it is very difficult to store, transmit and process, so the compression of image data is very important.
  • Compression is divided into two categories, lossy compression and lossless compression.
  • Lossy compression allows the image to be somewhat different before compression and after decompression.
  • lossless compression the image is basically the same before compression and after decompression.
  • the methods in the embodiments of the present application correspond to lossless compression.
  • JPEG image is the most widely used image coding format at present, but the compression ratio of JPEG image to the original image is not high. Compression ratio of JPEG images.
  • the embodiment of the present application is applied to the process of JPEG image compression.
  • the compressed JPEG image data is obtained by performing lossless compression on the JPEG image data.
  • the data is restored to obtain the lossless original JPEG image data.
  • the image processing device first obtains JPEG image data, then performs Huffman decoding on the JPEG image data, and then obtains quantization coefficients by quantizing the data. After that, the quantization coefficient is input into the entropy estimation network for probability estimation to obtain a probability model. Then, a compressed JPEG image is obtained through the probability model and the byte data obtained by arithmetic decoding.
  • the image processing method is divided into an image compression process and an image decompression process, and the image compression process and the image decompression process are respectively described in the embodiments of the present application.
  • FIG. 9 is a schematic flowchart of an image processing method provided by the present application.
  • Step 901 Perform Huffman decoding on the first JPEG image data to obtain quantized coefficients.
  • the image processing apparatus After acquiring the first JPEG image data, the image processing apparatus performs Huffman decoding on the first JPEG image to obtain quantization coefficients.
  • the image collected by the camera or the image acquisition device is the original image
  • the JPEG image is the compressed image data of the original image.
  • the image processing apparatus will perform Huffman decoding on the first JPEG image data through the Huffman lookup table.
  • the Huffman lookup table is in the first JPEG image. The header field part of the image data.
  • the image processing apparatus acquires the first JPEG image.
  • the image processing apparatus acts as a cloud server and acquires the first JPEG image data from the terminal side.
  • the image processing device obtains the first JPEG image data from the camera after the camera collects the image data, or after obtaining the image data collected by the camera, and then converts the image data into the first JPEG image data.
  • there may be more scenarios for obtaining the first JPEG image data which is not specifically limited here.
  • the image processing apparatus after acquiring the first JPEG image data, performs subsequent processing on the first JPEG image data in units of blocks, wherein the size of the block is the same as the size of the coding unit, and the block
  • the size of the pixel can be 8 rows and 8 columns of pixels, or other sizes, such as 4 rows and 4 columns or 16 rows and 16 columns of pixels, which is not specifically limited here.
  • the image processing apparatus performs Huffman decoding in units of blocks to obtain discrete cosine coefficients corresponding to each block of the first JPEG image data, where the discrete cosine coefficients are quantization coefficients.
  • the quantized coefficient further includes three components of YUV.
  • the dimension of the Y component is twice the dimension of the UV component, and the UV component is filled with zeros in every column to fill the UV component to the dimension of the Y component.
  • the three components of YUV are combined as quantization coefficients for input into the subsequent entropy estimation network. It can be understood that, in the actual application process, the dimensions of the three components of YUV may also be other corresponding manners, which are not specifically limited here.
  • Step 902 Input the quantized coefficients into an entropy estimation network to obtain an entropy estimation result.
  • the image processing device After the image processing device performs Huffman decoding on the first JPEG image data to obtain the quantized coefficients, the image processing device inputs the quantized coefficients into an entropy estimation network to obtain an entropy estimation result, and the entropy estimation result is used for probabilistic estimation of the quantized coefficients. estimate.
  • the entropy estimation network includes a super-a priori model and an entropy parameter model, the super-a priori model is used to perform a priori on the quantization coefficient, and the entropy parameter model is used to convert the a priori result, In order to obtain the mean and variance required for probability estimation, that is, the entropy estimation result.
  • the super-prior model includes a super-prior input module, a super-prior output module and a probability distribution module.
  • the image processing device After the image processing device obtains the quantized coefficients, the image processing device inputs the quantized coefficients into the previously trained super-a priori input module to obtain a first a priori value, and the first a priori value is obtained after being converted by the super-a priori input module.
  • the super-prior input module is composed of a multi-layer convolutional network and an activation function.
  • the first input module is a convolutional layer with a step number of 3
  • the output channel number is 192
  • the step size is 1
  • the activation function is Leaky ReLU
  • the second input module is a step number of 1.
  • the third input module is the convolutional layer with the stride 5
  • the number of output channels is 192
  • the stride is 2.
  • the super-priority input module may also be composed of other components, which are not specifically limited here.
  • the image processing apparatus quantizes the first priori value to obtain a priori quantization coefficient, and inputs the priori quantization coefficient into the probability distribution model to obtain the priori estimation result,
  • the prior estimation result is used for arithmetic coding with prior quantization to obtain second compressed data.
  • the first distribution module in the probability distribution model is a convolutional layer with a step number of 1, the number of output channels is 640, the step size is 1, the activation function is Leaky ReLU, and the second distribution module is a step number of 1.
  • the number of output channels is 512
  • the stride is 1
  • the activation function is Leaky ReLU
  • the third distribution module is a convolutional layer with a stride of 1
  • the number of output channels is 384
  • the stride is 1. Understandable
  • the probability distribution model may also be composed of other structures, which are not specifically limited here.
  • the a priori quantized coefficient includes three components of Y'U'V'.
  • the dimension of the Y' component is twice that of the U'V' component, and the U'V' component is filled with zeros in every column, and the U'V' component is filled to the dimension of the Y' component. dimension.
  • the three components of Y'U'V' are combined as quantization coefficients for input into the prior probability distribution model. It can be understood that, in the actual application process, the dimensions of the three components of Y'U'V' may also be other corresponding manners, which are not specifically limited here.
  • the second compressed data is stored, and the second compressed data is used for decompressing the subsequently obtained first compressed data. Then, perform arithmetic decoding according to the second compressed data and the priori estimation result to obtain a priori quantized coefficient, and input the priori quantized coefficient into the super-a priori output module to output the first priori data. If the test data is used to generate the entropy estimation result, the first a priori data can eliminate the correlation of the random variables of each pixel position in the quantization coefficient, thereby effectively improving the performance of arithmetic coding.
  • the first priori data is input into the entropy parameter model to obtain an entropy estimation result, where the entropy estimation result includes a mean value and a variance.
  • the first parameter module in the entropy parameter model is a convolution layer with a step number of 1, the number of output channels is 640, the step size is 1, the activation function is Leaky ReLU, and the second parameter module is a step number of 1.
  • the convolutional layer of the number of output channels is 512, the stride is 1, the activation function is Leaky ReLU, the third parameter module is a convolutional layer with a stride of 1, the number of output channels is 384, and the stride is 1.
  • the entropy parameter model may also be composed of other structures, which are not specifically limited here.
  • the entropy estimation network further includes a context model, and in a preferred manner, the context model is an autoregressive model based on PixelCNN++.
  • the context model is an autoregressive model based on PixelCNN++.
  • the image processing apparatus inputs the quantized coefficients into the context model to obtain context data, and the context data is used to optimize the entropy estimation result.
  • the context model includes a convolutional layer with a stride of 5, an output channel of 384, and a stride of 1.
  • the context data and the first prior information are input into the entropy parameter model to obtain the entropy estimation result.
  • the entropy parameter model is calculated according to the context data and the first prior information to obtain the mean and variance for probability estimation, that is, the entropy estimation result.
  • the entropy estimation network when the entropy estimation network includes a context model, but does not include each model in the super-prior network, only the context data obtained by the context model can be input into the entropy parameter model to obtain the entropy estimation.
  • the context model and the entropy parameter model are included in the entropy estimation network, the architecture of the entropy estimation network is simplified and the efficiency of the overall decompression process is improved.
  • the image processing apparatus Before using the entropy estimation network model, the image processing apparatus needs to train the entropy estimation network model.
  • the training data set is a plurality of JPEG images, for example, 1.2 million JPEG images.
  • Each JPEG image in the training dataset is subjected to Huffman decoding to obtain three YUV components of the quantized coefficients.
  • the dimension of the Y component is twice that of the UV component, and the UV is interlaced and zero-padded to the Y dimension, and then the three components of YUV are combined as the input of the entropy estimation network.
  • the super-prior input model is composed of a multi-layer convolutional network and an activation function.
  • the quantization coefficients are input into the super-prior input model to obtain the prior quantization coefficients related to the quantization coefficients. Coding is performed on the quantized coefficients.
  • the super-prior output model is composed of multiple layers of deconvolutional networks and activation functions.
  • the entropy estimation network also contains a context model, which is a mask convolutional layer to further model the prior distribution.
  • the output data of the context model and the super-prior output model are then combined through the entropy parameter model to obtain a probability estimation model. Among them, when the bit rate loss function tends to converge, the training ends.
  • Step 903 Perform arithmetic coding on the quantization coefficient and the entropy estimation result to obtain first compressed data.
  • the entropy estimation result is calculated to obtain byte stream data composed of bit information, and the storage space occupied by the first compressed data is smaller than the storage space occupied by the first JPEG image data.
  • step 902 may be performed by an image processing apparatus, or may be performed by another image processing apparatus, which is not specifically limited here.
  • the image processing device When executed by other image processing devices, the image processing device sends the quantization coefficients to other image processing devices, and the other image processing devices perform calculations according to the quantization coefficients and the entropy estimation network, and after obtaining the corresponding entropy estimation results, the entropy The estimation result is sent to the image processing device.
  • the models included in the entropy estimation network are only examples, and do not constitute limitations to the embodiments of the present application.
  • more models may be included for Optimize the entropy estimation result.
  • the image processing apparatus obtains a quantization coefficient after performing Huffman decoding on the first JPEG image, and obtains the first compressed data according to the entropy estimation result obtained by performing entropy estimation on the quantization coefficient and the quantization coefficient. Therefore, the The process is a lossless compression process, and it is not necessary to perform RGB conversion on the first JPEG image, which improves the performance of JPEG image compression.
  • the decompression method when the first JPEG image data is compressed, when the entropy estimation network includes the decompression method of each model of the super-priority network, the decompression method is different from the decompression method when only the context model is included, Therefore, they are described separately.
  • the entropy estimation network includes each model of the super-prior network.
  • FIG. 10 is another schematic flowchart of an image processing method according to an embodiment of the present application.
  • step 1001 target compressed data is obtained.
  • the image processing apparatus obtains target compressed data, the target compressed data includes first compressed data and second compressed data, and the first compressed data needs to be decompressed to obtain first JPEG image data, and the first compressed data is the first compressed data. Compressed data for JPEG images.
  • the second compressed data is the compressed data obtained according to the super-prior network, and is used for decompressing the first compressed data.
  • step 1002 the image processing apparatus obtains an entropy estimation result according to the target compressed data.
  • the image processing device After acquiring the target compressed data, the image processing device performs decompression and calculation according to the target compressed data to obtain an entropy estimation result.
  • the image processing apparatus decompresses and calculates the second compressed data to obtain an entropy estimation result.
  • the image processing apparatus decompresses the second compressed data , obtain an intermediate data, and calculate the intermediate data according to the calculation function to obtain a priori quantization coefficient. Then, the priori quantization coefficient is input into the super-a priori output model to obtain the first priori data. The image processing apparatus further decompresses the first compressed data to obtain dimension information of each quantization coefficient, where the dimension information of the quantization information represents the matrix dimension of the quantization coefficient.
  • the quantized coefficient is set as an all-zero matrix according to the dimensional information of the quantized coefficient, that is, an analog quantized coefficient is obtained, and the analog quantized coefficient is input into the context model to obtain a context data.
  • the context data and the first priori data are input into the entropy parameter model to obtain an entropy estimation result, and the entropy estimation result is used to decompress the first compressed data.
  • the value of the dimension of the matrix in which the quantization coefficient is set may also be other values, which are not specifically limited here.
  • the image processing apparatus decompresses the second compressed data to obtain a
  • the intermediate data is calculated according to the calculation function to obtain a priori quantization coefficient.
  • the priori quantization coefficient is input into the super-a priori output model to obtain the first priori data.
  • the first prior data is input into the entropy parameter model to obtain an entropy estimation result, and the entropy estimation result is used to decompress the first compressed data.
  • step 1003 the image processing apparatus calculates the target compressed data according to the entropy estimation result to obtain quantization coefficients.
  • the image processing apparatus After obtaining the entropy estimation result, the image processing apparatus calculates the target compressed data according to the entropy estimation result to obtain the quantization coefficient.
  • multiple entropy estimation results are obtained, and the multiple entropy estimation results correspond to data of different pixel points.
  • step 1004 Huffman coding is performed on the quantized coefficients to obtain first JPEG image data.
  • Huffman coding is performed on the quantized coefficients to obtain first JPEG image data.
  • Huffman coding is performed according to the plurality of quantization coefficients to obtain the first JPEG image data.
  • the first compressed data after decompressing the target compressed data to obtain the quantized coefficients, and then performing Huffman coding on the quantized coefficients, the first compressed data can be restored back to the first JPEG image data, and the process is lossless restoration. process, which improves the performance of image restoration.
  • FIG. 11 is another schematic flowchart of an image processing method according to an embodiment of the present application.
  • step 1101 the image processing apparatus acquires first compressed data.
  • the image processing apparatus obtains the first compressed data, and the first compressed data is the target compressed data.
  • the first compressed data needs to be decompressed to obtain first JPEG image data, where the first compressed data is compressed data of the first JPEG image.
  • step 1102 the image processing apparatus obtains an entropy estimation result according to the first compressed data.
  • the image processing apparatus After acquiring the first compressed data, the image processing apparatus performs decompression and calculation according to the first compressed data to obtain an entropy estimation result.
  • the image processing apparatus decompresses the first compressed data to obtain each quantization coefficient
  • the dimension information of the quantization information indicates the matrix dimension of the quantization coefficient.
  • the value of the matrix dimension of the quantized coefficient is set to 0, and the set quantized coefficient is input into the context model to obtain a context data.
  • the context data is input into the entropy parameter model to obtain an entropy estimation result, and the entropy estimation result is used to decompress the first compressed data.
  • the value of the matrix dimension for setting the quantization coefficient is still another value, which is not specifically limited here.
  • step 1103 the image processing apparatus calculates the first compressed data according to the entropy estimation result to obtain quantization coefficients.
  • the image processing apparatus After obtaining the entropy estimation result, the image processing apparatus calculates the first compressed data according to the entropy estimation result to obtain the quantization coefficient.
  • the multiple entropy estimation results correspond to data of different pixel points.
  • step 1104 the image processing apparatus performs Huffman coding on the quantized coefficients to obtain first JPEG image data.
  • Huffman coding is performed on the quantized coefficients to obtain first JPEG image data.
  • Huffman coding is performed according to the plurality of quantization coefficients to obtain the first JPEG image data.
  • the method for compressing the first JPEG image may be run independently in the image processing apparatus, that is, an image processing apparatus may only use the method for compressing the first JPEG image.
  • the image decompression method and the first JPEG image compression method can be used on the same image processing device, or can be used on different image processing devices.
  • the terminal device will use the above image compression method.
  • the first JPEG image data is compressed, and the first compressed data is uploaded to the server, and the server saves the first compressed data.
  • the server needs to call the first JPEG image
  • the compression method decompresses the first compressed data to obtain a first JPEG image.
  • FIG. 15 is a schematic structural diagram of the image processing apparatus provided by the present application.
  • An image processing device comprising:
  • Decoding unit 1501 for performing Huffman decoding on the first JPEG image data to obtain quantized coefficients
  • a processing unit 1502 configured to perform entropy estimation on the quantized coefficients to obtain an entropy estimation result, and the entropy estimation results are used to perform probability estimation on the quantized coefficients;
  • the encoding unit 1503 is configured to perform arithmetic encoding on the quantization coefficient and the entropy estimation result to obtain first compressed data, where the storage space occupied by the first compressed data is smaller than the storage space occupied by the first JPEG image data.
  • each unit of the image processing apparatus is similar to the steps performed by the image processing apparatus in the aforementioned embodiment shown in FIG. 9 , and details are not repeated here.
  • FIG. 16 is another schematic structural diagram of the image processing apparatus provided in the present application.
  • An image processing device comprising:
  • Decoding unit 1601 for performing Huffman decoding on the first JPEG image data to obtain quantized coefficients
  • a processing unit 1602 configured to perform entropy estimation on the quantized coefficients to obtain an entropy estimation result, and the entropy estimation result is used to perform probability estimation on the quantized coefficients;
  • the encoding unit 1603 is configured to perform arithmetic encoding on the quantization coefficient and the entropy estimation result to obtain first compressed data, where the storage space occupied by the first compressed data is smaller than the storage space occupied by the first JPEG image data.
  • the image processing apparatus further includes:
  • the input unit 1604 is used for inputting the quantized coefficients into the entropy estimation network to obtain an entropy estimation result, and the entropy estimation network is used to perform probability estimation on the quantized coefficients.
  • the entropy estimation network includes a super-a priori input model, a super-a priori output model, a probability distribution model, and an entropy parameter model, and the input unit is specifically used to input the quantization coefficients into the super-a priori input model to obtain the first a priori. value, the first prior value is the prior value of the quantization coefficient;
  • the processing unit 1602 is further configured to quantize the first prior value to obtain a priori quantized coefficient
  • the input unit 1604 is further configured to input the prior quantization coefficients into the probability distribution model to obtain a priori estimation result
  • the encoding unit 1603 is further configured to encode the prior estimation result and the prior quantization coefficient to obtain second compressed data, and the second compressed data is used to decompress the first compressed data;
  • the input unit 1604 is further configured to input the a priori quantization coefficients into the super-a priori output model to obtain first prior data, and the first prior data is used to generate an entropy estimation result;
  • the input unit 1604 is further configured to input the first prior data into the entropy parameter model to obtain an entropy estimation result.
  • the entropy estimation network includes a context model and an entropy parameter model, and the input unit is also used to input the quantization coefficients into the context model to obtain context data, and the context data is used to optimize the entropy estimation result;
  • the input unit 1604 is also used for inputting the context data into the entropy parameter model to obtain an entropy estimation result.
  • the input unit 1604 is further configured to input the quantization coefficient into the In the context model, to obtain context data, the context data is used to optimize the entropy estimation result;
  • the input unit 1604 is specifically configured to input the first prior data and the context data into the entropy parameter model to obtain an entropy estimation result.
  • the processing unit 1602 is further configured to calculate the first compressed data according to the entropy estimation result to obtain a quantization coefficient
  • the encoding unit 1603 is further configured to perform Huffman encoding on the quantized coefficients to obtain first JPEG image data.
  • each unit of the image processing apparatus is similar to the steps performed by the image processing apparatus in the embodiment shown in FIG. 9 or FIG. 10 , and details are not repeated here.
  • FIG. 17 is another schematic structural diagram of the image processing apparatus provided by the present application.
  • An image processing device comprising:
  • Obtaining unit 1701 configured to obtain target compressed data, where the target compressed data is the compressed data of the first JPEG image data;
  • a processing unit 1702 configured to obtain an entropy estimation result according to the target compressed data, and the entropy estimation result is used to perform probability estimation on the target compressed data;
  • the processing unit 1702 is further configured to calculate the target compressed data according to the entropy estimation result to obtain a quantization coefficient
  • the encoding unit 1703 is configured to perform Huffman encoding on the quantized coefficients to obtain first JPEG image data.
  • each unit of the image processing apparatus is similar to the steps performed by the image processing apparatus in the foregoing embodiment shown in FIG. 10 , and details are not repeated here.
  • FIG. 18 is another schematic structural diagram of the image processing apparatus provided by the present application.
  • An image processing device comprising:
  • the target compressed data is the compressed data of the first JPEG image data
  • a processing unit 1802 configured to obtain an entropy estimation result according to the target compressed data, where the entropy estimation result is used to perform probability estimation on the first compressed data;
  • the processing unit 1802 is further configured to calculate the target compressed data according to the entropy estimation result to obtain a quantization coefficient
  • the encoding unit 1803 is configured to perform Huffman encoding on the quantized coefficients to obtain the first JPEG image data.
  • the target compressed data includes first compressed data and second compressed data
  • the first compressed data is compressed data of the first JPEG image data
  • the second compressed data is used to decompress the first compressed data data
  • the processing unit 1802 is specifically configured to obtain a priori quantization coefficient according to the second compressed data, and the priori quantization coefficient is a priori of the quantization coefficient
  • the image processing device further includes:
  • an input unit 1804 configured to input the a priori quantization coefficients into a super-a priori output model to obtain first a priori data, where the first a priori data is used to generate the entropy estimation result;
  • the input unit 1804 is further configured to input the first prior data into the entropy parameter model to obtain an entropy estimation result
  • the processing unit 1802 is specifically configured to calculate the first compressed data according to the entropy estimation result to obtain a quantization coefficient.
  • processing unit 1802 is further configured to decompress the first compressed data to obtain dimension information of the quantized coefficients
  • the processing unit 1802 is further configured to obtain analog quantization coefficients according to the dimension information
  • the input unit 1804 is further configured to input the analog quantization coefficients into the context model to obtain context data, where the context data is used to optimize the entropy estimation result;
  • the input unit 1804 is specifically configured to input the first prior data and the context data into an entropy parameter model to obtain an entropy estimation result.
  • FIG. 19 is another schematic structural diagram of an image processing apparatus according to an embodiment of the present application.
  • the processor 1901 the memory 1902, the bus 1905, and the interface 1904.
  • the processor 1901 is connected to the memory 1902 and the interface 1904.
  • the bus 1905 is respectively connected to the processor 1901, the memory 1902, and the interface 1904.
  • the interface 1904 is used to receive or send data.
  • the processor 1901 be a single-core or multi-core central processing unit, or be a specific integrated circuit, or be one or more integrated circuits configured to implement embodiments of the invention.
  • the memory 1902 may be random access memory (RAM), or may be non-volatile memory (non-volatile memory), such as at least one hard disk memory.
  • Memory 1902 is used to store computer-implemented instructions. Specifically, the program 1903 may be included in the computer-executed instructions.
  • the processor 1901 when the processor 1901 calls the program 1903, it can make the network device in FIG. 19 perform the operations performed by the image processing apparatus in the embodiment shown in FIG. 9, or FIG. 10, or FIG. 11, specifically here No longer.
  • processors mentioned in the image processing apparatus in the above embodiments of the present application may be a central processing unit (central processing unit, CPU), or may be other general processing units device, digital signal processor (DSP), application-specific integrated circuit (ASIC), off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or Transistor logic devices, discrete hardware components, etc.
  • CPU central processing unit
  • DSP digital signal processor
  • ASIC application-specific integrated circuit
  • FPGA off-the-shelf programmable gate array
  • a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the number of processors in the image processing apparatus in the above embodiments of the present application may be one or multiple, and may be adjusted according to actual application scenarios, which is merely illustrative and not limiting.
  • the number of memories in this embodiment of the present application may be one or multiple, and may be adjusted according to actual application scenarios, which is merely illustrative and not limiting.
  • the image processing apparatus includes a processor (or a processing unit) and a memory
  • the processor in this application may be integrated with the memory, or the processor and the memory may be connected through an interface, which may be
  • the actual application scenario adjustment is not limited.
  • the present application provides a chip system, which includes a processor for supporting an image processing apparatus to implement the functions of the controller involved in the above method, such as processing data and/or information involved in the above method.
  • the chip system further includes a memory for storing necessary program instructions and data.
  • the chip system may be composed of chips, or may include chips and other discrete devices.
  • the chip when the chip system is a chip in a user equipment or an access network, the chip includes: a processing unit and a communication unit, the processing unit may be, for example, a processor, and the communication unit, such as It can be an input/output interface, a pin or a circuit, etc.
  • the processing unit can execute the computer-executed instructions stored in the storage unit, so that the chip in the image processing apparatus or the like executes the steps performed by the image processing apparatus in any one of the embodiments in FIGS. 9-11 .
  • the storage unit is a storage unit in the chip, such as a register, a cache, etc., and the storage unit may also be a storage unit located outside the chip in the image processing device, such as a read-only unit.
  • Memory read-only memory, ROM
  • RAM random access memory
  • Embodiments of the present application further provide a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a computer, implements the method flow performed by the controller of the image processing apparatus in any of the above method embodiments.
  • the computer may be the above-mentioned image processing apparatus.
  • controller or processor mentioned in the above embodiments of the present application may be a central processing unit (central processing unit, CPU), or other general-purpose processors, digital signal processors (digital signal processor, DSP) ), application specific integrated circuit (ASIC), field programmable gate array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. various combinations.
  • CPU central processing unit
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the number of processors or controllers in the image processing apparatus or chip system, etc. in the above embodiments of the present application may be one or more, and may be adjusted according to actual application scenarios, and this is just an example sexual description, not limited.
  • the number of memories in this embodiment of the present application may be one or multiple, and may be adjusted according to actual application scenarios, which is merely illustrative and not limiting.
  • the memory or readable storage medium and the like mentioned in the image processing apparatus and the like in the above embodiments may be volatile memory or non-volatile memory, or may include volatile and Both non-volatile memory.
  • the non-volatile memory may be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically programmable Erase programmable read-only memory (electrically EPROM, EEPROM) or flash memory.
  • Volatile memory may be random access memory (RAM), which acts as an external cache.
  • RAM random access memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • SDRAM synchronous DRAM
  • SDRAM double data rate synchronous dynamic random access memory
  • double data rate SDRAM double data rate SDRAM
  • DDR SDRAM enhanced synchronous dynamic random access memory
  • ESDRAM enhanced synchronous dynamic random access memory
  • SCRAM synchronous link dynamic random access memory
  • direct rambus RAM direct rambus RAM
  • the above-mentioned processing unit or processor may be a central processing unit, a general-purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic devices , transistor logic devices, hardware components, or any combination thereof.
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • a computer program product includes one or more computer instructions.
  • the computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
  • Computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from a website site, computer, server, or data center over a wire (e.g.
  • a computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server, a data center, or the like that includes an integration of one or more available media.
  • Useful media may be magnetic media (eg, floppy disks, hard disks, magnetic tapes), optical media (eg, DVDs), or semiconductor media, among others.
  • the words “if” or “if” as used herein may be interpreted as “at” or “when” or “in response to determining” or “in response to detecting.”
  • the phrases “if determined” or “if detected (the stated condition or event)” can be interpreted as “when determined” or “in response to determining” or “when detected (the stated condition or event),” depending on the context )” or “in response to detection (a stated condition or event)”.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

La présente invention concerne l'intelligence artificielle et une technique de compression d'image.| L'invention concerne un procédé de traitement d'image, destiné à être utilisé dans la compression et la décompression d'image. Le procédé des modes de réalisation de la présente invention comprend les étapes consistant à : décoder selon Huffman des premières données d'image JPEG pour produire un coefficient de quantification ; effectuer une estimation d'entropie par rapport au coefficient de quantification pour produire un résultat d'estimation d'entropie, le résultat d'estimation d'entropie étant utilisé pour effectuer une estimation de probabilité par rapport au coefficient de quantification ; et coder arithmétiquement le coefficient de quantification et le résultat d'estimation d'entropie pour produire des premières données compressées, l'espace de stockage occupé par les premières données compressées étant inférieur à l'espace de stockage occupé par les premières données d'image JPEG. Dans les modes de réalisation de la présente demande, les premières données d'image JPEG sont décodées selon la manière Huffman pour produire le coefficient de quantification, puis les premières données compressées sont produites, pendant un tel procédé, le besoin de convertir les premières données d'image JPEG en une image RVB pour une estimation d'entropie est évité, ce qui permet ainsi de réduire des informations non pertinentes et d'augmenter la performance de compression d'image JPEG.
PCT/CN2021/105097 2020-08-06 2021-07-08 Procédé de traitement d'image et dispositif correspondant WO2022028197A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010784666.6A CN114071141A (zh) 2020-08-06 2020-08-06 一种图像处理方法及其设备
CN202010784666.6 2020-08-06

Publications (1)

Publication Number Publication Date
WO2022028197A1 true WO2022028197A1 (fr) 2022-02-10

Family

ID=80116900

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/105097 WO2022028197A1 (fr) 2020-08-06 2021-07-08 Procédé de traitement d'image et dispositif correspondant

Country Status (2)

Country Link
CN (1) CN114071141A (fr)
WO (1) WO2022028197A1 (fr)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114554226A (zh) * 2022-02-25 2022-05-27 北京市商汤科技开发有限公司 图像处理方法及装置、电子设备和存储介质
CN116778002A (zh) * 2022-03-10 2023-09-19 华为技术有限公司 编解码方法、装置、设备、存储介质及计算机程序产品
CN114820610B (zh) * 2022-06-29 2022-09-06 数聚(山东)医疗科技有限公司 基于图像处理的新材料医疗器械缺陷检测方法

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1692375A (zh) * 2002-10-04 2005-11-02 国际商业机器公司 在对jpeg图像进行代码转换时增强压缩
US20140015698A1 (en) * 2012-07-14 2014-01-16 Alireza Shoa Hassani Lashdan System and method for fixed rate entropy coded scalar quantization
CN104902285A (zh) * 2015-05-21 2015-09-09 北京大学 一种图像编码方法
CN105376578A (zh) * 2015-10-28 2016-03-02 北京锐安科技有限公司 图像压缩方法及装置
CN110602494A (zh) * 2019-08-01 2019-12-20 杭州皮克皮克科技有限公司 基于深度学习的图像编码、解码系统及编码、解码方法
CN110769263A (zh) * 2019-11-01 2020-02-07 合肥图鸭信息科技有限公司 一种图像压缩方法、装置及终端设备

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11412225B2 (en) * 2018-09-27 2022-08-09 Electronics And Telecommunications Research Institute Method and apparatus for image processing using context-adaptive entropy model

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1692375A (zh) * 2002-10-04 2005-11-02 国际商业机器公司 在对jpeg图像进行代码转换时增强压缩
US20140015698A1 (en) * 2012-07-14 2014-01-16 Alireza Shoa Hassani Lashdan System and method for fixed rate entropy coded scalar quantization
CN104902285A (zh) * 2015-05-21 2015-09-09 北京大学 一种图像编码方法
CN105376578A (zh) * 2015-10-28 2016-03-02 北京锐安科技有限公司 图像压缩方法及装置
CN110602494A (zh) * 2019-08-01 2019-12-20 杭州皮克皮克科技有限公司 基于深度学习的图像编码、解码系统及编码、解码方法
CN110769263A (zh) * 2019-11-01 2020-02-07 合肥图鸭信息科技有限公司 一种图像压缩方法、装置及终端设备

Also Published As

Publication number Publication date
CN114071141A (zh) 2022-02-18

Similar Documents

Publication Publication Date Title
US20210125070A1 (en) Generating a compressed representation of a neural network with proficient inference speed and power consumption
US10834415B2 (en) Devices for compression/decompression, system, chip, and electronic device
WO2022028197A1 (fr) Procédé de traitement d'image et dispositif correspondant
CN113259665B (zh) 一种图像处理方法以及相关设备
US11983906B2 (en) Systems and methods for image compression at multiple, different bitrates
CN111641832A (zh) 编码方法、解码方法、装置、电子设备及存储介质
US20230401756A1 (en) Data Encoding Method and Related Device
WO2022022176A1 (fr) Procédé de traitement d'image et dispositif associé
WO2022246986A1 (fr) Procédé, appareil et dispositif de traitement de données, et support de stockage lisible par ordinateur
WO2023174256A1 (fr) Procédé de compression de données et dispositif associé
WO2023207836A1 (fr) Procédé et appareil de codage d'image, et procédé et appareil de décompression d'image
WO2022100140A1 (fr) Procédé et appareil de codage par compression, et procédé et appareil de décompression
WO2023051335A1 (fr) Procédé de codage de données, procédé de décodage de données et appareil de traitement de données
CN113554719B (zh) 一种图像编码方法、解码方法、存储介质及终端设备
CN115361559A (zh) 图像编码方法、图像解码方法、装置以及存储介质
EP4398487A1 (fr) Procédé de codage de données, procédé de décodage de données et appareil de traitement de données
Chakraborty et al. Leveraging Domain Knowledge using Machine Learning for Image Compression in Internet-of-Things
TW202345034A (zh) 使用條件權重操作神經網路
CN114693811A (zh) 一种图像处理方法以及相关设备
CN117376564A (zh) 数据编解码方法及相关设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21852378

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21852378

Country of ref document: EP

Kind code of ref document: A1