CN112714313A - Image processing method, device, equipment and storage medium - Google Patents

Image processing method, device, equipment and storage medium Download PDF

Info

Publication number
CN112714313A
CN112714313A CN202011564553.1A CN202011564553A CN112714313A CN 112714313 A CN112714313 A CN 112714313A CN 202011564553 A CN202011564553 A CN 202011564553A CN 112714313 A CN112714313 A CN 112714313A
Authority
CN
China
Prior art keywords
image
original image
original
model
encoded
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011564553.1A
Other languages
Chinese (zh)
Inventor
秦永强
敖川
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ainnovation Hefei Technology Co ltd
Original Assignee
Ainnovation Hefei Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ainnovation Hefei Technology Co ltd filed Critical Ainnovation Hefei Technology Co ltd
Priority to CN202011564553.1A priority Critical patent/CN112714313A/en
Publication of CN112714313A publication Critical patent/CN112714313A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/625Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using discrete cosine transform [DCT]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Discrete Mathematics (AREA)
  • Compression Of Band Width Or Redundancy In Fax (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The application provides an image processing method, an apparatus, a device and a storage medium, wherein the method comprises the following steps: acquiring an original image to be processed; encoding the original image to generate an encoded image of the original image, wherein the resolution of the encoded image is smaller than that of the original image; and sending the coded image to a server. According to the method and the device, the original image to be processed is encoded, so that the resolution of the encoded image is smaller than that of the original image, the file size of the encoded image is reduced, the encoded image is transmitted to the server, the image transmission size is reduced, and the image transmission efficiency is improved.

Description

Image processing method, device, equipment and storage medium
Technical Field
The present application relates to the field of image processing technologies, and in particular, to an image processing method, an image processing apparatus, an image processing device, and a storage medium.
Background
In the field of computer vision, an image recognition algorithm based on deep learning is often deployed at a cloud end, a user shoots a picture at the mobile phone end side, the picture is transmitted to a server with a deep learning recognition model, and the server returns a recognition result.
In the technical scheme, when the image resolution is high, the problems of high transmission delay, high traffic consumption, high network requirement and the like exist. For example, in the retail industry, the network speed is low in some scenes, so that the time consumption for transmitting high-resolution pictures is long, even the transmission fails, the use experience of customers is influenced, the use scenes of products are limited, and the scale of a deep learning algorithm is not favorably expanded.
Disclosure of Invention
An object of the embodiments of the present application is to provide an image processing method, apparatus, device and storage medium, which implement automatic image processing on an object to be detected, avoid omission problem caused by manual work, and improve detection efficiency of printing quality.
A first aspect of an embodiment of the present application provides an image processing method, including: acquiring an original image to be processed; encoding the original image to generate an encoded image of the original image, wherein the resolution of the encoded image is smaller than that of the original image; and sending the coded image to a server.
In an embodiment, the encoding the original image to generate an encoded image of the original image, where a resolution of the encoded image is smaller than a resolution of the original image includes: inputting the original image into a preset compression model, and outputting a compressed image of the original image; and coding the compressed image according to a preset coding rule to generate the coded image.
In one embodiment, the compression model is a neural network-based model; the compression model includes: a plurality of first convolution layers, each of the first convolution layers being for down-sampling the original image.
In an embodiment, before encoding the compressed image according to a preset encoding rule and generating the encoded image, the method further includes: and carrying out frequency domain transformation and quantization processing on the compressed image.
In one embodiment, the method further comprises the step of establishing the compression model: inputting a sample image to a neural network model to be trained to obtain an output image; and determining a training loss function based on the pixel variances of the sample image and the output image, and training the neural network model based on the training loss function to obtain the compression model.
A second aspect of the embodiments of the present application provides an image processing apparatus, including: receiving a coded image sent by a terminal, wherein the coded image is an image obtained by coding an original image by the terminal, and the resolution of the coded image is smaller than that of the original image; and decoding the coded image, and outputting a decoded image, wherein a recovery image of the original image is the decoded image.
In an embodiment, the decoding the encoded image to output a decoded image, and the restoring the original image to the decoded image includes: decoding the coded image according to a preset decoding rule to generate a pre-decoded image; and inputting the pre-decoding image into a preset recovery model, and outputting the decoding image.
In one embodiment, the recovery model is a neural network-based model; the recovery model includes: a plurality of second convolutional layers, each of the second convolutional layers for up-sampling the pre-decoded image.
In one embodiment, the method further comprises the step of establishing the recovery model: inputting a sample image to a neural network model to be trained to obtain an output image; and determining a training loss function based on the pixel variances of the sample image and the output image, and training the neural network model based on the training loss function to obtain the recovery model.
A third aspect of the embodiments of the present application provides an image processing apparatus, including: the acquisition module is used for acquiring an original image to be processed; the encoding module is used for encoding the original image to generate an encoded image of the original image, and the resolution of the encoded image is smaller than that of the original image; and the sending module is used for sending the coded image to a server.
In one embodiment, the encoding module is configured to: inputting the original image into a preset compression model, and outputting a compressed image of the original image; and coding the compressed image according to a preset coding rule to generate the coded image.
In one embodiment, the compression model is a neural network-based model; the compression model includes: a plurality of first convolution layers, each of the first convolution layers being for down-sampling the original image.
In one embodiment, the method further comprises: and the processing module is used for carrying out frequency domain transformation and quantization processing on the compressed image before the compressed image is coded according to a preset coding rule and the coded image is generated.
In an embodiment, the apparatus further includes a first establishing module, configured to: inputting a sample image to a neural network model to be trained to obtain an output image; and determining a training loss function based on the pixel variances of the sample image and the output image, and training the neural network model based on the training loss function to obtain the compression model.
A fourth aspect of the embodiments of the present application provides an image processing apparatus, including: the terminal comprises a receiving module and a processing module, wherein the receiving module is used for receiving a coded image sent by the terminal, the coded image is an image obtained by coding an original image by the terminal, and the resolution of the coded image is smaller than that of the original image; and the decoding module is used for decoding the coded image and outputting a decoded image, wherein a recovery image of the original image is the decoded image.
In one embodiment, the decoding module is configured to: decoding the coded image according to a preset decoding rule to generate a pre-decoded image; and inputting the pre-decoding image into a preset recovery model, and outputting the decoding image.
In one embodiment, the recovery model is a neural network-based model; the recovery model includes: a plurality of second convolutional layers, each of the second convolutional layers for up-sampling the pre-decoded image.
In an embodiment, the apparatus further includes a second establishing module, configured to: inputting a sample image to a neural network model to be trained to obtain an output image; and determining a training loss function based on the pixel variances of the sample image and the output image, and training the neural network model based on the training loss function to obtain the recovery model.
A fifth aspect of an embodiment of the present application provides an electronic device, including: a memory to store a computer program; a processor configured to perform the method of the first aspect of the embodiments of the present application and any of the embodiments of the present application to transmit the encoded image to the server.
A sixth aspect of the embodiments of the present application provides an electronic device, including: a memory to store a computer program; a processor configured to execute the method of the second aspect of the embodiments of the present application and any of the embodiments of the present application to recover an original image from an encoded image.
A seventh aspect of embodiments of the present application provides a non-transitory electronic device-readable storage medium, including: a program which, when run by an electronic device, causes the electronic device to perform the method of the first aspect of an embodiment of the present application and any embodiment thereof.
An eighth aspect of embodiments of the present application provides a non-transitory electronic device-readable storage medium, including: a program which, when run by an electronic device, causes the electronic device to perform the method of the second aspect of embodiments of the present application and any of the embodiments thereof.
According to the image processing method, the device, the equipment and the storage medium, the original image to be processed is encoded, so that the resolution of the encoded image is smaller than that of the original image, the file size of the encoded image is reduced, and then the encoded image is transmitted to the server, so that the image transmission size is reduced, and the image transmission efficiency is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and that those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.
Fig. 1A is a schematic structural diagram of an electronic device according to an embodiment of the present application;
fig. 1B is a schematic structural diagram of an electronic device according to an embodiment of the present application;
FIG. 2 is a diagram illustrating an image transmission scenario according to an embodiment of the present application;
FIG. 3 is a flowchart illustrating an image processing method according to an embodiment of the present application;
FIG. 4 is a flowchart illustrating an image processing method according to an embodiment of the present application;
FIG. 5 is a schematic diagram of an encoder according to an embodiment of the present application;
FIG. 6 is a flowchart illustrating an image processing method according to an embodiment of the present application;
FIG. 7 is a diagram of a decoder according to an embodiment of the present application;
FIG. 8 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present application;
fig. 9 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application. In the description of the present application, the terms "first," "second," and the like are used solely to distinguish one from another and are not to be construed as indicating or implying relative importance.
As shown in fig. 1A, the present embodiment provides an electronic apparatus 1 including: at least one processor 11 and a memory 12, one processor being exemplified in fig. 1A. The processor 11 and the memory 12 are connected by a bus 10. The memory 12 stores instructions executable by the processor 11, and the instructions are executed by the processor 11 to cause the electronic device 1 to perform all or part of the flow of the method in the embodiments described below to transmit the encoded image to the server.
In an embodiment, the electronic device 1 may be a mobile phone, a tablet computer, a notebook computer, a desktop computer, or the like.
As shown in fig. 1B, the present embodiment provides an electronic device 2 including: at least one processor 21 and a memory 22, one processor being exemplified in fig. 1B. The processor 21 and the memory 22 are connected by a bus 20. The memory 22 stores instructions executable by the processor 21, and the instructions are executed by the processor 21, so that the electronic device 2 can execute all or part of the flow of the method in the embodiments described below, so as to recover the original image from the encoded image.
In an embodiment, the electronic device 2 may be a mobile phone, a tablet computer, a notebook computer, a desktop computer, or a large computing server composed of multiple computers.
Please refer to fig. 2, which is an image transmission scenario according to an embodiment of the present application, including: a terminal 30 and a server 40, wherein:
the terminal 30 may be implemented by the electronic device 1, wherein a neural network-based encoder 31 is disposed in the terminal 30, and the raw image is input to the encoder 31, subjected to encoding processing, and output as an encoded image.
In an actual scene, taking price tag identification as an example, the terminal 30 may be a mobile phone or a tablet computer, and the like, and first, a user needs to acquire a price tag image, and then upload the price tag image to the server 40 through the mobile phone, and the server 40 performs image identification on the price tag image, so as to finally obtain price tag information. The scheme of the embodiment is used for solving the problem of low transmission rate in the image transmission process. The user can take the price tag image on the shelf as an original image using a mobile phone, then encode the price tag image, and transmit the encoded image to the server 40.
The server 40 may be implemented by the electronic device 2, and the server 40 may be deployed in the cloud. A decoder 41 based on a neural network is deployed in the server 40, and the server 40 receives the encoded image sent by the terminal 30 in real time and decodes the encoded image through the decoder 41 to generate a high-resolution decoded image, wherein the resolution of the decoded image is close to that of the original image and therefore can be used as a recovery image of the price tag image. The decoded image may be used for subsequent price tag identification.
The image processing method can reduce the size of the transmitted image and improve the image transmission efficiency by respectively introducing the neural network encoder 31 into the terminal 30 and introducing the decoder 41 into the server 40.
Please refer to fig. 3, which is an image processing method according to an embodiment of the present application, and the method may be executed by the electronic device 1 shown in fig. 1 and may be applied in the image transmission scenario shown in fig. 2 to transmit the encoded image to the server 40. The method comprises the following steps:
step 301: and acquiring an original image to be processed.
In this step, an original image of an object to be processed may be photographed by an image pickup device of a user and then input to the terminal 30. Or the user can directly use the mobile phone with the camera to shoot the original image of the object. The object to be processed may be a user-specified object such as a price tag for goods or the like.
Step 302: and carrying out encoding processing on the original image to generate an encoded image of the original image, wherein the resolution of the encoded image is smaller than that of the original image.
In this step, the original image may be a high-definition image, the resolution may be very high, and the purpose of reducing the resolution of the original image is achieved by encoding the original image, so that the encoded image occupies less network transmission resources in the transmission process. Here, the resolution difference between the encoded image and the original image may be specified by a user, and the user may set the resolution difference based on actual requirements, comprehensively considering the resolution requirements of the server 40 when restoring the decoded image.
Step 303: the encoded image is sent to the server 40.
In the step, in the sending process, the low-resolution coded image is sent, so that the network transmission resource can be greatly reduced, and the image transmission efficiency is improved.
According to the image processing method, the original image to be processed is subjected to encoding processing, so that the resolution of the encoded image is smaller than that of the original image, the file size of the encoded image is reduced, and then the encoded image is transmitted to the server 40, so that the image transmission size is reduced, and the image transmission efficiency is improved.
Please refer to fig. 4, which is an image processing method according to an embodiment of the present application, and the method may be executed by the electronic device 1 shown in fig. 1 and may be applied in the image transmission scenario shown in fig. 2 to transmit the encoded image to the server 40. The method comprises the following steps:
step 401: and acquiring an original image to be processed. See the description of step 301 in the above embodiments for details.
Step 402: and inputting the original image into a preset compression model, and outputting a compressed image of the original image.
In this step, the compression model is a neural network-based model. The compression model may include: and a plurality of first convolution layers, each first convolution layer being used for down-sampling the original image.
In an embodiment, the image encoding in the present embodiment can be implemented by a neural network encoder 31 as shown in fig. 5, and the encoder 31 can be disposed in the terminal 30. The neural network coding mainly comprises the following steps: a neural network based compression model and a post-processing procedure.
Taking an original image (i.e. an original picture) with a resolution of 1280x720 as an example, the neural network portion of the compression model may include 16 first convolution layers, parameters of each first convolution layer are as shown in fig. 5, and the current image resolution and the number of channels are indicated in a box, for example, 1280x720x3 indicates that the current image resolution is 1280x720 and the number of channels is 3. The text at the arrow indicates the first convolution layer parameter, and the expression rule is: based on the number of convolution layers xConv convolution kernel height x convolution kernel width x convolution kernel number x convolution stride, the neural network structure shown in fig. 5 includes the following contents:
the original image resolution was 1280x720x 3. After the convolutional layer treatment with parameters of 1x Conv 3x3x16x 2. The output feature map resolution was 640x360x 16. After the convolutional layer with the parameter of 4x Conv 3x3x32x1 and the convolutional layer with the parameter of 1x Conv 3x3x32x2, the resolution of the output characteristic diagram is 320x180x 32. After the convolutional layer with the parameter of 4x Conv 3x3x64x1 and the convolutional layer with the parameter of 1x Conv 3x3x64x2, the resolution of the output characteristic diagram is 160x90x 64. And finally, after the convolutional layer with the parameter of 4x Conv 3x3x128x1 and the convolutional layer with the parameter of 1x Conv 3x3x128x2 are processed, the resolution of the output intermediate feature map is 80x45x128, and the output intermediate feature map is a compressed image.
In one embodiment, a relu activation function may be used after each first convolution layer, and a batch-norm layer may be added after each first convolution layer with an image stride of 2. As shown in fig. 5, after the final layer of convolution, the output image resolution is 80 × 45 × 128, then the target value is limited to the range of [ -1,1] using the tanh activation function, and finally the result is scaled, for example, based on the requirement of the subsequent processing, the result may be multiplied by a positive number, and if the subsequent processing is performed by DCT (Discrete Cosine Transform), the positive number may be 128.
Step 403: and carrying out frequency domain transformation and quantization processing on the compressed image.
In this step, the compressed image output from step 402 may be transformed from the spatial domain to the frequency domain and then subjected to quantization processing in order to facilitate quantization and encoding of the image.
In one embodiment, as shown in fig. 5, step 403 may be implemented by a post-processing part, which first performs DCT transformation on the compressed image to transform the compressed image from a spatial domain to a frequency domain, thereby generating a frequency domain feature map. And then, quantizing, namely, in the quantization operation, firstly dividing the frequency domain value of the frequency domain characteristic diagram by a preset weight, and then, rounding to obtain a quantized characteristic diagram, wherein the quantized image has sparsity (more 0 exists in a characteristic matrix).
In an embodiment, the predetermined weight may be a positive integer matrix having the same size as the matrix of the frequency domain feature map, and each element in the positive integer matrix represents a quantization degree of a frequency, and generally, the low frequency weight is smaller and the high frequency weight is larger.
Step 404: and coding the compressed image according to a preset coding rule to generate a coded image, wherein the resolution of the coded image is smaller than that of the original image.
In this step, the quantized feature map matrix is encoded according to a preset encoding rule, where the preset encoding rule may be a combination of a difference code, a run-length code, and a huffman code, or any one of the foregoing encoding rules, the encoded image may be in a format similar to jpeg, the preset encoding rule is a decodable encoding rule, and a storage space of the encoded image is as small as possible.
Step 405: the encoded image is sent to the server 40. See the description of step 303 in the above embodiments for details.
Please refer to fig. 6, which is an image processing method according to an embodiment of the present application, and the method can be executed by the electronic device 2 shown in fig. 1 as a server 40, and can be applied in the image transmission scenario shown in fig. 2 to recover the original image from the encoded image. The method comprises the following steps:
step 601: and receiving the coded image sent by the terminal 30, wherein the coded image is an image obtained by coding the original image by the terminal 30, and the resolution of the coded image is smaller than that of the original image.
In this step, the server 40 receives, in real time, the network byte stream transmitted from the terminal 30, which includes the encoded image obtained by encoding the original image by the terminal 30, and the encoded image is compressed by the terminal 30 and has a resolution smaller than that of the original image.
In one embodiment, as shown in fig. 7, the decoding process of the encoded image can be implemented by a decoder 41 disposed in the server 40, and the decoded image is output. The neural network decoder 41 mainly includes: a preprocessing portion and a neural network based recovery model. The decoding process may include:
step 602: and decoding the coded image according to a preset decoding rule to generate a pre-decoded image.
In this step, the server 40 first performs decoding processing on the encoded image based on a preset decoding rule. As shown in fig. 7, the network byte stream received in step 601 is input to the preprocessing portion, and then the preprocessing portion decodes the network byte stream according to a preset decoding rule. Here, the decoding process is the reverse process of the terminal 30 encoding the original image in the above-described steps 403 to 404. And performing inverse DCT transformation on the decoded data to obtain a pre-decoded image similar to the characteristic graph of the compressed image.
Step 603: and inputting the pre-decoded image into a preset recovery model, and outputting the decoded image.
In this step, the recovery model is a neural network-based model. The recovery model includes: and a plurality of second convolution layers, each second convolution layer being used for performing an upsampling process on the pre-decoded image.
In an embodiment, as shown in fig. 7, taking an original image with a resolution of 1280x720 as an example, the neural network-based recovery model may mainly consist of 15 convolutional layers, the network structure parameters are as denoted in fig. 7, and the blocks indicate the current image resolution and the number of channels, for example, 1280x720x3 indicates that the current image resolution is 1280x720 and the number of channels is 3. The characters at the arrow indicate the parameters of the second convolution layer, and the expression rule is: based on the number of convolution layers xConv convolution kernel height x convolution kernel width x convolution kernel number x convolution stride, the neural network structure shown in fig. 7 includes the following contents:
the pre-decoded picture resolution is 80x45x 128. After convolutional layer processing with parameters of 4x Conv 3x3x64x1, the resolution of the output feature map is 160x90x 64. After the convolutional layer processing with parameters of 4x Conv 3x3x32x1, the resolution of the output feature map is 320x180x 32. After convolutional layer processing with parameters of 4x Conv 3x3x16x1, the resolution of the output feature map is 640x360x 16. Finally, after the convolutional layer with the parameter of 2x Conv 3x3x3x1 and the convolutional layer with the parameter of 1x Conv 3x3x3x1 are processed, the resolution of the output characteristic diagram is 1280x720x3, namely the decoded image.
In fig. 7, bilinear resize represents bilinear interpolated upsampling. After each second convolution layer, the relu activation function was used, and before the bilinear resize, the batch-norm layer was used. After the last convolution layer, the value of the convolution layer is limited to [0,1] by using a sigmoid function, then the convolution layer is multiplied by 256 and rounded, and a decoded image close to the original image can be recovered, and the decoded image can be used as a recovery image of the original image for subsequent use.
In one embodiment, before step 402 and before step 603, the method further comprises the steps of establishing a compression model and establishing a recovery model, including: and inputting the sample image to a neural network model to be trained to obtain an output image. And determining a training loss function based on the pixel variance of the sample image and the output image, and training a neural network model based on the training loss function to obtain a compression model and a recovery model.
In this step, an unsupervised training mode may be adopted during training, the original picture is input to the neural network to be trained, and the neural network model is trained based on the training loss function, so as to obtain a compression model and a recovery model. In the training process, the output image with the resolution consistent with that of the original image is output by the compression model or the recovery model, and then the pixel difference values of the corresponding positions of the two images are compared to determine the training loss function. Recording the original image as IsThe model output image is IdThen its training loss function is L:
Figure BDA0002861484330000111
where w represents the height of the original image or output image, h represents the height of the original image or output image, Is(x, y) represents a pixel value of the original image at (x, y), Id(x, y) represents a pixel value of the output image at (x, y).
In an embodiment, the above-mentioned loss function requires the original image and the output image to have the same resolution, so the compression model and the recovery model are trained together, because the encoding and decoding parts shown in fig. 5 and 7 are lossless compression for further reducing the transmission size, when training together, the encoding and decoding parts shown in fig. 5 and 7 can be removed, the result after quantization of the encoder 31 shown in fig. 5 is directly subjected to DCT inverse transformation, and the transformed result is input to the neural network part of the recovery model of the neural network decoder 41 shown in fig. 7. The framework after the cutting is the model to be trained.
The above-mentioned model training process may be executed by the electronic device 1 or the electronic device 2, and after obtaining the model, the compression model is deployed on the terminal 30, and the recovery model is deployed on the server 40.
Please refer to fig. 8, which is an image processing apparatus 800 according to an embodiment of the present application, which is applied to the electronic device 1 shown in fig. 1 and can be applied to the image transmission scenario shown in fig. 2 to transmit the encoded image to the server 40. The device includes: the device comprises an acquisition module 801, an encoding module 802 and a sending module 803, wherein the principle relationship of each module is as follows:
an obtaining module 801, configured to obtain an original image to be processed. The encoding module 802 is configured to perform encoding processing on the original image to generate an encoded image of the original image, where a resolution of the encoded image is smaller than a resolution of the original image. A sending module 803, configured to send the encoded image to the server 40.
In one embodiment, the encoding module 802 is configured to: and inputting the original image into a preset compression model, and outputting a compressed image of the original image. And coding the compressed image according to a preset coding rule to generate a coded image.
In one embodiment, the compression model is a neural network-based model. The compression model includes: and a plurality of first convolution layers, each first convolution layer being used for down-sampling the original image.
In one embodiment, the method further comprises: the processing module 804 is configured to perform frequency domain transformation and quantization processing on the compressed image before encoding the compressed image according to a preset encoding rule and generating an encoded image.
In an embodiment, the system further includes a first establishing module 805 configured to: and inputting the sample image to a neural network model to be trained to obtain an output image. And determining a training loss function based on the pixel variance of the sample image and the output image, and training a neural network model based on the training loss function to obtain a compression model.
For a detailed description of the image processing apparatus 800, please refer to the description of the related method steps in the embodiments shown in fig. 3 to fig. 5.
Please refer to fig. 9, which is an image processing apparatus 900 according to an embodiment of the present application, which is applied to the electronic device 2 shown in fig. 1 and can be applied to the image transmission scenario shown in fig. 2 to recover the original image from the encoded image. The device includes: the receiving module 901 and the decoding module 902, the principle relationship of each module is as follows:
the receiving module 901 is configured to receive an encoded image sent by the terminal 30, where the encoded image is an image obtained by encoding an original image by the terminal 30, and a resolution of the encoded image is smaller than a resolution of the original image. The decoding module 902 is configured to perform decoding processing on the encoded image and output a decoded image, where a restored image of the original image is the decoded image.
In one embodiment, the decoding module 902 is configured to: and decoding the coded image according to a preset decoding rule to generate a pre-decoded image. And inputting the pre-decoded image into a preset recovery model, and outputting the decoded image.
In one embodiment, the recovery model is a neural network-based model. The recovery model includes: and a plurality of second convolution layers, each second convolution layer being used for performing an upsampling process on the pre-decoded image.
In an embodiment, the apparatus further includes a second establishing module 903, configured to: and inputting the sample image to a neural network model to be trained to obtain an output image. And determining a training loss function based on the pixel variance of the sample image and the output image, and training a neural network model based on the training loss function to obtain a recovery model.
For a detailed description of the image processing apparatus 900, please refer to the description of the related method steps in the embodiments shown in fig. 6 to fig. 7.
An embodiment of the present invention further provides a non-transitory electronic device readable storage medium, including: a program that, when run on an electronic device, causes the electronic device to perform all or part of the procedures of the methods in the above-described embodiments. The storage medium may be a magnetic Disk, an optical Disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a Flash Memory (Flash Memory), a Hard Disk (Hard Disk Drive, abbreviated as HDD), a Solid State Drive (SSD), or the like. The storage medium may also comprise a combination of memories of the kind described above.
Although the embodiments of the present invention have been described in conjunction with the accompanying drawings, those skilled in the art may make various modifications and variations without departing from the spirit and scope of the invention, and such modifications and variations fall within the scope defined by the appended claims.

Claims (15)

1. An image processing method, comprising:
acquiring an original image to be processed;
encoding the original image to generate an encoded image of the original image, wherein the resolution of the encoded image is smaller than that of the original image;
and sending the coded image to a server.
2. The method according to claim 1, wherein said encoding said original image to generate an encoded image of said original image, said encoded image having a resolution smaller than a resolution of said original image comprises:
inputting the original image into a preset compression model, and outputting a compressed image of the original image;
and coding the compressed image according to a preset coding rule to generate the coded image.
3. The method of claim 2, wherein the compression model is a neural network-based model; the compression model includes:
a plurality of first convolution layers, each of the first convolution layers being for down-sampling the original image.
4. The method according to claim 2, wherein before encoding the compressed image according to a preset encoding rule to generate the encoded image, the method further comprises:
and carrying out frequency domain transformation and quantization processing on the compressed image.
5. The method of claim 2, further comprising the step of building the compression model:
inputting a sample image to a neural network model to be trained to obtain an output image;
and determining a training loss function based on the pixel variances of the sample image and the output image, and training the neural network model based on the training loss function to obtain the compression model.
6. An image processing method, comprising:
receiving a coded image sent by a terminal, wherein the coded image is an image obtained by coding an original image by the terminal, and the resolution of the coded image is smaller than that of the original image;
and decoding the coded image, and outputting a decoded image, wherein a recovery image of the original image is the decoded image.
7. The method according to claim 6, wherein said decoding the encoded image to output a decoded image, and wherein the restoring of the original image to the decoded image comprises:
decoding the coded image according to a preset decoding rule to generate a pre-decoded image;
and inputting the pre-decoding image into a preset recovery model, and outputting the decoding image.
8. The method of claim 7, wherein the recovery model is a neural network-based model; the recovery model includes:
a plurality of second convolutional layers, each of the second convolutional layers for up-sampling the pre-decoded image.
9. The method of claim 7, further comprising the step of establishing the recovery model by:
inputting a sample image to a neural network model to be trained to obtain an output image;
and determining a training loss function based on the pixel variances of the sample image and the output image, and training the neural network model based on the training loss function to obtain the recovery model.
10. An image processing apparatus characterized by comprising:
the acquisition module is used for acquiring an original image to be processed;
the encoding module is used for encoding the original image to generate an encoded image of the original image, and the resolution of the encoded image is smaller than that of the original image;
and the sending module is used for sending the coded image to a server.
11. An image processing apparatus characterized by comprising:
the terminal comprises a receiving module and a processing module, wherein the receiving module is used for receiving a coded image sent by the terminal, the coded image is an image obtained by coding an original image by the terminal, and the resolution of the coded image is smaller than that of the original image;
and the decoding module is used for decoding the coded image and outputting a decoded image, wherein a recovery image of the original image is the decoded image.
12. An electronic device, comprising:
a memory to store a computer program;
a processor to perform the method of any one of claims 1 to 5 to transmit the encoded image to a server.
13. An electronic device, comprising:
a memory to store a computer program;
a processor arranged to perform the method of any one of claims 6 to 9 to recover an original picture from an encoded picture.
14. A non-transitory electronic device readable storage medium, comprising: program which, when run by an electronic device, causes the electronic device to perform the method of any one of claims 1 to 5.
15. A non-transitory electronic device readable storage medium, comprising: program which, when run by an electronic device, causes the electronic device to perform the method of any one of claims 6 to 9.
CN202011564553.1A 2020-12-25 2020-12-25 Image processing method, device, equipment and storage medium Pending CN112714313A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011564553.1A CN112714313A (en) 2020-12-25 2020-12-25 Image processing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011564553.1A CN112714313A (en) 2020-12-25 2020-12-25 Image processing method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112714313A true CN112714313A (en) 2021-04-27

Family

ID=75546577

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011564553.1A Pending CN112714313A (en) 2020-12-25 2020-12-25 Image processing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112714313A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114240953A (en) * 2021-12-16 2022-03-25 北京数码视讯技术有限公司 Method and device for transmitting ultrahigh-resolution image
WO2024007977A1 (en) * 2022-07-07 2024-01-11 维沃移动通信有限公司 Image processing method and apparatus, and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109413434A (en) * 2018-11-08 2019-03-01 腾讯科技(深圳)有限公司 Image processing method, device, system, storage medium and computer equipment
US20200193647A1 (en) * 2018-10-19 2020-06-18 Samsung Electronics Co., Ltd. Artificial intelligence encoding and artificial intelligence decoding methods and apparatuses using deep neural network
CN111355965A (en) * 2020-02-28 2020-06-30 中国工商银行股份有限公司 Image compression and restoration method and device based on deep learning
CN111970513A (en) * 2020-08-14 2020-11-20 成都数字天空科技有限公司 Image processing method and device, electronic equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200193647A1 (en) * 2018-10-19 2020-06-18 Samsung Electronics Co., Ltd. Artificial intelligence encoding and artificial intelligence decoding methods and apparatuses using deep neural network
CN109413434A (en) * 2018-11-08 2019-03-01 腾讯科技(深圳)有限公司 Image processing method, device, system, storage medium and computer equipment
CN111355965A (en) * 2020-02-28 2020-06-30 中国工商银行股份有限公司 Image compression and restoration method and device based on deep learning
CN111970513A (en) * 2020-08-14 2020-11-20 成都数字天空科技有限公司 Image processing method and device, electronic equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114240953A (en) * 2021-12-16 2022-03-25 北京数码视讯技术有限公司 Method and device for transmitting ultrahigh-resolution image
WO2024007977A1 (en) * 2022-07-07 2024-01-11 维沃移动通信有限公司 Image processing method and apparatus, and device

Similar Documents

Publication Publication Date Title
US11423310B2 (en) Deep learning based adaptive arithmetic coding and codelength regularization
US20240078712A1 (en) Data compression using conditional entropy models
US10834425B2 (en) Image compression/decompression method and device, and image processing system
US10909728B1 (en) Learned lossy image compression codec
EP3354030B1 (en) Methods and apparatuses for encoding and decoding digital images through superpixels
CN104838653B (en) Lossless image compression using differential transmission
US11983906B2 (en) Systems and methods for image compression at multiple, different bitrates
CN101689297B (en) Efficient image representation by edges and low-resolution signal
WO2021169408A1 (en) Image processing method and apparatus, and electronic device and storage medium
CN111630570A (en) Image processing method, apparatus and computer-readable storage medium
CN112714313A (en) Image processing method, device, equipment and storage medium
CN116547969A (en) Processing method of chroma subsampling format in image decoding based on machine learning
US11212527B2 (en) Entropy-inspired directional filtering for image coding
CN113724136A (en) Video restoration method, device and medium
US9106925B2 (en) WEAV video compression system
CN111432213B (en) Method and apparatus for tile data size coding for video and image compression
CN108182712B (en) Image processing method, device and system
CN113256744B (en) Image coding and decoding method and system
CN116508320A (en) Chroma subsampling format processing method in image decoding based on machine learning
Zhuang et al. A robustness and low bit-rate image compression network for underwater acoustic communication
JP5110304B2 (en) Screen data transmitting apparatus, screen data transmitting method, and screen data transmitting program
CN113949867B (en) Image processing method and device
Patel et al. An analytical study on comparison of different image compression formats
Baluja et al. Learning to render better image previews
EP4294015A1 (en) Method for image encoding

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210427

RJ01 Rejection of invention patent application after publication