CN115150370B - Image processing method - Google Patents

Image processing method Download PDF

Info

Publication number
CN115150370B
CN115150370B CN202210782328.8A CN202210782328A CN115150370B CN 115150370 B CN115150370 B CN 115150370B CN 202210782328 A CN202210782328 A CN 202210782328A CN 115150370 B CN115150370 B CN 115150370B
Authority
CN
China
Prior art keywords
image
network model
chroma format
deep learning
original
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210782328.8A
Other languages
Chinese (zh)
Other versions
CN115150370A (en
Inventor
刘凯航
方华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Meishi Technology Co ltd
Original Assignee
Guangdong Meishi Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Meishi Technology Co ltd filed Critical Guangdong Meishi Technology Co ltd
Priority to CN202210782328.8A priority Critical patent/CN115150370B/en
Publication of CN115150370A publication Critical patent/CN115150370A/en
Application granted granted Critical
Publication of CN115150370B publication Critical patent/CN115150370B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/186Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component

Abstract

The invention mainly discloses an image processing method, which comprises the following steps: acquiring a 4:2:0 chroma format image, wherein the 4:2:0 chroma format image is obtained by encoding an original 4:4:4 chroma format image through an encoder and transmitting the encoded image through a communication channel; decoding the 4:2:0 chroma format image into a first image, inputting the first image into a target neural network model, and obtaining a second image output by the target neural network model, wherein the resolution of the second image is the same as that of the first image, and the chroma of the second image is closer to that of the original 4:4:4 chroma format image than that of the first image; the second image is sent to the display device. The method ensures that the occupied bandwidth is less in the process of transmitting the image, improves the display effect of the sampled image on the display output device after passing through the coder and decoder, and improves the display picture quality and definition when the image is displayed through the display device.

Description

Image processing method
Technical Field
The invention relates to the technical field of electronic information, in particular to an image processing method.
Background
YCbCr is a commonly used color coding scheme in consumer video products such as DVD, video camera, digital television, etc., where Y is the luminance (luma) component of the color and Cb and Cr are the concentration offset components of the blue and red. The human eye is usually more sensitive to the Y component, so most image compression methods usually compress Cb and Cr components to achieve the purpose of compressing an image on the premise of ensuring the image quality.
The main sampling formats of YCbCr are YCbCr4:2:0 and YCbCr 4:4:4. The three channels of the YCbCr4:4:4 sampling format have the same sampling rate, and in the generated image, the three channel component information of each pixel is complete, so that the image quality is good; and the YCbCr4:2:0 sampling format refers to sampling four luminance channel information and two chrominance channel information every four pixels, the image quality is poor.
When the method is applied to an image codec, the bandwidth occupied during image transmission in a YCbCr4:2:0 sampling format is low, so that the method has higher cost performance, but partial image information is lost in the sampling process, so that the problems of image quality degradation and the like are caused, and the quality degradation is more obvious especially in red, blue solid colors and red-blue background images. While the image quality of the YCbCr4:4:4 sampling format is very high, the required bandwidth is high and the hardware platform required for the codec for this sampling format is expensive.
Disclosure of Invention
To solve at least one of the foregoing technical problems, the present disclosure proposes, in a first aspect, a method of image processing, including: acquiring a 4:2:0 chroma format image, wherein the 4:2:0 chroma format image is obtained by encoding an original 4:4:4 chroma format image through an encoder and transmitting the encoded image through a communication channel; decoding the 4:2:0 chroma format image into a first image, inputting the first image into a target neural network model, and obtaining a second image output by the target neural network model, wherein the resolution of the second image is the same as that of the first image, and the chroma of the second image is closer to that of the original 4:4:4 chroma format image than that of the first image; the second image is sent to the display device.
Preferably, the method for acquiring the target neural network model comprises the following steps: constructing a training sample of an image, wherein the training sample of the image comprises an original 4:4:4 chroma format image set and a corresponding 4:2:0 chroma format image set, the resolution of each image in the original 4:4:4 chroma format image set is the same as the resolution of the corresponding image in the 4:2:0 chroma format image set, and each image in the original 4:4:4 chroma format image set is obtained by encoding, communication channel transmission and decoding through a coder-decoder; and inputting the training sample of the image into the deep learning network model for training to obtain the target neural network model.
Preferably, the training samples of the images comprise training data sets and test data sets, wherein the step of inputting the training samples of the images into the deep learning network model for training to obtain the target neural network model comprises the steps of inputting the training data sets into the deep learning network model for training to obtain a trained deep learning network model; inputting the test data set into the trained deep learning network model for verification, and obtaining a verified deep learning network model; and compressing the verified deep learning network model to obtain the target neural network model.
Preferably, "inputting the training data set into the deep learning network model for training, obtaining the trained deep learning network model" includes: inputting the 4:2:0 chroma format image in the training data set into a deep learning network model, and performing two-branch processing, wherein one branch performs feature extraction to obtain a feature map, and the other branch performs convolution operation to obtain a convolution result; performing series operation on the feature map and the convolution result, and obtaining an output image through 1x1 convolution operation; and calculating the difference value of the original 4:4:4 chroma format image paired with the output image and the training data set, and updating the parameters of the deep learning network model according to the difference value until the difference value is smaller than a given threshold value, so as to obtain the trained deep learning network model.
Preferably, "inputting the test data set into the trained deep learning network model for verification, obtaining the verified deep learning network model" includes: inputting the 4:2:0 chroma format image in the test data set into the trained deep learning network model, and outputting a reference image; and calculating an average index of the reference image and the paired original 4:4:4 chroma format image, and verifying the trained deep learning network model.
Preferably, "compressing the verified deep learning network model to obtain the target neural network model" includes: compressing the convolution layer of the branches into a single 3x3 convolution operation through structural reparameterization; loading the verified deep learning network model, and carrying out model quantization by utilizing QAT quantized perception training to obtain a quantized compressed deep learning network model; and converting the quantized and compressed deep learning network model into an RKNN model to obtain the target neural network model.
Preferably, the obtaining method of the original 4:4:4 chroma format image set includes: capturing, capturing frames and synthesizing images by using imaging equipment; the original 4:4:4 chroma format image set includes: the image processing method comprises the steps of landscape image, character image, face image, animal image, text image and industrial image, wherein the text image and the industrial image are subjected to data augmentation through a data augmentation method, and the data augmentation method comprises image synthesis, stitching, color inversion and image size adjustment.
The present disclosure proposes in a second aspect an apparatus for image processing, comprising: the acquisition module is used for acquiring a 4:2:0 chroma format image, wherein the 4:2:0 chroma format image is obtained by encoding an original 4:4:4 chroma format image through an encoder and transmitting the encoded 4:2:0 chroma format image through a communication channel; the decoding module is used for decoding the 4:2:0 chroma format image into a first image, inputting the first image into the target neural network model, and obtaining a second image output by the target neural network model, wherein the resolution of the second image is the same as that of the first image, and the chroma of the second image is closer to that of the original 4:4:4 chroma format image than that of the first image; and the sending module is used for sending the second image to the display device.
The present disclosure proposes in a third aspect an image decoding device comprising processing circuitry for performing a method as described in any of the above.
The present disclosure proposes in a fourth aspect an image decoding apparatus including a memory and a processor; a memory for storing a computer program; a processor for implementing the method according to any of claims 1 to 7 when executing a computer program.
Some technical effects of the present disclosure are: by acquiring a 4:2:0 chroma format image, the 4:2:0 chroma format image is obtained by encoding an original 4:4:4 chroma format image by an encoder and transmitting the encoded image through a communication channel; decoding the 4:2:0 chroma format image into a first image, inputting the first image into a target neural network model, and obtaining a second image output by the target neural network model, wherein the resolution of the second image is the same as that of the first image, and the chroma of the second image is closer to that of the original 4:4:4 chroma format image than that of the first image; the second image is sent to the display device. Therefore, the bandwidth occupied in the image transmission process is ensured to be small, the display effect of the sampled image on the display output device after passing through the coder and decoder is improved, and the display picture quality and definition of the image when displayed through the display device are improved.
Drawings
For a better understanding of the technical solutions of the present disclosure, reference may be made to the following drawings for aiding in the description of the prior art or embodiments. The drawings will selectively illustrate products or methods involved in the prior art or some embodiments of the present disclosure. The basic information of these figures is as follows:
FIG. 1 is an exemplary device architecture diagram in which embodiments of the present application may be applied;
fig. 2 is a flow chart of one embodiment of a method of image processing of the present application.
Detailed Description
Further technical means or technical effects to which the present disclosure relates will be described below, and it is apparent that examples (or embodiments) provided are only some embodiments, but not all, which are intended to be covered by the present disclosure. All other embodiments that can be made by those skilled in the art without the exercise of inventive faculty, based on the embodiments in this disclosure and the explicit or implicit presentation of the drawings, are intended to be within the scope of the present disclosure.
The encoder is a device or program for compressing image data or the like, and the decoder is a device or program for decompressing image data or the like. The codec is composed of a codec pair. The codecs all have their own fixed codec formats, and the code streams compressed by the encoder can be decoded by the corresponding decoders.
When transmitting the image from one end to the other end, the encoder is needed to encode the image, then the compressed image code stream is transmitted from the encoding end to the decoding end through the transmission system, and finally the decoding end decodes the compressed image code stream. The uncompressed image has a huge data volume and spatial redundancy, and requires a high bandwidth for the transmission system. There is a strong correlation between adjacent pixels of the image, such as temporal redundancy: content similarity between adjacent pictures of a video sequence, coding redundancy: the probability of occurrence of different pixel values is different, visual redundancy: the human visual system is insensitive to certain details, knowledge redundancy: the structure of the regularity may be derived from a priori knowledge and background knowledge. The above correlation can be used in the encoding process of image data, so that some unimportant or repeated image information is omitted, the data in encoding is reduced, and the bandwidth requirement in data transmission is reduced.
YCbCr is a common color coding scheme, where Y is the luminance component of the color and Cb and Cr are the concentration offset components of blue and red, respectively. The human eye is usually more sensitive to the Y component, so most image compression methods usually compress Cb and Cr components to achieve the purpose of compressing an image.
The main sampling formats of YCbCr are YCbCr4:2:0 and YCbCr 4:4:4. The three channels of the YCbCr4:4:4 sampling format have the same sampling rate, and in the generated image, the three channel components of each pixel have complete information and good image quality. The YCbCr4:4:4 sampling format has complete sampling information, so a high bandwidth is required for transmission, and the hardware platform required for the codec for this sampling format is expensive. The YCbCr4:2:0 sampling format refers to sampling four luminance channel information and two chrominance channel information every four pixels, and part of the image information is lost in the sampling process, so that the image quality is degraded, especially in the background images of red, blue solid colors and red-blue phases.
In the following embodiments, as shown in fig. 1, an image processing apparatus includes a first end apparatus 100, a second end apparatus 200, and a communication channel for transmitting information between the first end apparatus and the second end apparatus. The first end device comprises an image source, a preprocessor, an encoder and a first communication module. The second terminal device comprises a second communication module, a decoder, a post-processor and a display device.
The image source is used to acquire various types of original images or stores various types of original images themselves, such as landscape images, character images, face images, animal images, text images, industrial images, and the like. The manner in which the image source acquires the image may include imaging device capture, video capture frames, composite images, and the like.
The preprocessor is used for receiving the original image and preprocessing the original image to provide preprocessed image data. Such as clipping, color format conversion, etc. Alternatively, the preprocessor may be omitted.
The encoder is used for receiving the preprocessed image data, encoding the image data according to a certain encoding format and providing the encoded image data.
The first communication module is used for receiving the encoded image data and transmitting the encoded image data to the second end device through a communication channel. The first communication module may receive the encoded image data provided by the encoder or may receive the encoded image data provided by a storage device storing the encoded image data.
The second communication module is used for receiving the coded image data and providing the coded image data to the decoder. The second communication module may be configured to receive the encoded image data transmitted by any other device when receiving the encoded image data.
The decoder is configured to receive the encoded image data, decode the encoded image data, and provide decoded image data.
The post-processor is used for receiving the decoded image data, carrying out post-processing on the decoded image data and providing post-processed image data. Post-processing includes color format conversion of the image, etc., to provide post-processed data to a display device for displaying the image.
The display device is used for receiving the decoded image data and displaying the decoded image.
In the following embodiment, as shown in fig. 2, an image processing method includes:
s10: acquiring a 4:2:0 chroma format image, wherein the 4:2:0 chroma format image is obtained by encoding an original 4:4:4 chroma format image through an encoder and transmitting the encoded image through a communication channel;
s20: decoding the 4:2:0 chroma format image into a first image, inputting the first image into a target neural network model, and obtaining a second image output by the target neural network model, wherein the resolution of the second image is the same as that of the first image, and the chroma of the second image is closer to that of the original 4:4:4 chroma format image than that of the first image;
s30: the second image is sent to the display device.
Wherein, S10: acquiring a 4:2:0 chroma format image, wherein the 4:2:0 chroma format image is obtained by encoding an original 4:4:4 chroma format image through an encoder and transmitting the encoded image through a communication channel; in an embodiment, the execution body of the step of "obtaining the 4:2:0 chroma format image" may be a second end device, and the second end device obtains the 4:2:0 chroma format image by receiving the encoded 4:2:0 chroma format image transmitted through the communication channel. The obtaining of the 4:2:0 chroma format image may be receiving the encoded 4:2:0 chroma format image sent by the first end device, or may be receiving the encoded 4:2:0 chroma format image sent by a storage device storing the encoded 4:2:0 chroma format image.
In particular, the obtaining of the 4:2:0 chroma format image may be by receiving the encoded 4:2:0 chroma format image transmitted by the first-end device. First, an encoder of a first end device receives an original 4:4:4 chroma format image, samples the original 4:2:0 chroma format image into a 4:2:0 chroma format image, and transmits the image to a second end device through a communication channel so that the second end device can acquire the 4:2:0 chroma format image.
The obtaining of the 4:2:0 chroma format image may also be receiving the encoded 4:2:0 chroma format image transmitted by a storage device through a communication channel. The 4:2:0 chroma format image is obtained by encoding the original 4:4:4 chroma format image by an encoder. Specifically, the original 4:4:4 chroma format image is input to the encoder end and sampled as a 4:2:0 chroma format image. And then transmitted to the second end device via the communication channel so that the second end device can obtain the 4:2:0 chroma format image. Therefore, the image occupies a small broadband in the transmission process, the transmission speed is high, and the transmission speed of the image is ensured.
The communication channel (Channels of communicationl) is a path for data transmission, and in a computer network, the channel is divided into a physical channel and a logical channel. The physical channel refers to a physical channel for transmitting data signals, and consists of a transmission medium and related communication equipment; the logical channel refers to a logical path implemented by both transmitting and receiving data signals through an intermediate node based on a physical channel, thereby forming a logical path for transmitting the data signals. The logical channels may or may not be connected. Physical channels can also be divided into wired channels and wireless channels according to the difference of transmission media, and can also be divided into digital channels and analog channels according to the difference of transmission data types.
Therefore, the image formats transmitted through the communication channels are all 4:2:0 chroma format images, and the requirement on bandwidth can be reduced to the greatest extent. The efficiency and the cost performance of image transmission are guaranteed, and the requirement on hardware is low during transmission. Bandwidth refers to the transmission rate of data in a digital system, which is expressed in bits per second. The larger the bandwidth is, the larger the digital information flow in the unit time is; conversely, the smaller.
S20: decoding the 4:2:0 chroma format image into a first image, inputting the first image into a target neural network model, and obtaining a second image output by the target neural network model, wherein the resolution of the second image is the same as that of the first image, and the chroma of the second image is closer to that of the original 4:4:4 chroma format image than that of the first image; in one embodiment, the second end device may receive the 4:2:0 chroma format image transmitted by the communication channel through the second communication module. The decoder of the second end device decodes the received 4:2:0 chroma format image into a first image. And then inputting the first image into a target neural network model, and obtaining an optimized second image output by the target neural network model. The destination neural network model is disposed in the second end device. The resolution of the second image is the same as the resolution of the first image, and the chromaticity of the second image is closer to the chromaticity of the original 4:4:4 chromaticity format image than the chromaticity of the first image. Here, chromaticity of an image refers to chromaticity values of the image. Chromaticity, which represents the color of the pixel. If the chroma value is 0, it is a black-and-white picture.
The method for acquiring the target neural network model comprises the following steps: constructing a training sample of an image, wherein the training sample of the image comprises an original 4:4:4 chroma format image set and a corresponding 4:2:0 chroma format image set, the resolution of each image in the original 4:4:4 chroma format image set is the same as the resolution of the corresponding image in the 4:2:0 chroma format image set, and each image in the original 4:4:4 chroma format image set is obtained by encoding, communication channel transmission and decoding through a coder-decoder; and inputting the training sample of the image into the deep learning network model for training to obtain the target neural network model.
The training sample of images is made up of multiple types of styles of images. Optionally, a training sample of the image is obtained by the image source. The image categories of the training samples of the images include: scenic images, character images, face images, animal images, text images, industrial images. Optionally, the data expansion method is used for data expansion of the obtained text image and industrial image to overcome the problem that the text image and industrial image are difficult to collect, and the data expansion method comprises the following steps: image synthesis, stitching, color inversion, image size adjustment, clipping, and scaling.
The training samples of the images comprise an original 4:4:4 chroma format image set and a corresponding 4:2:0 chroma format image set, and the 4:2:0 chroma format image set is obtained by encoding, communication channel transmission and decoding each image of the original 4:4:4 chroma format image set through a coder-decoder. The original 4:4:4 chroma format image set is a YCbCr4:4 sampling format image of the various images, and the corresponding 4:2:0 chroma format image set is a YCbCr4:2:0 sampling format image of the various images after encoding and decoding and channel transmission. The method for acquiring the YCbCr4:2:0 sampling format image comprises the steps of up-down sampling by utilizing fuzzy kernel and bilinear interpolation, adding various noise and the like. The original 4:4:4 chroma format image and the corresponding 4:2:0 chroma format image are obtained by carrying out different sampling modes on the same image, so that the resolution ratio of the original 4:4:4 chroma format image and the corresponding 4:2:0 chroma format image is the same, and the original 4:4:4 chroma format image and the corresponding 4:2:0 chroma format image have a one-to-one pairing relation. The plurality of original 4:4:4 chroma format images forms the original 4:4:4 chroma format image set and the plurality of 4:2:0 chroma format images forms the 4:2:0 chroma format image set.
After obtaining the training sample images of the images, inputting paired 4:4:4 chroma format images and 4:2:0 chroma format images, namely an original 4:4:4 chroma format image set and a corresponding 4:2:0 chroma format image set, into a preprocessor, wherein the preprocessor preprocesses the images comprises the steps of dividing each image of different types into a plurality of image blocks with 128 x 128 sizes, performing operations such as horizontal overturning, mirroring, rotating and the like on the image blocks, then converting all the image block data into RGB format data, performing normalization processing and improving data dimension, and the data preprocessing mode can also comprise vertical overturning and PCA whitening. The different types of image data are randomly divided into a training data set and a test data set according to the proportion of 9:1, and are manufactured into fixed data storage format files, such as h5, etc., wherein the specific data storage format files are not limited and stored in the storage device, and the specific storage device is not limited.
The training samples of the images comprise training data sets and test data sets, wherein the step of inputting the training samples of the images into the deep learning network model for training to obtain a target neural network model comprises the steps of inputting the training data sets into the deep learning network model for training to obtain a trained deep learning network model; inputting the test data set into the trained deep learning network model for verification, and obtaining a verified deep learning network model; and compressing the verified deep learning network model to obtain the target neural network model. The resolution of each image in the training dataset and the test dataset is the same.
Inputting the training image sample obtained in the steps into a network model, and obtaining a processed target image sample after operations such as channel separation, grouping convolution, 3x3 convolution, concat, 1x1 convolution, dimension reduction and the like; calculating a loss value between the target image sample and the corresponding original YCbCr4:4:4 format image sample, updating parameters in a network model according to the loss value, aiming at reducing the loss value, and optimizing the quality of the target image sample. The step of inputting the training data set into the deep learning network model for training, and obtaining the trained deep learning network model comprises the following steps: inputting the 4:2:0 chroma format image in the training data set into a deep learning network model, and performing two-branch processing, wherein one branch performs feature extraction to obtain a feature map, and the other branch performs convolution operation to obtain a convolution result; performing series operation on the feature map and the convolution result, and obtaining an output image through 1x1 convolution operation; and calculating the difference value of the original 4:4:4 chroma format image paired with the output image and the training data set, and updating the parameters of the deep learning network model according to the difference value until the difference value is smaller than a given threshold value, so as to obtain the trained deep learning network model.
The step of inputting the test data set into the trained deep learning network model for verification, and obtaining the verified deep learning network model comprises the following steps: inputting the 4:2:0 chroma format image in the test data set into the trained deep learning network model, and outputting a reference image; and calculating an average index of the reference image and the paired original 4:4:4 chroma format image, and verifying the trained deep learning network model. The step of compressing the verified deep learning network model to obtain a target neural network model includes: compressing the convolution layer of the branches into a single 3x3 convolution operation through structural reparameterization; loading the verified deep learning network model, and carrying out model quantization by utilizing QAT quantized perception training to obtain a quantized compressed deep learning network model; and converting the quantized and compressed deep learning network model into an RKNN model to obtain the target neural network model.
And acquiring a second image through the target neural network model, wherein the chromaticity of the second image is closer to that of the original 4:4:4 chromaticity format image than that of the first image. In practice, the chromaticity of the second image is almost identical to the chromaticity of the original 4:4:4 chromaticity format image, and the chromaticity of the first image differs greatly from the chromaticity of the original 4:4:4 chromaticity format image. Thus, the chromaticity of the second image output by the decoding end is guaranteed to be good. So that the picture quality of the second image is substantially the same as the original image.
S30: the second image is sent to the display device.
The decoding end sends the second image to the display device, so that the display device can display the second image. At this time, the chromaticity of the second image is close to that of the original image, so that the look and feel of the image is improved.
According to the embodiment of the application, the target neural network model is deployed at the decoder end, the decoded image is subjected to image quality improvement operation, and the image is directly output to the display equipment for display by the display equipment. The display device may be a television screen or the like, without limitation. The display device is electrically connected with the second terminal device.
The embodiment of the application also provides an image processing device, which comprises: the acquisition module is used for acquiring a 4:2:0 chroma format image, wherein the 4:2:0 chroma format image is obtained by encoding an original 4:4:4 chroma format image through an encoder and transmitting the encoded 4:2:0 chroma format image through a communication channel; the decoding module is used for decoding the 4:2:0 chroma format image into a first image, inputting the first image into the target neural network model, and obtaining a second image output by the target neural network model, wherein the resolution of the second image is the same as that of the first image, and the chroma of the second image is closer to that of the original 4:4:4 chroma format image than that of the first image; and the sending module is used for sending the second image to the display device.
An embodiment of the present application provides an image decoding apparatus, including a processing circuit configured to perform a method as described in any one of the above.
The embodiment of the application provides an image decoding device, which comprises a memory and a processor; a memory for storing a computer program; a processor for implementing any of the methods as described above when executing the computer program.
Embodiments of the present application provide a computer readable medium having a computer program stored therein, the computer program being loaded and executed by a processing module to implement the steps of the acquisition method. It will be appreciated by those skilled in the art that all or part of the steps in the embodiments may be implemented by a computer program to instruct related hardware, and the program may be stored in a computer readable medium, and the readable medium may include various media that may store program codes, such as a flash disk, a removable hard disk, a read-only memory, a random access device, a magnetic disk, or an optical disk.
It is within the knowledge and ability of one skilled in the art to combine the various embodiments or features mentioned herein with one another as additional alternative embodiments without conflict, and such limited number of alternative embodiments, not listed one by one, formed by a limited number of combinations of features, still fall within the skill of the present disclosure, as would be understood or inferred by one skilled in the art in view of the drawings and the foregoing.
In addition, the description of the most embodiments has been developed based on various emphasis instead of being further understood that reasonable inferences can be made regarding prior art, other related descriptions herein, or the inventive concepts where not explicitly described.
It is emphasized that the above-described embodiments, which are typical and preferred embodiments of the disclosure, are merely set forth for a detailed description of the disclosed technology and are intended to be read by the reader in light of this disclosure, and are not intended to limit the scope or applicability of the disclosure. Any modifications, equivalent substitutions, improvements, or the like, which are within the spirit and principles of the present disclosure, are intended to be encompassed within the scope of the present disclosure.

Claims (9)

1. A method of image processing, comprising:
acquiring a 4:2:0 chroma format image, wherein the 4:2:0 chroma format image is obtained by encoding an original 4:4:4 chroma format image through an encoder and transmitting the encoded image through a communication channel;
decoding the 4:2:0 chroma format image into a first image, inputting the first image into a target neural network model, and obtaining a second image output by the target neural network model, wherein the resolution of the second image is the same as that of the first image, and the chroma of the second image is closer to that of the original 4:4:4 chroma format image than that of the first image;
the acquisition method of the target neural network model comprises the following steps: constructing a training sample of an image, wherein the training sample of the image comprises an original 4:4:4 chroma format image set and a corresponding 4:2:0 chroma format image set, the resolution of each image in the original 4:4:4 chroma format image set is the same as the resolution of the corresponding image in the 4:2:0 chroma format image set, and each image in the original 4:4:4 chroma format image set is obtained by encoding, communication channel transmission and decoding through a coder-decoder; inputting a training sample of the image into a deep learning network model for training to obtain a target neural network model;
the second image is sent to the display device.
2. The method of claim 1, wherein the training samples of the image comprise a training data set and a test data set, and wherein inputting the training samples of the image into the deep learning network model for training to obtain the target neural network model comprises:
inputting the training data set into a deep learning network model for training to obtain a trained deep learning network model;
inputting the test data set into the trained deep learning network model for verification, and obtaining a verified deep learning network model;
and compressing the verified deep learning network model to obtain the target neural network model.
3. The method of claim 2, wherein inputting the training data set into the deep learning network model for training to obtain the trained deep learning network model comprises:
inputting the 4:2:0 chroma format image in the training data set into a deep learning network model, and performing two-branch processing, wherein one branch performs feature extraction to obtain a feature map, and the other branch performs convolution operation to obtain a convolution result;
performing series operation on the feature map and the convolution result, and obtaining an output image through 1x1 convolution operation;
and calculating the difference value of the original 4:4:4 chroma format image paired with the output image and the training data set, and updating the parameters of the deep learning network model according to the difference value until the difference value is smaller than a given threshold value, so as to obtain the trained deep learning network model.
4. The method of claim 2, wherein inputting the test data set into the trained deep learning network model for verification to obtain the verified deep learning network model comprises:
inputting the 4:2:0 chroma format image in the test data set into the trained deep learning network model, and outputting a reference image;
and calculating an average index of the reference image and the paired original 4:4:4 chroma format image, and verifying the trained deep learning network model.
5. The method of claim 2, wherein compressing the validated deep learning network model to obtain the destination neural network model comprises:
compressing the convolution layer of the branches into a single 3x3 convolution operation through structural reparameterization;
loading the verified deep learning network model, and carrying out model quantization by utilizing QAT quantized perception training to obtain a quantized compressed deep learning network model;
and converting the quantized and compressed deep learning network model into an RKNN model to obtain the target neural network model.
6. The method of claim 1, wherein the obtaining the original 4:4:4 chroma format image set comprises: capturing, capturing frames and synthesizing images by using imaging equipment;
the original 4:4:4 chroma format image set includes: the image processing method comprises the steps of landscape image, character image, face image, animal image, text image and industrial image, wherein the text image and the industrial image are subjected to data augmentation through a data augmentation method, and the data augmentation method comprises image synthesis, stitching, color inversion and image size adjustment.
7. An apparatus for image processing, comprising:
the acquisition module is used for acquiring a 4:2:0 chroma format image, wherein the 4:2:0 chroma format image is obtained by encoding an original 4:4:4 chroma format image through an encoder and transmitting the encoded 4:2:0 chroma format image through a communication channel;
the decoding module is used for decoding the 4:2:0 chroma format image into a first image, inputting the first image into the target neural network model, and obtaining a second image output by the target neural network model, wherein the resolution of the second image is the same as that of the first image, and the chroma of the second image is closer to that of the original 4:4:4 chroma format image than that of the first image;
the objective neural network model obtaining module comprises: the method comprises the steps that training samples of images are used for constructing the training samples of the images, the training samples of the images comprise an original 4:4:4 chroma format image set and a corresponding 4:2:0 chroma format image set, the resolution of each image in the original 4:4:4 chroma format image set is the same as the resolution of the corresponding image in the 4:2:0 chroma format image set, and each image in the original 4:4:4 chroma format image set is obtained through encoding, communication channel transmission and decoding by a coder-decoder; the training sample of the image is input into a deep learning network model for training to obtain a target neural network model;
and the sending module is used for sending the second image to the display device.
8. An image decoding apparatus characterized by: comprising processing circuitry for performing the method of any of claims 1 to 6.
9. An image decoding apparatus comprising a memory and a processor;
a memory for storing a computer program;
a processor for implementing the method according to any of claims 1 to 6 when executing a computer program.
CN202210782328.8A 2022-07-05 2022-07-05 Image processing method Active CN115150370B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210782328.8A CN115150370B (en) 2022-07-05 2022-07-05 Image processing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210782328.8A CN115150370B (en) 2022-07-05 2022-07-05 Image processing method

Publications (2)

Publication Number Publication Date
CN115150370A CN115150370A (en) 2022-10-04
CN115150370B true CN115150370B (en) 2023-08-01

Family

ID=83410356

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210782328.8A Active CN115150370B (en) 2022-07-05 2022-07-05 Image processing method

Country Status (1)

Country Link
CN (1) CN115150370B (en)

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102523458B (en) * 2012-01-12 2014-06-04 山东大学 Encoding and decoding method for wireless transmission of high-definition image and video
CN109978764B (en) * 2019-03-11 2021-03-02 厦门美图之家科技有限公司 Image processing method and computing device
CN110719484B (en) * 2019-09-17 2020-08-04 广州魅视电子科技有限公司 Image processing method
CN111275128B (en) * 2020-02-13 2023-08-25 平安科技(深圳)有限公司 Image recognition model training method and system and image recognition method
CN113674144A (en) * 2020-05-14 2021-11-19 Tcl科技集团股份有限公司 Image processing method, terminal equipment and readable storage medium
CN111953977A (en) * 2020-07-09 2020-11-17 西安万像电子科技有限公司 Image transmission method, system and device

Also Published As

Publication number Publication date
CN115150370A (en) 2022-10-04

Similar Documents

Publication Publication Date Title
CN108696761B (en) Picture file processing method, device and system
CN109348226B (en) Picture file processing method and intelligent terminal
KR101366091B1 (en) Method and apparatus for encoding and decoding image
US9552652B2 (en) Image encoder and image decoder
WO2019210822A1 (en) Video encoding and decoding method, device, and system, and storage medium
CN107547907B (en) Method and device for coding and decoding
US8873625B2 (en) Enhanced compression in representing non-frame-edge blocks of image frames
US20180352207A1 (en) Methods and Systems for Transmitting Data in a Virtual Reality System
CN107534768B (en) Method and apparatus for compressing image based on photographing information
CN111429357B (en) Training data determining method, video processing method, device, equipment and medium
US20190045237A1 (en) Custom data indicating nominal range of samples of media content
CN107483942B (en) Decoding method of video data compressed code stream, encoding method and device of video data
TWI626841B (en) Adaptive processing of video streams with reduced color resolution
US20240105193A1 (en) Feature Data Encoding and Decoding Method and Apparatus
CN109151503B (en) Picture file processing method and equipment
JP2003188733A (en) Encoding method and arrangement
JP3462867B2 (en) Image compression method and apparatus, image compression program, and image processing apparatus
CN112261417B (en) Video pushing method and system, equipment and readable storage medium
CN115150370B (en) Image processing method
WO2021057676A1 (en) Video coding method and apparatus, video decoding method and apparatus, electronic device and readable storage medium
CN114079823A (en) Video rendering method, device, equipment and medium based on Flutter
CN112929703A (en) Method and device for processing code stream data
EP4300976A1 (en) Audio/video or image layered compression method and apparatus
US20230262210A1 (en) Visual lossless image/video fixed-rate compression
JPS5840989A (en) Coding processing method and transmission controlling method of picture information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant