CN114240749A

CN114240749A - Image processing method, image processing device, computer equipment and storage medium

Info

Publication number: CN114240749A
Application number: CN202111523337.7A
Authority: CN
Inventors: 彭云波; 张泽昕; 李承乾; 林悦
Original assignee: Netease Hangzhou Network Co Ltd
Current assignee: Netease Hangzhou Network Co Ltd
Priority date: 2021-12-13
Filing date: 2021-12-13
Publication date: 2022-03-25

Abstract

The embodiment of the application discloses an image processing method, an image processing device, computer equipment and a storage medium. The method comprises the following steps: acquiring an image to be processed with a first resolution, wherein the format of the image to be processed is a YUV channel format; acquiring a to-be-processed brightness channel map and a to-be-processed chrominance channel map of the to-be-processed image; performing super-resolution reconstruction processing on the brightness channel map to be processed to generate a target brightness channel map; processing the to-be-processed chrominance channel map according to the specified amplification parameters to obtain a target chrominance channel map; generating a target image based on the target luminance channel map and the target chrominance channel map; according to the super-resolution reconstruction method and device, a simplified super-resolution reconstruction model is constructed in advance, the super-resolution reconstruction model can rapidly amplify the resolution of the input low-resolution image to obtain the high-resolution image, the reconstruction speed of super-resolution reconstruction can be increased, and the image display quality is improved.

Description

Image processing method, image processing device, computer equipment and storage medium

Technical Field

The present application relates to the field of image processing technologies, and in particular, to an image processing method and apparatus, a computer device, and a storage medium.

Background

With the continuous development of computer communication technology, a great number of terminals such as smart phones, tablet computers and notebook computers are widely popularized and applied, the terminals are developed towards diversification and individuation directions, the terminals become indispensable terminals in life and work increasingly, and the requirements of people on the definition and the reality of terminal display pictures are higher and higher. At present, in the transmission process of pictures and video streams, super-resolution reconstruction is adopted to process images or videos, however, in the prior art, the super-resolution reconstruction process for each frame of an image or a video is time-consuming, so that the image rendering cost is high, and the display effect is poor.

Disclosure of Invention

The embodiment of the application provides an image processing method, an image processing device, computer equipment and a storage medium, wherein a simplified super-resolution reconstruction model is constructed in advance, the super-resolution reconstruction model can rapidly amplify an input low-resolution image to obtain a high-resolution image, the reconstruction speed of super-resolution reconstruction can be increased, and the image display quality is improved.

An embodiment of the present application provides an image processing method, including:

acquiring an image to be processed with a first resolution, wherein the format of the image to be processed is a YUV channel format;

acquiring a to-be-processed brightness channel map and a to-be-processed chrominance channel map of the to-be-processed image, wherein the resolution of the to-be-processed brightness channel map is a first resolution, and the resolution of the to-be-processed chrominance channel map is a first resolution;

performing super-resolution reconstruction processing on the brightness channel map to be processed to generate a target brightness channel map, wherein the super-resolution reconstruction processing comprises performing channel conversion processing and/or image feature extraction processing on the brightness channel map to be processed;

processing the chrominance channel map to be processed according to the specified amplification parameters to obtain a target chrominance channel map, wherein the resolution of the target chrominance channel map is the target resolution;

and generating a target image based on the target brightness channel map and the target chroma channel map.

Correspondingly, an embodiment of the present application further provides an image processing apparatus, including:

the device comprises a first acquisition unit, a second acquisition unit and a processing unit, wherein the first acquisition unit is used for acquiring an image to be processed with a first resolution, and the format of the image to be processed is a YUV channel format;

a second obtaining unit, configured to obtain a to-be-processed luminance channel map and a to-be-processed chrominance channel map of the to-be-processed image, where a resolution of the to-be-processed luminance channel map is a first resolution, and a resolution of the to-be-processed chrominance channel map is a first resolution;

the first generation unit is used for performing super-resolution reconstruction processing on the brightness channel map to be processed to generate a target brightness channel map, wherein the super-resolution reconstruction processing comprises channel conversion processing and/or image feature extraction processing on the brightness channel map to be processed;

the processing unit is used for processing the chrominance channel map to be processed according to the specified amplification parameter to obtain a target chrominance channel map, wherein the resolution of the target chrominance channel map is the target resolution;

a second generating unit configured to generate a target image based on the target luminance channel map and the target chrominance channel map.

In some embodiments, the apparatus further comprises a first processing subunit to:

inputting the luminance channel map to be processed into a normalization layer for normalization processing to obtain a first luminance channel map;

and performing super-resolution reconstruction processing on the first brightness channel map to generate a target brightness channel map.

In some embodiments, the apparatus further comprises a second processing subunit to:

inputting the first luminance channel map into a linear interpolation layer to obtain a second luminance channel map, inputting the first luminance channel map into a first channel conversion convolution layer to perform channel conversion processing to obtain a third luminance channel map, wherein a first channel numerical value of the first luminance channel map is consistent with a second channel numerical value of the second luminance channel map, a third channel numerical value of the third luminance channel map is determined based on a first weight of the first channel conversion convolution layer, and the third channel numerical value is larger than the first channel numerical value;

inputting the third luminance channel map into at least one feature extraction convolution layer for image feature extraction processing to obtain a fourth luminance channel map, wherein a fourth channel numerical value of the fourth luminance channel map is consistent with the third channel numerical value;

inputting the fourth luminance channel map into a second channel conversion convolutional layer for channel conversion processing to obtain a fifth luminance channel map, wherein a fifth channel value of the fifth luminance channel map is smaller than the fourth channel value, and the fifth channel value is determined based on a second weight of the second channel conversion convolutional layer;

inputting the fourth luminance channel map into a pixel reconstruction layer to obtain a fifth luminance channel map;

and generating a target brightness channel map based on the second brightness channel map and the fifth brightness channel map.

In some embodiments, the apparatus further comprises:

the third processing subunit is used for inputting the third luminance channel map into at least one feature extraction convolution layer to carry out image feature extraction processing so as to obtain a luminance channel map to be superimposed;

a third generating unit, configured to generate a fourth luminance channel map based on the luminance channel map to be superimposed and the third luminance channel map.

In some embodiments, the apparatus further comprises a fourth processing subunit to:

the first brightness channel graph and a first specified parameter input pixel splitting layer are processed to obtain a first brightness characteristic graph set, wherein the first brightness characteristic graph set comprises a first specified number of brightness characteristic graphs;

and inputting all the luminance feature maps in the first luminance feature map set into a first channel conversion convolution layer for channel conversion processing to generate a first to-be-processed luminance feature map set, wherein the first to-be-processed luminance feature map set comprises a plurality of third luminance channel maps.

In some embodiments, the apparatus further comprises:

a fourth generating unit, configured to sequentially input the third luminance channel map into the first feature extraction convolutional layer, the second feature extraction convolutional layer, and the third feature extraction convolutional layer, and generate a second to-be-processed luminance feature map set, where the second to-be-processed luminance feature map set includes multiple fourth luminance channel maps.

In some embodiments, the apparatus further comprises:

a fifth processing subunit, configured to input the first to-be-processed luminance feature map set into a second channel conversion convolutional layer for channel conversion processing, so as to obtain a third to-be-processed luminance feature map set, where the third to-be-processed luminance feature map set includes a plurality of sixth luminance channel maps, and a sixth channel numerical value of the sixth luminance channel map is the same as the third channel numerical value;

the input unit is used for inputting the third to-be-processed brightness feature map set into a pixel recombination layer to obtain a recombined brightness channel map;

a fifth generating unit configured to generate a target luminance channel map based on the second luminance channel map and the recomposed luminance channel map.

In some embodiments, the apparatus further comprises:

the training unit is used for training a first preset network structure by adopting a sample set to obtain the trained first preset network structure and a first target parameter, wherein the first preset network structure consists of a first preset convolutional layer and a second preset convolutional layer;

a sixth generating unit, configured to generate a first channel conversion convolutional layer based on the first preset network structure and the first target parameter.

In some embodiments, the apparatus further comprises a sixth processing subunit for:

performing fusion processing on the second brightness channel image and the fifth brightness channel image to obtain a target combined brightness channel image;

and performing inverse normalization processing on the target combined brightness channel map to generate a target brightness channel map.

In some embodiments, the apparatus further comprises:

a first acquisition unit for sending an image acquisition request to a server;

the receiving unit is used for receiving an image to be decompressed returned by the server, wherein the image to be decompressed is obtained by compressing a specified image by the server, the specified image is an image rendered by the server according to the image acquisition request, and the resolution of the specified image is a first resolution;

and the seventh processing subunit is used for decompressing the image to be decompressed to obtain the image to be processed.

In some embodiments, the apparatus further comprises:

the second acquisition unit is used for acquiring a picture display instruction and indicating an image to be processed rendered by a game engine according to the picture display instruction, wherein the resolution of the image to be processed is a first resolution;

and the third acquisition unit is used for acquiring the image to be processed rendered by the game engine from the game engine by the simulator when detecting that the simulator detects a trigger image display instruction in the game engine.

In some embodiments, the apparatus further comprises:

the judging unit is used for judging whether the format of the image to be processed is a YUV channel format or not;

if so, acquiring a to-be-processed brightness channel map and a to-be-processed chrominance channel map of the to-be-processed image;

if not, converting the current format of the image to be processed into a YUV channel format.

In some embodiments, the apparatus further comprises:

the eighth processing subunit is configured to perform linear interpolation processing on the to-be-processed chrominance channel map according to the specified amplification parameter to obtain a target chrominance channel map;

a seventh generating unit, configured to generate a target image based on the target luminance channel map and the target chrominance channel map.

In some embodiments, the apparatus further comprises:

the ninth processing subunit is configured to perform image superposition processing on the target luminance channel map and the target chrominance channel map to obtain an image to be converted;

and the conversion unit is used for converting the format of the image to be converted into an RGB channel format to obtain a target image.

Accordingly, embodiments of the present application further provide a computer device, which includes a processor, a memory, and a computer program stored on the memory and capable of running on the processor, and when executed by the processor, the computer program implements the steps of any of the image processing methods described above.

Furthermore, an embodiment of the present application further provides a storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the steps of any one of the image processing methods described above.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings to be recalled in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

Fig. 1 is a schematic flowchart of an image processing method provided in an embodiment of the present application;

fig. 2 is a schematic structural diagram of a super-resolution reconstruction network structure provided by an embodiment of the present application;

fig. 3 is a scene schematic diagram of an image processing method provided in an embodiment of the present application;

fig. 4 is another structural diagram of a super-resolution reconstruction network structure provided by an embodiment of the present application;

fig. 5 is another structural diagram of a super-resolution reconstruction network structure provided by an embodiment of the present application;

fig. 6 is a schematic view of another scene of an image processing method provided in an embodiment of the present application;

fig. 7 is another structural diagram of a super-resolution reconstruction network structure provided by an embodiment of the present application;

fig. 8 is another structural diagram of a super-resolution reconstruction network structure provided by an embodiment of the present application;

fig. 9 is another structural diagram of a super-resolution reconstruction network structure provided by an embodiment of the present application;

fig. 10 is another structural diagram of a super-resolution reconstruction network structure provided by an embodiment of the present application;

fig. 11 is another structural diagram of a super-resolution reconstruction network structure provided by an embodiment of the present application;

fig. 12 is another structural diagram of a super-resolution reconstruction network structure provided by an embodiment of the present application;

fig. 13 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present application;

fig. 14 is a schematic structural diagram of a computer device provided in an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be described clearly and completely with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The embodiment of the application provides an information recommendation method, an information recommendation device, a storage medium and computer equipment. Specifically, the information recommendation method according to the embodiment of the present application may be executed by a computer device, where the computer device may be a terminal or a server. The terminal can be a terminal device such as a smart phone, a tablet Computer, a notebook Computer, a touch screen, a Personal Computer (PC), a Personal Digital Assistant (PDA), and the like. The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as cloud service, a cloud database, cloud computing, a cloud function, cloud storage, network service, cloud communication, middleware service, domain name service, security service, CDN, and a big data and artificial intelligence platform.

For example, the computer device may be a terminal, and the terminal may acquire an image to be processed with a first resolution, where a format of the image to be processed is a YUV channel format; acquiring a to-be-processed brightness channel map and a to-be-processed chrominance channel map of the to-be-processed image, wherein the resolution of the to-be-processed brightness channel map is a first resolution, and the resolution of the to-be-processed chrominance channel map is a first resolution; performing super-resolution reconstruction processing on the brightness channel map to be processed to generate a target brightness channel map, wherein the super-resolution reconstruction processing comprises performing channel conversion processing and/or image feature extraction processing on the brightness channel map to be processed; processing the chrominance channel map to be processed according to the specified amplification parameters to obtain a target chrominance channel map, wherein the resolution of the target chrominance channel map is the target resolution; and generating a target image based on the target brightness channel map and the target chroma channel map.

The following are detailed below. It should be noted that the following description of the embodiments is not intended to limit the preferred order of the embodiments.

An image processing method provided in an embodiment of the present application may be executed by a processor of a terminal, as shown in fig. 1, a specific flow of the image processing method mainly includes steps 101 to 105, which are described in detail as follows:

101, acquiring an image to be processed with a first resolution, wherein the format of the image to be processed is a YUV channel format.

In an embodiment, before the step of "acquiring an image to be processed at a first resolution", the method may comprise:

sending an image acquisition request to a server;

receiving an image to be decompressed returned by the server, wherein the image to be decompressed is obtained by compressing a specified image by the server, the specified image is an image rendered by the server according to the image acquisition request, and the resolution of the specified image is a first resolution;

and decompressing the image to be decompressed to obtain the image to be processed.

Specifically, the method can be applied to cloud game scenes. Firstly, the image of the game picture to be displayed can be rendered at the server side (server) with low resolution, then the rendered image with low resolution is compressed and transmitted to the terminal equipment through the network. After receiving the compressed low-resolution image, the terminal device completes decompression processing in the GPU to obtain a decompressed low-resolution image, then performs fast super-resolution reconstruction on the decompressed low-resolution image through the terminal device (for example, in the GPU or the CPU), and finally completes display of the image of the game screen in the terminal device.

Among them, Cloud gaming (Cloud gaming) may also be called game on demand (gaming), which is an online game technology based on Cloud computing technology. Cloud game technology enables light-end devices (thin clients) with relatively limited graphics processing and data computing capabilities to run high-quality games. In a cloud game scenario, cloud game software is not run on a player game terminal, but is run on a cloud server. The player game terminal does not need to have strong graphic operation and data processing capacity, and only needs to have basic streaming media playing capacity and capacity of acquiring player input instructions and sending the instructions to the cloud server. The cloud game server acquires the audio and video stream of the cloud game software through the audio and video acquirer, then encodes the audio and video stream through the audio and video encoder, and then sends the encoded audio and video stream to the game client through internet communication protocols such as Real Time Streaming Protocol (RTSP). After receiving the encoded audio and video stream, the game client decodes the audio and video stream through the audio and video decoder, and then plays the picture and sound of the game through the audio and video player, so that the user can see the game picture and hear the sound of the game at the terminal. The game client monitors input instructions of a user (including input instructions from input devices such as a mouse, a keyboard and a touch screen), event coding is carried out according to the instructions input by the user, and then the coded input events are sent to the cloud game server through a custom communication protocol. The cloud game server decodes the encoded input events after receiving them, and then reproduces the user's input.

In another embodiment, before the step of "acquiring an image to be processed at a first resolution", the method may comprise:

acquiring a picture display instruction, and indicating a to-be-processed image rendered by a game engine according to the picture display instruction, wherein the resolution of the to-be-processed image is a first resolution;

when detecting that the simulator detects a trigger picture display instruction in the game engine, the simulator acquires an image to be processed rendered by the game engine from the game engine.

Specifically, the method can be applied to a mobile phone simulator. Firstly, a game engine arranged in the terminal renders an image of a game picture into a low-resolution picture based on a target rendering size, then a mobile phone simulator obtains the low-resolution picture, the mobile phone simulator carries out quick super-resolution reconstruction on the image of the game picture by using a GPU or a CPU on the terminal, and finally the display of the image of the game picture is completed in the terminal equipment.

The simulator is a software program that can simulate and execute a specific hardware platform and its program on a computer platform based on computer compilation. The simulator has a corresponding simulation system, for example, if the simulator is an android simulator, the simulator corresponds to the android simulation system, so that the simulator can run on a computer and simulate an android mobile phone system, and can install, use and uninstall software of android applications, so that a user can experience android games and applications on the computer. Here, the device configuring the simulator may be a computer, a mobile phone, a tablet computer, a game device, and the like.

In order to enable super-resolution reconstruction of a luminance channel map of an image to be processed, after the step "acquiring an image to be processed at a first resolution", the method may comprise:

judging whether the format of the image to be processed is a YUV channel format or not;

In order to perform super-resolution reconstruction on the luminance channel map of the image to be processed, if the format of the received image to be processed is RGB format, the format of the image to be processed needs to be converted into YUV channel format.

Wherein, R is the R (Red) value of each pixel in the image to be processed, G is the G (Green) value of each pixel in the image to be processed, and B is the B (Blue) value of each pixel in the image to be processed. Y denotes a luminance signal, and U, V denotes two color difference signals B-Y (i.e., U) and R-Y (i.e., V), respectively.

102, obtaining a to-be-processed luminance channel map and a to-be-processed chrominance channel map of the to-be-processed image, wherein the resolution of the to-be-processed luminance channel map is a first resolution, and the resolution of the to-be-processed chrominance channel map is a first resolution.

And 103, performing super-resolution reconstruction processing on the brightness channel map to be processed to generate a target brightness channel map, wherein the super-resolution reconstruction processing comprises performing channel conversion processing and/or image feature extraction processing on the brightness channel map to be processed.

To facilitate the training of the neural network, before the step of performing super-resolution reconstruction processing on the luminance channel map to be processed, the method may include:

In an embodiment, the super-resolution reconstruction processing on the first luminance channel map to generate the target luminance channel map may include:

In an embodiment, the step of inputting the third luminance channel map into at least one feature extraction convolution layer for image feature extraction processing to obtain a fourth luminance channel map may include:

inputting the third brightness channel diagram into at least one characteristic extraction convolution layer for image characteristic extraction processing to obtain a brightness channel diagram to be superposed;

and generating a fourth brightness channel map based on the brightness channel map to be superposed and the third brightness channel map.

In order to achieve faster calculation of a picture to be processed in a super-resolution reconstruction model, after the step of inputting the first luminance channel map into a linear interpolation layer and before inputting the first luminance channel map into a first channel conversion convolution layer for channel conversion processing, the method may include:

In one embodiment, the feature extraction convolutional layers comprise a first feature extraction convolutional layer, a second feature extraction convolutional layer and a third feature extraction convolutional layer, wherein the weights of the first feature extraction convolutional layer, the second feature extraction convolutional layer and the third feature extraction convolutional layer are weighted by the same weight. After the step of inputting all the luminance feature maps in the first luminance feature map set into the first channel conversion convolution layer for channel conversion processing, the method may include:

and sequentially inputting the third luminance channel map into the first feature extraction convolution layer, the second feature extraction convolution layer and the third feature extraction convolution layer to generate a second to-be-processed luminance feature map set, wherein the second to-be-processed luminance feature map set comprises a plurality of fourth luminance channel maps.

In another embodiment, after the step of inputting all the luma feature maps in the first luma feature map set into the first channel conversion convolutional layer for channel conversion processing to generate the first to-be-processed luma feature map set, "the method may include:

inputting the first to-be-processed brightness feature map set into a second channel conversion convolution layer for channel conversion processing to obtain a third to-be-processed brightness feature map set, wherein the third to-be-processed brightness feature map set comprises a plurality of sixth brightness channel maps, and the sixth channel numerical value of the sixth brightness channel map is the same as the third channel numerical value;

inputting the third to-be-processed brightness feature map set into a pixel recombination layer to obtain a recombined brightness channel map;

and generating a target brightness channel map based on the second brightness channel map and the recombined brightness channel map.

Optionally, a first preset network structure may be trained by using a sample set to obtain a trained first preset network structure and a first target parameter, where the first preset network structure is composed of a first preset convolutional layer and a second preset convolutional layer. Then, a first channel conversion convolutional layer is generated based on the first preset network structure and the first target parameter.

Specifically, the step of "generating the target luminance channel map based on the second luminance channel map and the fifth luminance channel map" may include:

In the embodiment of the application, image characteristic information is mainly processed, and the super-resolution reconstruction network model provided by the embodiment of the application can be a convolutional neural network model.

In machine learning, a Convolutional Neural Network (CNN) is a feed-forward Neural Network whose artificial neurons can respond to a part of the surrounding cells within the coverage range, and performs well for large-scale image processing. It includes: a Convolutional layer (Convolutional layer) and a pooling layer (pooling layer).

In general, the basic structure of CNN includes two layers, one of which is a feature extraction layer, and the input of each neuron is connected to a local acceptance domain of the previous layer and extracts the feature of the local. Once the local feature is extracted, the position relation between the local feature and other features is determined; the other is a feature mapping layer, each calculation layer of the network is composed of a plurality of feature mappings, each feature mapping is a plane, and the weights of all neurons on the plane are equal. The feature mapping structure adopts a sigmoid function (threshold function, which maps variables between 0 and 1) with small influence function kernel as an activation function of the convolution network, so that the feature mapping has displacement invariance. In addition, since the neurons on one mapping surface share the weight, the number of free parameters of the network is reduced. Each convolutional layer in the convolutional neural network is followed by a computation layer for local averaging and quadratic extraction, which reduces the feature resolution.

Referring to fig. 2, fig. 2 is a schematic structural diagram of a super-resolution reconstruction network structure according to an embodiment of the present application. In the embodiment of the present application, the input size of each layer in the convolutional neural network is N × C × H × W, where N is how many pictures are input each time, C is the number of channels (for example, if a picture is an RGB three-channel picture, C is 3, and if a luminance channel picture is a picture of one channel, C is 1), and H and W represent the height and the width, respectively. For example, when a 540 p-sized image is input to the super-resolution reconstruction network structure, the size of the image is 1 × 1 × 960 × 540. The super-resolution reconstruction network structure provided by the embodiment of the application specifically comprises the following components:

(1) normalization layer (Div layer): the role of the normalization layer is mainly normalization. Specifically, since the value range of the pixel point in the Y channel is [0,255], in order to simplify the processing step of the luminance channel image in the convolutional neural network and increase the processing speed, the original value range [0,255] may be divided by 255 to normalize the image data of the luminance channel image to the range between [0,1] so as to facilitate the training of the neural network.

(2) Pixel splitting layer (Unshuffle layer): the pixel splitting layer can convert adjacent pixel points in the brightness channel image into different channels. For example, when the image reduction ratio (down scale) is 4, since the pixel points on the 4 × 4 area on the original luminance channel image are respectively allocated to 16 channels, the luminance channel image having the resolution of 1 × 1 × 960 × 540 can be changed to the luminance channel image having the resolution of 1 × 16 × 240 × 135.

For example, as shown in fig. 3, the embodiment of the present application may split a High Resolution (HR) image into r square channels (channels), where r is downlink. The step can make each pixel point on the input brightness channel image utilized, and the resolution of the original input brightness channel image is greatly reduced due to the step, so that the calculation amount in subsequent Conv convolution processing can also be greatly reduced.

Since the originally input luminance channel image is input into the Unshuffle layer and processed, the corresponding point (x, y) in the obtained feature map is 16 pixels in the 4 × 4 range on the originally input luminance channel image. In this case, when the feature map subjected to the Unshuffle layer processing is subjected to convolution processing of 3 × 3, a 12 × 12 size of the receptive field can be obtained. Therefore, even if the model of the super-resolution reconstruction network structure provided by the present application is small, the balance of the speed and the effect of the image processing can be maintained.

In addition, since the Unshuffle layer is adopted to process the luminance channel image in the embodiment of the present application, in addition to performing the operation of upsampling by 2 times, the embodiment of the present application may also support the equal proportion of upsampling by 3/2 times and 4/3 times. Specifically, the implementation manner of the above effect may be selectable according to actual situations by setting a down scale coefficient of the unscuffle layer and an image magnification ratio (upscale) coefficient of the pixel reconstruction layer (Shuffle layer). For example, if the Low Resolution (LR) is up-sampled by 2 times of HR, then the down scale coefficient is 4, and the up scale coefficient is 8, then 2 times super-Resolution reconstruction can be achieved, and the speed and effect can be well balanced. If the up-sampling is 4/3 times from LR to HR, the downscale coefficient is 3, and the upsscale coefficient is 4, the 4/3-time super-resolution can be realized. If the up-sampling is 3/2 times from LR to HR, the downscale coefficient is 4, and the upsscale coefficient is 6, the 3/2-time super-resolution can be realized.

(3) And (3) rolling layers: the convolutional layer can be used for channel transformation and can also be used for extracting the features of the feature map. The convolutional layer in the embodiment of the present application includes a conv1 (convolution 1) layer, a conv2 (convolution 2) layer, a conv3 (convolution 3) layer, and a conv4 (convolution 4) layer.

Specifically, the Conv1 layer is a 3 × 3Conv layer, the Conv1 layer is a Conv layer without Add (convolution kernel), the weight of the layer is 3 × 3 × 16 × 20, the input image can be subjected to channel transformation through the layer, it should be noted that the number of channels can be modified according to requirements, and 20 in the weight can be modified into a value of 24, 40, 60, or 80. For example, an image with a size of 1 × 16 × 240 × 135 is input to the conv1 layer, and after the processing of the conv1 layer, an image with a size of 1 × 20 × 240 × 135 may be output, that is, the channels of the image may be converted from 16 channels to 20 channels.

The Conv2 layer and the Conv3 layer are both a Conv1 layer of 3 × 3Conv layer and a Conv layer with Add (convolution kernel), the weight of each of the Conv2 layer and the Conv3 layer is 3 × 3 × 20 × 20, and the features of the feature map can be extracted by the Conv2 layer and the Conv3 layer. For example, an image having a size of 1 × 20 × 240 × 135 is input to the conv2 layer, and after the processing of the conv2 layer, an image having a size of 1 × 20 × 240 × 135 can be output.

The Conv4 layer is a 1 × 1Conv layer, which is a Conv layer without Add (convolution kernel) and whose weight is 1 × 1 × 20 × 64, through which the input image can be channel-transformed. An image with the size of 1 × 20 × 240 × 135 is input to the conv4 layer, and after the processing of the conv4 layer, the image with the size of 1 × 64 × 240 × 135 can be output, that is, the channels of the image can be converted from 20 channels to 64 channels.

It should be noted that an activation function (Relu) is provided between adjacent convolutional layers, and by adding the activation function (Relu) between adjacent convolutional layers, the nonlinear relationship between the layers of the neural network can be increased.

In addition, the Conv2 layer and the Conv3 layer may be deleted according to the actual application, for example, both the Conv2 layer and the Conv3 layer may be deleted, and only the Conv1 layer and the Conv4 layer are reserved in the convolutional layer; alternatively, only the Conv2 layer may be deleted, leaving the Conv1, Conv3 and Conv4 layers in the convolutional layer; alternatively, only the Conv3 layer may be deleted, and the Conv1 layer, the Conv2 layer and the Conv4 layer may be left in the convolutional layer. And according to the computing power of a specific model, setting a size model and distributing the model to a corresponding terminal. For a terminal with low computing power, a convolutional layer provided with a Conv1 layer and a Conv4 layer may be used.

Optionally, in the process of training the super-resolution reconstruction network model, a heavy parameter technology may be adopted to improve the processing efficiency of the model. For example, referring to fig. 4, the left one in fig. 4 is an original structure 3 × 3conv structure, the middle structure is that the original 3 × 3conv structure is trained by using a structure of one 3 × 3conv structure and 1 × 1conv heavy parameter, the number of channels output by the 3 × 3conv structure is 256 to obtain parameters during training, when constructing the super-resolution reconstruction network model, the one 3 × 3conv structure and the 1 × 1conv structure are combined into one 3 × 3conv structure, and the input image is convolved by using the parameters obtained during training. For another example, referring to fig. 5, the left one in fig. 5 is an original structure 3 × 3conv structure, the middle structure is that the original 3 × 3conv structure is trained by using a structure of one 3 × 3conv structure and 1 × 1conv heavy parameter to train a training pair, the number of channels output by the 3 × 3conv structure is 256 to obtain parameters during training, when constructing the super-resolution reconstruction network model, the one 3 × 3conv structure, the 1 × 1conv structure and the short connection are combined into one 3 × 3conv structure, and the input image is convolved by using the parameters obtained during training.

(4) Pixel recombination layer (Shuffle layer): the pixel reconstruction layer can recover the result of the amplified resolution as the inverse process of the unscuffle layer. In particular, the pixel reconstruction layer may combine r²The feature maps of the individual channels are converted into new up-sampled results of nxc × H × W. Specifically, r of each pixel point is determined according to a certain rule²Each channel is sequentially converted into corresponding r²The image block of (1). For example, as shown in FIG. 6, r²After each channel passes through the pixel recombination layer, the pixels at the same position of each channel are put together to form a sheet r²An image of size (d).

(6) Linear interpolation layer (Bilinear layer): the linear interpolation layer can directly perform linear interpolation processing on the input brightness channel image to amplify the brightness channel image, so that the main branch of the super-resolution reconstruction network model learns more abundant detail content than the linear interpolation, the layer can be modified into a Nearest interpolation layer, and the data processing speed can be higher.

(7) Additive layer (Add layer): the addition layer can enable the main branch of the resolution reconstruction network model to learn residual errors of the input image and the output image, the residual error content is the detail content which is more critical, and specifically, the image obtained at the pixel reconstruction layer and the image obtained at the linear interpolation layer can be subjected to superposition processing to obtain a combined image with more details.

(8) Reverse layer (Mul layer): the image output from the Add layer may be subjected to an inverse normalization process. Specifically, the value range of the pixel points in the image output from the Add layer can be restored from the range of [0,1] to the range of [0,255 ].

It should be noted that, in the embodiment of the present application, before using the super-resolution reconstruction network structure, the constructed super-resolution reconstruction network structure is trained, and the specific training process is as follows:

(1) pre-constructing a picture training set as a high-resolution picture, performing downsampling processing on each high-resolution picture by using Bicubic interpolation to obtain a plurality of downsampled pictures, and taking the downsampled pictures as low-resolution pictures;

(2) randomly taking a first high-resolution picture from a plurality of high-resolution pictures, and taking a region (such as NxN) with a fixed size on the picture as a first high-resolution sample picture; meanwhile, on a low-resolution picture obtained after the downsampling processing corresponding to the first high-resolution picture, a region with the same position and the same size as the steps is intercepted to be used as a first low-resolution sample picture, and the first high-resolution sample picture and the first low-resolution sample picture are used as a picture sample pair.

(3) And respectively using the picture sample pairs as input and output training networks of the constructed super-resolution reconstruction network structure to train parameter configuration.

And 104, processing the chrominance channel map to be processed according to the specified amplification parameters to obtain a target chrominance channel map, wherein the resolution of the target chrominance channel map is the target resolution.

Specifically, the step of processing the chrominance channel map to be processed according to the specified amplification parameter to obtain the target chrominance channel map may include:

performing linear interpolation processing on the chrominance channel map to be processed according to the specified amplification parameter to obtain a target chrominance channel map;

And 105, generating a target image based on the target brightness channel map and the target chroma channel map.

In one embodiment, the step of "generating a target image based on the target luminance channel map and the target chrominance channel map" may comprise:

performing image superposition processing on the target brightness channel image and the target chromaticity channel image to obtain an image to be converted;

and converting the format of the image to be converted into an RGB channel format to obtain a target image.

Optionally, in the embodiment of the present application, a shader (shader) configured by a GPU in the terminal may perform image merging and superimposing processing on the target luminance channel map and the target chrominance channel map.

After the Y channel and the UV channel are combined, the YUV format image is multiplied by a color conversion matrix which is converted into the RGB format by a YUV format through a shader. The channel combination of the Y channel and the UV channel is realized by constructing a vector of float3 (i.e. three float values) by taking Y and UV.

In order to display the brightness channel image subjected to super-resolution reconstruction on the terminal, the image in YUV format needs to be converted into RGB channel format.

The embodiment of the application provides an image processing method, a simplified super-resolution reconstruction model is constructed in advance, the super-resolution reconstruction model can rapidly enlarge the resolution of an input low-resolution image to obtain a high-resolution image, the reconstruction speed of super-resolution reconstruction can be increased, and the image display quality is improved.

Based on the above description, the image processing method of the present application will be further described below by way of example. For example, a specific application example flow of the super-resolution reconstruction network structure according to the embodiment of the present application is as follows:

(1) a 540p size luminance channel image with a size of 1 × 1 × 960 × 540 is input to the normalization layer (Div layer), and the data in the luminance channel image is normalized to a range between [0,1] by the normalization layer.

(2) The luminance channel image input pixel subjected to the normalization processing by the normalization layer is divided into layers (Unshuffle layers) to obtain a predetermined image reduction ratio, and at this time, the obtained image reduction ratio (down scale) is 4, and since the pixel points on the 4 × 4 region on the initial luminance channel image are respectively allocated to 16 channels, the luminance channel image having the resolution of 1 × 1 × 960 × 540 can be changed to the luminance channel image having the resolution of 1 × 16 × 240 × 135.

Meanwhile, the luminance channel image subjected to the normalization processing by the normalization layer is input to a linear interpolation layer (Bilinear layer) so that the luminance channel image having a resolution of 1 × 1 × 960 × 540 is changed to a luminance channel image having a resolution of 1 × 1 × 1920 × 1080.

(3) The image after the pixel division is input to the conv1 layer of the 3 × 3conv layer and is transformed into the channel, the weight of the conv1 layer is 3 × 3 × 16 × 20, the image with the size of 1 × 16 × 240 × 135 is input to the conv1 layer, and after the processing of the conv1 layer, the image with the size of 1 × 20 × 240 × 135 can be output, that is, the channel of the image can be converted from 16 channels to 20 channels.

(4) The image passing through the conv1 layer is input into the conv2 layer of the 3 × 3conv layer to extract the features of the feature map, the weight of the conv2 layer is 3 × 3 × 20 × 20, and the image with the size of 1 × 20 × 240 × 135 can be output.

(5) The image passing through the conv2 layer is input into the conv3 layer of the 3 × 3conv layer to extract the features of the feature map, the weight of the conv3 layer is 3 × 3 × 20 × 20, and the image with the size of 1 × 20 × 240 × 135 can be output.

(6) The image passing through the conv3 layer is input into the conv4 layer of the 1 × 1conv layer to be subjected to channel transformation, the weight of the conv4 layer is 1 × 1 × 20 × 64, and after the image is subjected to the processing of the conv4 layer, the image with the size of 1 × 64 × 240 × 135 can be output, namely, the image channel can be converted from 20 channels to 64 channels.

(7) The image passing through the conv4 layer is input into a pixel reconstruction layer (Shuffle layer), the pixel reconstruction layer can restore the result of the resolution after being amplified, and after the processing of the pixel reconstruction layer, the image with the size of 1 × 1 × 1920 × 1080 can be output, namely, the channel of the image can be converted from 64 channels to 1 channel.

(8) The luminance channel image with the resolution of 1 × 1 × 1920 × 1080 which passes through the linear interpolation layer (Bilinear layer) and the luminance channel image which passes through the pixel reconstruction layer are input to the addition layer (Add layer), so that the image obtained at the pixel reconstruction layer and the image obtained at the linear interpolation layer are subjected to superposition processing to obtain a combined image with more details.

(9) The combined image obtained by the addition layer is input to a reverse normalization layer (Mul layer), so that the image output from the Add layer can be subjected to reverse normalization processing. Specifically, the value range of the pixel points in the image output from the Add layer can be restored from the range of [0,1] to the range of [0,255] to obtain the target brightness channel image.

Based on the above description, the image processing method of the present application will be further described below by way of example. For example, referring to fig. 7, a super-resolution reconstruction network structure according to an embodiment of the present application is as follows:

the super-resolution reconstruction network structure provided by the embodiment of the application sequentially comprises a normalization layer (Div layer), a conv1 (convolution 1) layer, a conv2 (convolution 2) layer, a conv3 (convolution 3) layer, a conv4 (convolution 4) layer, a conv5 (convolution 5) layer, a pixel reconstruction layer (Shuffle layer), a linear interpolation layer (Bilinear layer), an addition layer (Add layer) and an inverse normalization layer (Mul layer). The weight of the conv1 layer is 3 × 3 × 1 × 20, the weight of the conv2 layer is 3 × 3 × 20 × 20, the weight of the conv3 layer is 3 × 3 × 20 × 20, the weight of the conv4 layer is 3 × 3 × 20 × 20, and the weight of the conv5 layer is 1 × 1 × 20 × 4. In addition, the specified parameter (up-scale) of the Shuffle layer is 2. The super-resolution reconstruction network provided by the embodiment of the application is a basic network, and can convert the image resolution from 540p to 1080p under the condition that the picture magnification is 2 times, the Peak Signal to Noise Ratio (psnr) is 34.90dB, and the Frame Per Second (FPS) is 5FPS, although the speed is slow, the super-resolution reconstruction network structure can realize a good image processing effect. Specifically, a 540 p-sized luminance channel image with a size of 1 × 1 × 960 × 540 is input to the normalization layer (Div layer), and data in the luminance channel image is normalized to a range between [0,1] by the normalization layer. Then, the luminance channel image subjected to the normalization processing by the normalization layer is input to the conv1 layer having a weight of 3 × 3 × 1 × 20, subjected to channel conversion, and subjected to the conv1 layer processing, thereby outputting a luminance channel map having a size of 1 × 20 × 960 × 540. Then, the luminance channel map after the conv1 layer processing is input to the conv2 layer, the conv3 layer and the conv4 layer, all of which have weights of 3 × 3 × 20 × 20, and is subjected to image feature extraction processing in sequence, so that a luminance channel map with a size of 1 × 20 × 960 × 540 can be output. Then, the luminance channel map subjected to the image feature extraction process is input to the conv5 layer having a weight of 1 × 1 × 20 × 4, subjected to channel conversion, and subjected to the conv5 layer processing, thereby outputting a luminance channel map having a size of 1 × 4 × 960 × 540. The conv5 layered image is input into the pixel reconstruction layer (Shuffle layer), at this time, upscale of the pixel reconstruction layer is 2, the pixel reconstruction layer can restore the result of resolution after enlargement, and after the processing of the pixel reconstruction layer, the image with the size of 1 × 1 × 1920 × 1080 can be output, that is, the channel of the image can be converted from 64 channels to 1 channel. Then, the luminance channel image having a resolution of 1 × 1 × 1920 × 1080 that has passed through the linear interpolation layer (Bilinear layer) and the luminance channel image having passed through the pixel reconstruction layer are input to the addition layer (Add layer), and the image obtained at the pixel reconstruction layer and the image obtained at the linear interpolation layer are subjected to superimposition processing to obtain a combined image having more details. The combined image obtained by the addition layer is input to a reverse normalization layer (Mul layer), so that the image output from the Add layer can be subjected to reverse normalization processing. Specifically, the value range of the pixel points in the image output from the Add layer can be restored from the range of [0,1] to the range of [0,255] to obtain the target brightness channel image.

Based on the above description, the image processing method of the present application will be further described below by way of example. For example, referring to fig. 8, a super-resolution reconstruction network structure according to an embodiment of the present application is as follows:

the super-resolution reconstruction network structure provided by the embodiment of the application sequentially comprises a normalization layer (Div layer), a pixel splitting layer (Unshuffle layer), a conv1 (convolution 1) layer, a conv2 (convolution 2) layer, a conv3 (convolution 3) layer, a conv4 (convolution 4) layer, a conv5 (convolution 5) layer, a pixel reconstruction layer (Shuffle layer), a linear interpolation layer (Bilinear layer), an addition layer (Add layer) and a reverse normalization layer (Mul layer). The weight of the conv1 layer is 3 × 3 × 16 × 20, the weight of the conv2 layer is 3 × 3 × 20 × 20, the weight of the conv3 layer is 3 × 3 × 20 × 20, the weight of the conv4 layer is 3 × 3 × 20 × 20, and the weight of the conv5 layer is 1 × 1 × 20 × 64. The designation parameter (down-scale) of the Un Shuffle layer is 4, and the designation parameter (up-scale) of the Shuffle layer is 8. The super-resolution reconstruction network provided by the embodiment of the application can convert the image resolution from 540p to 1080p, the Peak Signal to Noise Ratio (psnr) is 33.46dB, and the Frame Per Second (FPS) is 30FPS under the condition that the image magnification is 2 times, although the image processing effect is slightly reduced compared with the network structure provided by fig. 7, the processing speed for super-resolution reconstruction can be greatly improved. Specifically, a 540 p-sized luminance channel image with a size of 1 × 1 × 960 × 540 is input to the normalization layer (Div layer), and data in the luminance channel image is normalized to a range between [0,1] by the normalization layer. The luminance channel image input pixel subjected to the normalization processing by the normalization layer is divided into layers (Unshuffle layers) to obtain a predetermined image reduction ratio, and at this time, the obtained image reduction ratio (down scale) is 4, and since the pixel points on the 4 × 4 region on the initial luminance channel image are respectively allocated to 16 channels, the luminance channel image having the resolution of 1 × 1 × 960 × 540 can be changed to the luminance channel image having the resolution of 1 × 16 × 240 × 135. Then, the luminance channel image subjected to the pixel division layer processing is input to the conv1 layer having a weight of 3 × 3 × 16 × 20, subjected to channel conversion, and subjected to the conv1 layer processing, thereby outputting a luminance channel map having a size of 1 × 20 × 240 × 135. Then, the luminance channel map after the conv1 layer processing is input to the conv2 layer, the conv3 layer and the conv4 layer, all of which have weights of 3 × 3 × 20 × 20, and is subjected to image feature extraction processing in sequence, so that a luminance channel map with a size of 1 × 20 × 240 × 135 can be output. Then, the luminance channel map subjected to the image feature extraction process is input to the conv5 layer having a weight of 1 × 1 × 20 × 64, subjected to channel conversion, and subjected to the conv5 layer processing, thereby outputting a luminance channel map having a size of 1 × 64 × 240 × 135. When the image passing through the conv5 layer is input into the pixel reconstruction layer (Shuffle layer), the up scale of the pixel reconstruction layer is 8, the pixel reconstruction layer can restore the result of the resolution after the magnification, and after the processing of the pixel reconstruction layer, the image with the size of 1 × 1 × 1920 × 1080 can be output, that is, the channel of the image can be converted from 64 channels to 1 channel. Then, the luminance channel image having a resolution of 1 × 1 × 1920 × 1080 that has passed through the linear interpolation layer (Bilinear layer) and the luminance channel image having passed through the pixel reconstruction layer and the conv layer are input to the addition layer (Add layer), and the image obtained at the pixel reconstruction layer and the image obtained at the linear interpolation layer are subjected to superimposition processing to obtain a combined image having more details. The combined image obtained by the addition layer is input to a reverse normalization layer (Mul layer), so that the image output from the Add layer can be subjected to reverse normalization processing. Specifically, the value range of the pixel points in the image output from the Add layer can be restored from the range of [0,1] to the range of [0,255] to obtain the target brightness channel image.

Based on the above description, the image processing method of the present application will be further described below by way of example. For example, referring to fig. 9, a super-resolution reconstruction network structure according to an embodiment of the present application is as follows:

the super-resolution reconstruction network structure provided by the embodiment of the application sequentially comprises a normalization layer (Div layer), a pixel splitting layer (Unshuffle layer), a conv1 (convolution 1) layer, a conv2 (convolution 2) layer, a conv3 (convolution 3) layer, a conv4 (convolution 4) layer, a conv5 (convolution 5) layer, a pixel reconstruction layer (Shuffle layer), a linear interpolation layer (Bilinear layer), an addition layer (Add layer) and a reverse normalization layer (Mul layer). The weight of the conv1 layer is 3 × 3 × 16 × 20, the weight of the conv2 layer is 3 × 3 × 20 × 20, the weight of the conv3 layer is 3 × 3 × 20 × 20, the weight of the conv4 layer is 3 × 3 × 20 × 20, and the weight of the conv5 layer is 1 × 1 × 20 × 64. Add of the conv2 layer, the conv3 layer and the conv4 layer in the embodiment of the present application is deleted. The designation parameter (down-scale) of the Un Shuffle layer is 4, and the designation parameter (up-scale) of the Shuffle layer is 8. The super-resolution reconstruction network provided by the embodiment of the application can convert the image resolution from 540p to 1080p, the Peak Signal to Noise Ratio (psnr) is 33.22dB, and the Frame Per Second (FPS) is 30FPS under the condition that the image magnification is 2 times, although the image processing effect is slightly reduced compared with the network structure provided by fig. 7, the processing speed for super-resolution reconstruction can be greatly improved.

Based on the above description, the image processing method of the present application will be further described below by way of example. For example, referring to fig. 10, a super-resolution reconstruction network structure according to an embodiment of the present application is as follows:

the super-resolution reconstruction network structure provided by the embodiment of the application sequentially comprises a normalization layer (Div layer), a pixel splitting layer (Unshuffle layer), a conv1 (convolution 1) layer, a conv5 (convolution 5) layer, a pixel reconstruction layer (Shuffle layer), a linear interpolation layer (Bilinear layer), an addition layer (Add layer) and a reverse normalization layer (Mul layer). The weight of the conv1 layer is 3 × 3 × 16 × 20, and the weight of the conv5 layer is 1 × 1 × 20 × 64. In the embodiment of the present application, the conv2 layer, the conv3 layer and the conv4 layer provided in the above embodiments are deleted. The designation parameter (down-scale) of the Un Shuffle layer is 4, and the designation parameter (up-scale) of the Shuffle layer is 8. The super-resolution reconstruction network provided by the embodiment of the application can convert the image resolution from 540p to 1080p, the Peak Signal to Noise Ratio (psnr) is 32.82dB, and the Frame Per Second (FPS) is 42FPS under the condition that the image magnification is 2 times, although the image processing effect is slightly reduced compared with the network structure provided by fig. 7, the processing speed for super-resolution reconstruction can be greatly improved. Specifically, a 540 p-sized luminance channel image with a size of 1 × 1 × 960 × 540 is input to the normalization layer (Div layer), and data in the luminance channel image is normalized to a range between [0,1] by the normalization layer. The luminance channel image input pixel subjected to the normalization processing by the normalization layer is divided into layers (Unshuffle layers) to obtain a predetermined image reduction ratio, and at this time, the obtained image reduction ratio (down scale) is 4, and since the pixel points on the 4 × 4 region on the initial luminance channel image are respectively allocated to 16 channels, the luminance channel image having the resolution of 1 × 1 × 960 × 540 can be changed to the luminance channel image having the resolution of 1 × 16 × 240 × 135. Then, the luminance channel image subjected to the pixel division layer processing is input to the conv1 layer having a weight of 3 × 3 × 16 × 20, subjected to channel conversion, and subjected to the conv1 layer processing, thereby outputting a luminance channel map having a size of 1 × 20 × 240 × 135. Then, the luminance channel map processed by the conv1 layer is input to the conv5 layer with the weight of 1 × 1 × 20 × 64, and is subjected to channel transformation and conv5 layer processing, so that a luminance channel map with the size of 1 × 20 × 240 × 135 can be output. When the image passing through the conv5 layer is input into the pixel reconstruction layer (Shuffle layer), the up scale of the pixel reconstruction layer is 8, the pixel reconstruction layer can restore the result of the resolution after the magnification, and after the processing of the pixel reconstruction layer, the image with the size of 1 × 1 × 1920 × 1080 can be output, that is, the channel of the image can be converted from 64 channels to 1 channel. Then, the luminance channel image having a resolution of 1 × 1 × 1920 × 1080 that has passed through the linear interpolation layer (Bilinear layer) and the luminance channel image having passed through the pixel reconstruction layer and the conv layer are input to the addition layer (Add layer), and the image obtained at the pixel reconstruction layer and the image obtained at the linear interpolation layer are subjected to superimposition processing to obtain a combined image having more details. The combined image obtained by the addition layer is input to a reverse normalization layer (Mul layer), so that the image output from the Add layer can be subjected to reverse normalization processing. Specifically, the value range of the pixel points in the image output from the Add layer can be restored from the range of [0,1] to the range of [0,255] to obtain the target brightness channel image.

Based on the above description, the image processing method of the present application will be further described below by way of example. For example, referring to fig. 11, a super-resolution reconstruction network structure according to an embodiment of the present application is as follows:

the super-resolution reconstruction network structure provided by the embodiment of the application sequentially comprises a normalization layer (Div layer), a pixel splitting layer (Unshuffle layer), a conv1 (convolution 1) layer, a conv5 (convolution 5) layer, a pixel reconstruction layer (Shuffle layer), a linear interpolation layer (Bilinear layer), an addition layer (Add layer) and a reverse normalization layer (Mul layer). The weight of the conv1 layer is 3 × 3 × 16 × 20, and the weight of the conv5 layer is 1 × 1 × 20 × 64. In the embodiment of the present application, the conv2 layer, the conv3 layer and the conv4 layer provided in the above embodiments are deleted. The designation parameter (down-scale) of the Un Shuffle layer is 4, and the designation parameter (up-scale) of the Shuffle layer is 8. Specifically, the conv1 layer (shown by a dashed box a in the figure) provided in the embodiment of the present application is obtained by training the first preset model (shown by a dashed box a in the figure) through a re-parameter technique. The super-resolution reconstruction network provided by the embodiment of the application can convert the image resolution from 540p to 1080p, the Peak Signal to Noise Ratio (psnr) is 33.65dB, and the Frame Per Second (FPS) is 42FPS under the condition that the image magnification is 2 times, although the image processing effect is slightly reduced compared with the network structure provided by fig. 7, the processing speed for super-resolution reconstruction can be greatly improved.

Based on the above description, the image processing method of the present application will be further described below by way of example. For example, referring to fig. 12, a super-resolution reconstruction network structure according to an embodiment of the present application is as follows:

the super-resolution reconstruction network structure provided by the embodiment of the application sequentially comprises a normalization layer (Div layer), a pixel splitting layer (Unshuffle layer), a conv1 (convolution 1) layer, a conv2 (convolution 2) layer, a conv5 (convolution 5) layer, a pixel reconstruction layer (Shuffle layer), a linear interpolation layer (Bilinear layer), an addition layer (Add layer) and a reverse layer (Mul layer). The weight of the conv1 layer is 3 × 3 × 16 × 20, the weight of the conv2 layer is 3 × 3 × 20 × 20, and the weight of the conv5 layer is 1 × 1 × 20 × 64. In the embodiment of the present application, the conv3 layer and the conv4 layer provided in the above embodiments are deleted. The designation parameter (down-scale) of the Un Shuffle layer is 4, and the designation parameter (up-scale) of the Shuffle layer is 8. Specifically, the conv1 layer (shown by a dashed line frame a in the figure) provided in the embodiment of the present application is obtained by training a first preset model (shown by a dashed line frame a in the figure) through a heavy parameter technique, and then merging the trained first preset model (shown by a dashed line frame a in the figure); the conv2 layer (shown by a dashed box B in the figure) provided in the embodiment of the present application is obtained by training the second preset model (shown by a dashed box B in the figure) by a heavy parameter technique, and then merging the trained second preset model (shown by a dashed box B in the figure). The super-resolution reconstruction network provided by the embodiment of the application can convert the image resolution from 540p to 720p, the Peak Signal to Noise Ratio (psnr) is 38.21dB, and the Frame Per Second (FPS) is 50FPS under the condition that the image magnification is 4/3 times, although compared with the network structure provided by fig. 7, the image processing effect is greatly improved, and the processing speed for super-resolution reconstruction can also be greatly improved.

In order to better implement the image processing method provided by the embodiment of the present application, the embodiment of the present application further provides an image processing apparatus corresponding to the image processing method. The terms are the same as those in the image processing method, and details of implementation can be referred to the description in the method embodiment.

Referring to fig. 13, fig. 13 is a block diagram of an image processing apparatus according to an embodiment of the present disclosure, the apparatus including:

a first obtaining unit 201, configured to obtain an image to be processed with a first resolution, where a format of the image to be processed is a YUV channel format;

a second obtaining unit 202, configured to obtain a to-be-processed luminance channel map and a to-be-processed chrominance channel map of the to-be-processed image, where a resolution of the to-be-processed luminance channel map is a first resolution, and a resolution of the to-be-processed chrominance channel map is a first resolution;

a first generating unit 203, configured to perform super-resolution reconstruction processing on the luminance channel map to be processed to generate a target luminance channel map, where the super-resolution reconstruction processing includes performing channel conversion processing and/or image feature extraction processing on the luminance channel map to be processed;

the processing unit 204 is configured to process the to-be-processed chrominance channel map according to a specified amplification parameter to obtain a target chrominance channel map, where a resolution of the target chrominance channel map is the target resolution;

a second generating unit 205, configured to generate a target image based on the target luminance channel map and the target chrominance channel map.

In some embodiments, the apparatus further comprises:

a first acquisition unit for sending an image acquisition request to a server;

In some embodiments, the apparatus further comprises:

The embodiment of the application provides an image processing apparatus, which is configured to obtain, by a first obtaining unit 201, an image to be processed with a first resolution, where a format of the image to be processed is a YUV channel format; a second obtaining unit 202, configured to obtain a to-be-processed luminance channel map and a to-be-processed chrominance channel map of the to-be-processed image, where a resolution of the to-be-processed luminance channel map is a first resolution, and a resolution of the to-be-processed chrominance channel map is a first resolution; a first generating unit 203, configured to perform super-resolution reconstruction processing on the luminance channel map to be processed to generate a target luminance channel map, where the super-resolution reconstruction processing includes performing channel conversion processing and/or image feature extraction processing on the luminance channel map to be processed; the processing unit 204 is configured to process the to-be-processed chrominance channel map according to a specified amplification parameter to obtain a target chrominance channel map, where a resolution of the target chrominance channel map is the target resolution; a second generating unit 205, configured to generate a target image based on the target luminance channel map and the target chrominance channel map. According to the method and the device, the resolution can be quickly amplified from the input low-resolution image to obtain the high-resolution image, the reconstruction speed of super-resolution reconstruction can be increased, and the image display quality is improved.

Correspondingly, the embodiment of the present application further provides a Computer device, where the Computer device may be a terminal or a server, and the terminal may be a terminal device such as a smart phone, a tablet Computer, a notebook Computer, a touch screen, a game machine, a Personal Computer (PC), a Personal Digital Assistant (PDA), and the like. As shown in fig. 14, fig. 14 is a schematic structural diagram of a computer device according to an embodiment of the present application. The computer apparatus 300 includes a processor 301 having one or more processing cores, a memory 302 having one or more storage media, and a computer program stored on the memory 302 and executable on the processor. The processor 301 is electrically connected to the memory 302. Those skilled in the art will appreciate that the computer device configurations illustrated in the figures are not meant to be limiting of computer devices and may include more or fewer components than those illustrated, or some components may be combined, or a different arrangement of components.

The processor 301 is a control center of the computer apparatus 300, connects various parts of the entire computer apparatus 300 by various interfaces and lines, performs various functions of the computer apparatus 300 and processes data by running or loading software programs and/or modules stored in the memory 302, and calling data stored in the memory 302, thereby monitoring the computer apparatus 300 as a whole.

In the embodiment of the present application, the processor 301 in the computer device 300 loads instructions corresponding to processes of one or more application programs into the memory 302, and the processor 301 executes the application programs stored in the memory 302 according to the following steps, so as to implement various functions:

The above operations can be implemented in the foregoing embodiments, and are not described in detail herein.

Optionally, as shown in fig. 14, the computer device 300 further includes: a touch display 303, a radio frequency circuit 304, an audio circuit 305, an input unit 306, and a power source 307. The processor 301 is electrically connected to the touch display 303, the radio frequency circuit 304, the audio circuit 305, the input unit 306, and the power source 307. Those skilled in the art will appreciate that the computer device configuration illustrated in FIG. 14 is not intended to be limiting of computer devices and may include more or fewer components than those illustrated, or some components may be combined, or a different arrangement of components.

The touch display screen 303 may be used for displaying a graphical user interface and receiving operation instructions generated by a user acting on the graphical user interface. The touch display screen 303 may include a display panel and a touch panel. The display panel may be used, among other things, to display information entered by or provided to a user and various graphical user interfaces of the computer device, which may be made up of graphics, text, icons, video, and any combination thereof. Alternatively, the Display panel may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like. The touch panel may be used to collect touch operations of a user on or near the touch panel (for example, operations of the user on or near the touch panel using any suitable object or accessory such as a finger, a stylus pen, and the like), and generate corresponding operation instructions, and the operation instructions execute corresponding programs. Alternatively, the touch panel may include two parts, a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 301, and can receive and execute commands sent by the processor 301. The touch panel may overlay the display panel, and when the touch panel detects a touch operation thereon or nearby, the touch panel transmits the touch operation to the processor 301 to determine the type of the touch event, and then the processor 301 provides a corresponding visual output on the display panel according to the type of the touch event. In the embodiment of the present application, the touch panel and the display panel may be integrated into the touch display screen 303 to realize input and output functions. However, in some embodiments, the touch panel and the touch panel can be implemented as two separate components to perform the input and output functions. That is, the touch display screen 303 may also be used as a part of the input unit 306 to implement an input function.

In the present embodiment, a graphical user interface is generated on the touch-sensitive display screen 303 by the processor 301 executing a game application. The touch display screen 303 is used for presenting a graphical user interface and receiving an operation instruction generated by a user acting on the graphical user interface.

The rf circuit 304 may be used for transceiving rf signals to establish wireless communication with a network device or other computer device via wireless communication, and for transceiving signals with the network device or other computer device.

The audio circuit 305 may be used to provide an audio interface between the user and the computer device through speakers, microphones. The audio circuit 305 may transmit the electrical signal converted from the received audio data to a speaker, and convert the electrical signal into a sound signal for output; on the other hand, the microphone converts the collected sound signal into an electric signal, which is received by the audio circuit 305 and converted into audio data, which is then processed by the audio data output processor 301, and then transmitted to, for example, another computer device via the radio frequency circuit 304, or output to the memory 302 for further processing. The audio circuit 305 may also include an earbud jack to provide communication of a peripheral headset with the computer device.

The input unit 306 may be used to receive input numbers, character information, or user characteristic information (e.g., fingerprint, iris, facial information, etc.), and to generate keyboard, mouse, joystick, optical, or trackball signal inputs related to user settings and function control.

The power supply 307 is used to power the various components of the computer device 300. Optionally, the power supply 307 may be logically connected to the processor 301 through a power management system, so as to implement functions of managing charging, discharging, and power consumption management through the power management system. Power supply 307 may also include any component of one or more dc or ac power sources, recharging systems, power failure detection circuitry, power converters or inverters, power status indicators, and the like.

Although not shown in fig. 14, the computer device 300 may further include a camera, a sensor, a wireless fidelity module, a bluetooth module, etc., which are not described in detail herein.

In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

As can be seen from the above, in the computer device provided in this embodiment, by obtaining the to-be-processed image with the first resolution, the format of the to-be-processed image is the YUV channel format; acquiring a to-be-processed brightness channel map and a to-be-processed chrominance channel map of the to-be-processed image, wherein the resolution of the to-be-processed brightness channel map is a first resolution, and the resolution of the to-be-processed chrominance channel map is a first resolution; performing super-resolution reconstruction processing on the brightness channel map to be processed to generate a target brightness channel map, wherein the super-resolution reconstruction processing comprises performing channel conversion processing and/or image feature extraction processing on the brightness channel map to be processed; processing the chrominance channel map to be processed according to the specified amplification parameters to obtain a target chrominance channel map, wherein the resolution of the target chrominance channel map is the target resolution; and generating a target image based on the target brightness channel map and the target chroma channel map. According to the super-resolution reconstruction method and device, a simplified super-resolution reconstruction model is constructed in advance, the super-resolution reconstruction model can rapidly amplify the resolution of the input low-resolution image to obtain the high-resolution image, the reconstruction speed of super-resolution reconstruction can be increased, and the image display quality is improved.

It will be understood by those skilled in the art that all or part of the steps in the methods of the above embodiments may be performed by instructions or by instructions controlling associated hardware, and the instructions may be stored in a storage medium and loaded and executed by a processor.

To this end, the present application provides a storage medium, in which a plurality of computer programs are stored, and the computer programs can be loaded by a processor to execute the steps in any one of the image processing methods provided by the embodiments of the present application. For example, the computer program may perform the steps of:

Wherein the storage medium may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.

Since the computer program stored in the storage medium can execute the steps in any image processing method provided in the embodiments of the present application, the beneficial effects that can be achieved by any image processing method provided in the embodiments of the present application can be achieved, and detailed descriptions are omitted here for the foregoing embodiments.

The foregoing describes an image processing method, an image processing apparatus, a computer device, and a storage medium provided in the embodiments of the present application in detail, and a specific example is applied in the present application to explain the principle and the implementation of the present application, and the description of the foregoing embodiments is only used to help understand the technical solutions and the core ideas of the present application; those of ordinary skill in the art will understand that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications or substitutions do not depart from the spirit and scope of the present disclosure as defined by the appended claims.

Claims

1. An image processing method, comprising:

performing super-resolution reconstruction processing on the brightness channel map to be processed to generate a target brightness channel map, wherein the super-resolution reconstruction processing includes performing channel conversion processing and/or image feature extraction processing on the brightness channel map to be processed, the resolution of the target brightness channel map is a second resolution, and the second resolution is higher than the first resolution;

processing the chroma channel map to be processed according to the specified amplification parameters to obtain a target chroma channel map, wherein the resolution of the target chroma channel map is the second resolution;

2. The method according to claim 1, further comprising, before performing super-resolution reconstruction processing on the luminance channel map to be processed:

3. The method according to claim 2, wherein the performing super-resolution reconstruction processing on the first luminance channel map to generate a target luminance channel map comprises:

4. The method according to claim 3, wherein the inputting the third luminance channel map into at least one feature extraction convolution layer for image feature extraction processing to obtain a fourth luminance channel map comprises:

5. The method according to claim 3, wherein after inputting the first luminance channel map into the linear interpolation layer, before inputting the first luminance channel map into the first channel conversion convolution layer for channel conversion processing, the method comprises:

6. The method of claim 5, wherein the generating a first to-be-processed luma feature map set by performing channel conversion processing on all luma feature maps in the first luma feature map set input into a first channel conversion convolutional layer comprises:

inputting all third luminance channel graphs in the first to-be-processed luminance characteristic graph set into at least one characteristic extraction convolution layer for image characteristic extraction processing to obtain a to-be-superimposed luminance channel graph;

and generating a fourth brightness channel map based on the brightness channel map to be superposed and the third brightness channel map corresponding to the brightness channel map to be superposed.

7. The method of claim 5, wherein the feature extraction convolutional layers comprise a first feature extraction convolutional layer, a second feature extraction convolutional layer, and a third feature extraction convolutional layer, wherein weights of the first feature extraction convolutional layer, the second feature extraction convolutional layer, and the third feature extraction convolutional layer are weighted equally;

after inputting all the luminance feature maps in the first luminance feature map set into the first channel conversion convolution layer for channel conversion processing, the method further includes:

8. The method according to claim 5, wherein after inputting all luma feature maps in the first luma feature map set into the first channel transform convolution layer for channel transform processing to generate the first to-be-processed luma feature map set, further comprising:

9. The method of claim 5, further comprising:

training a first preset network structure by adopting a sample set to obtain the trained first preset network structure and a first target parameter, wherein the first preset network structure consists of a first preset convolutional layer and a second preset convolutional layer;

and generating a first channel conversion convolution layer based on the first preset network structure and the first target parameter.

10. The method of claim 3, wherein generating a target luminance channel map based on the second luminance channel map and the fifth luminance channel map comprises:

11. The method of claim 1, further comprising, prior to acquiring the image to be processed at the first resolution:

sending an image acquisition request to a server;

12. The method of claim 1, further comprising, prior to acquiring the image to be processed at the first resolution:

when a detection simulator detects a trigger picture display instruction in the game engine, the simulator acquires an image to be processed rendered by the game engine from the game engine.

13. The method of claim 1, further comprising, after acquiring the image to be processed at the first resolution:

14. The method according to claim 1, wherein the processing the to-be-processed chroma channel map according to the specified amplification parameter to obtain a target chroma channel map comprises:

15. The method of claim 14, wherein generating a target image based on the target luminance channel map and the target chrominance channel map comprises:

16. An image processing apparatus characterized by comprising:

17. A computer arrangement comprising a processor, a memory and a computer program stored on the memory and executable on the processor, the computer program, when executed by the processor, implementing the steps of the image processing method according to any one of claims 1 to 15.

18. A storage medium, characterized in that the storage medium has stored thereon a computer program which, when being executed by a processor, carries out the steps of the image processing method according to any one of claims 1 to 15.