CN116433491A

CN116433491A - Image processing method, device, equipment, storage medium and product

Info

Publication number: CN116433491A
Application number: CN202310430826.0A
Authority: CN
Inventors: 潘翔
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2023-04-11
Filing date: 2023-04-11
Publication date: 2023-07-14

Abstract

The embodiment of the application discloses an image processing method, an image processing device, image processing equipment, a storage medium and a product. The method comprises the following steps: the method comprises the steps of obtaining a cascading tensor of a target image block, carrying out feature extraction processing on the cascading tensor of the target image block to obtain a feature image of the target image block, obtaining normalization scaling factors corresponding to each feature sub-image in the feature image, carrying out normalization processing on the corresponding feature sub-image based on the normalization scaling factors corresponding to each feature sub-image to obtain a normalization result of the feature image block, and using the normalization result of the feature image block to generate a reconstruction image block of the target image block. Therefore, the normalization processing is carried out on the corresponding feature subgraphs through the normalization scaling factors corresponding to the feature subgraphs, so that the noise introduced in the normalization process can be reduced, and the quality of the reconstructed image blocks is improved.

Description

Image processing method, device, equipment, storage medium and product

Technical Field

The present invention relates to the field of computer technology, and in particular, to an image processing method, an image processing apparatus, a computer device, a computer readable storage medium, and an image processing product.

Background

With the progress of scientific research, an end-to-end image codec scheme based on deep learning is widely used in the field of image transmission. It has been found that the filtering of characteristic information of display components (e.g., luminance components, chrominance components, etc.) of individual tiles in an image is involved in the image codec process. How to improve the quality of the reconstructed tile by filtering the feature information of the display component of the tile is a popular problem in current research.

Disclosure of Invention

The embodiment of the application provides an image processing method, an image processing device, image processing equipment, a computer readable storage medium and a computer readable storage medium product, which can improve the quality of a reconstructed image block.

In one aspect, an embodiment of the present application provides an image processing method, including:

acquiring a cascade tensor of the target image block, wherein the cascade tensor is obtained by splicing independent tensors of each display component of the target image block;

performing feature extraction processing on the cascade tensor to obtain a feature map of the cascade tensor, wherein the feature map comprises N feature subgraphs, and N is a positive integer;

acquiring a normalization scaling factor corresponding to each feature subgraph;

respectively carrying out normalization processing on the corresponding feature subgraphs based on the normalization scaling factors corresponding to each feature subgraph to obtain the normalization result of the feature subgraphs; the normalized results of the feature map are used to generate a reconstructed tile of the target tile.

In one aspect, an embodiment of the present application provides an image processing apparatus, including:

the acquisition unit is used for acquiring the cascade tensor of the target image block, wherein the cascade tensor is obtained by splicing independent tensors of each display component of the target image block;

the processing unit is used for carrying out feature extraction processing on the cascade tensor to obtain a feature map of the cascade tensor, wherein the feature map comprises N feature subgraphs, and N is a positive integer;

the obtaining unit is also used for obtaining the normalization scaling factors corresponding to each characteristic subgraph;

the processing unit is used for respectively carrying out normalization processing on the corresponding feature subgraphs based on the normalization scaling factors corresponding to each feature subgraph to obtain the normalization result of the feature graph; the normalized results of the feature map are used to generate a reconstructed tile of the target tile.

In one embodiment, the processing unit is configured to obtain a normalized scaling factor corresponding to each feature subgraph, specifically configured to:

obtaining norm values of N feature subgraphs;

calculating the ratio of the norm value of the ith feature subgraph to the sum of the norms of the N feature subgraphs to obtain a normalized scaling factor corresponding to the ith feature subgraph, wherein i is a positive integer less than or equal to N.

In one embodiment, each feature sub-graph includes at least one feature element; the processing unit is used for acquiring the norm values of the N feature subgraphs, and is specifically used for:

and calculating the norm value of the corresponding feature subgraph according to the feature elements included in each feature subgraph.

In one embodiment, the processing unit is configured to perform feature extraction processing on the cascade tensor of the target tile to obtain a feature map of the target tile, and specifically is configured to:

performing feature extraction processing on the cascade tensor of the target block through a convolution layer to obtain a feature map of the target block;

the number of channels of the convolution layer is N, and the N feature subgraphs are in one-to-one correspondence with the N channels.

In one embodiment, the target tile is any tile in the image to be processed, and the processing unit is configured to obtain a cascade tensor of the target tile, specifically configured to:

acquiring code stream data of an image to be processed, wherein the code stream data comprises residual information of a target image block, and the residual information of the target image block comprises residual values of various display components of the target image block;

determining characteristic information of each display component of the target image block based on residual values of each display component of the target image block;

performing first discrete wavelet transformation on the characteristic information of each display component of the target block to obtain independent tensors of each display component of the target block;

And splicing independent tensors of each display component of the target block to obtain a cascade tensor of the target block.

In one embodiment, the processing unit is configured to determine, based on the residual values of the respective display components of the target tile, feature information of the respective display components of the target tile, specifically configured to:

obtaining predicted values of all display components of a target block;

calculating a transformed value of each display component of the target tile based on the residual value of the display component and the predicted value of the display component;

and carrying out synthesis transformation processing on the transformation value of each display component of the target block to obtain the characteristic information of each display component.

In one implementation, the display component of the target tile includes a luminance component and a chrominance component; the processing unit is configured to perform a first discrete wavelet transform on the feature information of each display component of the target tile to obtain an independent tensor of each display component of the target tile, and specifically configured to:

performing interpolation processing on the chrominance components of the target block to obtain interpolation results of the chrominance components of the target block;

and respectively carrying out first discrete wavelet transform on interpolation results of the luminance component of the target block and the chrominance component of the target block to obtain an independent tensor of the luminance component of the target block and an independent tensor of the chrominance component of the target block.

In one embodiment, the processing unit is further configured to:

determining a filtering result of an independent tensor of each display component of the target block based on the normalization result of the feature map;

performing second discrete wavelet transform on the filtering result of the independent tensor of each display component of the target image block to obtain the display component of the target image block;

a reconstructed tile of the target tile is generated from the display component of the target tile.

In one embodiment, the processing unit is configured to determine, based on the normalized result of the feature map, a filtering result of an independent tensor for each display component of the target tile, specifically configured to:

activating the normalized result of the feature map to obtain an activated result of the feature map;

filtering the activation result of the feature map through at least one residual block and a convolution network to obtain a filtering result of the feature map;

sampling the filtering result of the feature map to obtain a sampling result of the feature map;

and combining the sampling result of the feature map with the independent tensor of each display component of the target block respectively to obtain the filtering result of the independent tensor of each display component of the target block.

Accordingly, the present application provides a computer device comprising:

A memory in which a computer program is stored;

and the processor is used for loading a computer program to realize the image processing method.

Accordingly, the present application provides a computer readable storage medium storing a computer program adapted to be loaded by a processor and to perform the above-described image processing method.

Accordingly, the present application provides a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions so that the computer device performs the above-described image processing method.

In the embodiment of the application, the cascade tensor of the target image block is obtained, the cascade tensor of the target image block is subjected to feature extraction processing to obtain a feature image of the target image block, the normalization scaling factors corresponding to each feature sub-image in the feature image are obtained, the normalization processing is performed on the corresponding feature sub-image based on the normalization scaling factors corresponding to each feature sub-image to obtain a normalization result of the feature image, and the normalization result of the feature image is used for generating a reconstruction image block of the target image block. Therefore, the normalization processing is carried out on the corresponding feature subgraphs through the normalization scaling factors corresponding to the feature subgraphs, so that the noise introduced in the normalization process can be reduced, and the quality of the reconstructed image blocks is improved.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1a is a schematic view of an end-to-end image codec according to an embodiment of the present application;

FIG. 1b is a schematic diagram of an image processing scheme according to an embodiment of the present application;

fig. 2 is a flowchart of an image processing method according to an embodiment of the present application;

FIG. 3 is a flowchart of another image processing method according to an embodiment of the present disclosure;

fig. 4 is a schematic diagram of a channel filtering module according to an embodiment of the present application;

fig. 5 is a schematic diagram of a residual block according to an embodiment of the present application;

fig. 6 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of a computer device according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.

The present application relates to artificial intelligence and machine learning techniques, and is briefly described below:

artificial intelligence (Artificial Intelligence, AI): AI is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision. The application mainly relates to normalization processing of each feature subgraph in a feature map through a trained nonlinear normalization module.

AI technology is a comprehensive discipline, and relates to a wide range of technologies, both hardware and software. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, processing technology for large applications, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.

Machine Learning (ML) is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, etc. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, confidence networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like. According to the embodiment of the application, the parameters alpha and beta in the nonlinear normalization module are trained mainly through the difference between the normalization processing result of the training data and the labeling data of the training data.

Furthermore, the present application relates to end-to-end image codec based on deep learningIs a scene of (a). Fig. 1a is a schematic view of an end-to-end image codec according to an embodiment of the present application. As shown in fig. 1a, the end-to-end image codec scenario mainly includes an analysis transformation module (Analysis Transform Net), a context model network (Context Model Net), a super-coding module (Hyper Encoder Net), a prediction fusion module (Prediction Fusion Net), a super-decoding module (Hyper Decoder Net), a super-scale decoding block (Hyper Scale Decoder Net), a synthesis transformation module (Synthesis Transform Net), and a channel filtering module (Content Adaptive Filter). Wherein the analysis transformation module is used for transforming the luminance component (x _Y ) And chrominance component (x _UV ) Nonlinear transformation is performed to obtain a transformation result (y _Y ) The result of transformation of the chrominance component (y _UV ). The prediction fusion module is used for predicting the brightness component transformation value (mu) _Y ) And chrominance component transform value (mu) _UV ) The input data of the prediction fusion module is the output data of the context model network and the super decoder module; wherein, the input data of the context model network is the brightness component transformation value output by the self-adaptive quantization module

The input data of the super decoder module is super parameter based on code stream data>

The super-scale decoder is configured to determine a gaussian distribution based on the code stream data. The synthesis transformation module is used for transforming the brightness component of the target block>

And chrominance component transform value +.>

And carrying out synthesis transformation processing to obtain the characteristic information of the luminance component of the target block and the characteristic information of the chrominance component of the target block. The channel filtering module is used for processing the characteristic information of the luminance component of the target block and the characteristic information of the chrominance component of the target blockLine filtering to obtain the luminance component +.>

And chrominance component->

In a scene of end-to-end image codec based on deep learning, an embodiment of the present application provides an image processing scheme to improve quality of reconstructed tiles. The image processing scheme may be applied in the channel filtering module of fig. 1 a. Fig. 1b is a schematic diagram of an image processing scheme provided in an embodiment of the present application, as shown in fig. 1b, where the image processing scheme provided in the present application may be executed by a computer device 101, and the computer device 101 may be a terminal device or a server. The terminal device may include, but is not limited to: smart phones (such as Android phones, IOS phones, etc.), tablet computers, portable personal computers, mobile internet devices (Mobile Internet Devices, abbreviated as MID), vehicle terminals, smart home appliances, unmanned aerial vehicles, wearable devices, etc., which are not limited in this embodiment of the present application. The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDN (Content Delivery Network ), basic cloud computing services such as big data and an artificial intelligent platform, which is not limited in the embodiment of the present application.

It should be noted that the number of computer devices in fig. 1b is only for example, and does not constitute a practical limitation of the present application; for example, fig. 1b may further include a computer device 102, where the computer device 102 may be configured to communicate code stream data of the image to be processed to the computer device 101. The computer device 101 and the computer device 102 may be connected by wired or wireless means, which is not limited in this application.

In a specific implementation, the general principle of this image processing scheme is as follows:

(1) The computer device 101 obtains the cascade tensor of the target tile. The cascade tensor is obtained by splicing independent tensors of each display component of the target block. In one embodiment, the target tile may be any one of the tiles in the image to be processed, and the computer device 101 obtains the code stream data of the image to be processed, the code stream data including residual information of the target tile, the residual information of the target tile including residual values of respective display components of the target tile. In one embodiment, the display component is determined based on the format of the image to be processed; when the pixel format of the image to be processed is RGB, the display component may include an R component, a G component, and a B component, and when the pixel format of the image to be processed is YUV, the display component may include a luminance (Y) component and a chrominance (UV) component. After obtaining the residual values of the respective display components of the target tile, the computer device 101 determines feature information of the respective display components of the target tile based on the residual values of the respective display components of the target tile; in one embodiment, the computer device 101 may obtain the predicted values of the respective display components of the target tile, calculate the transformed values of the display components based on the residual values of each display component of the target tile and the predicted values of the display components, and perform the synthetic transformation on the transformed values of each display component of the target tile to obtain the feature information of each display component. Further, the computer device 101 performs a first discrete wavelet transform on the feature information of each display component of the target tile to obtain an independent tensor of each display component of the target tile, and splices the independent tensors of each display component of the target tile to obtain a cascade tensor of the target tile.

(2) The computer device 101 performs feature extraction processing on the cascade tensor of the target tile, and obtains a feature map of the target tile. The feature map of the target block comprises N feature subgraphs, wherein N is a positive integer. In one embodiment, the computer device 101 may perform convolution processing on the cascade tensor of the target tile through a convolution layer with N channels to obtain a feature map of the target tile, where N feature subgraphs in the feature map of the target tile are in one-to-one correspondence with N channels in the convolution layer.

(3) The computer device 101 obtains a respective normalized scaling factor for each feature sub-graph. And each normalized scaling factor is used for carrying out normalization processing on the feature subgraph corresponding to the normalized scaling factor. The normalized scaling factors corresponding to the different feature subgraphs may be the same or different, and this application is not limited thereto.

In one embodiment, each feature sub-graph includes at least one feature element, and the computer device 101 calculates a norm (e.g., p-norm) of each feature sub-graph based on the feature elements included in the feature sub-graph, thereby obtaining a norm value of N feature sub-graphs. After deriving the normative value for each feature sub-graph, the computer device 101 may calculate a respective normalized scaling factor for each feature sub-graph by the normative value for each feature sub-graph. In one embodiment, the computer device 101 calculates a ratio of the norm value of the ith feature sub-graph to the sum of the norms of the N feature sub-graphs to obtain the normalized scaling factor corresponding to the ith feature sub-graph, where i is a positive integer less than or equal to N.

(4) The computer equipment 101 respectively performs normalization processing on the corresponding feature subgraphs based on the normalization scaling factors corresponding to each feature subgraph to obtain the normalization result of the feature graph; the normalized result of the feature map includes normalized results of the N feature subgraphs. In one embodiment, the normalization result of the feature map is made up of the normalization results of the N feature subgraphs. The normalized results of the feature map are used to generate a reconstructed tile of the target tile.

In one implementation, the computer device 101 determines the filtering result of the independent tensor for each display component of the target tile based on the normalized result of the feature map. After obtaining the filtering result of the independent tensor of each display component, the computer device 101 performs a second discrete wavelet transform on the filtering result of the independent tensor of each display component of the target tile to obtain the display component of the target tile, and then generates a reconstructed tile of the target tile through the display component of the target tile. Wherein the first discrete wavelet transform and the second discrete wavelet transform are inverse transforms to each other.

Based on the above image processing scheme, the embodiment of the present application proposes a more detailed image processing method, and the image processing method proposed by the embodiment of the present application will be described in detail below with reference to the accompanying drawings.

Referring to fig. 2, fig. 2 is a flowchart of an image processing method provided in an embodiment of the present application, where the image processing method may be performed by a computer device, and the computer device may be a terminal device or a server. As shown in fig. 2, the image processing method may include the following steps S201 to S204:

s201, acquiring a cascading tensor of the target block.

The image to be processed can be divided into one or more tiles based on actual conditions (or division rules) in the encoding and decoding processes, and the target tile is any tile in the image to be processed. The cascade tensor (concatenated tensor) of the target tile is obtained by stitching independent tensors (color tensors) of each display component of the target tile, and the independent tensors of each display component are obtained by performing a first discrete wavelet transform (Haar wavelet transform) on characteristic information of the display component. In one embodiment, the display component is determined based on the pixel format of the image to be processed. For example, when the pixel format of the image to be processed is RGB, the display component may include an R component, a G component, and a B component, and when the pixel format of the image to be processed is YUV, the display component may include a luminance (Y) component and a chrominance (UV) component.

In one embodiment, a computer device obtains code stream data of an image to be processed, the code stream data including residual information of a target tile, the residual information of the target tile including residual values of respective display components of the target tile. After obtaining the residual values of the display components of the target block, the computer equipment obtains the predicted values of the display components of the target block, calculates the transformation values of the display components based on the residual values of the display components of the target block and the predicted values of the display components, and then synthesizes and transforms the transformation values of the display components of the target block to obtain the characteristic information of each display component.

Further, the computer device performs a first discrete wavelet transform on the characteristic information of each display component of the target tile to obtain an independent tensor of each display component of the target tile. In one embodiment, the display component of the target tile includes a luminance component and a chrominance component; the computer device performs a first discrete wavelet transform on the characteristic information of each display component of the target tile, and the process of obtaining an independent tensor of each display component of the target tile includes: performing (bi-cubic) interpolation processing on the chrominance components of the target block to obtain interpolation results of the chrominance components of the target block, and performing first discrete wavelet transform on the luminance components of the target block and the interpolation results of the chrominance components of the target block to obtain independent tensors of the luminance components of the target block and independent tensors of the chrominance components of the target block. Specifically, the interpolation result of the chrominance component may include the interpolation result of the U component and the interpolation result of the V component; the computer equipment performs first discrete wavelet transformation on the brightness component to obtain an independent tensor of the brightness component; similarly, performing first discrete wavelet transformation on the U component to obtain an independent tensor of the U component; and performing first discrete wavelet transformation on the V component to obtain independent tensors of the V component. After the independent tensors of the display components are obtained, the independent tensors of each display component of the target image block are spliced, and the cascade tensors of the target image block are obtained.

S202, performing feature extraction processing on the cascade tensor of the target image block to obtain a feature map of the target image block.

The feature map of the target block comprises N feature subgraphs, wherein N is a positive integer. The manner in which features are extracted may include, but is not limited to: the characteristics of the cascade tensor of the target block are extracted through the convolution layer, the characteristics of the cascade tensor of the target block are extracted through the characteristic extraction model, and the characteristics of the cascade tensor of the target block are extracted through the filter, which is not limited in this application.

In one embodiment, the computer device may perform convolution processing on the cascade tensor through a convolution layer with the number of channels being N to obtain a feature map of the cascade tensor, where N feature subgraphs are in one-to-one correspondence with N channels; that is, after the convolution processing is performed on the cascade tensor by the convolution layer, a feature subgraph corresponding to each channel is obtained based on each channel of the convolution layer.

S203, acquiring the normalization scaling factors corresponding to the feature subgraphs.

And each normalized scaling factor is used for carrying out normalization processing on the feature subgraph corresponding to the sampling step length. The normalized scaling factors corresponding to the different feature subgraphs may be the same or different, and this application is not limited thereto.

In one embodiment, each feature sub-graph includes at least one feature element, and the computer device calculates a norm (e.g., p-norm) of each feature sub-graph based on the feature elements included in the feature sub-graph, to obtain a norm value of N feature sub-graphs. After obtaining the norm value of each feature sub-graph, the computer device may calculate the normalized scaling factor corresponding to each feature sub-graph through the norm value of each feature sub-graph. In one embodiment, the computer device calculates a ratio of the norm value of the ith feature sub-graph to the sum of the norms of the N feature sub-graphs to obtain a normalized scaling factor corresponding to the ith feature sub-graph, where i is a positive integer less than or equal to N.

S204, respectively carrying out normalization processing on the corresponding feature subgraphs based on the normalization scaling factors corresponding to the feature subgraphs to obtain the normalization result of the feature subgraphs.

The normalized result of the feature map includes normalized results of the N feature subgraphs. In one embodiment, the normalization result of the feature map is made up of the normalization results of the N feature subgraphs. The normalized results of the feature map are used to generate a reconstructed tile of the target tile.

In one embodiment, the computer device determines a filtering result of an independent tensor for each display component of the target tile based on the normalized result of the feature map. After the filtering result of the independent tensor of each display component is obtained, the computer equipment performs second discrete wavelet transformation on the filtering result of the independent tensor of each display component of the target block to obtain the display component of the target block, and then the reconstructed block of the target block is generated through the display component of the target block.

Referring to fig. 3, fig. 3 is a flowchart of another image processing method provided in an embodiment of the present application, where the image processing method may be performed by a computer device, and the computer device may be a terminal device or a server. As shown in fig. 3, the image processing method may include the following steps S301 to S306:

s301, acquiring a cascading tensor of the target block.

The specific embodiment of step S301 may refer to the embodiment of step S201 in fig. 2. Taking a pixel format as a YUV example, the following outlines the codec process:

Let the luminance component of the target block be x _Y The chrominance component being x _UV The packaging process of the code stream data comprises the following steps: brightness division of target image blocks by using neural networkNonlinear transformation is carried out on the quantity and the chrominance components to obtain a transformation result y of the luminance component _Y Transform result y of chrominance component _UV . Then the transform result y based on the luminance component _Y Transform result y of chrominance component _UV And the luminance component transform value μ of the target tile predicted by the prediction fusion module _Y And chrominance component transform value μ _UV Calculating residual value r of luminance component _Y Residual value r of chrominance component _UV . At the residual value r of the resulting luminance component _Y Residual value r of chrominance component _UV After that, the residual value r for the luminance component _Y Residual value r of chrominance component _UV Quantization processing is carried out to obtain residual values after quantization of the brightness component and the chromaticity component, which are respectively

And->

And generating the code stream data of the target block by the residual values quantized by the luminance component and the chrominance component.

Further, the computer equipment obtains the code stream data of the target block, decodes the code stream data to obtain the quantized residual values of the brightness component and the chroma component of the target block

And->

) The luminance component mu of the target block predicted by the prediction fusion module according to the quantized residual value of the luminance component and the chrominance component of the target block and the prediction fusion module _Y And chrominance component mu _UV Calculating luminance component conversion value +_of target tile>

And chrominance component transform value of target tile +.>

At the moment of obtaining the brightness component conversion value of the target block

And chrominance component transform value of target tile +.>

And then, carrying out synthesis transformation processing on the brightness component transformation value and the chromaticity component transformation value of the target block to obtain the characteristic information of the brightness component of the target block and the characteristic information of the chromaticity component of the target block, and then carrying out first discrete wavelet transformation on the characteristic information of the brightness component of the target block and the characteristic information of the chromaticity component of the target block to obtain the independent tensor of each display component of the target block.

In one embodiment, the display component of the target tile includes a luminance component and a chrominance component; the computer device performs a first discrete wavelet transform on the characteristic information of each display component of the target tile, and the process of obtaining an independent tensor of each display component of the target tile includes: performing (bi-cubic) interpolation processing on the chrominance components of the target block to obtain interpolation results of the chrominance components of the target block, and performing first discrete wavelet transform on the luminance components of the target block and the interpolation results of the chrominance components of the target block to obtain independent tensors of the luminance components of the target block and independent tensors of the chrominance components of the target block. Specifically, the interpolation result of the chrominance component may include the interpolation result of the U component and the interpolation result of the V component; the computer equipment performs first discrete wavelet transformation on the brightness component to obtain an independent tensor of the brightness component; similarly, performing first discrete wavelet transformation on the U component to obtain an independent tensor of the U component; and performing first discrete wavelet transformation on the V component to obtain independent tensors of the V component.

Still further, the computer device may stitch the independent tensors of each display component of the target tile to obtain the cascade tensors of the target tile. Wherein the chrominance components of the target tile may be further divided into a U component and a V component, which is not limited in this application.

S302, performing feature extraction processing on the cascade tensor of the target block through the convolution layer to obtain a feature map of the target block.

The feature map of the target block comprises N feature subgraphs, wherein N is a positive integer. The computer equipment can carry out convolution processing on the cascade tensor through a convolution layer with the number of channels of N to obtain a feature map of the cascade tensor, wherein N feature subgraphs are in one-to-one correspondence with N channels (such as N=48); that is, after the convolution processing is performed on the cascade tensor by the convolution layer, a feature subgraph corresponding to each channel is obtained based on each channel of the convolution layer. The scale of the convolution kernel in the convolution layer may be 3*3.

S303, calculating the norm value of the corresponding feature subgraph according to the feature elements contained in each feature subgraph.

Each feature sub-graph includes at least one feature element, and the computer device calculates a norm of each feature sub-graph based on the feature elements included in the feature sub-graph, which may include, but is not limited to, euclidean norms, p norms. The calculation mode of the p norm of the ith feature subgraph can be expressed as follows:

Wherein |x _i ‖ _p Representing the p-norm of the ith feature sub-graph, Q being the number of feature elements contained in the ith feature sub-graph, v _j Representing the j-th feature element in the i-th feature subgraph.

S304, calculating a normalization scaling factor corresponding to each feature subgraph based on the norm value of each feature subgraph.

In one embodiment, the computer device calculates a ratio of the norm value of the ith feature sub-graph to the sum of the norms of the N feature sub-graphs to obtain a normalized scaling factor corresponding to the ith feature sub-graph, where i is a positive integer less than or equal to N. Specifically, the method can be expressed as:

wherein g _i Representing the normalized scaling factor corresponding to the ith feature subgraph +|x _i ‖ _p Representing the p-norm of the ith feature sub-graph.

S305, respectively carrying out normalization processing on the corresponding feature subgraphs based on the normalization scaling factors corresponding to each feature subgraph to obtain the normalization result of the feature graphs.

Based on the normalized scaling factor corresponding to the ith feature subgraph, the computer device performs normalization processing on the ith feature subgraph through the nonlinear normalization module, and the obtained normalized result of the ith feature subgraph can be expressed as:

wherein,,

normalized result representing ith feature subgraph g _i Representing normalized scaling factor, x, corresponding to the ith feature subgraph _i Representing the ith characteristic subgraph, and alpha and beta are obtained by training a nonlinear normalization module. The training process of the nonlinear normalization module is as follows: carrying out normalization processing on the training data by adopting a nonlinear normalization module to obtain a normalization processing result of the training data; and optimizing initial values of alpha and beta based on the difference between the normalization processing result of the training data and the labeling data of the training data to obtain optimized alpha and beta.

According to the embodiment, the computer equipment can obtain the normalization results of the N feature subgraphs, so as to obtain the normalization results of the feature subgraphs.

S306, generating a reconstructed block of the target block through the normalization result of the feature map.

In one embodiment, the computer device determines a filtering result of an independent tensor for each display component of the target tile based on the normalized result of the feature map. Specifically, the computer equipment performs activation processing on the normalization result of the feature map to obtain an activation result of the feature map; among them, the activation functions used in the activation process may include, but are not limited to: sigmoid activation function, tanh activation function, reLU activation function. And then the computer equipment filters the activation result of the feature map through at least one residual block and a convolution network to obtain a filtering result of the feature map.

Further, the computer equipment samples the filtering result of the feature map to obtain a sampling result of the feature map. In one embodiment, the computer device may sample the filtering result of the feature map through a sampling step s to obtain a sampling result of the feature map. And combining the sampling result of the feature map with the independent tensor of each display component of the target block respectively to obtain the filtering result of the independent tensor of each display component of the target block.

Fig. 4 is a schematic diagram of a channel filtering module according to an embodiment of the present application. As shown in fig. 4, the channel filtering module includes: a convolutional layer, a nonlinear normalization module, an activation layer, and a residual block. The filtering process of the channel filtering module is as follows: firstly, feature extraction is carried out on cascading tensors of a target block through a convolution layer (the scale of a convolution kernel can be 3*3, and the number of channels can be 48), so that a feature map of the target block is obtained; and respectively carrying out normalization processing on each feature subgraph in the feature graphs through a nonlinear normalization module to obtain normalization results of the feature graphs, wherein specific embodiments can refer to embodiments in the step S303-the step S305, and details are not repeated here. And then activating the normalized result of the feature map through an activation layer, and filtering the activated result by adopting K residual blocks and a convolution layer (the scale of a convolution kernel can be 3*3 and the number of channels can be 4) to obtain a filtering result of the feature map. The residual block is used for reducing the computational complexity, and the residual block can be a one-dimensional residual block, wherein K is a positive integer.

Fig. 5 is a schematic diagram of a residual block according to an embodiment of the present application. As shown in fig. 5, the residual block includes a first convolution layer (the scale of the convolution kernel may be 1*3, the number of channels may be 48), a batch-based normalization module (batch norm),the active layer, the second convolution layer (the convolution kernel may have a scale of 3*1, the channel number may be 48), and the sampling step size during the sampling process may be denoted as s _n 。

After the filtering result of the feature map is obtained, the filtering result scaled by the sampling step s is respectively combined with the independent tensor of each display component of the target image block, so that the filtering result of the independent tensor of each display component of the target image block is obtained.

Further, the computer equipment performs second discrete wavelet transform on the filtering result of the independent tensor of each display component of the target image block to obtain the display component of the target image block; wherein the second discrete wavelet transform is an inverse of the first discrete wavelet transform. After obtaining the display components of the target image block, generating a reconstructed image block of the target image block through the display components of the target image block.

In the embodiment of the application, the cascade tensor of the target image block is obtained, the cascade tensor of the target image block is subjected to feature extraction processing to obtain a feature image of the target image block, the normalization scaling factors corresponding to each feature sub-image in the feature image are obtained, the normalization processing is performed on the corresponding feature sub-image based on the normalization scaling factors corresponding to each feature sub-image to obtain a normalization result of the feature image, and the normalization result of the feature image is used for generating a reconstruction image block of the target image block. Therefore, the normalization processing is carried out on the corresponding feature subgraphs through the normalization scaling factors corresponding to the feature subgraphs, so that the noise introduced in the normalization process can be reduced, and the quality of the reconstructed image blocks is improved. In addition, the normalization processing is carried out on the corresponding feature subgraphs through the normalization scaling factors corresponding to the feature subgraphs, so that the decoding performance of the code stream data can be improved; the residual block can reduce the calculation complexity and improve the decoding efficiency.

The foregoing details of the method of embodiments of the present application are set forth in order to provide a better understanding of the foregoing aspects of embodiments of the present application, and accordingly, the following provides a device of embodiments of the present application.

Referring to fig. 6, fig. 6 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present application, where the image processing apparatus shown in fig. 6 may be mounted in a computer device, and the computer device may be a terminal device or a server. The image processing device shown in fig. 6 may be used to perform some or all of the functions of the method embodiments described above with respect to fig. 2 and 3. Referring to fig. 6, the image processing apparatus includes:

an obtaining unit 601, configured to obtain a cascade tensor of the target tile, where the cascade tensor is obtained by stitching independent tensors of each display component of the target tile;

the processing unit 602 is configured to perform feature extraction processing on the cascade tensor of the target image block to obtain a feature map of the target image block, where the feature map includes N feature subgraphs, and N is a positive integer;

the obtaining unit 601 is further configured to obtain a normalized scaling factor corresponding to each feature subgraph;

the processing unit 602 is further configured to perform normalization processing on the corresponding feature subgraphs based on the normalization scaling factors corresponding to each feature subgraph, so as to obtain a normalization result of the feature graph; the normalized results of the feature map are used to generate a reconstructed tile of the target tile.

In one embodiment, the processing unit 602 is configured to obtain a normalized scaling factor corresponding to each feature sub-graph, specifically configured to:

obtaining norm values of N feature subgraphs;

In one embodiment, each feature sub-graph includes at least one feature element; the processing unit 602 is configured to obtain normative values of the N feature subgraphs, specifically configured to:

In one embodiment, the processing unit 602 is configured to perform feature extraction processing on the cascade tensor of the target tile to obtain a feature map of the target tile, specifically configured to:

In one embodiment, the target tile is any tile in the image to be processed, and the processing unit 602 is configured to obtain a cascade tensor of the target tile, specifically configured to:

In one embodiment, the processing unit 602 is configured to determine, based on the residual values of the respective display components of the target tile, feature information of the respective display components of the target tile, specifically configured to:

obtaining predicted values of all display components of a target block;

In one implementation, the display component of the target tile includes a luminance component and a chrominance component; the processing unit 602 is configured to perform a first discrete wavelet transform on the feature information of each display component of the target tile to obtain an independent tensor of each display component of the target tile, specifically configured to:

In one embodiment, the processing unit 602 is further configured to:

In one embodiment, the processing unit 602 is configured to determine, based on the normalized result of the feature map, a filtering result of an independent tensor of each display component of the target tile, specifically configured to:

According to one embodiment of the present application, part of the steps involved in the image processing methods shown in fig. 2 and 3 may be performed by respective units in the image processing apparatus shown in fig. 6. For example, step S201 and step S203 shown in fig. 2 may be performed by the acquisition unit 601 shown in fig. 6, and step S202 and step S204 may be performed by the processing unit 602 shown in fig. 6; step S301 shown in fig. 3 may be performed by the acquisition unit 601 shown in fig. 6, and steps S302 to S306 may be performed by the processing unit 602 shown in fig. 6. The respective units in the image processing apparatus shown in fig. 6 may be individually or collectively combined into one or several additional units, or some unit(s) thereof may be further split into a plurality of units smaller in function, which can achieve the same operation without affecting the achievement of the technical effects of the embodiments of the present application. The above units are divided based on logic functions, and in practical applications, the functions of one unit may be implemented by a plurality of units, or the functions of a plurality of units may be implemented by one unit. In other embodiments of the present application, the image processing apparatus may also include other units, and in practical applications, these functions may also be realized with assistance of other units, and may be realized by cooperation of a plurality of units.

According to another embodiment of the present application, an image processing apparatus as shown in fig. 6 may be constructed by running a computer program (including program code) capable of executing the steps involved in the respective methods as shown in fig. 2 and 3 on a general-purpose computing apparatus such as a computer device including a processing element such as a Central Processing Unit (CPU), a random access storage medium (RAM), a read only storage medium (ROM), and the like, and a storage element, and the image processing method of the present application is implemented. The computer program may be recorded on, for example, a computer-readable recording medium, and loaded into and run in the above-described computing device through the computer-readable recording medium.

Based on the same inventive concept, the principle and beneficial effects of the image processing device for solving the problems provided in the embodiments of the present application are similar to those of the image processing method for solving the problems in the embodiments of the method of the present application, and may refer to the principle and beneficial effects of implementation of the method, which are not described herein for brevity.

Referring to fig. 7, fig. 7 is a schematic structural diagram of a computer device according to an embodiment of the present application, where the computer device may be a terminal device or a server. As shown in fig. 7, the computer device includes at least a processor 701, a communication interface 702, and a memory 703. Wherein the processor 701, the communication interface 702, and the memory 703 may be connected by a bus or other means. Among them, the processor 701 (or central processing unit (Central Processing Unit, CPU)) is a computing core and a control core of the computer device, which can parse various instructions in the computer device and process various data of the computer device, for example: the CPU can be used for analyzing a startup and shutdown instruction sent by the object to the computer equipment and controlling the computer equipment to perform startup and shutdown operation; and the following steps: the CPU may transmit various types of interaction data between internal structures of the computer device, and so on. Communication interface 702 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI, mobile communication interface, etc.), and may be controlled by processor 701 to receive and transmit data; the communication interface 702 may also be used for transmission and interaction of data within a computer device. Memory 703 (Memory) is a Memory device in a computer device for storing programs and data. It will be appreciated that the memory 703 herein may comprise either a built-in memory of the computer device or an extended memory supported by the computer device. The memory 703 provides storage space that stores the operating system of the computer device, which may include, but is not limited to: android systems, iOS systems, windows Phone systems, etc., which are not limiting in this application.

The embodiments of the present application also provide a computer-readable storage medium (Memory), which is a Memory device in a computer device, for storing programs and data. It is understood that the computer readable storage medium herein may include both built-in storage media in a computer device and extended storage media supported by the computer device. The computer readable storage medium provides storage space that stores a processing system of a computer device. In this memory space, a computer program suitable for being loaded and executed by the processor 701 is stored. Note that the computer readable storage medium can be either a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as at least one magnetic disk memory; alternatively, it may be at least one computer-readable storage medium located remotely from the aforementioned processor.

In one embodiment, the processor 701 performs the following operations by running a computer program in the memory 703:

Performing feature extraction processing on the cascade tensor of the target image block to obtain a feature image of the target image block, wherein the feature image comprises N feature subgraphs, and N is a positive integer;

As an alternative embodiment, the specific embodiment of obtaining the normalized scaling factor corresponding to each feature sub-graph by the processor 701 is:

obtaining norm values of N feature subgraphs;

As an alternative embodiment, each feature sub-graph comprises at least one feature element; a specific embodiment of the processor 701 obtaining the normative values of the N feature subgraphs is:

As an alternative embodiment, the processor 701 performs feature extraction processing on the cascade tensor of the target tile, and a specific embodiment of obtaining the feature map of the target tile is as follows:

As an alternative embodiment, the target tile is any tile in the image to be processed, and the specific embodiment of the processor 701 obtaining the cascade tensor of the target tile is:

As an alternative embodiment, the processor 701 determines, based on the residual values of the respective display components of the target tile, the specific embodiment of the feature information of the respective display components of the target tile is:

Obtaining predicted values of all display components of a target block;

As an alternative embodiment, the display component of the target tile includes a luminance component and a chrominance component; the processor 701 performs a first discrete wavelet transform on the feature information of each display component of the target tile, and a specific embodiment for obtaining an independent tensor of each display component of the target tile is:

As an alternative embodiment, the processor 701 further performs the following operations by running a computer program in the memory 703:

As an alternative embodiment, the specific embodiment of the processor 701 determining the filtering result of the independent tensor of each display component of the target tile based on the normalized result of the feature map is:

Based on the same inventive concept, the principle and beneficial effects of solving the problem of the computer device provided in the embodiments of the present application are similar to those of solving the problem of the image processing method in the embodiments of the method of the present application, and may refer to the principle and beneficial effects of implementation of the method, which are not described herein for brevity.

The present application also provides a computer readable storage medium having a computer program stored therein, the computer program being adapted to be loaded by a processor and to perform the image processing method of the above method embodiments.

Embodiments of the present application also provide a computer program product or computer program comprising computer instructions stored in a computer-readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions so that the computer device performs the image processing method described above.

The steps in the method of the embodiment of the application can be sequentially adjusted, combined and deleted according to actual needs.

The modules in the device of the embodiment of the application can be combined, divided and deleted according to actual needs.

Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of the above embodiments may be implemented by a program to instruct related hardware, the program may be stored in a computer readable storage medium, and the readable storage medium may include: flash disk, read-Only Memory (ROM), random-access Memory (Random Access Memory, RAM), magnetic or optical disk, and the like.

The foregoing disclosure is only a preferred embodiment of the present application, and it is not intended to limit the scope of the claims, and one of ordinary skill in the art will understand that all or part of the processes for implementing the embodiments described above may be performed with equivalent changes in the claims of the present application and still fall within the scope of the claims.

Claims

1. An image processing method, the method comprising:

obtaining a cascade tensor of a target image block, wherein the cascade tensor is obtained by splicing independent tensors of each display component of the target image block;

respectively carrying out normalization processing on the corresponding feature subgraphs based on the normalization scaling factors corresponding to each feature subgraph to obtain the normalization result of the feature graph; the normalized results of the feature map are used to generate a reconstructed tile of the target tile.

2. The method of claim 1, wherein the obtaining the normalized scaling factor corresponding to each feature sub-graph comprises:

Obtaining the norm values of the N feature subgraphs;

3. The method of claim 2, wherein each feature sub-graph includes at least one feature element; the obtaining the norm values of the N feature subgraphs includes:

4. The method as set forth in claim 1, wherein the performing feature extraction processing on the cascade tensor of the target tile to obtain a feature map of the target tile includes:

5. The method of claim 1, wherein the target tile is any one of the tiles in the image to be processed, the obtaining a cascade tensor of the target tile comprising:

Acquiring code stream data of the image to be processed, wherein the code stream data comprises residual information of the target image block, and the residual information of the target image block comprises residual values of various display components of the target image block;

determining characteristic information of each display component of the target tile based on residual values of each display component of the target tile;

performing first discrete wavelet transformation on the characteristic information of each display component of the target image block to obtain independent tensors of each display component of the target image block;

6. The method of claim 5, wherein the determining feature information for each display component of the target tile based on residual values for each display component of the target tile comprises:

obtaining predicted values of all display components of the target block;

7. The method of claim 5, wherein the display component of the target tile includes a luminance component and a chrominance component; the performing a first discrete wavelet transform on the characteristic information of each display component of the target tile to obtain an independent tensor of each display component of the target tile includes:

performing interpolation processing on the chrominance components of the target image block to obtain interpolation results of the chrominance components of the target image block;

8. The method of any one of claims 1-7, wherein the method further comprises:

determining a filtering result of an independent tensor of each display component of the target tile based on the normalization result of the feature map;

and generating a reconstructed block of the target block through the display component of the target block.

9. The method of claim 8, wherein the determining a filtered result of an independent tensor for each display component of the target tile based on the normalized result of the feature map comprises:

activating the normalization result of the feature map to obtain an activation result of the feature map;

and combining the sampling result of the feature map with the independent tensor of each display component of the target image block respectively to obtain the filtering result of the independent tensor of each display component of the target image block.

10. An image processing apparatus, characterized in that the image processing apparatus comprises:

The acquisition unit is also used for acquiring the normalization scaling factors corresponding to each characteristic subgraph;

the processing unit is further used for respectively carrying out normalization processing on the corresponding feature subgraphs based on the normalization scaling factors corresponding to each feature subgraph to obtain a normalization result of the feature subgraphs; the normalized results of the feature map are used to generate a reconstructed tile of the target tile.

11. A computer device, comprising: a memory and a processor;

a memory in which a computer program is stored;

processor for loading the computer program for implementing the image processing method according to any of claims 1-9.

12. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program adapted to be loaded by a processor and to perform the image processing method according to any of claims 1-9.

13. A computer program product, characterized in that the computer program product comprises a computer program adapted to be loaded by a processor and to perform the image processing method according to any of claims 1-9.