CN116012243A

CN116012243A - Real scene-oriented dim light image enhancement denoising method, system and storage medium

Info

Publication number: CN116012243A
Application number: CN202211704340.3A
Authority: CN
Inventors: 张召; 任加欢; 洪日昌; 汪萌
Original assignee: Hefei University of Technology
Current assignee: Hefei University of Technology
Priority date: 2022-12-26
Filing date: 2022-12-26
Publication date: 2023-04-25

Abstract

The invention discloses a method, a system and a storage medium for enhancing and denoising a dim light image oriented to a real scene, wherein the method comprises the following steps: step 1, obtaining a dim light image, and carrying out potential subspace reconstruction; step 2, carrying out feature extraction and restoration on the reconstructed image in the step 1 based on a cross channel-window attention mechanism to obtain an enhanced noise-reduction image; and step 3, inputting the enhanced denoising image into a restoration neural network to obtain a normally-illuminated noiseless image. The invention can restore the image collected in the real dim light scene to obtain the image with normal illumination and no noise, thereby improving the representation capability of the image and being beneficial to improving the performance of other tasks.

Description

Real scene-oriented dim light image enhancement denoising method, system and storage medium

Technical Field

The invention relates to the field of image processing methods, in particular to a method, a system and a storage medium for enhancing and denoising a dim light image oriented to a real scene.

Background

Due to poor visibility in real dim light scenes, low Light Image Enhancement (LLIE) plays an important role in many tasks in low light environments, such as object detection, recognition and segmentation. However, in a real low-light environment, the image often contains much noise due to interference of light and sensors, which can lead to inaccurate representation, and most of the existing methods consider only noise-free low-light image enhancement. Therefore, to be able to better handle the dark image in real scenes, combining dark image enhancement and denoising would be a better solution. Dim light image enhancement aims to obtain a normal illumination image by estimating the image illumination to improve capturing more information in a dim light environment. While existing dim light enhancement methods perform well in the task of LLIE, they still face two challenges.

In recent years, a most representative type of dim light enhancement method, a dim light enhancement method based on the Retinex theory, obtains an illumination map and a reflection map by decomposing a dim light image. The theory assumes that the reflection patterns of the dark-light image and the normal-illumination image have consistency, and the brightness degree of the image only depends on the illumination pattern, so that the method mainly adjusts the illumination pattern, thereby restoring the dark-light image. However, when the method is oriented to a dark light image of a real scene, the image contains more noise, and most of the noise exists in a reflection image component, so that consistency of the reflection image of the dark light image and a reflection image of a normal illumination image is not established, the restored image is often greatly disturbed, and a spot blurring situation occurs. Another class of dim light enhancement methods is end-to-end enhancement networks, which mostly do not explicitly take noise into account. The noise contained in the image is considered in a few methods, but the methods often process the original format of the data, namely the RAW format data, the noise contained in the RAW format data is less in interference factors, useful information is more, a network trained based on the RAW data cannot be well applied to the RGB image, and generalization performance is poor.

Therefore, the dim light enhancement and denoising image system for the real scene is provided, so that the dim light image enhancement and denoising collected under the real scene is realized, the image generalization capability is improved, and the performance of the dim light image on other tasks is improved, so that the dim light image enhancement and denoising image system is a problem to be solved by those skilled in the art.

Disclosure of Invention

The invention aims to provide a method, a system and a storage medium for enhancing and denoising a dim light image oriented to a real scene, which are used for solving the problem that the generalization performance is poor due to the fact that joint enhancement and denoising tasks are not well considered in the prior art.

In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:

a method for enhancing and denoising a dim light image oriented to a real scene comprises the following steps:

step 1, obtaining a noise-containing dim light image in a real scene, and performing potential subspace reconstruction on the noise-containing dim light image to obtain a reconstructed image;

step 2, carrying out feature extraction and restoration on the image reconstructed in the step 1 based on a cross channel-window attention mechanism to obtain an enhanced noise-reduction image;

and 3, inputting the enhanced denoising image obtained in the step 2 into a restoration neural network by adopting the restoration neural network to perform feature extraction and reconstruction, so as to obtain a normally-illuminated noiseless image.

The further step 1 process is as follows:

step 1.1, convolution f with convolution kernel size 3×3 is utilized _3×3 Extracting the shallow layer characteristic F of the dim light image _shallow ∈R ^c×h×w Wherein R is the real number domain, c is the channel number, A is the height of the darkness image, and w is the width of the darkness image;

step 1.2, shallow layer feature F obtained in step 1.1 _shallow Performing potential subspace reconstruction, and acquiring a low-rank potential representation characteristic F with a reversible process through a matrix decomposition method _Latent As shown in the following formula:

F _shallow ＝f _3×3 (X)∈R ^c×h×w ，

wherein: x represents an input darkness noise image;

representing a data vector;

u and V are respectively low-rank representation matrices for feature reconstruction, and the low-rank representation matrix V can be regarded as a low-dimensional representation of input features with dimension R, and has U epsilon R ^r×hw ，V∈R ^r×c R represents the rank of the matrix, the rank r of the matrix is < hw;

trans () is a transpose operation;

Reshape ^T () The matrix conversion operation is realized to convert the two-dimensional matrix into a three-dimensional matrix;

thereby obtaining the shallow characteristic F through the low-rank representation matrices U and V _shallow Dimension consistent noise reduction low rank latent representation feature F _Latent And further obtaining a reconstructed image.

The further step 2 process is as follows:

step 2.1, constructing a deep feature reconstruction module for restoring local and global information based on a cross channel-window attention mechanism, wherein the deep feature reconstruction module is used for excavating deep information of potential features reconstructed in the step 1, constructing detail information of an image after the reconstruction is adjusted by an image refinement module, and comprises 3 coding blocks and 3 decoding blocks, wherein each coding block and each decoding block respectively comprise 4 layers of cross channel window attention layers; the image refinement module consists of 4 layers of cross channel window attention layers;

step 2.2, restoring the input potential features F by the deep feature reconstruction module _latent The specific process is as follows:

F _s ，F _c ＝Divide(F _Latent )，

F _sout ＝Swin_Transformer(F _s )，

F _cout ＝Crossed_Transformer(F _c )，

F _out ＝Concat(F _sout ，F _cout )，

wherein:

dihide () represents a channel splitting operation for splitting an input low-rank latent representation feature F _Latent Decomposition into features F _s ∈R ^m×h×w And F _c ∈R ^k×h×w M and k are channel numbers and have c=m+k;

Swin_Transformer () represents the window attention branch to obtain more accurate local features;

cross_transform () represents the cross channel attention branch to capture global structure that contributes to global brightness adjustment and color differences;

F _sout and F _cout Output characteristics of the window attention branch and the cross channel attention branch are respectively represented; the method comprises the steps of carrying out a first treatment on the surface of the

Concat(F _sout ，F _cout ) Representing the feature F to be acquired _sout And F _cout From the channel dimensions, the outputs F of the codec blocks are formed _out ∈R ^2c×h×w ；

Output F of the encoding and decoding block obtained in step 2.3 and step 2.2 _out Inputting the images to an image refinement module formed by cross channel window attention layers, performing shallow layer feature refinement and adjusting the channel number to form an enhanced noise reduction image, wherein the calculation formula is as follows:

wherein: f (F) _r Representing output characteristics representing an image refinement module; the method comprises the steps of carrying out a first treatment on the surface of the

CST(F _out ) Representing cross channel window attention layers;

n ₅ representing the number of cross channel window attention layers of the feature refinement module;

representing the final output image, i.e. the enhanced noise-reduced image.

Further step 3 comprises the steps of:

step 3.1, adopting a network based on a cross channel-window attention mechanism as a restoration neural network, and performing supervised learning on the restoration neural network based on a random gradient descent algorithm to optimize weight parameters of the restoration neural network so as to obtain an optimized restoration neural network;

and 3.2, inputting the enhanced noise reduction image obtained in the step 2 into the optimized restoration neural network to obtain a normally-illuminated noise-free image.

The further step 3.1 procedure is as follows:

firstly, acquiring a dim light image for training and a corresponding normal illumination image, inputting the dim light image for training into a restoration neural network for forward propagation, and obtaining a predicted image;

and then inputting the normal illumination image into the restoration neural network for supervised learning, calculating the loss between the supervised learning result and the predicted image based on the loss function, and optimizing and updating the weight parameters of the restoration neural network according to the loss to obtain the optimized restoration neural network.

Further step 3.1 the loss function L _total The formula is as follows:

wherein, I ₁ () Represents the loss of L1, L _ssim () Representing a structural consistency loss, l _tv () Represents the total variation loss, lambda is a positive term adjustment parameter, Y represents a normal light noise-free image,

representing the final output image of the network.

The dim light image enhancement denoising system oriented to the real scene comprises a processor and a memory, wherein program instructions are stored in the memory, and when the processor runs the program instructions, the steps in the method are executed.

A storage medium having stored therein program instructions readable and executable, the program instructions when read and executed, performing the steps of the above method.

Compared with the prior art, the invention constructs different network modules based on the deep neural network, and noise existing in the dim light image can be suppressed to a certain extent through the designed potential subspace reconstruction module; through the cross channel and the window attention mechanism, the global and local information of the denoising image features can be well reserved, and the detail features can be well reserved. Through the process, the invention realizes the enhancement and denoising of the dim light image in the real scene, can better keep the detailed information of the image, has better generalization performance, and can well process the enhancement and denoising tasks in different scenes. Based on the obtained model of network training completion, the invention can restore the image collected under the real dim light scene to obtain the image with normal illumination and no noise, thereby improving the representation capability of the image and being beneficial to improving the performance of other tasks. By introducing a potential subspace reconstruction module, a cross channel and a window attention mechanism, the invention can effectively enhance and denoise the dim light image under the real scene.

Drawings

Fig. 1 is a flowchart of a real scene-oriented dim light enhancement denoising method according to an embodiment of the present invention.

Fig. 2 is a comparison chart of application effects of an embodiment of the present invention, in which: (a) A dark-light image, and (b) a noise-free image of normal illumination after restoration.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The embodiment discloses a real scene-oriented dim light enhanced denoising image method, which is characterized in that subspace reconstruction, cross channel-window attention feature extraction and then a neural network model are combined on a noisy dim light image, so that a final enhanced denoising image is obtained. By introducing matrix decomposition thought and attention mechanism, the noise which is not easy to perceive in the dim light image can be automatically removed while the dim light image collected under the real scene is enhanced, and the performance of the dim light image on other tasks is effectively improved.

Referring to fig. 1, a flowchart of a method according to an embodiment of the invention is shown. The specific implementation steps are as follows:

step 1, obtaining a noise-containing dim light image in a real scene, and performing potential subspace reconstruction on the noise-containing dim light image to obtain a reconstructed image.

Specifically, a noisy dim light image X ε R is obtained ^3×h×w And its corresponding clean normal illumination image

Wherein 3 is the number of color image channels, h is the height of the darkness image, w is the width of the darkness image, X is the input darkness noise image, +.>

Is a clean normal illumination image recovered by the network for the supervised learning of the subsequent recovery neural network training.

Calculating shallow layer characteristics of potential subspace reconstruction of a dim light image, and obtaining a multichannel characteristic map F _shallow ∈R ^c×h×w Wherein R is the real number domain, c is the channel number, F _shallow Which is input into a subspace decomposition module in a subspace reconstruction module to obtainPotential low rank feature F including less noise _Latent ∈R ^c×h×w Thereby obtaining a reconstructed image,

the specific process of the step 1 is as follows:

step 1.1, convolution f with convolution kernel size 3×3 is utilized _3×3 Extracting the shallow layer characteristic F of the dim light image _shallow ∈R ^c×h×w The calculation formula is as follows:

E _shallow ＝f _3×3 (X)∈R ^c×h×w ，

where R is the real number field, c is the number of channels, h is the height of the darkness image, and w is the width of the darkness image.

Step 1.2 then shallow feature F _shallow Is input to the subspace reconstruction module to learn two low rank representation matrices, coefficient matrix U and base matrix V, which can be used to reconstruct features. The subspace reconstruction module is composed of a series of common convolutions with convolution kernel sizes of 3×3, wherein the coefficient matrix U can be obtained by the following formula:

U _M ＝GELU(f _1×1 (GELU(f _1×1 (F _shallow ))))∈R ^r×h×w ，

U＝Reshape(U _M )∈R ^r×hw ，

in the formula: f (f) _1×1 Is a normal convolution with a convolution kernel size of 1 x 1; GELU () is an activation function; reshape () is a matrix conversion operation for converting a three-dimensional matrix into a two-dimensional matrix; r is the rank of the matrix, U _M The characteristic of the channel number r is shown.

The basis matrix V can be obtained by the following formula:

F _V ＝GELU(f _1×1 (GELU(f _1×1 (F _shallow ))))∈R ^c×h×w ，

F＝Trans(Reshape(F _V ))∈R ^hw×c ，

V＝U×F∈R ^r×c ，

in the formula: reshape ^r () Is a matrix conversion operation that converts a two-dimensional matrix into a three-dimensional matrix. Trans () is a transpose operation. Base matrix VCan be regarded as a low-dimensional representation of the dimension r of the input feature, where the rank r < hw of the matrix.

Finally, the shallow characteristic F can be obtained through the obtained low-rank representation matrixes U and V _shallow Dimension consistent noise reduction potential low rank feature F _Latent ∈R ^c×h×w The formula is as follows:

wherein:

representing intermediate representation features.

Thus, the potential low-rank characteristic F after noise reduction is obtained _Latent And further obtaining a reconstructed image.

Step 2, carrying out feature extraction and restoration on the image reconstructed in the step 1 based on a cross channel window attention mechanism to obtain an enhanced noise-reduction image;

specifically, the potential low rank feature F obtained in step 1 is utilized _Latent Inputting the potential characteristics into a deep-layer characteristic reconstruction module formed by a cross channel attention and window attention mechanism, and mining deep-layer information of the potential characteristics reconstructed in the step 1; and finally, inputting the detail information of the reconstructed image to an image refining module for adjusting. The process is as follows:

and 2.1, constructing a deep feature reconstruction module for restoring local and global information based on a cross channel window attention mechanism, and excavating deep information of the potential features reconstructed in the step 1, and constructing an image refinement module to adjust detail information of the reconstructed image. The deep characteristic reconstruction module comprises 3 coding blocks and 3 decoding blocks, wherein each coding block and each decoding block respectively comprise 4 layers of cross channel window attention layers; the image refinement module is composed of 4 layers of cross channel window attention layers.

Step 2.2 potential Low rank feature F _Latent Is divided into two parts F _s ∈R ^m×h×w And F _c ∈R ^k×h×w Window attention branches and cross channel attention branches are input, respectively, where m and k are the number of channels of the feature, and c=m+k.

The window attention branch is mainly composed of a variable small window attention layer based on multi-head attention, a multi-layer perceptron layer and a layer normalization operation and is connected through residual errors. First, feature F _s Divided into a plurality of windows and flattened into features F by linear embedding _se ⁰ . The calculation process can be expressed as follows:

wherein MLP () represents a multi-layer perceptron; LN () represents a layer normalization operation; W_MSA represents window attention based on multi-head attention; sw_msa represents variable window attention; 1 represents layer 1.

Representing multi-headed attention feature, F, of layer 1 _se ^l Representing the multi-layer perceptron features of layer 1. />

Finally obtained feature F _se ^l＝1 Flattened into a tensor consistent with the input dimension.

For cross channel attention branches, feature F _c Is input into a 1 x 1 convolution for pixel aggregation across channel contexts, and a 3 x 3 depth convolution for encoding channel spatial contexts. Base groupIn the obtained context information, three vectors Query (Q), KKey (K), value (V) for attention calculation can be calculated as follows:

Q∈R ^k×h×w ，K∈R ^k×h×w ，V∈R ^k×h×w ，

wherein the method comprises the steps of

Representing a 3 x 3 depth convolution.

Then the dimensions of Q, K and V are changed into Q epsilon R ^k×hw ，K∈R ^hW×k ，V∈R ^k×hw . Q and K are used to calculate a cross-channel global context attention map G ε R ^k×k And g=q×k. Cross-channel attention-based feature F _ca The method can be obtained by the following formula:

CA(Q，K，V)＝Softmax(Q·K/α)·V，

where α represents a positive term parameter.

Then, through the forward propagation network, to convert F _ca As shown in the following formula:

F _cout ＝GELU(F _fd )⊙F _fd +F _ca ，

wherein F is _fd Representing the convolution characteristics, F _cout Indicating the output characteristics of the cross-channel attention branches, LN indicates layer normalization.

Then, F is obtained _sout ∈R ^m′h×w And F _cout ∈R ^k ′ ^×h×w From the channel dimensionSpliced together, wherein 2c=m '+k', to form the output F of the deep feature reconstruction module _out ∈R ^2c×h×w M 'and k' represent the number of channels, respectively.

Step 2.3, finally, outputting F of the deep feature reconstruction module _out And carrying out channel adjustment and shallow layer feature refinement to form an enhanced noise-reduction image. Will F _out Input into the feature refinement module, as shown in the following formula:

wherein F is _r Representing the output characteristics of the image refinement module, CST (F _out ) Represent cross channel window attention layer, n ₅ Represents the number of cross channel window attention layers of the feature refinement module,

representing the final output image, i.e., the enhanced noise-reduced image.

And 3, inputting the enhanced denoising image obtained in the step 2 into a restoration neural network by adopting the restoration neural network to perform feature extraction and reconstruction, so as to obtain a normally-illuminated noiseless image. The process is as follows:

and 3.1, adopting a network based on a cross channel-window attention mechanism as a restoration neural network, and performing supervised learning on the restoration neural network based on a random gradient descent algorithm to optimize weight parameters of the restoration neural network so as to obtain the optimized restoration neural network.

Specifically, firstly, a dim light image for training and a corresponding normal illumination image are collected, and the dim light image for training is input into a restoration neural network for forward propagation, so that a predicted image is obtained.

And then inputting the normal illumination image into a restoration neural network for supervised learning, and calculating the loss between the supervised learning result and the predicted image based on a loss function, wherein the loss function is as follows:

representing the final output image of the network, and having:

and updating the weight parameters of the restored neural network according to the loss optimization to obtain the optimized restored neural network.

And 3.2, loading the trained restoration neural network, inputting the enhanced noise reduction image obtained in the step 2 into the optimized restoration neural network, carrying out forward propagation once, and outputting a final result, wherein the obtained restoration result is a normally-illuminated noiseless image.

The invention also discloses an image enhancement denoising system, which comprises a processor and a memory, wherein the memory stores program instructions, and the processor executes the steps 1-3 in the method when running the program instructions.

The invention also discloses a storage medium, wherein the storage medium stores readable and executable program instructions, and when the program instructions are read and executed, the steps 1-3 in the method are executed.

The present invention was tested on three published dim illumination datasets: LOL, LRSW and VE-LOL-CAP. The LOL dataset consisted mainly of 485 versus normal-dim light training datasets and 15 versus normal-dim light test datasets. The LRSW dataset contained 5650 normal-dim light data collected by nikon D7500 and hua P40 Pro. The invention mainly adopts 2480 collected for P40Pro to train and 30 to test the data. The VE-LOL-CAP dataset contained 400 training on normal-dim light data and testing on 100 dim light images. In addition, the invention also collects the dim light images in some real scenes and performs image restoration work. These databases are collected from many sources and thus the test results are generally illustrative.

Referring to Table 1, PSNR, SSIM and MAE indexes of restored images of each method are shown in comparison tables of restoration results of the method and the dim light enhancement algorithms such as RetinexNet, R2RNet, kinD++, zeroDCE++, RUAS, SCI, enlightenGAN, DCC-Net, SNR and the like on two public data sets VE-LOL-CAP and LRSW. Experiments show that the method is obviously superior to other dim light enhancement methods, and the generalization is also superior to other comparison methods.

TABLE 1 comparison of the recovery results of the inventive method and algorithm on VELOL-CAP and LRSW datasets

Referring to table 2, for the method and the darkness enhancement algorithms such as RetinexNet, R2RNet, kind++, zerodce++, RUAS, SCI, enlightenGAN, DCC-Net, SNR, etc., the comparison table of the restoration results of noise concentrations of different levels is added to the LOL data set, and the PSNR, SSIM and MAE indexes of the restored images of each darkness enhancement method are given. Experiments show that the method is obviously superior to other dim light enhancement methods, is more robust to noise in an image, and can better treat the noise.

TABLE 2 comparison of the recovery results of the inventive method and the respective algorithms with the addition of different noise concentrations to LOL data set

/>

Referring to fig. 2, a comparison chart of application effects of an embodiment of the present invention is shown, in which: (a) A dark-light image, and (b) a noise-free image of normal illumination after restoration. As can be seen from a comparison of (a) and (b) in fig. 2, the method of the present invention can effectively realize noise-free restoration of a dark-light image.

The example experiment results on the three real data sets, namely LOL, VE-LOL-CAP and LRSW and the dim light images shot by some mobile phones show that the method of the invention has obvious improvement on visual and quantitative results, effectively enhances the brightness of the images and effectively removes the noise which is not easy to be found in the data.

Experimental results show that the dim light enhancement and denoising imaging system facing the real scene is obviously superior to related methods such as RetinexNet, R2RNet, kinD++, zeroDCE++, RUAS, SCI, enlightenGAN, DCC-Net, SNR and the like, and has stronger stability and certain advantages.

In summary, the invention discloses a real scene-oriented dim light enhancement and denoising image system. In the real dim light scene, the shot image is easily influenced by factors such as light rays, sensors and the like, so that the problems of low image contrast and visibility, serious noise interference and the like are caused. The system is based on a deep neural network, different network modules are constructed, and noise existing in a dim light image can be suppressed to a certain extent through a designed potential subspace reconstruction module; through the cross channel and the window attention mechanism, the global and local information of the denoising image features can be well reserved, and the detail features can be well reserved. Through the two modules, the system realizes the enhancement and denoising of the dim light image in the real scene, can better keep the detailed information of the image, has better generalization performance, and can well process the enhancement and denoising tasks in different scenes. Based on the obtained model of network training completion, the system can restore the image collected under the real dim light scene to obtain the image with normal illumination and no noise, thereby improving the representation capability of the image and being beneficial to improving the performance of other tasks. By introducing a potential subspace reconstruction module, a cross channel and a window attention mechanism, the system can effectively enhance and denoise a dim light image under a real scene.

For the system disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. The method for enhancing and denoising the dim light image facing the real scene is characterized by comprising the following steps of:

2. The method for enhancing and denoising a dim light image facing a real scene according to claim 1, wherein the procedure of step 1 is as follows:

step 1.1, convolution f with convolution kernel size 3×3 is utilized _3×3 Extracting the shallow layer characteristic F of the dim light image _shallow ∈R ^c ^×h×w Wherein R is a real number domain, c is the number of channels, h is the height of the darkness image, and w is the width of the darkness image;

F _shallow ＝f _3×3 (X)∈R ^c×h×w ，

wherein: x represents an input darkness noise image;

representing a data vector;

trans () is a transpose operation;

Reshape ^T () Is a matrix conversion operation, which realizes the conversion of a two-dimensional matrixChanging into a three-dimensional matrix;

3. The method for enhancing and denoising a dim light image facing a real scene according to claim 2, wherein the procedure of step 2 is as follows:

step 2.1, constructing a deep feature reconstruction module for restoring local and global information based on a window attention mechanism of a cross channel, wherein the deep feature reconstruction module is used for excavating deep information of potential features reconstructed in the step 1, constructing detail information of an image after the reconstruction is adjusted by an image refinement module, and comprises 3 coding blocks and 3 decoding blocks, wherein each coding block and each decoding block respectively comprise 4 layers of cross channel window attention layers; the image refinement module consists of 4 layers of cross channel window attention layers;

F _s ，F _c ＝Divide(F _Latent )，

F _sout ＝Swin_Transformer(F _s )，

F _cout ＝Crossed_Transformer(F _c )，

F _out ＝Concat(F _sout ，F _cout )，

wherein:

dihide () represents a channel splitting operation for splitting an input low-rank latent representation feature F _Latent Decomposition into features F _s ∈R ^m ^×h×w And F _c ∈R ^k×h×w M and k are the number of channels, respectively, and have c=m+k;

F _sout and F _cout Output characteristics of the window attention branch and the cross channel attention branch are respectively represented;

Concat(F _sout ，F _cout ) Representing the feature F to be acquired _sout And E is _cout From the channel dimensions, the outputs F of the codec blocks are formed _out ∈R ^2c×h×w ；

wherein: f (F) _r Representing output characteristics of the image refinement module;

CST(F _out ) Representing cross channel window attention layers;

representing the final output image, i.e. the enhanced noise-reduced image.

4. The method for enhancing denoising a dark light image oriented to a real scene according to claim 1, wherein step 3 comprises the steps of:

5. The method for enhancing denoising a dark light image oriented to a real scene according to claim 4, wherein the procedure of step 3.1 is as follows:

6. The method for enhancing denoising a real scene-oriented dim light image according to claim 5, wherein the loss function L in step 3.1 is as follows _total The formula is as follows:

wherein, I ₁ () Represents the loss of L1, L _ssim () Representing a structural consistency loss, l _tv () Represents the total variation loss, lambda is a positive term adjustment parameter, Y is a normal illumination noise-free image,

representing the final output characteristics of the network.

7. A real scene oriented dim light image enhancement denoising system comprising a processor and a memory, wherein program instructions are stored in the memory, characterized in that the processor executes the steps of the method according to any one of claims 1-6 when running the program instructions.

8. A storage medium having stored therein program instructions that can be read and executed, characterized in that the program instructions, when read and executed, perform the steps of the method of any of claims 1-6.