CN115456891A

CN115456891A - Under-screen camera image restoration method based on U-shaped dynamic network

Info

Publication number: CN115456891A
Application number: CN202211053075.7A
Authority: CN
Inventors: 刘茜娜; 胡锦帆; 陈翔宇; 董超
Original assignee: Shenzhen Institute of Advanced Technology of CAS
Current assignee: Shenzhen Institute of Advanced Technology of CAS
Priority date: 2022-08-31
Filing date: 2022-08-31
Publication date: 2022-12-09

Abstract

The invention discloses a method for restoring an image of an under-screen camera based on a U-shaped dynamic network. The method comprises the following steps: collecting a target image; and inputting the target image into the trained image restoration model to obtain a reconstructed image. The image restoration model comprises a basic network, a conditional branch and a kernel branch, wherein the basic network is used for extracting multi-scale information of an input image; the conditional branch is used for adaptively modulating the intermediate features extracted by the basic network so as to generate conditional features with different spatial resolutions aiming at the input image; the kernel branches generate dynamic convolution kernels with different spatial resolutions based on the combined characteristics of the input image and the point spread function characteristics in the channel dimension; and integrating the condition characteristics and the dynamic convolution kernels into a set position in the process of carrying out forward propagation on the input image by the basic network. The invention improves the quantitative performance and the visual quality of image restoration.

Description

Under-screen camera image restoration method based on U-shaped dynamic network

Technical Field

The invention relates to the technical field of image processing, in particular to a method for restoring an image of an under-screen camera based on a U-shaped dynamic network.

Background

The video camera system (UDC) is a novel imaging system for installing a display screen on a traditional digital camera lens, and can realize full screen display without punching a notch, so as to provide better user experience, thereby receiving wide attention in the industry. The UDC utilizes the characteristic of high transmittance of an OLED screen, and the OLED normally displays the content of the mobile phone under the condition of not taking a picture. When taking a picture, external light is imaged through the OLED screen.

However, it is relatively difficult to maintain the full functionality of the imaging sensor under the display, and the display screen inevitably affects the light propagation process. In the imaging process, since the light needs to travel through a screen covered on the camera, and various forms of optical diffraction and interference are generated, the image captured by the UDC system usually contains flare, haze, blur, noise, and the like. In addition, in a real scene, a UDC image is often shot in a High Dynamic Range (HDR) scene, a severe oversaturation problem occurs in a high light area, and phenomena such as halo and blur in the UDC image seriously affect user experience.

The goal of the image restoration task is to recover a clean high-quality image from the degraded deviation of the image, such as denoising, deblurring, super-resolution, and HDR reconstruction. Similar to these tasks, the purpose of UDC image restoration is to reconstruct degraded images generated by the UDC system. In order to model the complex degradation process of UDC systems, the prior art suggests restoring the image using a special diffraction blur kernel, the Point Spread Function (PSF). For example, the UDC image recovery task is considered as an inverse problem given an accurately measured PSF. As another example, modeling the image generation process of the UDC system, the UDC image recovery problem is solved based on Deconvolution-based pipeline (DeP) and data-driven learning-based methods. These UNet variants lack consideration for HDR scenes in data generation and PSF measurements, resulting in images captured by UDCs often with noise, flare, haze, and blurred artifacts, among others. Although the existing methods have made some progress in image restoration, the quality of the restored image still remains to be improved.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provides a method for restoring an image of a video camera under a screen based on a U-shaped dynamic network, which comprises the following steps:

collecting a target image;

inputting the target image into a trained image restoration model to obtain a reconstructed image;

the image restoration model comprises a basic network, a conditional branch and a kernel branch, wherein the basic network is used for extracting multi-scale information of an input image; the conditional branch is used for adaptively modulating the intermediate features extracted by the basic network so as to generate conditional features with different spatial resolutions aiming at the input image; the kernel branches generate dynamic convolution kernels with different spatial resolutions based on the combined characteristics of the input image and the point spread function characteristics in the channel dimension; and integrating the condition characteristics and the dynamic convolution kernels into a set position in the process of carrying out forward propagation on the input image by the basic network.

Compared with the prior art, the method has the advantages that a new deep network model is provided, and the method can be used for solving the problem of image recovery of an under-screen camera system with a known Point Spread Function (PSF) in an HDR scene. The provided network model includes a base network that utilizes multi-scale information, a conditional branch that performs spatial deformation modulation, and a kernel branch that provides a priori knowledge of a given PSF. In addition, according to the characteristics of HDR data, tone mapping loss is further designed for the network model so as to stabilize the optimization of the network model and improve the visual quality of the restored image.

Other features of the present invention and advantages thereof will become apparent from the following detailed description of exemplary embodiments thereof, which proceeds with reference to the accompanying drawings.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description, serve to explain the principles of the invention.

Fig. 1 is a flowchart of a method for restoring an image of a video camera under a screen based on a U-type dynamic network according to an embodiment of the present invention;

FIG. 2 is a network architecture diagram of an image restoration model according to one embodiment of the present invention;

FIG. 3 is a diagram of image restoration effects according to one embodiment of the present invention;

in the figure, conv-convolution; down-sampling-Down sampling; up-sampling; residual Block-Residual Block; residual SFT Block-Residual spatial feature transform Block; dynamic Conv-Dynamic convolution; element-wise Sum-point by point; concatenate-pooling; element-wise multiplication.

Detailed Description

Various exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, the numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present invention unless it is specifically stated otherwise.

The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the invention, its application, or uses.

Techniques, methods, and apparatus known to one of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.

In all examples shown and discussed herein, any particular value should be construed as merely illustrative, and not limiting. Thus, other examples of the exemplary embodiments may have different values.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be discussed further in subsequent figures.

Referring to fig. 1, the provided method for restoring an image of a video camera under a screen based on a U-shaped dynamic network includes the following steps:

in step S110, a data set and a point spread function PSF are acquired.

The Point Spread Function (PSF), which is the distribution of irradiance of a point source of light in an object space after passing through an optical system, can be used to describe the response of an imaging system to a point source or a point object. Conventional image restoration methods typically estimate the PSF first and then the original sharp image. A priori knowledge of the PSF has proven effective for image restoration. In one embodiment of the invention, to reduce model training costs, a public dataset and a public PSF are used.

Step S120, an image restoration model is constructed, and the model comprises a basic network for extracting multi-scale information, a conditional branch for modulating space transformation of different areas and an inner core branch for generating a dynamic convolution kernel.

Fig. 2 is an example of a proposed new image restoration model, or UDC-UNet model, that can be used to recover images captured by UDC in HDR scenes. From top to bottom, the model framework generally includes conditional branches, base networks, and kernel branches, where C represents the number of channels and K represents the dynamic convolution kernel size. First, an underlying network is constructed to extract multi-scale information hierarchically. Then, to achieve spatially varying modulation of different regions at different exposures, multiple Spatial Feature Transform (SFT) layers are employed to construct the conditional branches. In addition, a kernel branch is added which uses the PSF to refine the intermediate properties of the underlying network. Preferably, in consideration of the data characteristics of the HDR image, a new tone mapping loss is designed, and the image values are normalized to [0,1], so that the influence caused by different intensity pixels can be balanced, and the training process of the model can be stabilized.

In one embodiment, a U-shaped structure is adopted to construct a basic network for extracting features of different scales, which can be divided into shallow features and deep features, wherein the shallow features can be gradually extracted by a shallow network through an increasing receptive field, and the shallow network maps an input image into a high-dimensional representation; the deep features may be learned from the decoding process. In addition, the hopping connection of the U-network can effectively combine the shallow feature and the deep feature. The U-shaped structure is beneficial for the network to fully utilize the hierarchical multi-scale information of the input image.

For example, for an underlying network, the characteristic channel dimension C can be set to 32 (UDC-UNet is available when C is set to 20) _S Version, or a simplified version of UDC-UNet), the number of Residual SFT blocks in the basic network is set to [2, 8,2]。

The key of the UDC image restoration in the HDR scene is to process the blurring of an illumination unsaturated area (the pixel value is more than or equal to 0 and less than or equal to 1) and solve the flare artifact generated by an oversaturated area (the pixel value is more than 1 and less than or equal to 500) and the like. The traditional convolution kernel applies the same convolution weight on the whole image, and only has fixed convolution operation when dealing with the problem that different spaces have different characteristics. In one embodiment of the invention, HDR reconstruction with de-noising and de-quantization functions is implemented using a Spatial Feature Transform (SFT) layer, and a transformable spatial feature module is used to construct a conditional branch to adaptively modulate the intermediate features extracted by the underlying network.

For example, an input image is first input into a conditional branch, a channel is changed to C by a 3 × 3 convolution, and then a Residual Block layer (for example, the number is set to 2) is entered. And generating spatial condition characteristics with different resolutions through 4 branches consisting of 1 × 1 convolution kernels and down-sampling layers, wherein the characteristic channel dimensions are C,2C,4C and 8C in sequence. Finally, these features are organically integrated into the forward propagation of the underlying network through the SFT layer. The operation of the SFT layer can be represented as:

SFT(x)＝α⊙x+β (1)

wherein "" represents point-by-point multiplication, x ∈ R ^ (C × H × W) is an intermediate characteristic in the base network, and α ∈ R ^ (C × H × W), β ∈ R ^ (C × H × W) is a modulation coefficient characteristic diagram learned by the SFT layer from the output characteristics of the conditional branch, H denotes a height of the characteristic diagram, and W denotes a width of the characteristic diagram. Such a spatial feature transformation mechanism is used so that the proposed network model can easily map different regions spatially differently. Therefore, by introducing conditional branches, the network can perform spatial feature transformation on different characteristic regions, so that information extraction is performed on different regions distinctively.

Further, considering that a Point Spread Function (PSF) in the UDC imaging system can be used as the prior knowledge of the UDC image recovery to improve the recovery effect of the image, a kernel branch is introduced to utilize the PSF so as to perfect or refine the intermediate features extracted by the basic network.

In one embodiment, utilization of the PSF includes the following steps. First, the most important information in the PSF is extracted by a Principal Component Analysis (PCA) method, and the extracted information is enlarged to the same size as the input image as the PSF feature. Then, this PSF feature and the input image are combined in the feature dimension as the input of the kernel branch, and the 3 × 3 convolution transform channels are entered into the Residual Block (the number is set to 2). And finally, outputting dynamic convolution kernels with different spatial resolutions through 4 branches. For example, for an intermediate feature with a dimension of c × h × w, the corresponding dynamic convolution kernel dimension generated is ck ^2 × h × w, where k is the size of the dynamic convolution kernel (e.g., set to 3). Then, performing dynamic convolution on each pixel, which specifically operates as follows:

F(i,j,c)＝K(i,j,c)·M(i,j,c) (2)

the dynamic convolution kernel generated by the kernel branch enables the jumping connection of shallow layer features to be more flexible when being connected to deep layers, and the jumping connection is not directly added, so that the intermediate features extracted by the basic network are refined.

It should be noted that the kernel branch input can select various types of inputs, including: no input; inputting only image information; inputting only point spread function information; and four types of simultaneous input images and point spread functions PSF. Experiments prove that the image restoration effect is optimal under the condition of simultaneously inputting an image and a point spread function PSF.

Step S130, training an image restoration model based on the set loss function.

In one embodiment, a new loss function is designed for UDC image characteristics, expressed as:

Mapping_L ₁ (Y,X)＝|Mapping(Y)-Mapping(X)| (3)

where Y represents the restored image generated by the network model, X represents the corresponding real image, mapping is a tone Mapping function used to convert the image to a standard image and normalize the image values to 0,1]. Preferably, mapping (I) = I/(I + 0.25) is set for converting the HDR image into the standard image. L of the difference between Y and X after tone mapping ₁ The norm is used as a loss function of the network, and training is performed under the constraint of the loss function to obtain parameters to be learned in the network model, such as weight, bias and the like. It should be understood that the loss function may alternatively be of another type, such as L ₂ And (4) loss. However, verified by experiments, based on Mapping _ L ₁ The model of lost training works best.

In actual model training, based on the collected training data set, a gradient descent algorithm may be used to optimize the network model by iteration until convergence, such as a designed loss function Mapping _ L ₁ The loss is no longer reduced or reaches the set loss standard. For example, the initial learning rate is set to 2 × 10 ^-4 The learning rate variation strategy uses cosine annealing, and sets the minimum learning rate to eta _min ＝1×10 ^-7 Maximum learning rate of η _max ＝2×10 ^-4 And in [ 5X 10 ] ⁴ ,1.5×10 ⁵ ,3×10 ⁵ ,4.5×10 ⁵ ]And restarting after the second iteration.

Specifically, the training process of the image restoration model comprises the following steps:

step S1, using a UDC image as input, obtaining a 5-dimensional vector by a PSF through Principal Component Analysis (PCA), then copying the vector, and expanding the spatial dimension of the vector to be consistent with that of the input image, thereby obtaining the PSF characteristics;

s2, inputting an image into a conditional branch to generate conditional features with different spatial resolutions;

s3, merging (concatenate) the input image and the PSF characteristics in a channel dimension, and entering an inner core branch after merging to generate dynamic convolution kernels with different spatial resolutions;

and S4, the input image enters a basic network for forward propagation, the obtained different condition characteristics and the obtained dynamic convolution kernel are integrated into a specific position, and the network model is optimized through continuous iteration.

In conclusion, through carrying out multi-aspect analysis and observation on the UDC data, the image restoration model provided by the invention adds the conditional branch and the kernel branch in the UNet basic network, thereby improving the characterization capability of the network. Moreover, mapping _ L suitable for training data is designed ₁ Loss is reduced, network optimization is stabilized, and the visual quality of the restored image is improved. Compared with the network structure provided by the invention, the conventional UNet network has larger performance difference. Meanwhile, if only the conventional training data is used, the expected effect cannot be achieved by neglecting the kernel branch designed by the invention. In addition, mapping _ L is designed ₁ Loss, optimal effect can be achieved, and the risk that the training process cannot be converged is avoided.

And step S140, aiming at the collected target image, carrying out image reconstruction by using the trained image restoration model.

In practical application, the target picture can be directly input into the trained image restoration model, and the reconstructed image can be obtained.

To further verify the effect of the present invention, simulation experiments were performed. Fig. 3 is a comparison of visual effects, and tables 1 to 4 are quantitative experimental results.

In table 1, the parameter amounts (Params) are in units of M, which means million; the calculated amount takes G as a unit and is an abbreviation of kilomega floating point operation; PSNR and SSIM respectively represent peak signal-to-noise ratio and structural similarity, reflect image reconstruction effect, and the larger the numerical value is, the better the restoration effect is; LPIPS represents image similarity, and the smaller the numerical value, the better the restoration effect. Uformer, HDRUNet and DISCNet are representative existing neural network algorithm models that can be used for UDC image restoration. UDC-UNet _S UDC-UNet, represent the UDC-UNet method of the present invention, and a simplified version of the UDC-UNet method, respectively.

Table 1: comparative test results of the existing image restoration model and the present invention

Method	PSNR	SSIM	LPIPS	Params
					Uformer	37.97	0.9784	0.0285	20.0M
HDRUNet	40.23	0.9832	0.0240	1.7M
					DISCNet	39.89	0.9864	0.0152	3.8M
UDC-UNet	47.18	0.9927	0.0100	14.0M
					UDC-UNet _S	45.98	0.9913	0.0128	5.7M

As can be seen from Table 1, the present invention has superior performance in terms of restoration effects (indexes PSNR, SSIM and LPIPS). Furthermore, through comparison of parameters (Params) and calculated quantities (GMACs), it can be found that the model of the present invention, even if compressed, has better performance than other models with further savings in computational power.

Table 2 is a structure ablation comparative experiment, where √ and x represent the use and non-use of corresponding structures, respectively. PSNR and SSIM respectively represent peak signal-to-noise ratio and structural similarity, reflect the reconstruction effect of the algorithm, and the larger the numerical value is, the better the reconstruction effect is; LPIPS represents image similarity, and the smaller the numerical value, the better the restoration effect.

Table 2: structure ablation contrast test structure

Model (model)	(a)	(b)	(c)	(d)	(e)
						U-shapedBasic network	×	√	√	√	√
Conditional branching	×	×	×	√	√
						Kernel branching	×	×	√	×	√
PSNR	42.19	44.50	44.58	45.23	45.37
						SSIM	0.9884	0.9897	0.9893	0.9897	0.9898
LPIPS	0.0164	0.0155	0.0157	0.0166	0.0162

As can be seen from table 2, the U-type basic network, the conditional branch and the kernel branch used in the present invention can significantly improve the network reconstruction performance. The network performs best when a U-shaped base network is used and conditional branches and kernel branches are added. The experimental results in tables 2 and 3 also show the effectiveness of this conditional branching.

Table 3 reflects the quantitative result variation resulting from using different inputs in the kernel branch. The inputs to the kernel branch include: (a) no input; (b) inputting only image information; (c) inputting only point spread function information; (d) simultaneously inputting the image and the point spread function PSF.

Table 3: quantitative result change caused by using input in kernel branch

Method	Input device	PSNR	SSIM	LPIPS
					(a)	None	45.23	0.9897	0.0166
(b)	Image	45.17	0.9896	0.0162
					(c)	PSF	45.26	0.9895	0.0166
(d)	Image+PSF	45.37	0.9898	0.0162

As can be observed from table 3, the present invention achieves a strong image restoration effect by adding a separate kernel branch to the network and comprehensively using information from the input image itself and the point spread function PSF.

Table 4 is a loss function comparison experiment. As can be observed from Table 4, the present invention compares L commonly used in the prior art ₁ Replacing the loss as Mapping _ L ₁ After loss, the image recovery effect is further improved. Moreover, experiments prove that Mapping _ L is used ₁ Loss ratio Mapping _ L ₂ The loss is more suitable for the restoration of the UDC image, and clearer visual quality can be obtained.

Table 4: loss function comparison test results

Loss function	PSNR	SSIM	LPIPS
				L ₁	41.30	0.9812	0.0301
Mapping_L ₂	40.19	0.9838	0.0238
				Mapping_L ₁	45.37	0.9898	0.0162

Experimental results show that the present invention surpasses the most advanced methods at present in both quantitative performance and visual quality, and can produce visually satisfactory results without significant artifacts even in over-saturated areas.

In summary, compared with the prior art, the technical effects of the present invention are mainly reflected in the following aspects:

1) In the prior art, the same filter weight is applied in all regions. The invention uses the spatial feature transformation layer, so that the network can pay attention to different dynamic range areas to different degrees, thereby improving the model performance.

2) In the prior art, fuzzy core information is ignored, and network training is directly carried out. According to the method, a single kernel branch is added, the measured point spread function PSF related to the UDC imaging system is used as priori knowledge to be added into network training, and the model performance is improved.

3) Prior art uses conventional L ₁ Or L ₂ A loss function. The invention designs tone Mapping loss Mapping _ L according to the characteristics of the UDC image ₁ And the loss function further improves the model performance and stabilizes the network optimization, thereby obtaining better visual quality.

4) The invention provides an end-to-end network to relieve the problems of halo, haze, blur, noise and the like in the UDC image in the HDR scene, can bring better sensory experience to users, and has certain significance for further popularization and application of the UDC system.

5) The method can be applied to the restoration of the UDC image, and can also be applied to other bottom layer vision tasks, particularly to the image restoration in an HDR scene, so that the method has certain guiding significance to the image restoration in other HDR scenes. In addition, the underlying network of the present invention may also employ other existing network architectures.

The method can be applied to electronic equipment, a server or a cloud, and can be used for reconstructing the acquired target image by using the trained image restoration model to obtain a clear restored image. The electronic device can be a terminal device or a server, and the terminal device comprises any terminal device such as a mobile phone, a tablet computer, a Personal Digital Assistant (PDA), a point-of-sale (POS), a smart wearable device (a smart watch, virtual reality glasses, a virtual reality helmet, etc.). The server includes but is not limited to an application server or a Web server, and may be a stand-alone server, a cluster server, a cloud server, or the like.

The present invention may be a system, method and/or computer program product. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied therewith for causing a processor to implement various aspects of the present invention.

The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or electrical signals transmitted through electrical wires.

The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.

The computer program instructions for carrying out operations of the present invention may be assembler instructions, instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + +, python, or the like, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present invention are implemented by personalizing an electronic circuit, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA), with state information of computer-readable program instructions, which can execute the computer-readable program instructions.

Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.

These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable medium storing the instructions comprises an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.

The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions. It is well known to those skilled in the art that implementation by hardware, by software, and by a combination of software and hardware are equivalent.

Having described embodiments of the present invention, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein was chosen in order to best explain the principles of the embodiments, the practical application, or technical improvements to the market, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein. The scope of the invention is defined by the appended claims.

Claims

1. A screen camera image restoration method based on a U-shaped dynamic network comprises the following steps:

collecting a target image;

2. The method of claim 1, wherein the underlying network is a U-structured network, and is partitioned to include a shallow network and a deep network based on the depth of extracted features, the shallow network gradually extracting features through an increasing receptive field and mapping the input image into a high-dimensional representation; the deep network learns deep features from the decoding process, and the U-shaped structured network combines the shallow features and the deep features via a hopping connection.

3. The method of claim 1, wherein the conditional branch comprises a first convolution layer, a first residual block layer, a plurality of down-sampling branches, and a spatial feature transform layer, and wherein an input image enters the first residual block layer after being convolved by the first convolution layer; generating spatial condition features with different resolutions through the plurality of downsampling branches; and integrating the obtained spatial condition characteristics into the forward propagation of the base network through the spatial characteristic transformation layer.

4. The method of claim 3, wherein the operation of the spatial feature transform layer SFT is represented as:

SFT(x)＝α⊙x+β

wherein "" represents point-by-point multiplication, x is an intermediate characteristic in the base network, and α and β are modulation factor characteristic maps learned by the spatial characteristic transformation layer from the output characteristics of the conditional branches.

5. The method of claim 1, wherein the kernel branch includes a second convolution layer, a second residual block layer, and a plurality of sampling branches, the second convolution layer takes a point spread function characteristic and a combined characteristic of an input image in a characteristic dimension as input, the input image enters the second residual block layer after a channel number is transformed, and the plurality of sampling branches are respectively connected to the second residual block layer and are used for generating dynamic convolution kernels with different spatial resolutions, so as to perform dynamic convolution on each pixel of the dynamic convolution kernels.

6. The method of claim 1, wherein training the loss function of the image restoration model is set to:

Mapping_L ₁ (Y,X)＝|Mapping(Y)-Mapping(X)|

wherein Y represents a restored image generated by the image restoration model, X represents a corresponding real image, and Mapping is a tone Mapping function for converting an image into a standard image for normalizing image values to [0,1]]，Mapping_L ₁ (Y, X) L representing the difference between Y and X after tone mapping ₁ Norm as a function of loss.

7. The method of claim 6, wherein the tone mapping function is arranged to:

Mapping(I)＝I/(I+0.25)

where I denotes an image, which is a generic representation of the input to the tone Mapping function Mapping.

8. The method of claim 1, wherein the point spread function features are obtained according to the following steps:

analyzing the point spread function through principal components to obtain a multi-dimensional vector;

and copying the multi-dimensional vector, and expanding the spatial dimension of the multi-dimensional vector to be consistent with that of the input image to obtain the point spread function characteristic.

9. A computer-readable storage medium, on which a computer program is stored, wherein the computer program realizes the steps of the method according to any one of claims 1 to 8 when executed by a processor.

10. A computer device comprising a memory and a processor, on which memory a computer program is stored which is executable on the processor, characterized in that the processor realizes the steps of the method according to any one of claims 1 to 8 when executing the computer program.