CN116385305A - Cross-region transducer-based image shadow removing method and system for nerve radiation field - Google Patents

Cross-region transducer-based image shadow removing method and system for nerve radiation field Download PDF

Info

Publication number
CN116385305A
CN116385305A CN202310378434.4A CN202310378434A CN116385305A CN 116385305 A CN116385305 A CN 116385305A CN 202310378434 A CN202310378434 A CN 202310378434A CN 116385305 A CN116385305 A CN 116385305A
Authority
CN
China
Prior art keywords
shadow
image
module
network model
crformer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310378434.4A
Other languages
Chinese (zh)
Inventor
王波
国英龙
杨巨成
王伟
刘海涛
贾智洋
魏峰
徐振宇
孙笑
王嫄
陈亚瑞
张传雷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Siteng Heli Tianjin Technology Co ltd
Original Assignee
Siteng Heli Tianjin Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Siteng Heli Tianjin Technology Co ltd filed Critical Siteng Heli Tianjin Technology Co ltd
Priority to CN202310378434.4A priority Critical patent/CN116385305A/en
Publication of CN116385305A publication Critical patent/CN116385305A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/80Geometric correction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Radiation-Therapy Devices (AREA)

Abstract

The invention provides an image shadow removing method and system of a nerve radiation field based on a transregional transducer, comprising the following steps: acquiring a fern data set under the surf_llff_data; constructing a shadow removing network model fusing an MLP neural network and a cross-region transducer; initializing the shadow removing network model, selecting an optimizer, and setting network training parameters; optimizing the shadow removing network model by using a loss function and storing; and loading an optimal shadow removing network model generated in the training process, acquiring a test set, inputting the test set into the shadow removing network model, and rendering to generate an image without shadow. The invention uses the MLP neural network fused into NeRF in the transregional Transformer (CRFormer) for high quality shadow removal, rendering high quality shadow-removed images.

Description

Cross-region transducer-based image shadow removing method and system for nerve radiation field
Technical Field
The invention belongs to the field of computer vision, and particularly relates to an image shadow removing method and system for a nerve radiation field based on a transregional transducer.
Background
The viewpoint synthesis based on images is an important problem commonly focused in the fields of computer graphics and computer vision, and mainly refers to using a plurality of images of known shooting viewpoints as input to express the properties of geometry, appearance, illumination and the like of three-dimensional objects or scenes shot by the images, so that images of other non-shooting viewpoints can be synthesized, and finally, a drawing result with high reality is obtained. Compared with the traditional process of combining three-dimensional reconstruction with graphic drawing, the method can obtain a photo-level synthesis effect.
While conventional computer graphics allow for the generation of high quality controllable scene images, all physical parameters of the scene, such as camera parameters, illuminance, and material of the object, need to be provided as inputs. These physical properties need to be estimated from existing observations (such as images and video) if one wants to generate a controllable image of the real world scene. This estimation task, known as reverse rendering, is very challenging, especially when the target is photo-realistic synthesis. In contrast, neural rendering is a rapidly emerging field that can compactly represent scenes, and by using neural networks, rendering can be learned from existing observations. The main idea of neural rendering is to combine the insight of classical computer graphics with the latest evolution of deep learning. Similar to classical computer graphics, the goal of neural rendering is to generate photo-realistic images in a controlled manner.
With the rise of neural rendering (NeRF) technology, a similar method is also expanded into the field of viewpoint synthesis, a three-dimensional scene or model is represented by using a neural radiation field, and a volume rendering method is combined, so that the representation method is successfully applied to the field of viewpoint synthesis, and a high-quality synthesis effect is obtained. And the method is optimized and expanded. NeRF, as an implicit representation, provides a new idea for traditional graphics processing methods, namely processing images from the perspective of an implicit neural representation, or neural field.
But the neural rendering results may appear as bad shadows and reduce visual quality. Shadows can also affect the characteristic representation of an image and may adversely affect subsequent image, video processing tasks.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides an image shadow removing method and system for a nerve radiation field based on a transregional transducer.
In order to achieve the above purpose, the technical scheme of the invention is realized as follows:
an image shadow removal method of a transregional transducer-based neural radiation field, comprising:
s1: acquiring a fern data set under the surf_llff_data;
s2: constructing a shadow removing network model fusing an MLP neural network and a cross-region transducer;
s3: initializing the shadow removing network model, selecting an optimizer, and setting network training parameters;
s4: optimizing the shadow removing network model by using a loss function and storing;
s5: and loading an optimal shadow removing network model generated in the training process, acquiring a test set, inputting the test set into the shadow removing network model, and rendering to generate an image without shadow.
Further, in step S2, a CRFormer module is added to the MLP neural network, where the CRFormer module is used to remove shadows in the image, and a double encoder in the CRFormer module is used to extract a given shadow image, then a cross-region alignment block is used to absorb a shadow portion, and finally the CRFormer module is used to recover the shadow portion; the MLP neural network is used to render a composite of images.
Further, step S3 establishes the shadow removing network model by using a pytorch framework, selects gradient back propagation calculation for training, and initializes the learning rate.
Further, in step S4, the reconstruction loss and the spatial loss are optimized using the loss function, the shadow image is removed, and the average value of the local area of the image is processed in the MLP network.
In another aspect, the present invention further provides an image shadow removing system for a nerve radiation field based on a transregional transducer, including:
the data set module acquires a fern data set under the surf_llff_data;
the model module is used for constructing a shadow removing network model fusing the MLP neural network and the cross-region transducer;
the initialization module initializes the shadow removing network model, selects an optimizer and sets network training parameters;
the optimization module optimizes the shadow removing network model by using the loss function and stores the shadow removing network model;
and the optimal module loads an optimal shadow removing network model generated in the training process, acquires a test set, inputs the test set into the shadow removing network model, and renders and generates an image without shadow.
Further, the model module comprises an MLP neural network module and a CRFormer module, wherein the CRFormer module is used for removing shadows in images, firstly, a double encoder in the CRFormer module is used for extracting a given shadow image, then a cross-region alignment block is used for absorbing shadow parts, and finally, the CRFormer module is used for recovering the shadow parts; the MLP neural network module is used to render a composite of images.
Furthermore, the initialization module establishes the shadow removing network model by adopting a pytorch framework, selects gradient back propagation calculation for training, and initializes the learning rate.
Further, the method for image shadow removal based on transregional fransformer neural radiation field of claim 1, wherein the optimizing module uses a loss function to optimize reconstruction loss and spatial loss, remove shadow images, and process the average of local areas of the image in the MLP neural network.
Compared with the prior art, the invention has the following beneficial effects:
1. the present invention uses a MLP neural network fused into NeRF across region Transformer (CRFormer) for high quality shadow removal;
2. the invention aggregates the pixel characteristics of the non-shadow area into the restored shadow area characteristics through the new area perceived cross-attention (RCA) proposed in the CRFormer; a high quality shadow-removed image is rendered compared to the original neural radiation field.
Drawings
FIG. 1 is a schematic flow chart of an embodiment of the present invention;
FIG. 2 is a diagram of a method of computing a cross-region alignment block according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of an MLP neural network according to an embodiment of the present invention.
Detailed Description
It should be noted that, without conflict, the embodiments of the present invention and features of the embodiments may be combined with each other.
Cross-region Transformer (CRFormer) is used for image shadow high quality removal, taking into account all pixels from non-shadow regions to help recover each shadow pixel, taking full advantage of potential context cues from non-shadow regions to remove shadows.
Aiming at the shadow problem of the neural rendering result, the invention provides an image shadow removing method of a neural radiation field based on a transregional transducer, and CRFormer is applied to the neural radiation field to realize shadow removing of the image.
The following describes the embodiments of the present invention with reference to the drawings.
FIG. 1 is a flowchart of a method for shadow removal of images of a transregional transducer-based neural radiation field according to the present invention, comprising:
step 1: and acquiring a fern data set under the nerf_llff_data.
And a fern data set under the nerf_llff_data in the NeRF official data set is adopted, the data set comprises 72 training pictures, 20 verification pictures and 20 test pictures, and the angles of the pictures are different.
Step 2: fusion MLP neural networks and trans-former formats were employed for shadow removal.
And adding a CRFormer module into the MLP network by adopting a fused MLP neural network and a cross-region Transformer (CRFormer) model, firstly extracting a given shadow image by using a double encoder in the CRFormer module, then absorbing a shadow part by using a cross-region alignment block, and finally recovering the shadow part by using the CRFormer. The model mainly comprises: CRFormer module, MLP neural network module. The CRFormer module is used for removing shadows in the image, and the MLP neural network is used for rendering the synthesis of the image.
As shown in fig. 1, a new cross-region Transformer (CRFormer) is employed, in CRFormer, a dual encoder architecture design is employed for extracting asymmetric features.
Firstly, extracting asymmetric characteristics between two paths of a given shadow image and a shadow mask thereof by using a double encoder (NS Path, S Path); then, the proposed transducer layer with N cross-region alignment blocks absorbs the characteristics of both shadow and non-shadow regions, establishing a connection from the non-shadow region to the shadow region, which is achieved by newly designed region-aware cross-attention. In this way, the proposed CRFormer can recover the intensity of each shadow pixel in the shadow region with enough context information from the non-shadow region. Then, the outputs of the series of cross-region alignment blocks are fed into a single decoder to achieve the de-shadowing result; finally, post-processing is performed by using a lightweight U-shaped network to redetermine the obtained shadow removal result.
To reduce interference caused by convolutions between shadow pixels and non-shadow pixels, features within each region are extracted to provide non-shadow region features of interest, and a top encoder (non-shadow path) is constructed on a shallow sub-net using three convolutions, including two 3 x 3 average pooling convolutions, to sample the feature map; a 1 x 1 convolution adjusts the dimension of the feature map to match the dimension of the bottom encoder output. The bottom encoder of the shadow path is a deeper encoder consisting of several convolutions and residual blocks, where the step size of the two convolutions is set to 2, sampling the feature map. Image semantic segmentation primarily serves to refine the quality of the shadow removal of those three pictures in fig. 1.
As shown in fig. 2, to recover shadow pixels, it is important to fully explore and exploit the potential context cues of non-shadow regions. The present invention therefore proposes a new type of Transformer layer with region-aware cross-attention (RAC) that transfers sufficient context information from non-shadow regions to shadow regions. Within the transducer layer, the transducer layer with N cross-region alignment blocks absorbs the characteristics of both shadow and non-shadow regions, establishing a connection from the non-shadow region to the shadow region, by newly designed region-aware cross-over. In this way, the proposed CRFormer can recover the intensity of each shadow pixel in the shadow region with enough context information from the non-shadow region. Then, the outputs of the series of cross-region alignment blocks are fed into a single decoder to achieve the de-shadowing result; finally, post-processing is performed by using a lightweight U-shaped network to redetermine the obtained shadow removal result.
Step 3: initializing a network model, selecting an optimizer, and setting parameters of network training.
And (3) establishing a network model by adopting a pytorch framework, selecting gradient back propagation calculation for training, and selecting 20 pictures from the data set as a test data set and 72 pictures as a training data set. batch is set to 64 and the learning rate is dynamically decremented from 0.001 to 0.00015.
Step 4: the network model is optimized and saved using the loss function.
Crformars are trained in an end-to-end fashion. The functional formula of the total loss (L) is:
L=ω 1 L rec2 L spa (1)
wherein L is rec To reconstruct the loss, L spa For space loss omega 1 And omega 2 Is the weight of the different penalty terms.
Specifically, the pixel level L1 distance is adopted to ensure that the pixel intensity of the shadow removing result is consistent with the pixel intensity of the real image, and the calculation formula is as follows:
L rec =||^I-I gt || 1 +||I r -I gt || 1 (2)
wherein ζ represents a shadow-removed image, I r Representing the intensity of the shadow-removed pixels, I being the real image, I gt For true image pixel intensities, 1 represents an empirically set representation pixel level distance.
In addition, the spatial consistency of the images is enhanced by preserving the variability between adjacent regions of the shadow-free image and its corresponding shadow-free version, calculated as:
L spa =Φ(^I ,I gt )+Φ(I r ,I gt ) (3)
wherein ζ represents a shadow-removed image, I r Representing the intensity of the shadow-removed pixels, I being the real image, I gt For true image pixel intensity, Φ is the weight lost.
The loss function can remove shadow images, and can also process the average value of the local areas of the images in the MLP network, so that the feature extraction effect of the MLP network is improved.
Step 5: and loading an optimal network model generated in the training process, acquiring a test set, inputting the test set into the network model, and rendering to generate an image without shadow.
Loading the trained network model, and generating a shadow-removed rendering image result by using the images in the data set.
The foregoing description is only of the preferred embodiments of the present application and is not intended to limit the same, but rather, various modifications and variations may be made by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principles of the present application should be included in the protection scope of the present application.

Claims (8)

1. A transregional transducer-based image shadow removal method for a neural radiation field, comprising:
s1: acquiring a fern data set under the surf_llff_data;
s2: constructing a shadow removing network model fusing an MLP neural network and a cross-region transducer;
s3: initializing the shadow removing network model, selecting an optimizer, and setting network training parameters;
s4: optimizing the shadow removing network model by using a loss function and storing;
s5: and loading an optimal shadow removing network model generated in the training process, acquiring a test set, inputting the test set into the shadow removing network model, and rendering to generate an image without shadow.
2. The method for removing shadows from an image of a transregional fransformer-based neural radiation field according to claim 1, wherein in step S2, a CRFormer module is added to the MLP neural network, the CRFormer module is used to remove shadows in the image, a given shadow image is first extracted by a double encoder in the CRFormer module, then a transregional alignment block is used to absorb the shadow portion, and finally the CRFormer module is used to restore the shadow portion; the MLP neural network is used to render a composite of images.
3. The method for image shadow removal of a transregional fransformer-based neural radiation field of claim 1, wherein step S3 uses a pytorch framework to build the shadow removal network model, selects gradient back propagation calculations for training, and initializes a learning rate.
4. The method of claim 1, wherein the step S4 is performed by optimizing reconstruction loss and spatial loss using a loss function, removing shadow images, and processing the average value of the local area of the image in the MLP network.
5. An image shadow removal system for a transregional transducer-based neural radiation field, comprising:
the data set module acquires a fern data set under the surf_llff_data;
the model module is used for constructing a shadow removing network model fusing the MLP neural network and the cross-region transducer;
the initialization module initializes the shadow removing network model, selects an optimizer and sets network training parameters;
the optimization module optimizes the shadow removing network model by using the loss function and stores the shadow removing network model;
and the optimal module loads an optimal shadow removing network model generated in the training process, acquires a test set, inputs the test set into the shadow removing network model, and renders and generates an image without shadow.
6. The cross-regional fransformer based neural radiation field image shadow removal system of claim 5, wherein the model modules include an MLP neural network module and a CRFormer module, the CRFormer module is used to remove shadows in the image, a given shadow image is first extracted by a double encoder in the CRFormer module, then a cross-regional alignment block is used to absorb the shadow portion, and finally the CRFormer module is used to restore the shadow portion; the MLP neural network module is used to render a composite of images.
7. The transregional fransformer-based neural radiation field image shadow removal system of claim 5, wherein the initialization module uses a pytorch framework to build the shadow removal network model, selects gradient back propagation calculations for training, and initializes a learning rate.
8. The transregional fransformer based image shadow removal system of claim 5, wherein the optimization module uses a loss function to optimize reconstruction loss and spatial loss, remove shadow images, and process the average of local areas of the image in the MLP neural network.
CN202310378434.4A 2023-04-11 2023-04-11 Cross-region transducer-based image shadow removing method and system for nerve radiation field Pending CN116385305A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310378434.4A CN116385305A (en) 2023-04-11 2023-04-11 Cross-region transducer-based image shadow removing method and system for nerve radiation field

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310378434.4A CN116385305A (en) 2023-04-11 2023-04-11 Cross-region transducer-based image shadow removing method and system for nerve radiation field

Publications (1)

Publication Number Publication Date
CN116385305A true CN116385305A (en) 2023-07-04

Family

ID=86967309

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310378434.4A Pending CN116385305A (en) 2023-04-11 2023-04-11 Cross-region transducer-based image shadow removing method and system for nerve radiation field

Country Status (1)

Country Link
CN (1) CN116385305A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116883578A (en) * 2023-09-06 2023-10-13 腾讯科技(深圳)有限公司 Image processing method, device and related equipment
CN117292040A (en) * 2023-11-27 2023-12-26 北京渲光科技有限公司 Method, apparatus and storage medium for new view synthesis based on neural rendering

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116883578A (en) * 2023-09-06 2023-10-13 腾讯科技(深圳)有限公司 Image processing method, device and related equipment
CN116883578B (en) * 2023-09-06 2023-12-19 腾讯科技(深圳)有限公司 Image processing method, device and related equipment
CN117292040A (en) * 2023-11-27 2023-12-26 北京渲光科技有限公司 Method, apparatus and storage medium for new view synthesis based on neural rendering
CN117292040B (en) * 2023-11-27 2024-03-08 北京渲光科技有限公司 Method, apparatus and storage medium for new view synthesis based on neural rendering

Similar Documents

Publication Publication Date Title
Jiang et al. Learning to see moving objects in the dark
Guo et al. Dense scene information estimation network for dehazing
CN116385305A (en) Cross-region transducer-based image shadow removing method and system for nerve radiation field
Shih et al. Exemplar-based video inpainting without ghost shadow artifacts by maintaining temporal continuity
CN109462747B (en) DIBR system cavity filling method based on generation countermeasure network
KR102311796B1 (en) Method and Apparatus for Deblurring of Human Motion using Localized Body Prior
US11880935B2 (en) Multi-view neural human rendering
CN112991231B (en) Single-image super-image and perception image enhancement joint task learning system
CN108648264A (en) Underwater scene method for reconstructing based on exercise recovery and storage medium
CN113284061B (en) Underwater image enhancement method based on gradient network
CN115115516B (en) Real world video super-resolution construction method based on Raw domain
Li et al. Uphdr-gan: Generative adversarial network for high dynamic range imaging with unpaired data
CN107301662A (en) Compression restoration methods, device, equipment and the storage medium of depth image
Lv et al. Low-light image enhancement via deep Retinex decomposition and bilateral learning
CN114972134A (en) Low-light image enhancement method for extracting and fusing local and global features
Huang et al. Removing reflection from a single image with ghosting effect
CN106412560B (en) A kind of stereoscopic image generation method based on depth map
Chen et al. CERL: A unified optimization framework for light enhancement with realistic noise
CN111064905A (en) Video scene conversion method for automatic driving
CN114339030A (en) Network live broadcast video image stabilization method based on self-adaptive separable convolution
CN114862707A (en) Multi-scale feature recovery image enhancement method and device and storage medium
Kim et al. Light field angular super-resolution using convolutional neural network with residual network
Zhang et al. As-Deformable-As-Possible Single-image-based View Synthesis without Depth Prior
Xu et al. Direction-aware video demoireing with temporal-guided bilateral learning
CN112200756A (en) Intelligent bullet special effect short video generation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination