CN116934936A - Three-dimensional scene style migration method, device, equipment and storage medium - Google Patents

Three-dimensional scene style migration method, device, equipment and storage medium Download PDF

Info

Publication number
CN116934936A
CN116934936A CN202311205617.2A CN202311205617A CN116934936A CN 116934936 A CN116934936 A CN 116934936A CN 202311205617 A CN202311205617 A CN 202311205617A CN 116934936 A CN116934936 A CN 116934936A
Authority
CN
China
Prior art keywords
style
image
original
dimensional scene
style migration
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311205617.2A
Other languages
Chinese (zh)
Inventor
陈尧森
刘跃根
罗天
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Sobey Digital Technology Co Ltd
Original Assignee
Chengdu Sobey Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Sobey Digital Technology Co Ltd filed Critical Chengdu Sobey Digital Technology Co Ltd
Priority to CN202311205617.2A priority Critical patent/CN116934936A/en
Publication of CN116934936A publication Critical patent/CN116934936A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/02Non-photorealistic rendering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/10Geometric effects
    • G06T15/20Perspective computation
    • G06T15/205Image-based rendering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Graphics (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Geometry (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Processing Or Creating Images (AREA)
  • Image Processing (AREA)

Abstract

The application discloses a three-dimensional scene style migration method, a device, equipment and a storage medium, which are characterized in that firstly, RGB images under a plurality of view angles are collected to serve as original images, the original images are subjected to data preprocessing to obtain camera position and posture information, then the original images and the camera position and posture information are input into a nerve radiation field model for training, an original three-dimensional scene is constructed, a style migration network is used for carrying out style migration on the original images and the style images in the original three-dimensional scene to obtain an original image after style migration, finally, the original image after style migration is used as supervision data, and the three-dimensional scene after style migration is obtained through optimization. Compared with the prior art, the method has better visual effect, and the whole nerve radiation field does not need to be trained again when facing different style pictures, so that the three-dimensional scene style migration of artistic styles and real scene styles can be realized, and the method has higher practical value.

Description

Three-dimensional scene style migration method, device, equipment and storage medium
Technical Field
The application relates to the technical field of computer vision and machine learning, in particular to a three-dimensional scene style migration method, a device, equipment and a storage medium.
Background
In recent years, based on the implicit three-dimensional representation of the neural radiation field (Neural Radiance Fields, neRF), great progress has been made, and the three-dimensional scene obtained by the method has a very strong sense of realism. In order to reduce the time of artistic creation and the requirement on expertise, the method has strong application value for carrying out style migration on the three-dimensional scene based on NeRF.
Currently, part of the method can realize that artistic features of a single 2D image are transferred into a complete real 3D scene, thereby changing the style in the real scene. The style migration results obtained by the method often have the problems of blurring, inconsistent appearance and artifacts, and the migration results of different style graphs need to be trained from the beginning again, so that the requirements of practical application are difficult to meet.
Disclosure of Invention
The application aims to overcome the defects of the prior art and provide a three-dimensional scene style migration method, a device, equipment and a storage medium, which can solve the problems of poor migration effect and repeated training in the prior art.
The aim of the application is achieved by the following technical scheme:
in a first aspect, the present application provides a three-dimensional scene style migration method, where the method includes:
collecting RGB images under a plurality of view angles as original images, and carrying out data preprocessing on the original images to obtain camera position and posture information;
inputting the original image and the camera position and posture information into a nerve radiation field model for training, and constructing an original three-dimensional scene;
performing style migration on the original image and the style image in the original three-dimensional scene by using a style migration network to obtain an original image after style migration;
and optimizing the original image after the style migration by taking the original image as supervision data to obtain a three-dimensional scene after the style migration.
In one possible implementation manner, the step of performing data preprocessing on the original image to obtain camera position and orientation information includes:
performing image screening and resolution adjustment on the original image to obtain an adjusted original image;
and respectively extracting image characteristic points of the adjusted original image, performing image stereo matching on the extracted image characteristic points under a plurality of view angles to generate sparse point clouds, and taking the sparse point clouds as camera position and posture information.
In one possible implementation, the neural radiation field model includes a dense voxel grid and a characteristic voxel grid, and the step of inputting the original image and the camera position and posture information into the neural radiation field model for training, and constructing an original three-dimensional scene includes:
inputting the original image and the camera position and pose information into a dense voxel grid and a characteristic voxel grid;
interpolation is carried out by using the dense voxel grids to inquire density information of the space point positions;
interpolation is carried out by using the characteristic voxel grid to inquire the color information of the space point;
obtaining a rendered image according to the density information and the color information by using a rendering formula;
the loss of the rendered image and the original picture is computed for back propagation.
In one possible embodiment, the density information is:wherein->Is a volume rendering function, +.>Activating a function for softplus, +.>For space point coordinates>In order to have a dense voxel grid,is a difference function;
the color information is:,/>for space point coordinates>Is a characteristic voxel grid;
the rendering formula is:+/>,/>is the attenuation parameter, K is the number of beams, < >>Is background color->Is the attenuation parameter at the K-th point.
In one possible implementation manner, the step of performing style migration on the original image and the style image in the original three-dimensional scene by using a style migration network to obtain a style migrated original image includes:
extracting style characteristics and content characteristics of an original image and a style image respectively by adopting a pretrained VGG19 convolutional neural network;
adopting a feature pyramid network to fuse the style features and the content features;
using a picture style migration network to perform mean value calculation and variance calculation on the fused style characteristics and content characteristics to obtain a stylized image;
filtering an abnormal value generated by transmission in the stylized image by using a Gaussian filter to obtain a result image;
converting the result image into a YUV domain, and converting the result image and the style image into the YUV domain by using a picture style migration network;
and splicing the Y channel converted from the result image and the style image with the UV channel converted from the result image to obtain an original image after style migration.
In one possible implementation manner, the step of optimizing the original image after the style migration to obtain the three-dimensional scene after the style migration by using the original image after the style migration as the supervision data includes:
performing stylized three-dimensional scene rendering on the original three-dimensional scene in a volume rendering mode to obtain a stylized rendering image;
and calculating the loss of the stylized rendering image and the original image after the style migration and carrying out back propagation.
In one possible implementation manner, the step of performing stylized three-dimensional scene rendering on the original three-dimensional scene in a volume rendering manner includes:
sampling the characteristic voxel grids through the characteristic voxel grids to obtain original scene color information;
extracting style characteristics of the style image by adopting a pre-trained style characteristic encoder;
generating control parameters by utilizing the super-network processing style characteristics;
adjusting the weight of the color generation module by using the control parameters;
and performing feature migration on the original color information to obtain a final rendering result.
In a second aspect, the present application proposes a three-dimensional scene style migration apparatus, the apparatus comprising:
the preprocessing module is used for collecting RGB images under a plurality of view angles as original images, and carrying out data preprocessing on the original images to obtain camera position and posture information;
the training module is used for collecting RGB images under a plurality of view angles as original images, and carrying out data preprocessing on the original images to obtain camera position and posture information;
the style migration module is used for performing style migration on the original image and the style image in the original three-dimensional scene by using a style migration network to obtain an original image after style migration;
and the scene generation module is used for optimizing the original image after the style migration to obtain a three-dimensional scene after the style migration by taking the original image after the style migration as supervision data.
In a third aspect, the present application also proposes a computer device comprising a processor and a memory, the memory having stored therein a computer program, the computer program being loaded and executed by the processor to implement the three-dimensional scene style migration method according to any of the first aspects.
In a fourth aspect, the present application also proposes a computer readable storage medium having stored therein a computer program, the computer program being loaded and executed by a processor to implement the three-dimensional scene style migration method according to any of the first aspects.
The above-mentioned main scheme of the application and its various further alternatives can be freely combined to form multiple schemes, which are all the schemes that the application can adopt and claim; and the application can be freely combined between the (non-conflicting choices) choices and between the choices and other choices. Various combinations will be apparent to those skilled in the art from a review of the present disclosure, and are not intended to be exhaustive or all of the present disclosure.
The application discloses a three-dimensional scene style migration method, a device, equipment and a storage medium, which are characterized in that firstly, RGB images under a plurality of view angles are collected as original images, the original images are subjected to data preprocessing to obtain camera position and posture information, then the original images and the camera position and posture information are input into a nerve radiation field model for training, an original three-dimensional scene is constructed, a style migration network is used for carrying out style migration on the original images and the style images in the original three-dimensional scene to obtain an original image after style migration, and finally, the original image after style migration is used as supervision data to obtain a three-dimensional scene after style migration in an optimized mode. Compared with the prior art, the method has better visual effect, and the whole nerve radiation field does not need to be trained again when facing different style pictures, so that the three-dimensional scene style migration of artistic styles and real scene styles can be realized, and the method has higher practical value.
Drawings
Fig. 1 shows a flow chart of a three-dimensional scene style migration method according to an embodiment of the present application.
Fig. 2 shows a schematic flow chart of style migration according to an embodiment of the present application.
Fig. 3 shows a schematic diagram of an embodiment of three-dimensional scene style migration proposed by an embodiment of the present application.
Detailed Description
Other advantages and effects of the present application will become apparent to those skilled in the art from the following disclosure, which describes the embodiments of the present application with reference to specific examples. The application may be practiced or carried out in other embodiments that depart from the specific details, and the details of the present description may be modified or varied from the spirit and scope of the present application. It should be noted that the following embodiments and features in the embodiments may be combined with each other without conflict.
All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
In the prior art, part of the method can realize that the artistic characteristics of a single 2D image are transferred into a complete real 3D scene, so that the style in the real scene is changed. However, the style migration results obtained by the methods often have the problems of blurring, inconsistent appearance and artifacts, and the migration results of different style graphs need to be trained from the beginning again, so that the requirements of practical application are difficult to meet.
Therefore, in order to solve the above-mentioned problems, embodiments of the present application provide a three-dimensional scene style migration method, apparatus, device, and storage medium, which have better visual effects than the prior art, and do not need to train the entire nerve radiation field again when facing different style pictures, and can also realize three-dimensional scene style migration of artistic styles and real scene styles, so that the three-dimensional scene style migration method, apparatus, device, and storage medium have higher practical values, and are described in detail below.
3D scene style transfer: the appearance in a 3D scene can be edited by texture generation and semantic view synthesis. The scene style is changed by taking the image as a reference, the scene style is also a hot topic of 3D sensing style transfer research, and the spatial consistency is one of main problems to be solved.
Referring to fig. 1, fig. 1 shows a flow chart of a three-dimensional scene style migration method according to an embodiment of the present application, where the method includes the following steps:
s100, acquiring RGB images under a plurality of view angles as original images, and carrying out data preprocessing on the original images to obtain camera position and posture information.
In the reverse engineering, the point data set of the product appearance surface obtained by the measuring instrument is also called as point cloud, the number of points obtained by using the two-dimensional coordinate measuring machine is smaller, and the distance between the points is larger, so that the point cloud is called as sparse point cloud.
The step of preprocessing the original image to obtain the camera position and posture information comprises the following steps:
performing image screening and resolution adjustment on the original image to obtain an adjusted original image;
and respectively extracting image characteristic points of the adjusted original image, performing image stereo matching on the extracted image characteristic points under a plurality of view angles to generate sparse point clouds, and taking the sparse point clouds as camera position and posture information.
Firstly, RGB images under a plurality of view angles in a 3D scene are collected as original images, the original images are subjected to image screening, blurred images are removed, a group of more excellent and clear images are screened, resolution adjustment is carried out on the group of images, the resolution of the images is reduced or improved, and the adjusted original images are obtained. Extracting the characteristic points of the adjusted original image, and performing stereo matching on the characteristic points under different visual angles to obtain a plurality of image pairs and sparse point clouds, wherein the sparse point clouds can represent the position and posture information of the camera.
S200, inputting the original image and the camera position and posture information into a nerve radiation field model for training, and constructing an original three-dimensional scene.
The neural radiation field model is trained to represent the original three-dimensional scene by using the RGB original image and the camera position and posture information, wherein the construction method of the original three-dimensional scene is divided into two parts of appearance feature construction and geometric shape construction.
The nerve radiation field model comprises dense voxel grids and characteristic voxel grids, the original image and camera position and posture information are input into the nerve radiation field model for training, and the step of constructing an original three-dimensional scene comprises the following steps:
inputting the original image and the camera position and posture information into a dense voxel grid and a characteristic voxel grid;
interpolation is carried out by using the dense voxel grids to inquire density information of the space point positions;
interpolation is carried out by using the characteristic voxel grid to inquire the color information of the space point;
obtaining a rendered image according to the density information and the color information by using a rendering formula;
the loss of the rendered image and the original picture is computed for back propagation.
Geometry construction is represented using a dense voxel grid, which aims to efficiently query density information of any spatial point by interpolation:wherein->Is a volume rendering function, +.>Activating a function for softplus, +.>For space point coordinates>Is a dense voxel grid>Is a difference function;
appearance feature construction is represented using a feature voxel grid, which aims to efficiently query color information of any spatial point location by interpolation:,/>for space point coordinates>Is a characteristic voxel grid;
the rendering formula is:+/>,/>is the attenuation parameter, K is the number of beams,is background color->Is the attenuation parameter of the kth point, +.>
The density information and the color information of any point in the scene are obtained, and the construction of the three-dimensional scene is completed. If the scene corresponding to the view angle needs to be rendered, the scene is calculated according to a rendering formula of volume rendering.
And S300, performing style migration on the original image and the style image in the original three-dimensional scene by using a style migration network to obtain the original image after style migration.
And (3) generating supervision data of scene style migration: performing style migration on the original image and the style image by using a style migration network, wherein the style migration network is used as supervision data, and the supervision data generation comprises: extracting features of an original image and a style image, fusing multi-scale features, performing preliminary style migration and optimizing style migration.
Specifically, the step of performing style migration on the original image and the style image in the original three-dimensional scene by using a style migration network to obtain the original image after style migration comprises the following steps:
extracting style characteristics and content characteristics of an original image and a style image respectively by adopting a pretrained VGG19 convolutional neural network;
adopting a feature pyramid network to fuse the style features and the content features;
using a picture style migration network to perform mean value calculation and variance calculation on the fused style characteristics and content characteristics to obtain a stylized image;
filtering an abnormal value generated by transmission in the stylized image by using a Gaussian filter to obtain a result image;
converting the result image into a YUV domain, and converting the result image and the style image into the YUV domain by using a picture style migration network;
and splicing the Y channel converted from the result image and the style image with the UV channel converted from the result image to obtain an original image after style migration.
The feature extraction of the original image and the style image adopts a pretrained VGG19 convolutional neural network, and aims to extract the style features and the content features. The multi-scale feature fusion adopts a feature pyramid network FPN, and aims to fuse picture features with different resolutions and enhance the feature extraction effect. The primary style migration adopts a common picture style migration network AdaIN, and aims to obtain a style migration result preliminarily by utilizing extraction style characteristics and content characteristics. Style migration optimization includes the sub-steps of: filtering abnormal values generated by feature transfer by using a Gaussian filter, so that the feature transfer is smoother, converting a result image and a style image into a YUV domain, and further processing the result image and the style image by using AdaIN; and splicing the Y channel of the obtained result and the UV channel of the original result, and converting the result into RGB to obtain a final result.
Referring to fig. 2, fig. 2 shows a flow chart of style migration according to an embodiment of the present application, where the processed omnibearing style and full-resolution content are spliced to obtain a low-resolution stylized image, the low-resolution stylized image is filtered by a gaussian filter, and then the image is converted into a YUV domain, and a UV channel therein is selected. And inputting the full-resolution content into RGBtoYUV, converting the picture into YUV domain, processing the picture by using a style migration network, taking a Y channel therein, splicing the UV channel and the Y channel, and finally converting the spliced UV channel and Y channel into RGB image, thus obtaining the full-resolution image.
The two-dimensional realism style conversion framework shown in fig. 2 supports the input of full resolution style images and full resolution content images, enabling the realism style conversion of style images to content images. The image is converted into YUV channels in this framework. And finally, fusing the generated stylized UV channel, and fusing the stylized image with the Y channel fused by the original content image to obtain the final realistic stylized image.
And S400, optimizing the original image after style migration to obtain a three-dimensional scene after style migration by taking the original image after style migration as supervision data.
Three-dimensional scene style migration: taking the image after style migration as supervision, and optimizing and training to obtain the three-dimensional scene after style migration.
The method for optimizing the three-dimensional scene after style migration by taking the original image after style migration as supervision data comprises the following steps:
performing stylized three-dimensional scene rendering on the original three-dimensional scene in a volume rendering mode to obtain a stylized rendering image;
and calculating the loss of the stylized rendering image and the original image after the stylized migration, and carrying out back propagation.
The specific stylized three-dimensional scene rendering comprises the following sub-steps: sampling the characteristic voxel grids through the characteristic voxel grids to obtain original scene color information;
extracting style characteristics of the style image by adopting a pre-trained style characteristic encoder;
generating control parameters by utilizing the super-network processing style characteristics;
adjusting the weight of the color generation module by using the control parameters;
and performing feature migration on the original color information to obtain a final rendering result.
The style characteristic extraction adopts a pretrained style characteristic encoder which adopts a VGGNet structure, and the color generation module adopts an MLP (multi-level programming) for obtaining final color information C according to style characteristics and original colors:
where c is the original color, x is the coordinates of the point, and d is the viewpoint direction. Different style maps can accomplish style migration of the three-dimensional scene and do not need retraining.
In one possible embodiment, please refer to fig. 3, fig. 3 illustrates a schematic diagram of an embodiment of three-dimensional scene style migration proposed by an embodiment of the present application. Training an original picture, obtaining density information by using a density voxel grid, obtaining color information by using a characteristic voxel grid and a view angle direction and combining with super-linearity processing, performing style coding on the color information by combining with a style picture, and updating the weight of the super-linearity by using a super-network. And inputting the style image and the content image into a YUV style network to obtain a stylized image.
In the framework, training of photorealistic style conversion in 3D scenes is divided into two phases. The first stage is the geometric training of a single scene. We use the density voxel grid and the feature voxel grid to directly represent the scene, output the density with the density voxel grid, and predict the color with the feature voxel grid of the rgdnet shallow MLP. The second stage is style training. The parameters of the density voxel grid and the feature voxel grid will be frozen and we use the features of the reference pattern image as input to the super-network, thus controlling the input of rgdnet. Therefore, we jointly optimize the super network to realize the scene realism style conversion and the image of any style.
Referring again to fig. 3, in this framework, the photo style conversion training in a three-dimensional scene is split into two phases. The first stage is a single scene geometry training. We use the density voxel grid and the feature voxel grid to directly represent the scene, output the density with the density voxel grid, and predict the color with the feature voxel grid of the rgdnet shallow MLP. The second stage is style training. And freezing parameters of the density voxel grid and the characteristic voxel grid, taking the characteristics of the reference style image as the input of the super network, and controlling the input of RGB Net. Therefore, we jointly optimize the super network to realize the scene realism style conversion and the image of any style.
Compared with the prior art, the embodiment of the application has the following beneficial effects:
firstly, a NeRF model is used for representing an initial three-dimensional scene, an effective two-dimensional style migration image is obtained through a style migration network, and then a three-dimensional scene style migration module is trained by combining an original image and a stylized image, so that style migration of the three-dimensional scene is realized.
Secondly, three-dimensional scene style migration of artistic styles and real scene styles can be realized.
Thirdly, style migration of the three-dimensional scene can be completed for any style picture, and the head training is not needed again.
The following provides a possible implementation manner of the three-dimensional scene style migration device, which is used for executing each execution step and corresponding technical effect of the three-dimensional scene style migration method shown in the above embodiment and the possible implementation manner. The device comprises:
the preprocessing module is used for collecting RGB images under a plurality of view angles as original images, and carrying out data preprocessing on the original images to obtain camera position and posture information;
the training module is used for collecting RGB images under a plurality of view angles as original images, and carrying out data preprocessing on the original images to obtain camera position and posture information;
the style migration module is used for performing style migration on the original image and the style image in the original three-dimensional scene by using a style migration network to obtain an original image after style migration;
and the scene generation module is used for optimizing the original image after the style migration to obtain a three-dimensional scene after the style migration by taking the original image after the style migration as supervision data.
The preferred embodiment provides a computer device, which can implement the steps in any embodiment of the three-dimensional scene style migration method provided by the embodiment of the present application, so that the beneficial effects of the three-dimensional scene style migration method provided by the embodiment of the present application can be implemented, and detailed descriptions of the foregoing embodiments are omitted herein.
Those of ordinary skill in the art will appreciate that all or a portion of the steps of the various methods of the above embodiments may be performed by instructions, or by instructions controlling associated hardware, which may be stored in a computer-readable storage medium and loaded and executed by a processor. To this end, an embodiment of the present application provides a storage medium having stored therein a plurality of instructions capable of being loaded by a processor to perform the steps of any one of the embodiments of the three-dimensional scene style migration method provided by the embodiment of the present application.
Wherein the storage medium may include: read Only Memory (ROM), random access Memory (RAM, random Access Memory), magnetic or optical disk, and the like.
The steps in any three-dimensional scene style migration method embodiment provided by the embodiment of the present application can be executed due to the instructions stored in the storage medium, so that the beneficial effects that any three-dimensional scene style migration method provided by the embodiment of the present application can be achieved, and detailed descriptions of the previous embodiments are omitted.
The foregoing description of the preferred embodiments of the application is not intended to be limiting, but rather is intended to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the application.

Claims (10)

1. A method for style migration of a three-dimensional scene, the method comprising:
collecting RGB images under a plurality of view angles as original images, and carrying out data preprocessing on the original images to obtain camera position and posture information;
inputting the original image and the camera position and posture information into a nerve radiation field model for training, and constructing an original three-dimensional scene;
performing style migration on the original image and the style image in the original three-dimensional scene by using a style migration network to obtain an original image after style migration;
and optimizing the original image after the style migration by taking the original image as supervision data to obtain a three-dimensional scene after the style migration.
2. The method for migrating a three-dimensional scene according to claim 1, wherein the step of preprocessing the original image data to obtain camera position and orientation information comprises:
performing image screening and resolution adjustment on the original image to obtain an adjusted original image;
and respectively extracting image characteristic points of the adjusted original image, performing image stereo matching on the extracted image characteristic points under a plurality of view angles to generate sparse point clouds, and taking the sparse point clouds as camera position and posture information.
3. The three-dimensional scene style migration method of claim 1, wherein the step of inputting the original image and the camera position and orientation information into a neural radiation field model for training, and constructing an original three-dimensional scene, comprises:
inputting the original image and the camera position and pose information into a dense voxel grid and a characteristic voxel grid;
interpolation is carried out by using the dense voxel grids to inquire density information of the space point positions;
interpolation is carried out by using the characteristic voxel grid to inquire the color information of the space point;
obtaining a rendered image according to the density information and the color information by using a rendering formula;
the loss of the rendered image and the original picture is computed for back propagation.
4. The three-dimensional scene style migration method of claim 3, wherein the density information is:wherein->Is a volume rendering function, +.>For the softplus activation function,is a difference function>For space point coordinates>Is a dense voxel grid;
the color information is:,/>is a characteristic voxel grid;
the rendering formula is:+/>,/>is the attenuation parameter, K is the number of beams,is background color->Is the attenuation parameter at the K-th point.
5. The method for style migration of a three-dimensional scene according to claim 1, wherein the step of style migrating the original image and the style image in the original three-dimensional scene using a style migration network to obtain the style migrated original image comprises:
extracting style characteristics and content characteristics of an original image and a style image respectively by adopting a pretrained VGG19 convolutional neural network;
adopting a feature pyramid network to fuse the style features and the content features;
using a picture style migration network to perform mean value calculation and variance calculation on the fused style characteristics and content characteristics to obtain a stylized image;
filtering an abnormal value generated by transmission in the stylized image by using a Gaussian filter to obtain a result image;
converting the result image into a YUV domain, and converting the result image and the style image into the YUV domain by using a picture style migration network;
and splicing the Y channel converted from the result image and the style image with the UV channel converted from the result image to obtain an original image after style migration.
6. The method for migrating a three-dimensional scene according to claim 1, wherein the step of optimizing the original image after the migration of the style as the supervision data to obtain the three-dimensional scene after the migration of the style comprises the steps of:
performing stylized three-dimensional scene rendering on the original three-dimensional scene in a volume rendering mode to obtain a stylized rendering image;
and calculating the loss of the stylized rendering image and the original image after the style migration and carrying out back propagation.
7. The three-dimensional scene style migration method according to claim 6, wherein the step of stylized three-dimensional scene rendering of the original three-dimensional scene using a volume rendering method comprises:
sampling the characteristic voxel grids through the characteristic voxel grids to obtain original scene color information;
extracting style characteristics of the style image by adopting a pre-trained style characteristic encoder;
generating control parameters by utilizing the super-network processing style characteristics;
adjusting the weight of the color generation module by using the control parameters;
and performing feature migration on the original color information to obtain a final rendering result.
8. A three-dimensional scene style migration apparatus, the apparatus comprising:
the preprocessing module is used for collecting RGB images under a plurality of view angles as original images, and carrying out data preprocessing on the original images to obtain camera position and posture information;
the training module is used for collecting RGB images under a plurality of view angles as original images, and carrying out data preprocessing on the original images to obtain camera position and posture information;
the style migration module is used for performing style migration on the original image and the style image in the original three-dimensional scene by using a style migration network to obtain an original image after style migration;
and the scene generation module is used for optimizing the original image after the style migration to obtain a three-dimensional scene after the style migration by taking the original image after the style migration as supervision data.
9. A computer device comprising a processor and a memory, the memory having stored therein a computer program that is loaded and executed by the processor to implement the three-dimensional scene style migration method of any of claims 1-7.
10. A computer readable storage medium having stored therein a computer program that is loaded and executed by a processor to implement the three-dimensional scene style migration method of any of claims 1-7.
CN202311205617.2A 2023-09-19 2023-09-19 Three-dimensional scene style migration method, device, equipment and storage medium Pending CN116934936A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311205617.2A CN116934936A (en) 2023-09-19 2023-09-19 Three-dimensional scene style migration method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311205617.2A CN116934936A (en) 2023-09-19 2023-09-19 Three-dimensional scene style migration method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116934936A true CN116934936A (en) 2023-10-24

Family

ID=88377554

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311205617.2A Pending CN116934936A (en) 2023-09-19 2023-09-19 Three-dimensional scene style migration method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116934936A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117541732A (en) * 2024-01-09 2024-02-09 成都信息工程大学 Text-guided neural radiation field building scene stylization method
CN118096978A (en) * 2024-04-25 2024-05-28 深圳臻像科技有限公司 3D artistic content rapid generation method based on arbitrary stylization

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102129708A (en) * 2010-12-10 2011-07-20 北京邮电大学 Fast multilevel imagination and reality occlusion method at actuality enhancement environment
CN113706714A (en) * 2021-09-03 2021-11-26 中科计算技术创新研究院 New visual angle synthesis method based on depth image and nerve radiation field
CN114298895A (en) * 2021-12-24 2022-04-08 成都索贝数码科技股份有限公司 Image realistic style migration method, device, equipment and storage medium
WO2022095757A1 (en) * 2020-11-09 2022-05-12 华为技术有限公司 Image rendering method and apparatus
CN115409937A (en) * 2022-08-19 2022-11-29 中国人民解放军战略支援部队信息工程大学 Facial video expression migration model construction method based on integrated nerve radiation field and expression migration method and system
CN115587930A (en) * 2022-12-12 2023-01-10 成都索贝数码科技股份有限公司 Image color style migration method, device and medium
CN115661403A (en) * 2022-10-13 2023-01-31 阿里巴巴(中国)有限公司 Explicit radiation field processing method, device and storage medium
CN116310028A (en) * 2023-03-07 2023-06-23 上海学深智能科技有限公司 Style migration method and system of three-dimensional face model
CN116418961A (en) * 2023-06-09 2023-07-11 深圳臻像科技有限公司 Light field display method and system based on three-dimensional scene stylization
CN116543086A (en) * 2023-05-04 2023-08-04 阿里巴巴达摩院(杭州)科技有限公司 Nerve radiation field processing method and device and electronic equipment

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102129708A (en) * 2010-12-10 2011-07-20 北京邮电大学 Fast multilevel imagination and reality occlusion method at actuality enhancement environment
WO2022095757A1 (en) * 2020-11-09 2022-05-12 华为技术有限公司 Image rendering method and apparatus
CN113706714A (en) * 2021-09-03 2021-11-26 中科计算技术创新研究院 New visual angle synthesis method based on depth image and nerve radiation field
CN114298895A (en) * 2021-12-24 2022-04-08 成都索贝数码科技股份有限公司 Image realistic style migration method, device, equipment and storage medium
CN115409937A (en) * 2022-08-19 2022-11-29 中国人民解放军战略支援部队信息工程大学 Facial video expression migration model construction method based on integrated nerve radiation field and expression migration method and system
CN115661403A (en) * 2022-10-13 2023-01-31 阿里巴巴(中国)有限公司 Explicit radiation field processing method, device and storage medium
CN115587930A (en) * 2022-12-12 2023-01-10 成都索贝数码科技股份有限公司 Image color style migration method, device and medium
CN116310028A (en) * 2023-03-07 2023-06-23 上海学深智能科技有限公司 Style migration method and system of three-dimensional face model
CN116543086A (en) * 2023-05-04 2023-08-04 阿里巴巴达摩院(杭州)科技有限公司 Nerve radiation field processing method and device and electronic equipment
CN116418961A (en) * 2023-06-09 2023-07-11 深圳臻像科技有限公司 Light field display method and system based on three-dimensional scene stylization

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117541732A (en) * 2024-01-09 2024-02-09 成都信息工程大学 Text-guided neural radiation field building scene stylization method
CN118096978A (en) * 2024-04-25 2024-05-28 深圳臻像科技有限公司 3D artistic content rapid generation method based on arbitrary stylization
CN118096978B (en) * 2024-04-25 2024-07-12 深圳臻像科技有限公司 3D artistic content rapid generation method based on arbitrary stylization

Similar Documents

Publication Publication Date Title
CN109255831B (en) Single-view face three-dimensional reconstruction and texture generation method based on multi-task learning
CN113706714B (en) New view angle synthesizing method based on depth image and nerve radiation field
CN107578436B (en) Monocular image depth estimation method based on full convolution neural network FCN
US11367239B2 (en) Textured neural avatars
CN110349247B (en) Indoor scene CAD three-dimensional reconstruction method based on semantic understanding
CN116934936A (en) Three-dimensional scene style migration method, device, equipment and storage medium
CN111968217A (en) SMPL parameter prediction and human body model generation method based on picture
CN114049420B (en) Model training method, image rendering method, device and electronic equipment
CN111951368B (en) Deep learning method for point cloud, voxel and multi-view fusion
CN114429538B (en) Method for interactively editing nerve radiation field geometry
US11055892B1 (en) Systems and methods for generating a skull surface for computer animation
WO2024055211A1 (en) Method and system for three-dimensional video reconstruction based on nerf combination of multi-view layers
CN113077545A (en) Method for reconstructing dress human body model from image based on graph convolution
CN117274501B (en) Drivable digital person modeling method, device, equipment and medium
JP7446566B2 (en) Volumetric capture and mesh tracking based machine learning
CN117501313A (en) Hair rendering system based on deep neural network
CN110322548B (en) Three-dimensional grid model generation method based on geometric image parameterization
CN116681839A (en) Live three-dimensional target reconstruction and singulation method based on improved NeRF
CN116934972A (en) Three-dimensional human body reconstruction method based on double-flow network
CN116091705A (en) Variable topology dynamic scene reconstruction and editing method and device based on nerve radiation field
CN116452715A (en) Dynamic human hand rendering method, device and storage medium
CN116402943A (en) Indoor three-dimensional reconstruction method and device based on symbol distance field
CN116342377A (en) Self-adaptive generation method and system for camouflage target image in degraded scene
CN115063562A (en) Virtual-real fusion augmented reality presentation method based on multi-view three-dimensional reconstruction
CN114820323A (en) Multi-scale residual binocular image super-resolution method based on stereo attention mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination