CN115147577A - VR scene generation method, device, equipment and storage medium - Google Patents
VR scene generation method, device, equipment and storage medium Download PDFInfo
- Publication number
- CN115147577A CN115147577A CN202211081918.4A CN202211081918A CN115147577A CN 115147577 A CN115147577 A CN 115147577A CN 202211081918 A CN202211081918 A CN 202211081918A CN 115147577 A CN115147577 A CN 115147577A
- Authority
- CN
- China
- Prior art keywords
- rendering
- scene
- neural network
- network model
- scene picture
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 238000009877 rendering Methods 0.000 claims abstract description 120
- 238000003384 imaging method Methods 0.000 claims abstract description 97
- 238000003062 neural network model Methods 0.000 claims abstract description 94
- 238000012549 training Methods 0.000 claims abstract description 29
- 230000003287 optical effect Effects 0.000 claims abstract description 13
- 238000005070 sampling Methods 0.000 claims description 69
- 238000005457 optimization Methods 0.000 claims description 11
- 238000012545 processing Methods 0.000 claims description 10
- 238000004364 calculation method Methods 0.000 claims description 6
- 230000000694 effects Effects 0.000 abstract description 7
- 230000008569 process Effects 0.000 description 12
- 238000010586 diagram Methods 0.000 description 6
- 238000004891 communication Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 5
- 230000005855 radiation Effects 0.000 description 5
- 239000003086 colorant Substances 0.000 description 3
- 239000002245 particle Substances 0.000 description 2
- 238000002834 transmittance Methods 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000005036 nerve Anatomy 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T19/00—Manipulating 3D models or images for computer graphics
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/10—Geometric effects
- G06T15/20—Perspective computation
- G06T15/205—Image-based rendering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/56—Extraction of image or video features relating to colour
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computer Graphics (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- General Engineering & Computer Science (AREA)
- Geometry (AREA)
- Computer Hardware Design (AREA)
- Image Analysis (AREA)
- Image Generation (AREA)
Abstract
The invention discloses a VR scene generation method, a device, equipment and a storage medium, wherein the method comprises the following steps: extracting a scene picture from a preset training set, and inputting the spatial feature information of the scene picture into an initial neural network model to obtain rendering feature information of the scene picture; performing optical rendering according to the rendering characteristic information to obtain a rendering imaging result of the scene picture; comparing the rendering imaging result with the scene picture, and determining the model loss of the initial neural network model; and optimizing the initial neural network model according to the model loss to obtain a target neural network model, and displaying a VR scene containing the scene picture based on the target neural network model. According to the method, the scene picture is used as input, the target neural network model obtained through training is used for VR scene generation, and the technical effect of reducing the cost of VR scene generation is achieved.
Description
Technical Field
The invention relates to the technical field of computer graphics, in particular to a VR scene generation method, a VR scene generation device, VR scene generation equipment and a storage medium.
Background
Under the condition of epidemic situation, the frequency of watching rooms on site is impacted, room enterprises innovatively transfer the traditional offline marketing mode to online marketing, and consumers are guided to watch rooms or experience VR (Virtual Reality) immersive sample rooms by a self-built online marketing platform.
The main realization mode that current VR looked at the room is for adopting professional 3D camera to construct the VR scene, and this kind of camera includes a plurality of special camera lenses and 3D sensor, and the picture of shooing has depth information, can assist the reestablishment of scene. However, such cameras are expensive, resulting in a costly VR scene generation.
Disclosure of Invention
The invention mainly aims to provide a VR scene generation method, a VR scene generation device, equipment and a storage medium, and aims to solve the problem that the cost of VR scene generation is high.
In order to achieve the above object, the present invention provides a VR scene generating method, including:
extracting a scene picture from a preset training set, and inputting the spatial feature information of the scene picture into an initial neural network model to obtain rendering feature information of the scene picture;
performing optical rendering according to the rendering characteristic information to obtain a rendering imaging result of the scene picture;
comparing the rendering imaging result with the scene picture, and determining the model loss of the initial neural network model;
and optimizing the initial neural network model according to the model loss to obtain a target neural network model, and displaying a VR scene containing the scene picture based on the target neural network model.
Optionally, the step of inputting the spatial feature information of the scene picture into an initial neural network model to obtain rendering feature information of the scene picture includes:
acquiring a space three-dimensional coordinate and a shooting angle associated with the scene picture, and forming a vector to be input by the space three-dimensional coordinate and the shooting angle;
carrying out Hash coding on the vector to be input to obtain a target input vector;
and inputting the target input vector into the initial neural network model to obtain color characteristic information and volume density information of the scene picture.
Optionally, the step of optically rendering according to the rendering feature information to obtain a rendering imaging result of the scene picture includes:
determining a space imaging plane and an imaging point in the space imaging plane according to the space characteristic information of the scene picture;
establishing a coordinate axis by taking a preset shooting position as an origin and the direction of the imaging point as a positive direction, and performing layered sampling in a preset imaging range of the coordinate axis to obtain sampling points;
and carrying out voxel rendering processing on the sampling points according to the rendering characteristic information of the sampling points to obtain a rendering imaging result in the shooting position direction.
Optionally, the step of performing hierarchical sampling within a preset imaging range of the coordinate axis to obtain sampling points includes:
equally dividing the preset imaging range, and uniformly sampling in the equally divided imaging interval to obtain coarse sampling points;
determining the sampling weight of the coarse sampling points according to the volume density information of the coarse sampling points, and taking the sampling weight as the probability density distribution in the imaging range;
and performing fine sampling according to the probability density distribution to obtain fine sampling points.
Optionally, the step of comparing the rendered imaging result with the scene picture and determining the model loss of the initial neural network model comprises:
taking the pixel color of the scene picture as a true value, taking the pixel color of the rendering imaging result as a predicted value, and determining the color difference between the predicted value and the true value;
determining the model loss from the color difference.
Optionally, the step of optimizing the initial neural network model according to the model loss to obtain a target neural network model includes:
judging whether the model loss is smaller than a preset loss threshold value or not;
and when the model loss is not less than a preset loss threshold value, performing gradient descent optimization processing on the initial neural network model until the model loss is less than the loss threshold value, and obtaining the target neural network model.
Optionally, after the step of optimizing the initial neural network model according to the model loss to obtain a target neural network model, the method further includes:
acquiring current observation angle information, and inputting the current observation angle information into the target neural network model to obtain current rendering information;
and performing rendering calculation on each pixel point in the current imaging plane according to the current rendering information and a preset rendering formula to generate a VR scene under the current observation angle.
In addition, to achieve the above object, the present invention provides a VR scene generating apparatus, including:
the input module is used for extracting a scene picture from a preset training set, and inputting the spatial feature information of the scene picture into the initial neural network model to obtain rendering feature information of the scene picture;
the rendering module is used for carrying out optical rendering according to the rendering characteristic information to obtain a rendering imaging result of the scene picture;
the comparison module is used for comparing the rendering imaging result with the scene picture and determining the model loss of the initial neural network model;
and the optimization module is used for optimizing the initial neural network model according to the model loss to obtain a target neural network model so as to display the VR scene containing the scene picture based on the target neural network model.
In addition, to achieve the above object, the present invention also provides an electronic device, including: a memory, a processor, and a VR scene generation program stored on the memory and executable on the processor, the VR scene generation program configured to implement the steps of the VR scene generation method as described above.
Furthermore, to achieve the above object, the present invention also provides a computer readable storage medium having stored thereon a VR scene generation program, which when executed by a processor, implements the steps of the VR scene generation method as described above.
The VR scene generation method provided by the invention comprises the steps of extracting a scene picture from a preset training set, inputting spatial characteristic information of the scene picture into an initial neural network model to obtain rendering characteristic information of the scene picture, carrying out optical rendering according to the rendering characteristic information to obtain a rendering imaging result of the scene picture, comparing the rendering imaging result with the scene picture to determine model loss of the initial neural network model, optimizing the initial neural network model according to the model loss to obtain a target neural network model, displaying a VR scene comprising the scene picture based on the target neural network model, using the spatial characteristic information of the scene picture to replace depth information to serve as input of the neural network model, outputting the rendering characteristic information by the trained neural network model, enabling the VR scene generated after rendering according to the rendering characteristic information to achieve a good display effect, avoiding the need of a high-price professional 3D camera, reducing hardware cost for VR scene generation investment, and setting training sets corresponding to different rooms, so that batch VR scene generation of different rooms can be realized and market demands can be met.
Drawings
Fig. 1 is a schematic structural diagram of an electronic device in a hardware operating environment according to an embodiment of the present invention;
fig. 2 is a schematic flowchart of a VR scene generation method according to a first embodiment of the present invention;
FIG. 3 is a schematic diagram of a radiation model involved in a VR scene generation method of the invention;
fig. 4 is a flowchart illustrating a VR scene generation method according to a second embodiment of the present invention;
fig. 5 is a schematic diagram of hierarchical sampling related to the VR scene generation method of the present invention;
fig. 6 is a schematic diagram of a VR scene generator according to the present invention.
The implementation, functional features and advantages of the present invention will be further described with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and do not limit the invention.
Referring to fig. 1, fig. 1 is a schematic structural diagram of an electronic device in a hardware operating environment according to an embodiment of the present invention.
As shown in fig. 1, the electronic device may include: a processor 1001, such as a Central Processing Unit (CPU), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a WIreless interface (e.g., a WIreless-FIdelity (WI-FI) interface). The Memory 1005 may be a Random Access Memory (RAM) Memory, or may be a Non-Volatile Memory (NVM), such as a disk Memory. The memory 1005 may alternatively be a storage device separate from the processor 1001.
Those skilled in the art will appreciate that the configuration shown in fig. 1 does not constitute a limitation of the electronic device, and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.
As shown in fig. 1, a memory 1005, which is a storage medium, may include therein an operating system, a network communication module, a user interface module, and a VR scene generation program.
In the electronic device shown in fig. 1, the network interface 1004 is mainly used for data communication with other devices; the user interface 1003 is mainly used for data interaction with a user; the processor 1001 and the memory 1005 of the electronic device according to the present invention may be disposed in the electronic device, and the electronic device calls the VR scene generation program stored in the memory 1005 through the processor 1001 and executes the VR scene generation method provided in the embodiment of the present invention.
An embodiment of the present invention provides a VR scene generation method, and referring to fig. 2, fig. 2 is a schematic flow diagram of a first embodiment of a VR scene generation method according to the present invention.
In this embodiment, the VR scene generation method includes:
step S10, extracting a scene picture from a preset training set, and inputting the spatial feature information of the scene picture into an initial neural network model to obtain rendering feature information of the scene picture;
for a suite needing VR scene generation, scene picture collection can be performed first, and the embodiment has no special requirements on equipment used for collecting the scene pictures. In the acquisition process, a plurality of scene pictures at different angles can be shot so as to show the complete picture of the suite, and the spatial characteristic information of the scene pictures is recorded during shooting. The number of scene pictures for each suite may be required to be no less than 32. And putting the scene pictures of a single suite into the same training set, and inputting the recorded spatial feature information into the initial neural network model for training to obtain the output rendering feature information.
As an example, the step of inputting the spatial feature information of the scene picture into the initial neural network model to obtain the rendering feature information of the scene picture may include:
a1, acquiring a spatial three-dimensional coordinate and a shooting angle associated with a scene picture, and forming a vector to be input by the spatial three-dimensional coordinate and the shooting angle;
step A2, carrying out Hash coding on the vector to be input to obtain a target input vector;
and A3, inputting the target input vector into the initial neural network model to obtain color characteristic information and volume density information of the scene picture.
The spatial feature information may include spatial three-dimensional coordinates and shooting angles of imaging points in the scene picture. The spatial three-dimensional coordinates may be represented by (x, y, z), and the photographing angle may be represented by (θ, Φ), where θ represents a horizontal angle at the time of photographing and Φ represents a pitch angle at the time of photographing. And (3) forming a five-dimensional vector (x, y, z, theta, phi) to be input by the spatial three-dimensional coordinates and the shooting angles.
The vector to be input is directly input into the initial neural network model, high-frequency characteristic information of the image will be lost, hash coding can be carried out on the vector to be input, and the training effect of a high-frequency part model is improved. The initial Neural network model used for training may be a Neural radiation field (NeRF) model, which is an implicit scene representation. In the process of Hash coding, a given fully-connected neural network m (y; 981), wherein v 981is a weight parameter, the input of the weight parameter is y = enc (x; theta), theta is a coding parameter, and the used super parameter comprises the number of layers L, the size T of a Hash table, the number F of characteristic dimensions and the maximum value N of resolution max And resolution minimum N min 。
In the encoding process, the definition domain is divided into L layers with different resolutions, and the formula that the resolution changes along with the number of layers is as follows:
wherein b is a growth parameter and l is a layer number.
For a given input coordinate x, finding surrounding voxels at the resolution level of the L-layer, assigning indices to their angles by hashing their integer coordinates, finding the corresponding F-dimensional feature vector from the hash table according to the angle index, performing linear interpolation according to the relative position of x in the respective L-th voxel, concatenating the result of each resolution level and the auxiliary input, producing a coded fully-concatenated layer input y, i.e., the target feature vector.
And inputting the target characteristic vector obtained by coding into an initial nerve radiation field model to obtain four-dimensional output (R, G, B, sigma), (R, G, B) representing color characteristic information, and sigma representing volume density information.
The input of the initial neural network model is coded by using a Hash coding mode, so that the approximate quality and the training speed can be improved, the obvious performance overhead can not be generated, and the training efficiency is improved while the lower cost level is kept.
Step S20, performing optical rendering according to the rendering characteristic information to obtain a rendering imaging result of the scene picture;
rendering may be viewed as simulating the process of generating photographs by taking a picture with a camera. The rendering feature information may include a color feature vector C composed of three colors of R, G, and B, and a volume density σ. The rendering method used for optical rendering may be voxel rendering.
As an example, the step of performing optical rendering according to the rendering characteristic information to obtain a rendering imaging result of the scene picture may include:
step B1, determining a space imaging plane and an imaging point in the space imaging plane according to the space characteristic information of the scene picture;
b2, establishing a coordinate axis by taking a preset shooting position as an original point and the direction of the imaging point as a positive direction, and performing layered sampling in a preset imaging range of the coordinate axis to obtain a sampling point;
and B3, performing voxel rendering processing on the sampling points according to the rendering characteristic information of the sampling points to obtain a rendering imaging result in the shooting position direction.
In the process of voxel rendering, a new image is rendered and synthesized by solving the color of light rays passing through a scene picture. The plane where the scene picture is located serves as a space imaging plane, the preset shooting position serves as an original point, and the original point and an imaging point in the space imaging plane form a ray. In voxel rendering, a radiation model can be used, as shown in fig. 3, and assuming that each point on a ray entering the human eye in space radiates a certain energy, the total energy received by the human eye at the end is the accumulation of the radiation energies of all the particles on the line of sight, the position of the human eye is an origin o, d represents a unit direction vector, any point on a coordinate axis can be represented as o + td, and t represents the distance from the point to the origin. And determining a near plane near and a far plane far which are parallel to the imaging plane and perpendicular to the rays, wherein the near plane is used as a lower limit, and the far plane is used as an upper limit to form a preset imaging range.
And carrying out layered sampling in a preset imaging range, and carrying out voxel rendering processing by using rendering characteristic information of sampling points to obtain rendering imaging results in the shooting position direction of the imaging points, wherein the rendering imaging results of all the imaging points can present the rendering imaging results of the scene pictures. The rendering formula used for voxel rendering may be:
wherein,
t denotes a sampling point, t n Denotes the starting point, t f Representing the end point, σ the opacity, c the color, and T the collision detection function.
Step S30, comparing the rendering imaging result with the scene picture, and determining the model loss of the initial neural network model;
the rendered imaging effect can be evaluated through the difference between the rendered imaging result and the scene picture in the training set, and the closer the rendered imaging result is to the scene picture, the better the rendered effect is.
As an example, comparing the rendered imaging result with a picture of the scene, the step of determining a model loss of the initial neural network model may comprise:
step C1, taking the pixel color of the scene picture as a true value, taking the pixel color of the rendering imaging result as a predicted value, and determining the color difference between the predicted value and the true value;
and step C2, determining the model loss according to the color difference.
In evaluating the rendering effect, a color difference between the scene picture and the rendering imaging result may be used as an index. And taking the RGB color of the pixel of the scene picture as a true value, taking the RGB color of the pixel rendering imaging result as a predicted value, calculating a difference value between the predicted value and the true value, and taking the difference value as the model loss. The loss function used for the calculation may be:
where R represents a set of rays, C represents the true background of the ray R,representing the coarse volume prediction RGB colors,representing the fine volume prediction RGB colors.
And S40, optimizing the initial neural network model according to the model loss to obtain a target neural network model, and displaying a VR scene containing the scene picture based on the target neural network model.
And performing iterative training optimization on the initial neural network model, and evaluating a model training result according to the value of model loss. In the optimization process, scene pictures are continuously and randomly extracted from the training set to train the initial neural network model, and the result obtained by training is compared with the original image for optimization.
As an example, optimizing the initial neural network model according to the model loss to obtain the target neural network model may include:
step D1, judging whether the model loss is smaller than a preset loss threshold value or not;
and D2, when the model loss is not less than a preset loss threshold value, performing gradient descent optimization processing on the initial neural network model until the model loss is less than the loss threshold value, and obtaining the target neural network model.
And carrying out gradient reduction on the model loss, continuing training when the model loss is not less than a loss threshold value, and training from the training set by randomly sampling scene pictures until the model converges to obtain the target neural network model. And when the model loss is smaller than the loss threshold value, the model reaches the preset optimization requirement, and the training is finished.
As an example, after the step of optimizing the initial neural network model according to the model loss to obtain the target neural network model, the method may further include:
step E1, obtaining current observation angle information, inputting the current observation angle information into the target neural network model, and obtaining current rendering information;
and E2, performing rendering calculation on each pixel point in the current imaging plane according to the current rendering information and a preset rendering formula, and generating a VR scene under the current observation angle.
After the model training process is completed, VR scene generation may be performed using the target neural network model. When a VR experience is performed, the VR experiencer has a specific current viewing angle at each moment, and the model can output what the VR experiencer can view at the viewing angle, that is, the imaging on the imaging plane at the viewing angle. First, for the current viewing angle, a single pixel of the imaging plane is discussed. According to the principle of optical rendering, the three-dimensional points on the observation angle straight line can be calculated firstly, the coordinates of the three-dimensional points are found and are transmitted into the target neural network model in parallel, and then the final current rendering imaging result is calculated according to the rendering formula by the output of the target neural network model. And generating all the pixel points in an iteration mode to obtain the imaging of the whole imaging plane. The pixel points on the imaging plane are independent, so all the pixel points can be calculated in parallel to meet the real-time performance of VR.
In this embodiment, a scene picture is extracted from a preset training set, spatial feature information of the scene picture is input into an initial neural network model to obtain rendering feature information of the scene picture, optical rendering is performed according to the rendering feature information to obtain a rendering imaging result of the scene picture, the rendering imaging result is compared with the scene picture to determine a model loss of the initial neural network model, the initial neural network model is optimized according to the model loss to obtain a target neural network model, a VR scene including the scene picture is displayed based on the target neural network model, the spatial feature information of the scene picture is used to replace depth information to serve as input of the neural network model, the trained neural network model outputs the rendering feature information, the VR scene generated after rendering according to the rendering feature information can achieve a good display effect, a high-priced professional 3D camera is not required, hardware cost for VR scene generation investment is reduced, training sets corresponding to different rooms are set, batch VR scenes of different rooms can be generated, and market demands are met.
Further, in a second embodiment of the VR scene generation method of the present invention, referring to fig. 4, the method includes:
step S11, equally dividing the preset imaging range, and uniformly sampling in the equally divided imaging interval to obtain coarse sampling points;
modeling with the same resolution for each region in the scene results in a decrease in the accuracy of the model while increasing the amount of computation.
FIG. 5 is a schematic view of the hierarchical sampling, as shown in FIG. 5, the interval [ t ] represented by the preset imaging range is first set near ,t far ]Dividing the sampling into N subintervals, randomly sampling in each subinterval, and expressing the sampling by t to obtain a coarse sampling point C i The process of (d) can be expressed as:
by c i Represents the sample point C i The color formula of the imaging point C is:
wherein, alpha is defined as alpha =1-e -σδ And delta denotes the distance between the sample points,
when σ =0, α =0, indicates that when the bulk density is 0, the transparency is 0, completely transparent, the light emitted from the camera continues to travel backward, the current point does not contribute to the color of the imaged point, nor does the latter point contribute to the color of the imaged point. When σ → + ∞, α → 1, it means that the transparency is 1 at this time, and is completely opaque, so that the light emitted from the camera is completely blocked at the current point, so that the point behind the current point does not contribute to the color of the imaging point C.
S12, determining the sampling weight of the coarse sampling point according to the volume density information of the coarse sampling point, and taking the sampling weight as the probability density distribution in the imaging range;
on the basis of obtaining the rough sampling point, for each sampling point C i Their corresponding transmittance i Comprises the following steps: (1-. Alpha.) 1 )(1-α 2 )...(1-α i-1 ),C i The weights for the color of imaged point C are: weight i =ɑ i transmittance i . The above sampling weights are taken as a function of the probability density distribution along the ray direction.
And S13, performing fine sampling according to the probability density distribution to obtain fine sampling points.
The probability density function can be used for estimating the distribution condition of particles on the light, and the probability density function is used for fine sampling, so that the sampling points at the positions with high probability are distributed more densely, and the fine sampling points are obtained.
In this embodiment, a hierarchical sampling mode is used, so that the calculation amount is reduced while the model precision is maintained, and the model training efficiency is improved.
An embodiment of the present invention further provides a VR scene generation apparatus, as shown in fig. 6, the VR scene generation apparatus includes:
the input module 101 is configured to extract a scene picture from a preset training set, input spatial feature information of the scene picture into an initial neural network model, and obtain rendering feature information of the scene picture;
the rendering module 102 is configured to perform optical rendering according to the rendering characteristic information to obtain a rendering imaging result of the scene picture;
a comparison module 103, configured to compare the rendered imaging result with the scene picture, and determine a model loss of the initial neural network model;
and the optimizing module 104 is configured to optimize the initial neural network model according to the model loss to obtain a target neural network model, so as to display a VR scene including the scene picture based on the target neural network model.
Optionally, the input module 101 is further configured to:
acquiring a space three-dimensional coordinate and a shooting angle associated with the scene picture, and forming a vector to be input by the space three-dimensional coordinate and the shooting angle;
carrying out Hash coding on the vector to be input to obtain a target input vector;
and inputting the target input vector into the initial neural network model to obtain color characteristic information and volume density information of the scene picture.
Optionally, the rendering module 102 is further configured to:
determining a space imaging plane and an imaging point in the space imaging plane according to the space characteristic information of the scene picture;
establishing a coordinate axis by taking a preset shooting position as an origin and the direction of the imaging point as a positive direction, and performing layered sampling in a preset imaging range of the coordinate axis to obtain a sampling point;
and carrying out voxel rendering processing on the sampling points according to the rendering characteristic information of the sampling points to obtain a rendering imaging result in the shooting position direction.
Optionally, the rendering module 102 is further configured to:
equally dividing the preset imaging range, and uniformly sampling in the equally divided imaging interval to obtain coarse sampling points;
determining the sampling weight of the coarse sampling points according to the volume density information of the coarse sampling points, and taking the sampling weight as the probability density distribution in the imaging range;
and performing fine sampling according to the probability density distribution to obtain fine sampling points.
Optionally, the comparing module 103 is further configured to:
taking the pixel color of the scene picture as a true value, taking the pixel color of the rendering imaging result as a predicted value, and determining the color difference between the predicted value and the true value;
determining the model loss from the color difference.
Optionally, the optimization module 104 is further configured to:
judging whether the model loss is smaller than a preset loss threshold value or not;
and when the model loss is not less than a preset loss threshold value, performing gradient descent optimization processing on the initial neural network model until the model loss is less than the loss threshold value, and obtaining the target neural network model.
Optionally, the VR scene generating apparatus further includes a generating module, configured to:
acquiring current observation angle information, and inputting the current observation angle information into the target neural network model to obtain current rendering information;
and performing rendering calculation on each pixel point in the current imaging plane according to the current rendering information and a preset rendering formula to generate a VR scene under the current observation angle.
An embodiment of the present invention further provides a computer-readable storage medium, where a VR scene generation program is stored, and when executed by a processor, the VR scene generation program implements the steps of the VR scene generation method as described above. For a specific implementation of the computer-readable storage medium according to the embodiment of the present invention, reference is made to the embodiments of the VR scene generation method, and details are not repeated here.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrases "comprising a," "8230," "8230," or "comprising" does not exclude the presence of other like elements in a process, method, article, or system comprising the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the description of the foregoing embodiments, it is clear to those skilled in the art that the method of the foregoing embodiments may be implemented by software plus a necessary general hardware platform, and certainly may also be implemented by hardware, but in many cases, the former is a better implementation. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) as described above and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.
Claims (10)
1. A VR scene generation method, comprising the steps of:
extracting a scene picture from a preset training set, and inputting the spatial feature information of the scene picture into an initial neural network model to obtain rendering feature information of the scene picture;
performing optical rendering according to the rendering characteristic information to obtain a rendering imaging result of the scene picture;
comparing the rendering imaging result with the scene picture, and determining the model loss of the initial neural network model;
and optimizing the initial neural network model according to the model loss to obtain a target neural network model, and displaying a VR scene containing the scene picture based on the target neural network model.
2. The VR scene generation method of claim 1, wherein the step of inputting the spatial feature information of the scene picture into an initial neural network model to obtain the rendering feature information of the scene picture comprises:
acquiring a space three-dimensional coordinate and a shooting angle associated with the scene picture, and forming a vector to be input by the space three-dimensional coordinate and the shooting angle;
carrying out Hash coding on the vector to be input to obtain a target input vector;
and inputting the target input vector into the initial neural network model to obtain color feature information and volume density information of the scene picture.
3. The VR scene generating method of claim 1, wherein the optically rendering according to the rendering feature information to obtain the rendering imaging result of the scene picture includes:
determining a space imaging plane and an imaging point in the space imaging plane according to the space characteristic information of the scene picture;
establishing a coordinate axis by taking a preset shooting position as an origin and the direction of the imaging point as a positive direction, and performing layered sampling in a preset imaging range of the coordinate axis to obtain a sampling point;
and carrying out voxel rendering processing on the sampling points according to the rendering characteristic information of the sampling points to obtain a rendering imaging result in the shooting position direction.
4. The VR scene generation method of claim 3, wherein the step of performing hierarchical sampling within a preset imaging range of the coordinate axis to obtain the sampling points comprises:
equally dividing the preset imaging range, and uniformly sampling in the equally divided imaging interval to obtain coarse sampling points;
determining the sampling weight of the coarse sampling points according to the volume density information of the coarse sampling points, and taking the sampling weight as the probability density distribution in the imaging range;
and performing fine sampling according to the probability density distribution to obtain fine sampling points.
5. The VR scene generation method of claim 4, wherein comparing the rendered imaging result to the scene picture to determine a model loss for the initial neural network model comprises:
taking the pixel color of the scene picture as a real value, taking the pixel color of the rendering imaging result as a predicted value, and determining the color difference between the predicted value and the real value;
determining the model loss from the color difference.
6. The VR scene generation method of claim 5, wherein said optimizing the initial neural network model based on the model losses to obtain a target neural network model comprises:
judging whether the model loss is smaller than a preset loss threshold value or not;
and when the model loss is not less than a preset loss threshold, performing gradient descent optimization processing on the initial neural network model until the model loss is less than the loss threshold, and obtaining the target neural network model.
7. The VR scene generation method of claim 6, further comprising, after the step of optimizing the initial neural network model based on the model losses to obtain a target neural network model:
acquiring current observation angle information, and inputting the current observation angle information into the target neural network model to obtain current rendering information;
and performing rendering calculation on each pixel point in the current imaging plane according to the current rendering information and a preset rendering formula to generate a VR scene under the current observation angle.
8. A VR scene generation apparatus, comprising:
the input module is used for extracting a scene picture from a preset training set, inputting the spatial characteristic information of the scene picture into an initial neural network model, and obtaining rendering characteristic information of the scene picture;
the rendering module is used for carrying out optical rendering according to the rendering characteristic information to obtain a rendering imaging result of the scene picture;
the comparison module is used for comparing the rendering imaging result with the scene picture and determining the model loss of the initial neural network model;
and the optimization module is used for optimizing the initial neural network model according to the model loss to obtain a target neural network model so as to display the VR scene containing the scene picture based on the target neural network model.
9. An electronic device, characterized in that the electronic device comprises: a memory, a processor, and a VR scene generation program stored on the memory and executable on the processor, the VR scene generation program configured to implement the steps of the VR scene generation method of any of claims 1 to 7.
10. A computer-readable storage medium, having a VR scene generation program stored thereon, which when executed by a processor implements the steps of the VR scene generation method of any of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211081918.4A CN115147577A (en) | 2022-09-06 | 2022-09-06 | VR scene generation method, device, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211081918.4A CN115147577A (en) | 2022-09-06 | 2022-09-06 | VR scene generation method, device, equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115147577A true CN115147577A (en) | 2022-10-04 |
Family
ID=83416089
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211081918.4A Pending CN115147577A (en) | 2022-09-06 | 2022-09-06 | VR scene generation method, device, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115147577A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117611727A (en) * | 2024-01-24 | 2024-02-27 | 腾讯科技(深圳)有限公司 | Rendering processing method, device, equipment and medium |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160335795A1 (en) * | 2015-05-13 | 2016-11-17 | Google Inc. | Deepstereo: learning to predict new views from real world imagery |
CN111240484A (en) * | 2020-01-13 | 2020-06-05 | 孝感峰创智能科技有限公司 | Protection method and system based on omnidirectional motion platform and readable storage medium |
CN112613609A (en) * | 2020-12-18 | 2021-04-06 | 中山大学 | Nerve radiation field enhancement method based on joint pose optimization |
CN113327299A (en) * | 2021-07-07 | 2021-08-31 | 北京邮电大学 | Neural network light field method based on joint sampling structure |
US20210390761A1 (en) * | 2020-06-15 | 2021-12-16 | Microsoft Technology Licensing, Llc | Computing images of dynamic scenes |
CN114004941A (en) * | 2022-01-04 | 2022-02-01 | 苏州浪潮智能科技有限公司 | Indoor scene three-dimensional reconstruction system and method based on nerve radiation field |
CN114493995A (en) * | 2022-01-17 | 2022-05-13 | 上海壁仞智能科技有限公司 | Image rendering model training method, image rendering method and image rendering device |
CN114898028A (en) * | 2022-04-29 | 2022-08-12 | 厦门大学 | Scene reconstruction and rendering method based on point cloud, storage medium and electronic equipment |
CN114972632A (en) * | 2022-04-21 | 2022-08-30 | 阿里巴巴达摩院(杭州)科技有限公司 | Image processing method and device based on nerve radiation field |
CN114998548A (en) * | 2022-05-31 | 2022-09-02 | 北京非十科技有限公司 | Image reconstruction method and system |
-
2022
- 2022-09-06 CN CN202211081918.4A patent/CN115147577A/en active Pending
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160335795A1 (en) * | 2015-05-13 | 2016-11-17 | Google Inc. | Deepstereo: learning to predict new views from real world imagery |
CN111240484A (en) * | 2020-01-13 | 2020-06-05 | 孝感峰创智能科技有限公司 | Protection method and system based on omnidirectional motion platform and readable storage medium |
US20210390761A1 (en) * | 2020-06-15 | 2021-12-16 | Microsoft Technology Licensing, Llc | Computing images of dynamic scenes |
CN112613609A (en) * | 2020-12-18 | 2021-04-06 | 中山大学 | Nerve radiation field enhancement method based on joint pose optimization |
CN113327299A (en) * | 2021-07-07 | 2021-08-31 | 北京邮电大学 | Neural network light field method based on joint sampling structure |
CN114004941A (en) * | 2022-01-04 | 2022-02-01 | 苏州浪潮智能科技有限公司 | Indoor scene three-dimensional reconstruction system and method based on nerve radiation field |
CN114493995A (en) * | 2022-01-17 | 2022-05-13 | 上海壁仞智能科技有限公司 | Image rendering model training method, image rendering method and image rendering device |
CN114972632A (en) * | 2022-04-21 | 2022-08-30 | 阿里巴巴达摩院(杭州)科技有限公司 | Image processing method and device based on nerve radiation field |
CN114898028A (en) * | 2022-04-29 | 2022-08-12 | 厦门大学 | Scene reconstruction and rendering method based on point cloud, storage medium and electronic equipment |
CN114998548A (en) * | 2022-05-31 | 2022-09-02 | 北京非十科技有限公司 | Image reconstruction method and system |
Non-Patent Citations (3)
Title |
---|
互联网: "[NeRF]基于Mindspore的NeRF实现", 《HTTPS://WWW.ICODE9.COM/CONTENT-4-1418983.HTML》 * |
常远 等: "基于神经辐射场的视点合成算法综述", 《图形学报 等》 * |
敲键盘少女: "NeRF论文解析-Neural Radiance Field", 《HTTPS://BLOG.CSDN.NET/SJJSSUWJN/ARTICLE/DETAILS/123259875》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117611727A (en) * | 2024-01-24 | 2024-02-27 | 腾讯科技(深圳)有限公司 | Rendering processing method, device, equipment and medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wynn et al. | Diffusionerf: Regularizing neural radiance fields with denoising diffusion models | |
CN106803267B (en) | Kinect-based indoor scene three-dimensional reconstruction method | |
CN110378838B (en) | Variable-view-angle image generation method and device, storage medium and electronic equipment | |
RU2215326C2 (en) | Image-based hierarchic presentation of motionless and animated three-dimensional object, method and device for using this presentation to visualize the object | |
CN107484428B (en) | Method for displaying objects | |
CN115699093A (en) | Computing images of a dynamic scene | |
CN111523398A (en) | Method and device for fusing 2D face detection and 3D face recognition | |
CN110288697A (en) | 3D face representation and method for reconstructing based on multiple dimensioned figure convolutional neural networks | |
CN103530907B (en) | Complicated three-dimensional model drawing method based on images | |
CN116310076A (en) | Three-dimensional reconstruction method, device, equipment and storage medium based on nerve radiation field | |
US11961266B2 (en) | Multiview neural human prediction using implicit differentiable renderer for facial expression, body pose shape and clothes performance capture | |
CN114758337A (en) | Semantic instance reconstruction method, device, equipment and medium | |
CN115731336B (en) | Image rendering method, image rendering model generation method and related devices | |
WO2022208440A1 (en) | Multiview neural human prediction using implicit differentiable renderer for facial expression, body pose shape and clothes performance capture | |
CN117730530A (en) | Image processing method and device, equipment and storage medium | |
CN116416376A (en) | Three-dimensional hair reconstruction method, system, electronic equipment and storage medium | |
CN115861515A (en) | Three-dimensional face reconstruction method, computer program product and electronic device | |
CN116778063A (en) | Rapid virtual viewpoint synthesis method and device based on characteristic texture grid and hash coding | |
CN115147577A (en) | VR scene generation method, device, equipment and storage medium | |
CN116863069A (en) | Three-dimensional light field face content generation method, electronic equipment and storage medium | |
CN115272575B (en) | Image generation method and device, storage medium and electronic equipment | |
CN116051737A (en) | Image generation method, device, equipment and storage medium | |
CN115409949A (en) | Model training method, visual angle image generation method, device, equipment and medium | |
CN112785494B (en) | Three-dimensional model construction method and device, electronic equipment and storage medium | |
JP2019149112A (en) | Composition device, method, and program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20221004 |