CN115439388B - Free viewpoint image synthesis method based on multilayer nerve surface expression - Google Patents

Free viewpoint image synthesis method based on multilayer nerve surface expression Download PDF

Info

Publication number
CN115439388B
CN115439388B CN202211391996.4A CN202211391996A CN115439388B CN 115439388 B CN115439388 B CN 115439388B CN 202211391996 A CN202211391996 A CN 202211391996A CN 115439388 B CN115439388 B CN 115439388B
Authority
CN
China
Prior art keywords
viewpoint
image
layer
module
scale
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211391996.4A
Other languages
Chinese (zh)
Other versions
CN115439388A (en
Inventor
戴翘楚
吴翼天
曹静萍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Yilan Technology Co ltd
Original Assignee
Hangzhou Yilan Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Yilan Technology Co ltd filed Critical Hangzhou Yilan Technology Co ltd
Priority to CN202211391996.4A priority Critical patent/CN115439388B/en
Publication of CN115439388A publication Critical patent/CN115439388A/en
Application granted granted Critical
Publication of CN115439388B publication Critical patent/CN115439388B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a free viewpoint image synthesis method based on multilayer nerve surface expression, and relates to the field of computer vision, the method comprises the following steps of S1, acquiring image data acquired by sparse viewpoints, and estimating the pose of the sparse viewpoints; s2, designing a sparse viewpoint free viewpoint image synthesis network based on multilayer nerve surface expression; s3, training the sparse view free view image synthesis network based on the multi-layer nerve surface expression by utilizing a large-scale multi-view data set; and S4, after obtaining the image synthesis network model parameters, applying the image synthesis network model parameters to a free viewpoint synthesis task of the sparse multi-viewpoint data obtained in the first step. According to the invention, by designing the multi-layer nerve surface expression model, the characteristics of the sparse multi-view image are fully utilized, a free view image synthesis algorithm with high quality and generalization is completed, and the method is suitable for free view image synthesis tasks of a multi-view acquisition system.

Description

Free viewpoint image synthesis method based on multilayer nerve surface expression
Technical Field
The invention relates to the field of computer vision, in particular to a free viewpoint image synthesis method based on multilayer nerve surface expression.
Background
Free viewpoint image synthesis is a major problem in the field of computer vision. With the advent of the 5G age and the development and popularization of virtual reality technology and augmented reality technology, digital images have developed into a necessary trend toward interactivity and immersion.
The free viewpoint synthesis is widely applied to various fields such as virtual reality, movie and television production, sports live broadcast, cultural social interaction and the like due to the characteristics of strong three-dimensional immersion, large viewing freedom and rich interaction experience.
However, the current free viewpoint system still requires hundreds of cameras, and the structure is complex and expensive; meanwhile, most of domestic landing systems adopt fixed imaging tracks, the watching view point is limited, the immersion feeling is insufficient, and the practicability and the economy are to be improved.
Disclosure of Invention
In order to achieve the above purpose, the invention aims to study the free viewpoint image synthesis problem under the sparse viewpoint, solve the problem that the existing free viewpoint generation algorithm needs each group of multi-viewpoint images to be trained for a long time or the geometric estimation defect under the sparse viewpoint influences the final viewpoint synthesis result, propose a framework based on multi-layer nerve surface expression, realize the scene geometric estimation and texture mapping synthesis of the viewpoint to be synthesized from end to end, realize the free viewpoint image generation with high quality and high efficiency, and solve the problems in the background technology.
In the application, based on innovative multilayer nerve surface expression, a workflow of free viewpoint image synthesis input by sparse viewpoints is designed, scene structure information of a new viewpoint to be synthesized and accurate texture migration and fusion processes are fully learned in a network, and the synthesis of the free viewpoint images with quasi-high quality and high efficiency is completed.
The technical scheme adopted is as follows: the free viewpoint image synthesis method based on the multilayer nerve surface expression comprises the following steps:
s1, acquiring multi-view synchronous or static scene image data acquired by sparse views, and estimating sparse view point pose;
s2, designing a free viewpoint image synthesis network;
s3, training the free viewpoint image synthesis network by utilizing a large-scale multi-viewpoint data set so that the free viewpoint image synthesis network can be generalized to various multi-viewpoint data;
and S4, after obtaining the trained free viewpoint image synthesis network model parameters, applying the free viewpoint synthesis network model parameters to the free viewpoint synthesis task of the sparse multi-viewpoint data obtained in the step S1.
Further, after step S4, there is also: and S5, when the trained network has certain generalization property for the data which does not appear in the training set, directly utilizing the trained network model in the step S3 to conduct forward prediction, and realizing high-quality free viewpoint image synthesis of the sparse multi-viewpoint data to be tested.
Further, in step S1, the acquisition method is a Structure-from-Motion method or a multi-viewpoint calibration method for a given calibration object scale.
Further, the free viewpoint synthesis network comprises a Multi-scale image feature extraction module, a target-oriented Multi-scale refined scene depth (depth) estimation MVS (Multi-view Stereo) module, a Multi-layer nerve surface density estimation module, a reverse feature fusion and Multi-layer nerve surface color decoding module and a Multi-layer nerve surface voxel rendering module;
the multi-scale image feature extraction module consists of a convolution layer and a jump connection layer, and the multi-scale image feature extraction module is expressed as follows:
wherein,network representing the module->For arbitrary input of the image of the module, the output of the module can be three-scale image features +.>
The MVS module realizes scene geometric estimation of any view point by modifying the MVS network based on learning, and the realization method comprises the following steps:
the M source viewpoint images are subjected to a multi-scale image feature extraction module to obtain M multiplied by 3 image features;
the transformation from the source viewpoint characteristics to a certain depth of the target viewpoint is realized corresponding to each scale, and the probability of each pixel point of the target image on each depth is output after 3D convolution regularization by constructing a cost body of variance;
gradually optimizing from small scale to large scale, updating and sampling according to the probability of the depth value of the previous layer, and finally outputting a multi-layer surface (curved surface) with the target point corresponding to the original image resolution, wherein the probability on depth is determined by the depth value of the final sampling;
the multi-layer nerve surface density estimation module samples depth probability bodies from the output of the MVS moduleIn recovering the density value +.>Providing for volume rendering to obtain a final output image corresponding to the opacity representing the multi-layer surface;
the reverse feature fusion and multi-layer nerve surface color decoding module uses the multi-layer surface sampling point set obtained in the MVS module to access the source viewpoint feature in a reverse wayCorresponding characteristic values in the code, and fusing and decoding the corresponding characteristics to formColor values of the multi-layer surface;
and the multi-layer nerve surface voxel rendering module performs voxel rendering after the density corresponding to the multi-layer nerve surface obtained by the multi-layer nerve surface density estimation module and the color corresponding to the multi-layer nerve surface obtained by the reverse feature fusion and multi-layer nerve surface color decoding module, so as to complete the synthesis of a final target image.
Further, in step S3, the training data is multi-viewpoint image data with camera pose, and is divided into a training set, a verification set and a test set, where the training converges the network and is on the verification set.
Further, in step S3, the multi-view synchronous (or still scene) image data acquired by sparse view is set as,/>The number of views to be input; estimating the sparse viewpoint pose to obtain the pose of each viewpointWherein->The internal references of each view are respectively contained>External parameters->(rotation matrix and translation matrix).
Further, in step S3, the pose of the target viewpoint is defined asFinding the image of M source viewpoints closest to the target viewpoint among the input viewpoints according to the position and orientation of the target viewpoint +.>And camera pose->As an input to the network.
Compared with the prior art, the invention has the following beneficial effects:
(1) In the invention, because the trained network has certain generalization for the data which does not appear in the training set, the forward prediction of the network can be directly utilized to complete the free viewpoint image synthesis task of the sparse multi-viewpoint data to be tested;
(2) According to the invention, by designing a multi-layer nerve surface expression model, the characteristics of sparse multi-view images are fully utilized to complete a high-quality free-view image synthesis algorithm with generalization, and the method is suitable for free-view image synthesis tasks of a multi-view acquisition system;
(3) In the invention, a network is designed for expressing the multi-layer nerve surface, the reconstruction of the multi-layer surface of a scene is realized in a new viewpoint synthesis frame from end to end, and the fusion and the generation of high-quality new viewpoint textures are completed based on the expression of the multi-layer surface.
Drawings
Fig. 1 is a flowchart of a free viewpoint image synthesis method based on multi-layer neural surface expression in the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In the application, based on innovative multilayer nerve surface expression, a workflow of free viewpoint image synthesis input by sparse viewpoints is designed, scene structure information of a new viewpoint to be synthesized and accurate texture migration and fusion processes are fully learned in a network, and the synthesis of the free viewpoint images with quasi-high quality and high efficiency is completed.
Examples
As shown in fig. 1, the free viewpoint image synthesis method based on the multilayer neural surface expression described in the present embodiment includes the following steps:
s1, acquiring multi-view synchronous or static scene image data acquired by sparse views, and estimating sparse view point pose;
the acquisition method is a Structure-from-Motion method or a multi-view calibration method for a given calibration object scale;
that is, the step collects multi-view data of the same static scene or dynamic scene at the same moment, and the multi-view data can be sparse (i.e. the view point pose changes greatly);
when the method is used, the purpose of the acquisition is to acquire different viewpoints of a scene to be observed by using limited acquisition equipment, and the free viewpoint image of the scene can be expected to be recovered through an algorithm.
S2, designing a free viewpoint image synthesis network;
when the network is used, the network is used for designing the expression of the multi-layer nerve surface, the reconstruction of the multi-layer surface of the scene is realized in the end-to-end new viewpoint synthesis frame, and the high-quality fusion and generation of the new viewpoint texture are completed based on the expression of the multi-layer surface.
Wherein the free viewpoint synthesis network comprises: the system comprises a Multi-scale image feature extraction module, a target-oriented Multi-scale refined scene depth (depth) estimated MVS (Multi-view Stereo) module, a Multi-layer nerve surface density estimation module, a reverse feature fusion and Multi-layer nerve surface color decoding module and a Multi-layer nerve surface voxel rendering module.
In use, this part of the deep neural network (deep neural network) design is the core part.
S3, training the free viewpoint image synthesis network by utilizing a large-scale multi-viewpoint data set so that the free viewpoint image synthesis network can be generalized to various multi-viewpoint data;
the training data are multi-view image data with camera pose; the training data is divided into a training set, a validation set and a test set, the training causing the network to converge and be on the validation set.
The step S3 comprises the following steps:
s31, inputting N viewpoints which are most similar to the viewpoint to be synthesized into the network, and outputting predicted images under the viewpoint to be synthesized;
s32, supervision may be a pixel level loss function such as L1, L2 or a perceptual loss function.
And S4, after obtaining the trained free viewpoint image synthesis network model parameters, applying the free viewpoint synthesis network model parameters to the free viewpoint synthesis task of the sparse multi-viewpoint data obtained in the step S1.
As shown in fig. 1, the method for synthesizing a free viewpoint image based on the expression of a multilayered nerve surface in this embodiment specifically includes the following steps:
acquiring multi-view synchronous (or still scene) image data acquired by sparse views;
wherein,,/>the number of views to be input; estimating the sparse viewpoint pose to obtain the pose of each viewpoint>Wherein->The internal references of each view are respectively contained>External parameters->(rotation matrix and translation matrix).
Specifically, the step acquires multi-view data of the same static scene or dynamic scene at the same moment, and the multi-view data can be sparse (namely, the position and the posture of the view point are changed greatly); the purpose of the acquisition is to acquire different viewpoints of a scene to be observed by using limited acquisition equipment, and the free viewpoint image of the scene can be expected to be restored through an algorithm.
Defining the pose of a target viewpoint asFinding the image of M source viewpoints closest to the target viewpoint among the input viewpoints according to the position and orientation of the target viewpoint +.>And camera pose->As an input to the network.
The method specifically comprises the following steps:
the multi-scale image feature extraction module is based on a U-Net model and consists of a multi-scale convolution layer and a jump connection layer, and can be expressed as:
wherein,network representing the module->For arbitrary input of the image of the module, the output of the module can be three-scale image features +.>
When the three-channel image fusion method is used, the three-channel image passes through the feature extraction module, so that multi-scale features with different scales and different channel numbers extracted at different depths in a network can be obtained, and the three-channel image comprises image features corresponding to different perception domains and is used for subsequent nerve surface positioning and reverse feature fusion.
The MVS module for target-oriented multi-scale and refined scene depth estimation realizes scene geometric estimation of any view point by modifying a learning-based MVS network, and the realization method comprises the following steps:
(1) The M source viewpoint images are subjected to a multi-scale image feature extraction module to obtain M multiplied by 3 image features;
(2) The transformation from the source viewpoint characteristics to a certain depth of the target viewpoint is realized corresponding to each scale, and the probability of each pixel point of the target image on each depth is output after 3D convolution regularization by constructing a cost body of variance;
(3) Gradually optimizing from small scale to large scale, updating sampling according to the probability of the depth value of the previous layer, outputting the multi-layer surface (curved surface) with the target point corresponding to the original image resolution, and determining the probability on depth according to the final sampled depth value.
The MVS module may be expressed as:
wherein,the module is the MVS module for target-oriented multi-scale and refined scene depth estimation, and outputs the MVS module as a sampling point set of a target viewpoint on a multi-layer surface>On a corresponding sampling depth probability volume
Multi-layer nerve surface density estimation module, sampling depth probability body from output of MVS moduleIn recovering the density value +.>Providing for volume rendering to obtain a final output image corresponding to the opacity representing the multi-layer surface;
reverse feature fusion and multi-layer nerve surface color decoding module, and reverse access source viewpoint feature by utilizing multi-layer surface sampling point set obtained in MVS moduleAnd fusing and decoding the corresponding characteristic values to form the color values of the multi-layer surface.
For convenience of explanation, the processing procedure of the inverse feature fusion and multi-layer nerve surface color decoding module is described:
(1) Corresponding to a certain depthAnd a certain pixel point->By->Finding out the characteristic of the corresponding source view, and taking out the characteristic set corresponding to the M source view as +.>
(2) M feature vectors are respectively encoded by an MLP (multi-layer perceptron), and fused features are obtained through mean value operation
(3) Corresponds to each pixel pointAnd depth->All of which are de-invertedFeature fusion is carried out to obtain multi-layer features
It may be noted that the feature fusion method may be other feature fusion forms;
(4) By decoding by a decoder, a multi-layer color is obtained as follows:
wherein,layer number of the surface of the multilayer nerve, < > and the like>For the image decoder, a multi-layer color is finally output +.>
The multi-layer nerve surface voxel rendering module performs voxel rendering after the density corresponding to the multi-layer nerve surface obtained by the multi-layer nerve surface density estimation module and the color corresponding to the multi-layer nerve surface obtained by the inverse feature fusion and multi-layer nerve surface color decoding module, and completes the synthesis of a final target image, which can be expressed as:
the sparse view free view image synthesis network based on the multi-layer nerve surface expression is trained by utilizing a large-scale multi-view data set, so that the sparse view free view image synthesis network can be generalized to various multi-view data.
In particular, the training data may be multi-view image data with camera pose.
The input of the network is M viewpoint images most similar to the viewpoint to be synthesizedAnd corresponding camera poseOutput as predicted image +.>The supervision may be a pixel level loss function such as L1, L2 or a perceptual loss function, etc.; the loss function may be an L2 loss:
it is noted that relational terms such as first and second, and the like, if any, are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.
The technical problems to be solved are still consistent with the invention, and all the technical problems are included in the protection scope of the invention.

Claims (6)

1. The free viewpoint image synthesis method based on the multilayer nerve surface expression is characterized by comprising the following steps of:
s1, acquiring multi-view synchronous or static scene image data acquired by sparse views, and estimating sparse view point pose;
s2, designing a free viewpoint image synthesis network;
s3, training the free viewpoint image synthesis network by utilizing a large-scale multi-viewpoint data set so that the free viewpoint image synthesis network can be generalized to various multi-viewpoint data;
step S4, after obtaining the trained free viewpoint image synthesis network model parameters, the method is applied to a free viewpoint synthesis task of the sparse multi-viewpoint data obtained in the step S1;
the free viewpoint image synthesis network comprises a multi-scale image feature extraction module, a target-oriented multi-scale refined scene depth estimation MVS module, a multi-layer nerve surface density estimation module, a reverse feature fusion multi-layer nerve surface color decoding module and a multi-layer nerve surface voxel rendering module;
the multi-scale image feature extraction module consists of a convolution layer and a jump connection layer, and the multi-scale image feature extraction module is expressed as follows:
wherein,network representing the module->For arbitrary input of the image of the module, the output of the module is the three-scale image feature +.>
The MVS module for target-oriented multi-scale and refined scene depth estimation realizes scene geometric estimation of any view point by modifying a learning-based MVS network, and the realization method comprises the following steps:
the M source viewpoint images are subjected to a multi-scale image feature extraction module to obtain M multiplied by 3 image features;
the transformation from the source viewpoint characteristics to a certain depth of the target viewpoint is realized corresponding to each scale, and the probability of each pixel point of the target image on each depth is output after 3D convolution regularization by constructing a cost body of variance;
gradually optimizing from a small scale to a large scale, updating and sampling according to the probability of the depth value of the previous layer, and finally outputting a multi-layer surface with a target point corresponding to the original image resolution, wherein the probability on the depth is determined by the depth value of the final sampling;
the multi-layer nerve surface density estimation module outputs a sampling depth probability volume of an MVS module for target-oriented multi-scale refineable scene depth estimationIn recovering the density value +.>Providing for volume rendering to obtain a final output image corresponding to the opacity representing the multi-layer surface;
the reverse feature fusion and multi-layer nerve surface color decoding module utilizes a multi-layer surface sampling point set obtained in an MVS module of target-oriented multi-scale refinement scene depth estimation to access source viewpoint features reverselyCorresponding characteristic values in the multi-layer surface are fused and decoded to form color values of the multi-layer surface;
and the multi-layer nerve surface voxel rendering module performs voxel rendering after the density corresponding to the multi-layer nerve surface obtained by the multi-layer nerve surface density estimation module and the color corresponding to the multi-layer nerve surface obtained by the reverse feature fusion and multi-layer nerve surface color decoding module, so as to complete the synthesis of a final target image.
2. The method for synthesizing a free viewpoint image based on multi-layer neural surface expression according to claim 1, wherein after step S4, there is further provided:
and S5, when the trained network has certain generalization property for the data which does not appear in the training set, directly utilizing the trained network model in the step S3 to conduct forward prediction, and realizing high-quality free viewpoint image synthesis of the sparse multi-viewpoint data to be tested.
3. The method for synthesizing a free viewpoint image based on multilayer neural surface expression according to claim 1, wherein the acquisition method in step S1 is a Structure-from-Motion method or a multi-viewpoint scaling method of a given scaling.
4. The method of claim 1, wherein in step S3, the large-scale multi-view dataset is multi-view image data with camera pose, and is divided into a training set, a validation set, and a test set.
5. The method for synthesizing a free-viewpoint image based on multi-layer neural surface expression according to claim 1, wherein in step S3, sparse viewpoint-acquired multi-viewpoint-synchronized or still scene image data is set as,/>The number of views to be input; estimating the sparse viewpoint pose to obtain the pose of each viewpoint>The method comprises the steps of carrying out a first treatment on the surface of the Wherein->The internal references of each view are respectively contained>External parameters->
6. The free viewpoint image synthesis method based on the multilayer neural surface expression according to claim 1, wherein in step S3, the pose of the target viewpoint is defined asFinding the image of M source viewpoints closest to the target viewpoint among the input viewpoints according to the position and orientation of the target viewpoint +.>And camera poseAs an input to the network.
CN202211391996.4A 2022-11-08 2022-11-08 Free viewpoint image synthesis method based on multilayer nerve surface expression Active CN115439388B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211391996.4A CN115439388B (en) 2022-11-08 2022-11-08 Free viewpoint image synthesis method based on multilayer nerve surface expression

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211391996.4A CN115439388B (en) 2022-11-08 2022-11-08 Free viewpoint image synthesis method based on multilayer nerve surface expression

Publications (2)

Publication Number Publication Date
CN115439388A CN115439388A (en) 2022-12-06
CN115439388B true CN115439388B (en) 2024-02-06

Family

ID=84252759

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211391996.4A Active CN115439388B (en) 2022-11-08 2022-11-08 Free viewpoint image synthesis method based on multilayer nerve surface expression

Country Status (1)

Country Link
CN (1) CN115439388B (en)

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105141956A (en) * 2015-08-03 2015-12-09 西安电子科技大学 Incremental rate distortion optimization method based on free viewpoint video depth map coding
CN105247862A (en) * 2013-04-09 2016-01-13 联发科技股份有限公司 Method and apparatus of view synthesis prediction in three-dimensional video coding
JP2019159840A (en) * 2018-03-13 2019-09-19 萩原電気ホールディングス株式会社 Image synthesizing apparatus and image synthesizing method
CN111028273A (en) * 2019-11-27 2020-04-17 山东大学 Light field depth estimation method based on multi-stream convolution neural network and implementation system thereof
CN111144214A (en) * 2019-11-27 2020-05-12 中国石油大学(华东) Hyperspectral image unmixing method based on multilayer stack type automatic encoder
CN111951203A (en) * 2020-07-01 2020-11-17 北京大学深圳研究生院 Viewpoint synthesis method, apparatus, device and computer readable storage medium
CN112637582A (en) * 2020-12-09 2021-04-09 吉林大学 Three-dimensional fuzzy surface synthesis method for monocular video virtual view driven by fuzzy edge
CN114463408A (en) * 2021-12-20 2022-05-10 北京邮电大学 Free viewpoint image generation method, device, equipment and storage medium
CN114627223A (en) * 2022-03-04 2022-06-14 华南师范大学 Free viewpoint video synthesis method and device, electronic equipment and storage medium
CN114663543A (en) * 2022-03-31 2022-06-24 西安交通大学 Virtual view synthesis method based on deep learning and multi-view geometry
CN114666564A (en) * 2022-03-23 2022-06-24 南京邮电大学 Method for synthesizing virtual viewpoint image based on implicit neural scene representation
CN114820901A (en) * 2022-04-08 2022-07-29 浙江大学 Large-scene free viewpoint interpolation method based on neural network
CN114820945A (en) * 2022-05-07 2022-07-29 北京影数科技有限公司 Sparse sampling-based method and system for generating image from ring shot image to any viewpoint image

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10950037B2 (en) * 2019-07-12 2021-03-16 Adobe Inc. Deep novel view and lighting synthesis from sparse images

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105247862A (en) * 2013-04-09 2016-01-13 联发科技股份有限公司 Method and apparatus of view synthesis prediction in three-dimensional video coding
CN105141956A (en) * 2015-08-03 2015-12-09 西安电子科技大学 Incremental rate distortion optimization method based on free viewpoint video depth map coding
JP2019159840A (en) * 2018-03-13 2019-09-19 萩原電気ホールディングス株式会社 Image synthesizing apparatus and image synthesizing method
CN111028273A (en) * 2019-11-27 2020-04-17 山东大学 Light field depth estimation method based on multi-stream convolution neural network and implementation system thereof
CN111144214A (en) * 2019-11-27 2020-05-12 中国石油大学(华东) Hyperspectral image unmixing method based on multilayer stack type automatic encoder
CN111951203A (en) * 2020-07-01 2020-11-17 北京大学深圳研究生院 Viewpoint synthesis method, apparatus, device and computer readable storage medium
CN112637582A (en) * 2020-12-09 2021-04-09 吉林大学 Three-dimensional fuzzy surface synthesis method for monocular video virtual view driven by fuzzy edge
CN114463408A (en) * 2021-12-20 2022-05-10 北京邮电大学 Free viewpoint image generation method, device, equipment and storage medium
CN114627223A (en) * 2022-03-04 2022-06-14 华南师范大学 Free viewpoint video synthesis method and device, electronic equipment and storage medium
CN114666564A (en) * 2022-03-23 2022-06-24 南京邮电大学 Method for synthesizing virtual viewpoint image based on implicit neural scene representation
CN114663543A (en) * 2022-03-31 2022-06-24 西安交通大学 Virtual view synthesis method based on deep learning and multi-view geometry
CN114820901A (en) * 2022-04-08 2022-07-29 浙江大学 Large-scene free viewpoint interpolation method based on neural network
CN114820945A (en) * 2022-05-07 2022-07-29 北京影数科技有限公司 Sparse sampling-based method and system for generating image from ring shot image to any viewpoint image

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
Complete Multi-View Reconstruction of Dynamic Scenes from Probabilistic Fusion of Narrow andWide Baseline Stereo;Tony Tung 等;《2009 IEEE 12th International Conference on Computer Vision》;20091231;第1709-1716页 *
Neural Sparse Voxel Fields;Lingjie Liu 等;《arXiv:2007.11571v2》;20210131;第1-22页 *
VoxGRAF: Fast 3D-Aware Image Synthesis with Sparse Voxel Grids;Katja Schwarz 等;《arXiv:2206.07695v2》;20220630;第1-20页 *
三维人脸重建及自由视点视频生成的研究;汪晏如;《中国优秀硕士学位论文全文数据库 信息科技辑》;20210415;第2021年卷(第04期);第I138-552页 *
基于图像的自由视点合成方法研究;李明豪;《中国优秀硕士学位论文全文数据库 信息科技辑》;20220115;第2022年卷(第01期);第I138-2246页 *
基于多流对极卷积神经网络的光场图像深度估计;王硕等;《计算机应用与软件》;20200812;第37卷(第08期);第194-201页 *

Also Published As

Publication number Publication date
CN115439388A (en) 2022-12-06

Similar Documents

Publication Publication Date Title
Lee et al. From big to small: Multi-scale local planar guidance for monocular depth estimation
CN110782490B (en) Video depth map estimation method and device with space-time consistency
Wang et al. Deep learning for hdr imaging: State-of-the-art and future trends
WO2021048607A1 (en) Motion deblurring using neural network architectures
CN111901598B (en) Video decoding and encoding method, device, medium and electronic equipment
CN114936605A (en) Knowledge distillation-based neural network training method, device and storage medium
Gu et al. Coupled real-synthetic domain adaptation for real-world deep depth enhancement
CN112288788A (en) Monocular image depth estimation method
JP2022536381A (en) MOTION TRANSITION METHOD, APPARATUS, DEVICE, AND STORAGE MEDIUM
Zhao et al. Vcgan: Video colorization with hybrid generative adversarial network
Li et al. Uphdr-gan: Generative adversarial network for high dynamic range imaging with unpaired data
CN115002379B (en) Video frame inserting method, training device, electronic equipment and storage medium
CN114996814A (en) Furniture design system based on deep learning and three-dimensional reconstruction
CN115496663A (en) Video super-resolution reconstruction method based on D3D convolution intra-group fusion network
CN115170388A (en) Character line draft generation method, device, equipment and medium
CN116091955A (en) Segmentation method, segmentation device, segmentation equipment and computer readable storage medium
CN113379606A (en) Face super-resolution method based on pre-training generation model
CN116342675B (en) Real-time monocular depth estimation method, system, electronic equipment and storage medium
CN115439388B (en) Free viewpoint image synthesis method based on multilayer nerve surface expression
Xiao et al. Progressive motion boosting for video frame interpolation
Nie et al. Context and detail interaction network for stereo rain streak and raindrop removal
CN116402908A (en) Dense light field image reconstruction method based on heterogeneous imaging
CN116486009A (en) Monocular three-dimensional human body reconstruction method and device and electronic equipment
Jung et al. Depth image interpolation using confidence-based Markov random field
CN115830094A (en) Unsupervised stereo matching method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant