CN114022639A

CN114022639A - Three-dimensional reconstruction model generation method and system, electronic device and storage medium

Info

Publication number: CN114022639A
Application number: CN202111255424.9A
Authority: CN
Inventors: 魏辉; 卢丽华; 李茹杨; 赵雅倩; 李仁刚
Original assignee: Inspur Electronic Information Industry Co Ltd
Current assignee: Inspur Electronic Information Industry Co Ltd
Priority date: 2021-10-27
Filing date: 2021-10-27
Publication date: 2022-02-08

Abstract

The application discloses a method for generating a three-dimensional reconstruction model, which comprises the following steps: acquiring target images acquired by a camera, and calculating a camera pose corresponding to each target image; determining three-dimensional point cloud data corresponding to the target image according to the camera pose, and combining the three-dimensional point cloud data corresponding to all the target images to obtain an intermediate reconstruction model; performing simulated imaging on the intermediate reconstruction model by using a virtual camera; the model error of the intermediate reconstruction model is determined by comparing the target image with the simulated imaging result, and the intermediate reconstruction model is adjusted by utilizing the model error to obtain the three-dimensional reconstruction model. The application also discloses a generation system of the three-dimensional reconstruction model, an electronic device and a storage medium, and the three-dimensional reconstruction model has the beneficial effects.

Description

Three-dimensional reconstruction model generation method and system, electronic device and storage medium

Technical Field

The present application relates to the field of three-dimensional vision technologies, and in particular, to a method and a system for generating a three-dimensional reconstruction model, an electronic device, and a storage medium.

Background

Three-dimensional vision is a technology of multidisciplinary intersection such as computer vision and computer graphics, and is one of core technologies of many application fields such as robots, automatic driving and VR/AR. Three-dimensional reconstruction is a technology for scanning and calculating a real physical scene by using various sensor devices and generating a corresponding digital model, and is one of the key problems of three-dimensional visual research. With the progress of sensors and computing technologies, three-dimensional reconstruction presents some new changes and characteristics, and a real-time, dense and stable three-dimensional reconstruction method becomes a target pursued by the industry.

At present, real-time three-dimensional reconstruction based on consumer-grade sensor equipment is influenced by factors such as errors measured by a sensor, image data quality, a sensor motion state, scene feature abundance degree and the like, so that the quality of a reconstructed three-dimensional model is generally not ideal enough, and the application in VR/AR is difficult to realize. The fast three-dimensional reconstruction process based on the consumer-grade equipment mainly comprises two parts, namely camera tracking and model fusion, wherein the camera tracking is to estimate camera pose (position and attitude, six degrees of freedom) data corresponding to each frame of data in series of data collected by a sensor by finding a matching relation, then transform different frame data into the same world coordinate system and perform data fusion. Errors are introduced in the processes of sensor measurement precision, data matching, model fusion and the like, and the whole three-dimensional reconstruction process can only obtain an approximate solution through an iterative optimization method, so that the reconstruction result precision is not high.

Therefore, how to improve the generation accuracy of the three-dimensional reconstruction model is a technical problem that needs to be solved by those skilled in the art at present.

Disclosure of Invention

The application aims to provide a method, a system, an electronic device and a storage medium for generating a three-dimensional reconstruction model, which can improve the generation precision of the three-dimensional reconstruction model.

In order to solve the above technical problem, the present application provides a method for generating a three-dimensional reconstruction model, where the method for generating a three-dimensional reconstruction model includes:

acquiring target images acquired by a camera, and calculating a camera pose corresponding to each target image;

determining three-dimensional point cloud data corresponding to the target image according to the camera pose, and combining the three-dimensional point cloud data corresponding to all the target images to obtain an intermediate reconstruction model;

performing simulated imaging on the intermediate reconstruction model by using a virtual camera;

and determining a model error of the intermediate reconstruction model by comparing the target image with the simulated imaging result, and adjusting the intermediate reconstruction model by using the model error to obtain a three-dimensional reconstruction model.

Optionally, the performing, by using a virtual camera, simulation imaging on the intermediate reconstruction model includes:

adjusting the pose of the virtual camera to the camera pose;

and controlling the virtual camera to project rays to the intermediate reconstruction model, and generating the simulated imaging result according to the intersection result of the rays and the intermediate reconstruction model.

Optionally, generating the simulated imaging result according to the intersection result of the ray and the intermediate reconstruction model includes:

judging whether the ray intersects with the intermediate reconstruction model;

if yes, setting the image pixel value of the corresponding position of the light ray in the simulation imaging result to be 0;

if not, determining a target pixel value according to the color information or the distance information of the intersection point position of the light rays, and setting the image pixel value of the corresponding position of the light rays in the simulated imaging result as the target pixel value.

Optionally, determining a model error of the intermediate reconstruction model by comparing the target image with the simulated imaging result includes:

calculating pixel difference values of pixel points at the same position in the target image and the simulation imaging result;

and determining the model error of the intermediate reconstruction model according to the pixel difference value.

Optionally, after the three-dimensional point cloud data corresponding to all the images are combined to obtain an intermediate reconstruction model, the method further includes:

dividing the intermediate reconstruction model into a plurality of spatial units according to a preset spatial resolution pair;

setting the confidence coefficient of the space unit according to the quantity of the point cloud data fused in the space unit;

correspondingly, the adjusting the intermediate reconstruction model by using the model error to obtain a three-dimensional reconstruction model comprises:

setting the pixel points with the pixel difference value larger than a first threshold value in the simulated imaging result as abnormal pixel points according to the model error, and reducing the confidence coefficient of a space unit where the abnormal pixel points are located;

judging whether the confidence of the space unit is smaller than a second threshold value;

if so, setting the space unit with the confidence coefficient smaller than the second threshold value as a space unit to be adjusted;

and adjusting pixel points corresponding to the space units to be adjusted in the intermediate reconstruction model, and generating the three-dimensional reconstruction model by using the adjusted pixel points.

Optionally, adjusting the pixel point corresponding to the space unit to be adjusted in the intermediate reconstruction model includes:

acquiring pixel points corresponding to the space unit to be adjusted in all the target images, and weighting to obtain a standard value corresponding to the space unit to be adjusted;

and adjusting the pixel points corresponding to the space units to be adjusted in the intermediate reconstruction model by taking the standard values as optimization targets.

Optionally, after determining whether the confidence of the space unit is smaller than a second threshold, the method further includes:

and restoring the confidence level of the space unit with the confidence level being larger than or equal to the second threshold value to the initial value.

The present application also provides a system for generating a three-dimensional reconstruction model, the system including:

the pose calculation module is used for acquiring target images acquired by a camera and calculating a camera pose corresponding to each target image;

the reconstruction module is used for determining three-dimensional point cloud data corresponding to the target image according to the camera pose and combining the three-dimensional point cloud data corresponding to all the target images to obtain an intermediate reconstruction model;

the simulation imaging module is used for performing simulation imaging on the intermediate reconstruction model by using a virtual camera;

and the model adjusting module is used for determining a model error of the intermediate reconstruction model by comparing the target image with the simulated imaging result and adjusting the intermediate reconstruction model by using the model error to obtain a three-dimensional reconstruction model.

The present application further provides a storage medium on which a computer program is stored, which when executed, implements the steps performed by the above-described method for generating a three-dimensional reconstructed model.

The application also provides an electronic device, which comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the steps executed by the generation method of the three-dimensional reconstruction model when calling the computer program in the memory.

The application provides a method for generating a three-dimensional reconstruction model, which comprises the following steps: acquiring target images acquired by a camera, and calculating a camera pose corresponding to each target image; determining three-dimensional point cloud data corresponding to the target image according to the camera pose, and combining the three-dimensional point cloud data corresponding to all the target images to obtain an intermediate reconstruction model; performing simulated imaging on the intermediate reconstruction model by using a virtual camera; and determining a model error of the intermediate reconstruction model by comparing the target image with the simulated imaging result, and adjusting the intermediate reconstruction model by using the model error to obtain a three-dimensional reconstruction model.

The method and the device for acquiring the target image determine the three-dimensional point cloud data of the target image according to the camera pose when the target image is acquired, and the three-dimensional point cloud data are obtained by combining the three-dimensional point cloud data. An intermediate reconstruction model can be obtained by fusing the three-dimensional point cloud data. The method and the device utilize the virtual camera to carry out simulation imaging on the intermediate reconstruction model, and compare a simulation imaging result with a target image to determine the difference between the intermediate reconstruction model and actual imaging, and obtain a model error. The three-dimensional reconstruction model obtained by adjusting the intermediate reconstruction model according to the model error can eliminate the error in the intermediate reconstruction model, so that the generation precision of the three-dimensional reconstruction model can be improved. The application also provides a generating system of the three-dimensional reconstruction model, a storage medium and an electronic device, which have the beneficial effects and are not repeated herein.

Drawings

In order to more clearly illustrate the embodiments of the present application, the drawings needed for the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings can be obtained by those skilled in the art without inventive effort.

Fig. 1 is a flowchart of a method for generating a three-dimensional reconstruction model according to an embodiment of the present disclosure;

fig. 2 is a flowchart of a visually consistent three-dimensional reconstruction method according to an embodiment of the present disclosure;

fig. 3 is a schematic structural diagram of a system for generating a three-dimensional reconstruction model according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

Referring to fig. 1, fig. 1 is a flowchart of a method for generating a three-dimensional reconstruction model according to an embodiment of the present disclosure.

The specific steps may include:

s101: acquiring target images acquired by a camera, and calculating a camera pose corresponding to each target image;

the present embodiment can be applied to a three-dimensional model reconstruction apparatus, and a camera can capture target images of a specific object or scene in a plurality of poses. On the basis of obtaining a target image acquired by a camera, the embodiment can perform camera pose estimation operation on the target image; specifically, after the camera collects the target images, the camera pose corresponding to each target image can be calculated. The camera may be a consumer grade sensor device such as a general camera, a depth camera, etc.

For each acquired target image, the camera pose corresponding to the target image can be estimated. The estimation of the camera pose can be realized by using optimization based on geometric constraint and by using an optimization problem construction method based on learning. The optimized objective function may be constructed using various geometric and photometric constraints, including but not limited to reprojection errors, photometric consistency, and the like. Depending on the sensor type, the optimization problem construction can be based on color images, depth images alone, or with both color and depth maps.

S102: determining three-dimensional point cloud data corresponding to the target images according to the camera pose, and combining the three-dimensional point cloud data corresponding to all the target images to obtain an intermediate reconstruction model;

on the basis of obtaining the camera pose, the target images can be restored according to the camera pose to obtain three-dimensional point cloud data, and the three-dimensional point cloud data corresponding to a plurality of target images can be combined to obtain an intermediate reconstruction model.

S103: performing simulated imaging on the intermediate reconstruction model by using a virtual camera;

on the basis of obtaining the intermediate reconstruction model, the intermediate reconstruction model can be subjected to simulation imaging, so that the model quality of the intermediate reconstruction model can be evaluated by using a simulation imaging result. Specifically, the pose of the virtual camera may be determined according to the camera pose corresponding to each target image, so that the virtual camera performs analog imaging on the intermediate reconstruction model in the corresponding camera pose, and the analog imaging result is an image obtained by shooting the intermediate reconstruction model in the camera pose by the virtual camera.

S104: and determining a model error of the intermediate reconstruction model by comparing the target image with the simulated imaging result, and adjusting the intermediate reconstruction model by using the model error to obtain a three-dimensional reconstruction model.

In the embodiment, the target image actually shot by the camera is compared with the virtual imaging result of the virtual camera to obtain the difference between the intermediate reconstruction model and the actual object, namely the model error, so as to be used as the measurement standard of the reconstruction quality of the model. On the basis of obtaining the model error, the present embodiment adjusts the intermediate reconstruction model by using the model error to obtain the three-dimensional reconstruction model.

The embodiment determines three-dimensional point cloud data of the target image according to the camera pose when the target image is acquired, and the three-dimensional point cloud data is obtained by merging the three-dimensional point cloud data. An intermediate reconstruction model can be obtained by fusing the three-dimensional point cloud data. In this embodiment, the virtual camera is used to perform simulated imaging on the intermediate reconstruction model, and the simulated imaging result is compared with the target image to determine the difference between the intermediate reconstruction model and the actual imaging, so as to obtain a model error. The three-dimensional reconstruction model obtained by adjusting the intermediate reconstruction model according to the model error can eliminate the error in the intermediate reconstruction model, so that the generation precision of the three-dimensional reconstruction model can be improved.

As a further description of the corresponding embodiment of fig. 1, the present embodiment may perform simulated imaging on the intermediate reconstruction model in the following manner: adjusting the pose of the virtual camera to the camera pose; and controlling the virtual camera to project rays to the intermediate reconstruction model, and generating the simulated imaging result according to the intersection result of the rays and the intermediate reconstruction model.

Each ray projected by the virtual camera corresponds to a pixel point in the simulated imaging result, so that after the ray is projected to the intermediate reconstruction model, the embodiment can judge whether the ray is intersected with the intermediate reconstruction model; if yes, setting the image pixel value of the corresponding position of the light ray in the simulation imaging result to be 0; if not, determining a target pixel value according to the color information or the distance information of the intersection point position of the light rays, and setting the image pixel value of the corresponding position of the light rays in the simulated imaging result as the target pixel value.

Specifically, after point cloud data of a certain frame of target image is fused into the intermediate reconstruction model, the fused model is subjected to simulated imaging according to the estimated camera pose of the frame of target image, and the model is used for subsequently evaluating the estimated camera pose and the accuracy of the reconstruction model. The method mainly comprises the following operations:

and taking the estimated camera pose corresponding to the frame of target image as the pose of the virtual camera, and performing analog imaging on the reconstructed model by using the virtual camera. The internal and external camera parameters such as imaging resolution and the like are determined according to the parameters of the actual camera, and the parameters are kept consistent.

Analog imaging may be performed by ray casting or the like. Specifically, from the camera pose, a ray may be projected toward each pixel of the target image of the simulated imaging, and the ray may intersect or may not intersect with the reconstructed model. The image pixel values corresponding to rays that do not intersect the model are set to 0 or other default values, and the image pixel values corresponding to rays that intersect the model are determined from information at the intersection location of the model and the rays. Each pixel may correspond to a projection ray, and the value of the pixel is determined by this ray. Each pixel can also project a plurality of light rays, the value of the pixel is determined by the plurality of light rays together in a weighting mode, and the position distribution of each light ray in the size range of the pixel can be random sampling, four-corner position or other modes.

The simulation imaging result can be a color image, a depth image, or both. When generating a color map, the value of each pixel is determined by the model color value at the location of the intersection of the model and the ray. When the depth map is generated, the value of each pixel is the distance between the intersection points of the rays of the camera and the model, and the distance can be a direct distance between two points or a projection distance of the direct distance between the two points in the optical axis direction of the camera. The distance may be a dimensional value using a certain measurement unit, or may be a dimensionless value using a certain normalization.

As a further introduction to the corresponding embodiment of fig. 1, determining a model error of the intermediate reconstruction model by comparing the target image and the simulated imaging result includes: calculating pixel difference values of pixel points at the same position in the target image and the simulation imaging result; and determining the model error of the intermediate reconstruction model according to the pixel difference value.

Further, after the three-dimensional point cloud data corresponding to all the images are combined to obtain an intermediate reconstruction model, the intermediate reconstruction model can be divided into a plurality of space units according to a preset spatial resolution; and setting the confidence coefficient of the space unit according to the quantity of the point cloud data fused in the space unit.

Specifically, the target image can be restored to the three-dimensional point cloud data according to the calculated camera pose, and the three-dimensional point cloud data corresponding to different target images are fused (transformed into the same coordinate system and points corresponding to the same position are merged) to obtain an intermediate reconstruction model. The intermediate reconstruction model adopts a space occupation representation method, the space where the intermediate reconstruction model is located is wholly divided according to a certain resolution ratio and divided into space units with the same size, and the efficiency of space representation can be improved by means of algorithms like voxel hash and the like. Each spatial cell stores either the reconstructed object surface points located within the cell or the signed closest distance of the cell to the reconstructed surface. The method specifically comprises the following operations:

operation 1: and transforming the image into three-dimensional point cloud data according to the pose of the camera.

Operation 2: and subdividing the space occupied by the newly reconstructed middle reconstruction model according to a certain spatial resolution requirement.

And if the three-dimensional point cloud reconstructed from the new image exceeds the representation ranges of all the units already subdivided in the space, subdividing the newly added space position according to a preset resolution ratio on the basis of the original subdivision. And the resolution of the subdivision is comprehensively determined according to the representation precision of the reconstruction model, available storage resources and other conditions. Each space unit is given an initial confidence value during subdivision, and the confidence value represents the accuracy of a reconstruction point located in the space unit and is used for evaluating the quality of a subsequent reconstruction model.

Operation 3: and fusing the newly reconstructed three-dimensional point cloud into the corresponding space unit.

Each point in the reconstructed point cloud corresponds to a certain space unit according to the space position, and each space unit stores the coordinates of the reconstructed point in the unit. The coordinates of the newly added point (or the signed closest distance of the cell from the reconstruction surface, determined from the data stored in the cell) and the coordinates (or distance) of the existing points in the cell are weighted and averaged, the weight can be determined according to the number of the fused points, for example, if n points are fused in the cell, the weight of the cell is n, the weight of the newly added point is 1, and the weight of the cell becomes n +1 after fusion. The above point fusion process can also be implemented by adopting other strategies. In the model fusion process, the confidence of each unit can be kept unchanged or changed according to a certain rule, and the change of the confidence can reflect the change of the reconstruction quality of the model.

As a further introduction to the corresponding embodiment of fig. 1, the intermediate reconstruction model may be adjusted to obtain a three-dimensional reconstruction model in the following manner: setting the pixel points with the pixel difference value larger than a first threshold value in the simulated imaging result as abnormal pixel points according to the model error, and reducing the confidence coefficient of a space unit where the abnormal pixel points are located; judging whether the confidence of the space unit is smaller than a second threshold value; if so, setting the space unit with the confidence coefficient smaller than the second threshold value as a space unit to be adjusted; and adjusting pixel points corresponding to the space units to be adjusted in the intermediate reconstruction model, and generating the three-dimensional reconstruction model by using the adjusted pixel points.

Specifically, the embodiment can obtain all the pixel points in the target image corresponding to the space unit to be adjusted for weighting, so as to obtain the standard value corresponding to the space unit to be adjusted; and adjusting the pixel points corresponding to the space units to be adjusted in the intermediate reconstruction model by taking the standard values as optimization targets. After determining whether the confidence of the spatial unit is smaller than a second threshold, the confidence of the spatial unit whose confidence is greater than or equal to the second threshold may be restored to an initial value.

In the process, the image obtained by the simulated imaging can be compared with the image actually acquired by the camera, so that the difference between the reconstructed model and the actual object can be obtained and used as the measurement standard of the reconstruction quality of the model. The method mainly comprises the following operations:

and calibrating the confidence coefficient of the space unit according to the difference between the corresponding pixel values of the actually acquired image and the simulated imaging image. If the pixel value difference exceeds a preset threshold value, the confidence coefficient of the space unit corresponding to the pixel is considered to be poor, and the confidence coefficient of the corresponding space unit is reduced according to a certain rule. If the difference between the pixel values is smaller than the predetermined threshold, the confidence of the spatial unit corresponding to the pixel remains unchanged.

And according to the set calibration times, evaluating the space units in a segmented manner and adjusting and optimizing the space units. The reconstruction process of each frame of image can calibrate all the space units visible in the image once, each space unit maintains a counter of calibration times, and when the calibration times reach a certain number, the space unit is evaluated once. If the confidence of the space unit is smaller than a preset threshold value, the space unit is considered to need to be adjusted and optimized. If the confidence of the space unit is larger than the preset threshold value at the moment, the space unit is considered not to need to be adjusted and optimized, and the confidence of the space unit is restored to the initialized value. The aim of the initialization recovery is to avoid the influence of accumulated errors in the process of estimating and reconstructing the pose of the camera as much as possible.

The process of generating the three-dimensional reconstruction model by adjusting the intermediate reconstruction model is as follows: and adjusting and optimizing a part of space units in the reconstructed model so as to keep corresponding values of the space units in the simulated imaging image and the actually acquired image consistent. The adjustment process takes the actual value in the collected image as an accurate value, and the specific adjustment scheme is determined according to the meaning of the data represented by the spatial unit or the stored data content.

The method mainly comprises the following operations:

operation 1: and calculating to obtain a space unit adjustment optimization target based on the acquired image.

And for a certain space unit needing to be adjusted, firstly acquiring pixel values of space units corresponding to all acquired images participating in the calibration of the space unit in the fourth step, then restoring the pixels to three-dimensional space points according to data such as camera poses and the like corresponding to the acquired images, and weighting each point according to a certain rule to obtain an accurate value corresponding to the space unit to be used as an object for adjustment and optimization.

Operation 2: and if the accurate value of the position of the space unit obtained in the operation 1 does not exceed the range of the current space unit, updating the data in the space unit to be adjusted of the current reconstruction model according to the accurate value.

Operation 3: and if the accurate value of the position of the space unit obtained in the operation 1 exceeds the range of the current space unit, performing weighted fusion on the data in the space unit to be adjusted of the current reconstruction model and the data in the space unit corresponding to the accurate value. The confidence of the accurate value in the corresponding space unit can be kept unchanged or adjusted according to a preset strategy.

The flow described in the above embodiment is explained below by an embodiment in practical use.

The types of sensors used in different application scenes are different, and the acquired data formats are different, so that the three-dimensional reconstruction presents different technical architectures and implementation methods in different applications. In VR/AR, it is crucial to construct a virtual environment/object that is highly consistent with the real world, which will directly impact the VR/AR application experience. Real-time, convenient and fast automatic three-dimensional reconstruction based on a real scene/object is an important way for realizing the aim and is one of the hotspots of research in the field. High-precision three-dimensional reconstruction techniques represented by volume photography can realize very high-quality three-dimensional reconstruction by means of complex sensor equipment and sufficient computational support, and are well applied to many professional fields. However, due to the limitations of the field, cost, convenience of use and the like of complex sensor devices, the method cannot be popularized in a large scale, and fast and high-quality three-dimensional reconstruction based on consumer-grade sensor devices is an urgent need for VR/AR applications.

The existing three-dimensional model reconstruction scheme mainly utilizes a depth map and a color map to construct a combined optimization target, constructs multi-dimensional constraint by means of shadow/contour and other information, improves the accuracy of camera pose estimation by a coarse-to-fine layer-by-layer optimization strategy and the like, and thus indirectly improves the model reconstruction accuracy. Although the quality of the reconstructed model obtained by the methods is greatly improved, the quality of the reconstructed model is still difficult to meet the actual application requirements. The current reconstruction process cannot effectively measure the error of a reconstruction result in real time in the reconstruction process, and generally the quality of a reconstruction model is indirectly measured by estimating the deviation of a camera track after the reconstruction is finished. Meanwhile, the existing method for solving the problem of error accumulation in the reconstruction process corrects the pose of a camera by performing global optimization in the reconstruction process, so that the quality of a reconstruction model is indirectly improved.

Considering that three-dimensional reconstruction is a process of gradual accumulation and fusion, the embodiment starts with the reconstruction result in the reverse direction, and according to the requirement that the reconstructed model and the actual target object should be kept consistent visually, the intermediate result obtained in the reconstruction process is subjected to simulated imaging to find the difference between the intermediate result and the actual imaging of the camera, and the reconstruction result is subjected to targeted optimization based on the difference, so that the geometric details of the model in the reconstruction process can be better reserved and recovered, and the quality of the three-dimensional reconstruction is directly improved.

The embodiment provides a three-dimensional reconstruction method conforming to visual consistency aiming at the problem of low quality of a real-time three-dimensional reconstruction model based on consumer-level equipment in the field of three-dimensional vision, and is used for realizing rapid and convenient three-dimensional reconstruction with high geometric quality. Aiming at the problem that the quality of the model is difficult to measure in the reconstruction process, a simulation imaging method for the reconstruction result is provided, a simulation image of the reconstruction result can be obtained, and the quality of the reconstruction model can be measured from the perspective of visual consistency; aiming at the problems of noise in an image used for reconstruction, accumulated error caused by a reconstruction algorithm and the like, a model credibility measuring method based on image sequence segmented accumulation is provided, and the difference between a reconstructed model and a real object is accurately evaluated; aiming at the problem of error correction in the reconstruction process, a reconstruction model optimization and adjustment method based on a simulation imaging result is provided, so that the reconstruction result and a real object meet visual consistency.

The embodiment is based on consumer-grade sensor equipment, such as a common camera, a depth camera and the like, three-dimensional reconstruction is performed by using acquired image data, simulated imaging is performed on an intermediate reconstruction result, the geometric quality of a reconstruction model is evaluated from the perspective of visual consistency, an image sequence formed by a plurality of images is used for providing a reliability model of reconstruction result measurement, a high-error part of the reconstruction model is obtained by comparing with an image directly measured by the camera, and the high-error position is subjected to targeted optimization adjustment to improve the model precision.

Referring to fig. 2, fig. 2 is a flowchart of a visually consistent three-dimensional reconstruction method according to an embodiment of the present application, which may include the following steps:

step one, a camera collects images.

And step two, estimating the pose of the camera.

And step three, point cloud recovery and model fusion.

And recovering the image to a three-dimensional point cloud according to the camera pose estimated in the second step, and fusing point cloud data corresponding to different images (transforming to the same coordinate system and combining points corresponding to the same position). The reconstruction model adopts a space occupation representation method, the space where the reconstruction model is located is wholly divided according to a certain resolution, the space is divided into space units with the same size, and the efficiency of space representation can be improved by means of algorithms like voxel hash and the like. Each cell stores the reconstructed object surface points located within the cell or stores the signed closest distance of the cell from the reconstructed surface.

And fourthly, simulating and imaging the reconstruction result.

And after the point cloud of a certain frame is fused into the model, performing analog imaging on the fused model according to the camera pose of the frame estimated in the step two, and evaluating the estimated camera pose and the accuracy of the reconstructed model subsequently.

And step five, reconstructing model error measurement based on the simulation imaging result.

And comparing the image obtained by the simulated imaging in the step four with the image actually acquired by the camera to obtain the difference between the reconstruction model and the actual object as the measurement standard of the reconstruction quality of the model.

And step six, adjusting and optimizing the reconstruction result based on the error measurement.

And adjusting and optimizing a part of space units in the reconstructed model according to the evaluation of the step five, wherein the aim is to keep corresponding values of the space units in the simulated imaging image and the actually acquired image consistent. The adjustment process takes the actual value in the collected image as an accurate value, and the specific adjustment scheme is determined according to the meaning of the data represented by the spatial unit or the stored data content.

And step seven, repeating the operations from the step one to the step six on each frame of acquired data until the scanning of the camera is finished.

The embodiment can measure the quality of the reconstruction model by performing simulated imaging on the reconstruction intermediate result and guide the optimization adjustment of the reconstruction result, so that the reconstruction result and the actual object are kept visually consistent. And each space unit is given an initial confidence value during subdivision and is used for subsequently evaluating the quality of the reconstruction model.

In the embodiment, the difference between the reconstructed model and the actual object can be obtained by comparing the image obtained by performing simulated imaging on the reconstruction result with the image actually acquired by the camera, and the difference is used as the measurement standard of the reconstruction quality of the model.

The embodiment can calibrate the spatial cell confidence according to the corresponding pixel positions of the actually acquired image and the simulated imaging image. If the pixel value difference exceeds a preset threshold value, the confidence coefficient of the space unit corresponding to the pixel is considered to be poor, and the confidence coefficient of the corresponding space unit is reduced according to a certain rule. If the difference between the pixel values is smaller than the predetermined threshold, the confidence of the spatial unit corresponding to the pixel remains unchanged.

The reconstruction process of each frame of image can calibrate all the space units visible in the image once, each space unit maintains a counter of calibration times, and when the calibration times reach a certain value, the space units are evaluated once. If the confidence of the space unit is smaller than a preset threshold value, the space unit is considered to need to be adjusted and optimized. If the confidence of the space unit is larger than a preset threshold value at the moment, the control unit is considered not to be adjusted and optimized, the confidence of the space unit is restored to an initialized value, and the purpose of restoring initialization is to avoid the influence of accumulated errors in the camera pose estimation and reconstruction processes as much as possible.

In the embodiment, the adjustment mode can be determined according to the meaning of data represented by the space unit or the content of stored data in the space unit, which is needed to be adjusted and optimized in the reconstructed model, so that the corresponding values of the space unit in the simulated imaging and the actually acquired image are kept consistent, and the value in the acquired image is taken as an accurate value to be adjusted in the adjustment process.

For a certain space unit needing to be adjusted, firstly, the pixel values corresponding to the space unit in all collected images for calibrating the unit are obtained, then the pixels are restored to points in a three-dimensional space according to the data such as the camera pose corresponding to each point, and then each point is weighted according to a certain rule to obtain an accurate value corresponding to the space unit to be used as an object for adjustment and optimization.

If the adjusted space unit position accurate value does not exceed the range of the current space unit, updating the data in the current space unit according to the accurate value; and if the adjusted position accurate value of the space unit exceeds the range of the current space unit, performing weighted fusion on the data in the current space unit and the data in the space unit corresponding to the accurate value, and keeping the confidence coefficient in the space unit corresponding to the accurate value unchanged.

The embodiment can provide a three-dimensional reconstruction method conforming to visual consistency aiming at the problem of low quality of a real-time three-dimensional reconstruction model based on consumer-level equipment in the field of three-dimensional vision, and can realize high-quality and rapid three-dimensional reconstruction of a physical scene. The provided simulation imaging method for the reconstruction result can obtain a simulation image of the reconstruction result, realize the measurement of the geometric quality of the reconstruction model from the perspective of visual consistency, and solve the problem that the model quality is difficult to effectively measure in the reconstruction process; the reconstruction model confidence coefficient method based on image sequence segmentation accumulation can realize accurate evaluation of the difference between a reconstruction model and a real object, and solves the problems of inaccurate evaluation result and the like caused by noise existing in an image and algorithm accumulated error; the reconstruction model optimization adjustment method based on the simulated imaging result can enable the reconstruction result and the real object to be consistent in vision, and solves the problem of error correction in the reconstruction process.

Referring to fig. 3, fig. 3 is a schematic structural diagram of a system for generating a three-dimensional reconstruction model according to an embodiment of the present application, where the system may include:

the pose calculation module 301 is configured to acquire target images acquired by a camera and calculate a camera pose corresponding to each target image;

a reconstruction module 302, configured to determine three-dimensional point cloud data corresponding to the target image according to the camera pose, and merge all the three-dimensional point cloud data corresponding to the target image to obtain an intermediate reconstruction model;

a simulation imaging module 303, configured to perform simulation imaging on the intermediate reconstruction model by using a virtual camera;

a model adjusting module 304, configured to determine a model error of the intermediate reconstruction model by comparing the target image with the simulated imaging result, and adjust the intermediate reconstruction model by using the model error to obtain a three-dimensional reconstruction model.

Further, the analog imaging module 303 includes:

a pose adjusting unit for adjusting the pose of the virtual camera to the camera pose;

and the imaging unit is used for controlling the virtual camera to project rays to the intermediate reconstruction model and generating the simulated imaging result according to the intersection result of the rays and the intermediate reconstruction model.

Further, the imaging unit is used for judging whether the light ray intersects with the intermediate reconstruction model; if yes, setting the image pixel value of the corresponding position of the light ray in the simulation imaging result to be 0; if not, determining a target pixel value according to the color information or the distance information of the intersection point position of the light rays, and setting the image pixel value of the corresponding position of the light rays in the simulated imaging result as the target pixel value.

Further, the model adjusting module 304 is configured to calculate a pixel difference value between the target image and a pixel point at the same position in the simulated imaging result; and the model error of the intermediate reconstruction model is determined according to the pixel difference value.

Further, the method also comprises the following steps:

the model dividing module is used for dividing the intermediate reconstruction model into a plurality of space units according to a preset spatial resolution after the three-dimensional point cloud data corresponding to all the images are combined to obtain the intermediate reconstruction model; the confidence coefficient of the space unit is set according to the quantity of the point cloud data fused in the space unit;

correspondingly, the model adjusting module 304 is configured to set, according to the model error, a pixel point in the simulated imaging result, where the pixel difference value is greater than the first threshold, as an abnormal pixel point, and reduce the confidence of the spatial unit where the abnormal pixel point is located; the confidence level of the space unit is judged whether to be smaller than a second threshold value; if so, setting the space unit with the confidence coefficient smaller than the second threshold value as a space unit to be adjusted; and the three-dimensional reconstruction model is generated by utilizing the adjusted pixel points.

Further, the adjusting, by the model adjusting module 304, pixel points corresponding to the space unit to be adjusted in the intermediate reconstruction model includes: acquiring pixel points corresponding to the space unit to be adjusted in all the target images, and weighting to obtain a standard value corresponding to the space unit to be adjusted; and adjusting the pixel points corresponding to the space units to be adjusted in the intermediate reconstruction model by taking the standard values as optimization targets.

Further, the method also comprises the following steps:

and the confidence coefficient restoring module is used for restoring the confidence coefficient of the space unit of which the confidence coefficient is greater than or equal to a second threshold value to an initial value after judging whether the confidence coefficient of the space unit is less than the second threshold value.

The embodiment provides a three-dimensional reconstruction scheme conforming to visual consistency, and the provided quality evaluation and adjustment optimization method for the reconstruction result is used for realizing high-geometric-quality rapid three-dimensional reconstruction based on consumer-grade equipment. The simulation imaging scheme for the reconstruction result is provided, the simulation image of the reconstruction result can be obtained, and the quality of the reconstruction model can be measured from the perspective of visual consistency; the model confidence coefficient measuring scheme based on image segmentation sequence accumulation is provided, the influence of image noise and algorithm accumulated errors on quality evaluation can be solved, and the accurate evaluation of the difference between a reconstructed model and a real object is realized; the reconstruction model optimization adjustment scheme based on the confidence coefficient is provided, the error correction problem in the reconstruction process is solved, and the reconstruction result and the real object meet visual consistency.

Since the embodiment of the system part corresponds to the embodiment of the method part, the embodiment of the system part is described with reference to the embodiment of the method part, and is not repeated here.

The present application also provides a storage medium having a computer program stored thereon, which when executed, may implement the steps provided by the above-described embodiments. The storage medium may include: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The application further provides an electronic device, which may include a memory and a processor, where the memory stores a computer program, and the processor may implement the steps provided by the foregoing embodiments when calling the computer program in the memory. Of course, the electronic device may also include various network interfaces, power supplies, and the like.

The embodiments are described in a progressive manner in the specification, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description. It should be noted that, for those skilled in the art, it is possible to make several improvements and modifications to the present application without departing from the principle of the present application, and such improvements and modifications also fall within the scope of the claims of the present application.

It is further noted that, in the present specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

Claims

1. A method for generating a three-dimensional reconstruction model, comprising:

2. The method for generating the three-dimensional reconstruction model according to claim 1, wherein the performing simulation imaging on the intermediate reconstruction model by using the virtual camera comprises:

adjusting the pose of the virtual camera to the camera pose;

3. The method for generating the three-dimensional reconstruction model according to claim 2, wherein generating the simulation imaging result according to the intersection result of the ray and the intermediate reconstruction model comprises:

judging whether the ray intersects with the intermediate reconstruction model;

4. The method for generating the three-dimensional reconstruction model according to claim 1, wherein determining the model error of the intermediate reconstruction model by comparing the target image with the simulated imaging result comprises:

5. The method for generating the three-dimensional reconstruction model according to claim 4, after combining the three-dimensional point cloud data corresponding to all the images to obtain an intermediate reconstruction model, further comprising:

6. The method for generating the three-dimensional reconstruction model according to claim 5, wherein adjusting the pixel points corresponding to the spatial units to be adjusted in the intermediate reconstruction model comprises:

7. The method for generating the three-dimensional reconstruction model according to claim 5, wherein after determining whether the confidence of the spatial unit is smaller than a second threshold, the method further comprises:

8. A system for generating a three-dimensional reconstruction model, comprising:

9. An electronic device, comprising a memory in which a computer program is stored and a processor which, when calling the computer program in the memory, implements the steps of the method for generating a three-dimensional reconstructed model according to any one of claims 1 to 7.

10. A storage medium having stored thereon computer-executable instructions which, when loaded and executed by a processor, carry out the steps of a method of generating a three-dimensional reconstructed model according to any one of claims 1 to 7.