CN109147025B

CN109147025B - RGBD three-dimensional reconstruction-oriented texture generation method

Info

Publication number: CN109147025B
Application number: CN201810757144.XA
Authority: CN
Inventors: 齐越; 王晨; 衡亦舒
Original assignee: Beihang University
Current assignee: Beihang University
Priority date: 2018-07-11
Filing date: 2018-07-11
Publication date: 2023-07-18
Anticipated expiration: 2038-07-11
Also published as: CN109147025A

Abstract

The invention relates to an RGBD three-dimensional reconstruction texture generation method, which uses a space-time joint sampling method, considers the factors of time sequence and space sequence, extracts RGB data and depth data with low ambiguity and high uniqueness from an RGBD data stream as key frames, ensures the image quality, reduces the data redundancy and simultaneously ensures that a model can be covered by the key frames as much as possible. And projecting images with different view angles to the same view angle by using the model data and the camera pose of the key frame, establishing an error energy function, alternately optimizing the images, and generating mutually aligned image data with different view angles as optimized key frame data. And parameterizing the model to a two-dimensional plane, and fusing data under different visual angles to the plane to generate a final texture image. The method fully considers the problems caused by camera attitude deviation and modeling geometric errors in the process of texture generation, and has important significance for generating clear and seamless high-quality texture images and realistic three-dimensional models.

Description

RGBD three-dimensional reconstruction-oriented texture generation method

Technical Field

The invention belongs to the field of computer vision and computer graphic image processing, in particular to a method for automatically generating a single clear and seamless texture atlas by utilizing RGBD data of a three-dimensional grid model and a key frame obtained by scanning. The method generates clear texture images for the three-dimensional model, and has important significance for vivid rendering of the model and three-dimensional content creation.

Background

With the continuous development of VR/AR application potential in recent years, a demand for mass production of three-dimensional content has been brought. The three-dimensional reconstruction technology of the scene based on the RGBD data stream enables the model to have higher geometric accuracy. However, realistic three-dimensional models also require high quality textures to reveal details that cannot be represented by geometric models alone. The rendering method of the three-dimensional model combined with the texture atlas can reveal details through textures under the condition of using less computing resources. In the process of three-dimensional reconstruction, the continuity and clarity of texture atlas plays a vital role in the performance of the final three-dimensional model.

Current research into three-dimensional reconstruction based on RGBD data streams has largely focused on how to generate a high-precision geometric model from the RGBD data streams, with the following texture generation work steps generally being as follows: firstly, in the process of generating a geometric model, a plurality of key frames are intercepted according to a certain rule, wherein the key frame content comprises RGB data and depth data at the time and camera gestures estimated by a relevant method in the modeling process; second, for each mesh triangular patch on the geometric model, associating it with key frames of one or more viewpoints according to certain constraints, then projecting RGB information onto the model using the camera's internal parameters; due to the relation of illumination and shooting angles, RGB information obtained from different key frames can have differences, obvious joints appear on textures on the model, and the joints can be eliminated by using a Poisson modifying method to smooth grids at two sides of the joints; and finally, the RGB information on the grid triangular surface patches is arranged and packed into an independent texture atlas, and a final texture image is obtained.

In the above process, a high-quality texture map can be obtained under ideal conditions, but in practical situations, when key frames are intercepted, whether the key frames are blurred or not and the number of the key frames can have adverse effects on the optimization process of the texture. The error accumulation in the modeling process can also cause offset of the camera gesture, so that the RGB information of different key frames is misplaced when projected onto the surface of the model, and some methods exist at present to try to align the RGB data under different visual angles with the model by optimizing the camera gesture in the modeling process, so that textures can be correctly added onto the reconstructed model. However, the texture quality is affected by not only camera pose, but also accuracy of the reconstructed model geometry and the influence of illumination change during scanning, which can affect the consistency of textures, thereby affecting the quality of the final texture of the grid model.

Disclosure of Invention

The technical solution of the invention is as follows: the texture generation method for RGBD three-dimensional reconstruction is capable of generating a texture atlas with global consistency within a few minutes automatically.

The invention aims at researching a method for generating a single continuous texture image in the RGBD three-dimensional reconstruction process by combining the known reconstructed grid model and key frame data with the requirements in the texture generation process and utilizing the geometric characteristics of the grid model and the color and depth data in the key frame, and provides guarantee for the generation of a realistic three-dimensional model.

The technical proposal of the invention is as follows: a texture generation method facing RGBD three-dimensional reconstruction comprises the following steps:

s1, carrying out space-time sampling on a data stream according to time marks of each frame in color and Depth (RGB and Depth, RGBD for short) data streams for reconstructing a model by combining the uniqueness measurement of each frame of RGB data to obtain a key frame sequence;

s2, projecting key frames at different positions to the same camera position according to the key frame sequence sampled in the first step to construct an energy function, and solving the energy function in a grouping alternate iteration solving mode to generate a key frame sequence aligned with each other;

s3, dividing the triangular surface patch of the model surface into different areas according to the camera gestures of different key frames and the aligned key frame sequences generated in the second step, parameterizing the areas to a two-dimensional plane, endowing data projected to the corresponding areas on the key frames, and arranging according to the area sizes to finally obtain a texture atlas of the model.

The step S1 is as follows:

s11, performing time sampling on RGBD data flow, wherein the specific process is as follows:

s11, performing time sampling on RGBD data flow to obtain a time-sampled key frame, wherein the specific process is as follows:

for the registered RGBD data stream, first thresholding the RGB data with depth data, separating the front and back, and then adding the two to the RGBD data streamCalculating the blur degree measure D of all RGB data at the threshold delta set by the frame number _max Selecting the minimum D value, and storing corresponding RGBD data and camera gestures as key frames; meanwhile, after one frame is selected, the next number is δ _mtn Does not process any frame from delta _mtn Starting at +1 frame and proceeding from the next delta _max Selecting the frame with the minimum D value as a key frame until all RGBD data are processed to obtain a time sampling key frame set K ₀ . Each key frame K _t ∈K ₀ In which RGB data C is contained _t Depth data D _t And camera pose T _t ；

S12, performing spatial sampling on the key frames obtained by time sampling, and reducing the number of the key frames under the condition of guaranteeing the coverage of the key frames, wherein the specific process is as follows:

depth data K of key frames time-sampled for each frame _t ∈K ₀ The uniqueness metric of the key frame is calculated by the following formula:

wherein Q (I) represents the uniqueness metric of the key frame, is a real number between 0 and 1, and I' represents the set K ₀ D _I′ (p ') represents the value of the corresponding depth image projected by point p onto image I', z _I′ (p ') is the z-value of the three-dimensional point corresponding to point p converted into the three-dimensional space of image I', i| representing the number of pixels in image I; adding all the key frames into a priority queue, and evaluating the priority by the priority queue according to the uniqueness metric of each key frame; after all key frames are calculated, deleting the value with the lowest uniqueness metric value in the queue, and simultaneously recalculating the uniqueness metrics of all key frames capable of observing the deleted key frame pixels until the queueThe smallest uniqueness measure in the queue is greater than the threshold sigma, and the key frames in the queue are the key frame sequence K' which is finally subjected to space-time sampling.

The specific implementation of the step S2 is as follows:

s21, constructing an image pyramid;

copying all key frames RGB image key frames obtained by space-time sampling in S1 into three groups, { S _i }，{T _i }，{M _i Set { S }, where _i Called source image, set { T }, set _i Is the target image, set { M } _i Respectively downsampling the texture image, establishing an image pyramid with v layers of scale from small to large, and carrying out the following iteration on each layer of image according to the scale from small to large;

s22, utilizing the original image S _i ∈{S _i }，{M _i The data of } is combined with the following formula (1) to generate a target image T _i

Wherein T is _i (x _i ) Representing an image T _i X of (x) _i The values of the pixels, the method for constructing the target image is realized, the first term of the numerator on the right side of the formula and the first two terms of the denominator are the results of a standard patch-match algorithm, and the obtained image T _i And image S _i The corresponding relation of the pixel blocks in the invention, L is the total number of pixels in the pixel blocks, and the invention adopts 7 multiplied by 7 pixel blocks, and L=49;

w _i (x _i ) Indicated in the ith frame x _i Weights at pixels, defined as θ/d ² Wherein θ is pixel x _i The included angle between the normal direction of the model surface at the corresponding three-dimensional point and the line of sight, d represents the pixel x _i The distance between the corresponding three-dimensional point and the observation position; in the invention, w is interpolated data in a loader;

M _k (x _i→k ) Representing image M _k X of (x) _i The pixels are projected to M according to a transformation matrix _i Pixel values at;

n represents the total number of key frames, alpha and lambda are balance coefficients respectively, and 2 and 0.1 are taken;

s23, utilizing a set { T } _i Generating a set { M } in conjunction with equation (2) _i }

Represents M _t Is composed of T _t Weighted average is carried out to obtain;

s24, optimizing { M }' by adopting an alternate optimization method _t Maintain { T } at time _i Immobilized { T }, optimize _i Maintain { M } at time _t Taking each execution of formulas (1) and (2) as an iteration, considering that the iteration v multiplied by 5 of the minimum scale level can converge, and the iteration times of the large scale level is reduced by 5 each time along with the increase of the scale of the pyramid;

s25, after the iteration of a certain layer is finished, for { M _t }、{T _i And directly upsampling the optimized result, taking the directly upsampled result as initial data of a new layer of iteration, and then executing S22 until the layer of iteration with the largest scale is completed, so as to finally obtain a key frame sequence K which is mutually aligned.

The specific implementation of the step S3 is as follows:

s31, dividing the reconstructed model into different areas according to the pose of the camera;

sequentially numbering the viewing angles of all key frames as { 1..N }, projecting all triangular patches on a reconstructed model under each key frame, selecting a viewing angle with the maximum value described by the following formula for each triangular patch, and giving the number of the viewing angle to the triangular patch;

wherein f represents a triangular surface patch, C represents a camera under a certain viewing angle, θ represents a line-of-sight angle between a normal of f and C, d represents a distance between a center of f and C, a _f Representing the area of f, alpha and lambda representing the smoothing coefficients;

after all triangular patches are processed, the triangular patches with uniform numbers are considered to belong to the same block; in order to maintain the continuity of illumination, if the number of triangular patches of a certain area communicated with each other is less than a set value sigma=50, combining the triangular patches of the area into an adjacent area;

for all the areas, projecting the areas to the view angles corresponding to the area numbers to obtain two-dimensional coordinates corresponding to the three-dimensional vertexes, and obtaining a size set B= { (w) of a bounding box of the two-dimensional coordinates ₀ ，h ₀ )，(w ₁ ，h ₁ )，...，(w _m ，h _m ) M is the number of areas, and according to the width of each bounding box, sequentially moving the two-dimensional vertexes in each bounding box to a plane with a size specified by a user to obtain texture coordinates of the whole reconstructed model;

s32, fusing key frame data of different view angles to the texture image to finally obtain a texture atlas;

for each triangular patch on the reconstructed model, projecting the triangular patch onto all key frames for observing the triangular patch, recording corresponding color information, establishing a priority queue for each triangular patch, taking the included angle value between the normal line of the triangular patch and the central ray of the visual angle as the priority, and only reserving the first N key frame data (N is 3) in each queue; after all the triangular patches are processed, calculating a weighted average value of key frames in a queue corresponding to each triangular patch, and finally generating a single continuous texture atlas as a final result.

The invention further analyzes the requirements of three-dimensional reconstruction texture generation, combines the characteristics of model data obtained by three-dimensional reconstruction and RGBD data flow, and has the following advantages:

(1) Considering that the depth data of the common consumer-level depth camera is low in quality, and the data redundancy in the common three-dimensional reconstruction process is high but is easily affected by dynamic blurring, the RGBD data stream is sampled by combining the time sequence characteristic and the space sequence characteristic, so that the data volume is reduced, and the quality of texture results is ensured.

(2) Considering texture mapping in three-dimensional reconstruction, and solving the problem that texture results are discontinuous due to inaccuracy of camera gestures in three-dimensional reconstruction, errors of geometric models and changes of illumination, optimizing key frame color data under different visual angles by using a block-based image optimization method to generate mutually aligned texture data, parameterizing the models to a two-dimensional plane, fusing the optimized key frame data, and generating a high-quality texture map. Fig. 1 shows raw color data used in the present invention. Fig. 2 shows a three-dimensional model provided with the texture generated by the invention, and it can be seen that the texture of the invention reveals the surface details of the model well and produces continuous results, in comparison with fig. 1.

Drawings

FIG. 1 shows raw color data used in the present invention;

FIG. 2 shows the result of rendering from another perspective by the method of the present invention;

FIG. 3 shows a comparison of the method of the present invention before and after application, the left image shows the result after application of the present invention, and the right image does not use the method of the present invention, so that it can be seen that the method of the present invention can effectively improve texture quality;

fig. 4 shows a schematic diagram of the RGBD-oriented three-dimensional reconstruction texture generation method of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

The principle of the invention is as follows: by using a space-time combined sampling method, the factors of time sequence and space sequence are considered, RGB data and depth data with low ambiguity and high uniqueness are extracted from RGBD data flow to serve as key frames, image quality is guaranteed, data redundancy is reduced, and meanwhile the model is guaranteed to be covered by the key frames as much as possible. And projecting images with different view angles to the same view angle by using the model data and the camera pose of the key frame, establishing an error energy function, alternately optimizing the images, and generating mutually aligned image data with different view angles as optimized key frame data. And parameterizing the model to a two-dimensional plane, and fusing data under different visual angles to the plane to generate a final texture image.

The invention provides a texture generation method for RGBD three-dimensional reconstruction, which is characterized by comprising the following steps of:

and (1) extracting RGBD data streams acquired by a camera according to time stamps, calculating the blurring degree measurement of all RGB images, extracting RGB data with the smallest blurring degree measurement value in each time interval and corresponding depth data as key frames, and processing the whole RGBD data streams according to the method to obtain initial key frames. And then, carrying out the uniqueness measurement on each initial key frame, deleting the key frames with the lowest uniqueness measurement one by one and updating the uniqueness measurement of all images until the uniqueness measurement of all images exceeds a certain threshold. And finally, storing RGB, corresponding depth data and camera gestures as three items of data stored in the required key frames.

And (2) copying all the key frames into three groups of an original image, a target image and a texture image, and establishing scale pyramids with different heights according to the width and the height of the key frames RGB. And from a low scale to a high scale, alternately optimizing the target image and the texture image on each layer of scale, interpolating the target image and the texture image of the current scale to generate a high-scale image after convergence, and simultaneously downsampling the original image to the scale to be iterated as input data of one layer of scale iteration. After all scale iterations are completed, generating mapping data which can be projected onto the model surface and aligned with each other, and using the mapping data as optimized key frame data.

And (3) dividing the model surface into different areas according to camera viewpoints, and expanding the areas to a two-dimensional plane through a parameterization method. And arranging the areas on the two-dimensional plane into a square plane according to the size by utilizing a greedy strategy, and obtaining the two-dimensional texture coordinates of each vertex in three dimensions after normalization. And for each triangular patch generated by the model, carrying out weighted average on all optimized key frame information which can be projected on the triangular patch, and then copying the average result to a two-dimensional plane to obtain a final texture map.

As shown in fig. 4, the implementation process of the texture generation method for RGBD-oriented three-dimensional reconstruction of the present invention mainly includes four steps: spatiotemporal sampling of key frames, key frame RGB data optimization, texture atlas generation. The method is concretely realized as follows:

s1, performing space-time sampling on key frames;

and extracting RGBD data streams acquired by a camera according to the time stamps, calculating the blurring degree measurement of all RGB images, extracting RGB data with the smallest blurring degree measurement value in each time interval and corresponding depth data as key frames, and processing the whole RGBD data streams according to the method to obtain the key frames subjected to time sampling. And then, carrying out the uniqueness measurement on each key frame subjected to time sampling, deleting the key frames with the lowest uniqueness measurement one by one and updating the uniqueness measurement of all images until the uniqueness measurement of all images exceeds a certain threshold. And finally, storing RGB, corresponding depth data and camera gestures as three items of data stored in the space-time sampled key frames.

S11, performing time sampling on RGBD data flow;

for the registered RGBD data stream, firstly, threshold filtering is carried out on RGB data by utilizing depth data, the front background is separated, then, the ambiguity measure D of all RGB data is calculated for each RGB data, and the threshold delta is set at the frame number _max The D value is the smallest, and corresponding RGBD data and camera pose are stored as key frames. Meanwhile, after selecting one frame, δ is next _mtn The frame is not processed at all, from delta _mtn Starting at +1 frame and proceeding from the next delta _max Selecting the frame with the minimum D value as a key frame until all RGBD data are processed to obtain a time sampling key frame set K ₀ . Each key frame K _t ∈K ₀ In which RGB data C is contained _t Depth data D _t And camera pose T _t 。

S12, performing spatial sampling on the key frames obtained by the time sampling;

depth data D of key frames sampled at each frame time _i The uniqueness metric is calculated by the following formula:

wherein Q (I) represents a uniqueness metric value of the key frame, which is a real number between 0 and 1. I' represents the set K ₀ D _I′ (p ') represents the value of the corresponding depth image projected by point p onto image I', z _I′ (p ') is the z-value in three-dimensional space of the three-dimensional point conversion to I' corresponding to point p, |i| represents the number of pixels in image I. All key frames are added to a priority queue, which prioritizes according to a uniqueness metric for each key frame. After all the key frames are calculated, deleting the value with the lowest uniqueness in the queue, and simultaneously recalculating the uniqueness metric of all the key frames capable of observing the deleted key frame pixels until the minimum uniqueness metric in the queue is greater than a threshold sigma, wherein the key frames in the queue are the key frames finally subjected to space-time extraction.

S2, optimizing key frame RGB data;

and copying all the key frames into three groups of an original image, a target image and a texture image, and establishing scale pyramids with different heights according to the width and height of the key frames RGB. And from a low scale to a high scale, alternately optimizing the target image and the texture image on each layer of scale, interpolating the target image and the texture image of the current scale to generate a high-scale image after convergence, and simultaneously downsampling the original image to the scale to be iterated as input data of one layer of scale iteration. After all scale iterations are completed, generating mutually aligned mapping data which can be projected onto the model surface, and obtaining optimized mutually aligned key frame data. As shown in fig. 3, the left hand picture shows the modeling results of textures generated using optimized key frames, and the right hand picture shows the results with non-optimized textures, it can be seen that the present invention eliminates texture discontinuities due to camera pose errors and model geometry inaccuracies.

S21, constructing an image pyramid;

the key frame RGB image obtained by space-time sampling is duplicated into three groups, { S _i }，{T _i }，{M _i Set { S }, where _i The images in the } are called source images, set { T } _i The images in the } are target images and the set { M } _i The image of } is a texture image. Respectively downsampling the three collected images, establishing three image pyramids with v layers of scales from small to large, and carrying out the following iteration on each layer of images according to the scales from small to large;

Wherein T is _i (x _i ) Representing an image T _i X of (x) _i The values of the pixels, the method for constructing the target image is realized, the first term of the numerator on the right side of the formula and the first two terms of the denominator are the results of a standard patch-match algorithm, and the obtained result is T _i And S is _i The corresponding relation of the pixel blocks in the invention, L is the total number of pixels in the pixel blocks, and the invention adopts 7×7 pixel blocks, and L=49.

w _i (x _i ) Indicated at the t-th frame x _i Weights at pixels, defined as θ/d ² Wherein θ is pixel x _i The included angle between the normal direction of the model surface at the corresponding three-dimensional point and the line of sight, d represents the pixel x _i Distance of the corresponding three-dimensional point from the observation position. In the invention, w is interpolated data in the loader.

M _k (x _i→k ) Represents M _k X of (x) _i Individual pixels are projected onto a transform matrixM _i Pixel values at.

N represents the total number of key frames. Alpha and lambda are balance coefficients, here 2 and 0.1, respectively.

S23, collecting { T } from target images _i Generating a texture image set { M } in conjunction with equation (2) _i }

This formula represents image M _i Is composed of image T _i And (5) carrying out weighted average to obtain the product.

S24, adopting an alternate optimization method to optimize the set { M } _i Hold set { T } time _i Immobilized, optimize set { T } _i Hold set { M } time _i The iteration is carried out by taking each time of formulas (1) and (2) as one iteration, experiments find that the iteration v multiplied by 5 times of the minimum scale level can be converged, the level with large scale is adopted, and the iteration times are reduced by 5 times along with the increase of the scale of the pyramid.

S25, after the iteration of a certain layer is finished, for { M _i }{T _i The optimized result of the step is directly up-sampled as the initial data of the new layer iteration, and then the step S22 is executed until the level iteration of the maximum scale is completed. And finally obtaining the mutually aligned key frame data.

S3, generating a texture atlas.

According to the viewpoint of the camera, the model surface is divided into different areas, and the areas are unfolded to a two-dimensional plane through a parameterization method. And arranging the areas on the two-dimensional plane into a square plane according to the size by utilizing a greedy strategy, and obtaining the two-dimensional texture coordinates of each vertex in three dimensions after normalization. And for each triangular patch generated by the model, carrying out weighted average on all optimized key frame information which can be projected on the triangular patch, and then copying the average result to a two-dimensional plane to obtain a final texture map.

all the key frames are positioned at the viewing angles, which are numbered { 1..N }, and all the reconstructed triangular patches are projected under each key frame. For each triangular patch, selecting a viewing angle that maximizes a value described by the following formula, and assigning the number of the viewing angle to the triangular patch:

wherein f represents a triangular surface patch, C represents a camera under a certain viewing angle, θ represents a line-of-sight angle between a normal of f and C, d represents a distance between a center of f and C, a _f The area of f is represented, and α and λ represent smoothing coefficients.

After all triangular patches are processed, the triangular patches with uniform numbers are considered to belong to the same block. In order to maintain continuity of illumination, if the number of triangular patches that belong to the same region and communicate with each other is less than σ=50, the triangular patches of that region are merged into the adjacent regions.

For the triangular patches of all the areas, the triangular patches are projected to the view angles corresponding to the area numbers to obtain two-dimensional coordinates corresponding to three-dimensional vertexes, and the size set { (w) of the bounding boxes of the two-dimensional coordinates is obtained ₀ ，h ₀ )，(w ₁ ，h ₁ )，...，(w _m ，h _m ) And m is the number of areas. And sequentially moving the two-dimensional vertexes in the bounding boxes to a plane with a size specified by a user according to the width of each bounding box to obtain texture coordinates of the whole model.

S32, fusing key frame data of different visual angles to the texture image.

For each triangular patch on the model, projecting the triangular patch onto all key frames capable of observing the triangular patch, recording corresponding color information, and establishing a priority queue for each triangular patch. And taking the included angle value between the normal line of the triangular surface patch and the central ray of the visual angle as a priority, and only retaining the first N key frame data in each queue. After all triangular patches are processed. And calculating a weighted average value of key frames in the queue corresponding to each triangular patch, and finally generating a single continuous texture image as a final result.

Claims

1. The RGBD-oriented three-dimensional reconstruction texture generation method is characterized by comprising the following steps of:

s1, carrying out space-time sampling on a data stream according to a time mark of each frame in an RGBD data stream for reconstructing a model by combining the uniqueness measurement of each frame of RGB data to obtain a key frame sequence;

s2, projecting key frames at different positions to the same camera position according to the key frame sequence sampled in the S1 to construct an energy function, and solving the energy function in a grouping alternate iteration solving mode to generate a key frame sequence aligned with each other;

s3, dividing the triangular surface patch of the model surface into different areas according to the camera gestures of different key frames according to the aligned key frame sequences generated in the S2, parameterizing the areas to a two-dimensional plane, endowing data projected to the corresponding areas on the key frames, and arranging according to the area sizes to finally obtain a texture atlas of the model;

the specific implementation of the step S3 is as follows:

projecting all triangular patches on a model with reconstructed viewing angles of all the key frames to the positions below each key frame, selecting a viewing angle with the largest D (f, C) value described by the following formula for each triangular patch, and giving the number of the viewing angle to the triangular patch;

after all triangular patches are processed, the triangular patches with the same number are considered to belong to the same block; in order to maintain the continuity of illumination, if the number of the triangular patches which are mutually communicated in a certain area is less than a set value, combining the triangular patches of the area into an adjacent area;

for all the areas, projecting the areas to the view angles corresponding to the area numbers to obtain two-dimensional coordinates corresponding to the three-dimensional vertexes, and obtaining a size set B= { (w) of a bounding box of the two-dimensional coordinates ₀ ,h ₀ ),(w ₁ ,h ₁ ),…,(w _m ,h _m ) M is the number of areas, and according to the width of each bounding box, sequentially moving the two-dimensional vertexes in each bounding box to a plane with a size specified by a user to obtain texture coordinates of the whole reconstructed model;

s32, fusing key frame data of different view angles to the texture image to obtain a final texture atlas;

for each triangular patch on the reconstructed model, projecting the triangular patch onto all key frames for observing the triangular patch, and recording corresponding color information, wherein each triangular patch establishes a priority queue, the included angle value between the normal line of the triangular patch and the central ray of the visual angle is used as the priority, and each queue only keeps the front N ₁ Key frame data; after all the triangular patches are processed, calculating a weighted average value of key frames in a queue corresponding to each triangular patch, and finally generating a single continuous texture atlas as a final result.

2. The RGBD-oriented three-dimensional reconstruction texture generation method of claim 1, wherein: the step S1 is as follows:

s11, firstly, performing time sampling on RGBD data flow to obtain a time-sampled key frame, wherein the specific process is as follows:

for the registered RGBD data stream, firstly, using depth data to perform threshold filtering on RGB data once, separating front and background, calculating ambiguity measure D of all RGB data for RGB data in RGBD data stream, and setting threshold delta at frame number _max Selecting the minimum D value, and storing corresponding RGBD data and camera gestures as key frames; meanwhile, after one frame is selected, the following number isδ _min Does not process any frame from delta _min Starting at +1 frame and proceeding from the next delta _max Selecting the frame with the minimum D value as a key frame until all RGBD data are processed to obtain a time sampling key frame set K ₀ Each key frame K _i ∈K ₀ In which RGB data C is contained _i Depth data D _i And camera pose E _i ；

depth data K of key frames time-sampled for each frame _i ∈K ₀ The uniqueness metric of the key frame is calculated by the following formula:

wherein Q (I) represents a uniqueness metric value of key frame I, a real number between 0 and 1, and I' represents a set K ₀ D _I' (p ') represents the value of the corresponding depth image projected by point p onto image I', z _I′ (p ') is the z-value of the three-dimensional point corresponding to point p converted into the three-dimensional space of image I', i| representing the number of pixels in image I; adding all the key frames into a priority queue, and evaluating the priority by the priority queue according to the uniqueness metric value of each key frame; after all key frames are calculated, deleting the value with the lowest uniqueness metric value in the queue, and simultaneously recalculating the uniqueness metrics of all key frames capable of observing the deleted key frame pixels until the smallest uniqueness metric value in the queue is greater than a threshold sigma _d At this time, the key frame in the queue is the key frame sequence K' which is finally subjected to space-time sampling.

3. The RGBD-oriented three-dimensional reconstruction texture generation method of claim 1, wherein: the specific implementation of the step S2 is as follows:

s21, constructing an image pyramid;

copying all key frames RGB image key frames obtained by space-time sampling in S1 into three groups to obtain { S } _i },{T _i },{M _i Set { S }, where _i Called source image, set { T }, set _i Is the target image, set { M } _i The method comprises the steps of performing downsampling on three groups of images respectively to establish an image pyramid with v layers of scales from small to large, and performing the following iteration on each layer of images according to the scales from small to large;

s22, utilizing the source image S _i ∈{S _i Set { M }, set _i The RGB data in the } is combined with the following formula (1) to generate a target image T _i (x _i )

Wherein T is _i (x _i ) Representing an image T _i X of (x) _i The value of the pixel, realizing the method of constructing the target image, L is the total number of pixels in the pixel block;

w _i (x _i ) Indicated in the ith frame x _i Weights at pixels, defined as phi/d ² Wherein phi is pixel x _i The included angle between the normal direction of the model surface at the corresponding three-dimensional point and the line of sight, d represents the pixel x _i The distance between the corresponding three-dimensional point and the observation position;

N ₂ representing the total number of key frames;

Represents M _i Is composed of T _i Weighted average is carried out to obtain;

s24, optimizing { M }' by adopting an alternate optimization method _i Maintain { T } at time _i Immobilized { T }, optimize _i Maintain { M } at time _i Taking each execution of formulas (1) and (2) as an iteration, considering that the iteration v multiplied by 5 of the minimum scale level can converge, and the iteration times of the large scale level is reduced by 5 each time along with the increase of the scale of the pyramid;

s25, after the iteration of a certain layer is finished, for { M _i }、{T _i And directly upsampling the optimized result, taking the directly upsampled result as initial data of a new layer of iteration, and then executing S22 until the layer of iteration with the largest scale is completed, so as to finally obtain a key frame sequence K which is mutually aligned.