CN115294285B

CN115294285B - Three-dimensional reconstruction method and system of deep convolutional network

Info

Publication number: CN115294285B
Application number: CN202211230767.4A
Authority: CN
Inventors: 陶丽桢; 任建国
Original assignee: Shandong Tianda Qingyuan Information Technology Co ltd
Current assignee: Shandong Tianda Qingyuan Information Technology Co ltd
Priority date: 2022-10-10
Filing date: 2022-10-10
Publication date: 2023-01-17
Anticipated expiration: 2042-10-10
Also published as: CN115294285A

Abstract

The invention relates to the technical field of computer systems based on specific calculation models, in particular to a three-dimensional reconstruction method and a three-dimensional reconstruction system of a deep convolutional network. The method can reduce error matching, improve matching precision and reduce cost to carry out three-dimensional reconstruction.

Description

Three-dimensional reconstruction method and system of deep convolutional network

Technical Field

The invention relates to the technical field of computer systems based on specific calculation models, in particular to a three-dimensional reconstruction method and a three-dimensional reconstruction system of a deep convolutional network.

Background

The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.

The three-dimensional reconstruction technology is to depict a real scene into a mathematical model which accords with the logical expression of a computer through the processes of depth data acquisition, pretreatment, point cloud registration and fusion, surface generation and the like, namely to reconstruct a three-dimensional virtual model of the surface of an object in the computer and construct a complete three-dimensional model of the object. At present, according to different reconstruction modes, the following three-dimensional reconstruction methods are mainly available: a binocular stereo vision method and a trinocular stereo vision method.

In binocular stereo vision, the matching result often depends on the amount of the texture and color information of the surface of the photographed object. When the texture information of the object is less, the pixels of the left image and the right image are matched easily to generate error matching, and the result is unreliable. In order to reduce error matching and improve binocular vision matching precision, the three-eye stereoscopic vision system is a solution. The matching in the multi-view stereoscopic vision adopts a method of multi-angle front intersection of a plurality of overlapped points according to an optical triangle theory, so redundant data can be effectively used, the problem of mismatching is solved to a certain extent, and the three-dimensional reconstruction precision is improved. However, this method is cumbersome and the hardware used is more complex and expensive. Therefore, it is desirable to provide a method for performing three-dimensional reconstruction, which can reduce the number of mismatch, improve the matching accuracy, and reduce the cost.

Disclosure of Invention

The present invention provides a three-dimensional reconstruction method and system for a deep convolutional network, so as to solve the problems in the background art.

In order to achieve the purpose, the invention provides the following technical scheme:

a first aspect of the present invention provides a method for three-dimensional reconstruction of a deep convolutional network, the method comprising:

s1, acquiring data, including acquiring point cloud data of an object, and acquiring a depth image;

s2, carrying out data preprocessing, comprising the following substeps:

(1) Denoising the depth image, eliminating noise points acquired in real-time data acquisition, (2) normalizing the coordinates of the denoised point cloud data, transforming the point cloud data to the same coordinate system,

(3) A principal component analysis method is selected for feature extraction,

and S3, carrying out data input, comprising the following substeps:

(1) Respectively storing the feature matrixes obtained by the previous preprocessing into the data packets,

(2) And (3) the feature matrix is divided into the following geometric information: dividing the coordinates, the surface normal, the normal vector, the tangent vector and the non-geometric information of the point into two batches, and inputting the two batches into a deep convolution network for training through each iteration;

s4, constructing a deep convolutional network for fine registration of point cloud data, and comprising the following substeps:

(1) Three layers of convolution layers are built, a batch normalization layer and an activation layer are sequentially arranged between every two convolution layers,

(2) Building a global average pooling layer after the convolutional layer for regularizing the whole network structure to prevent overfitting,

(3) Finally, adding the loss layer into the network, and introducing a loss function to calculate the network loss;

s5, network training is carried out to optimize the performance of the network, and the method comprises the following substeps:

(1) Initializing parameters of each layer in the previously built deep convolution network;

(2) Sending the point cloud data characteristic matrix preprocessed in advance to a network in batches, and updating the weight in the network each time in an iteration mode until the network converges to an optimal state;

(3) The back propagation training is repeated. In the process, the loss value is gradually reduced along with the increase of the training rounds until the preset training rounds or the loss value presents a stable trend;

and S6, fusing point cloud data to obtain a more refined three-dimensional reconstruction result graph.

A second aspect of the present invention provides a three-dimensional reconstruction system of a deep convolutional network, comprising:

a data acquisition module configured to: acquiring point cloud data of an object to obtain a depth image;

a data pre-processing module configured to: denoising the depth image, and eliminating noise points acquired in real-time data acquisition; normalizing the coordinates of the denoised point cloud data, and transforming the point cloud data to the same coordinate system; selecting a principal component analysis method for feature extraction;

a data input module configured to: storing the feature matrix obtained by the previous preprocessing into a data packet, and then, according to the geometric information: dividing the coordinates, the surface normal, the normal vector, the tangent vector and the non-geometric information of the point into two batches, and inputting the two batches into a deep convolution network for training through each iteration;

a deep convolutional network construction module configured to: building three convolution layers, wherein a batch normalization layer and an active layer are sequentially arranged between every two convolution layers; building a global average pooling layer after the convolutional layer, and regularizing the whole network structure to prevent overfitting; finally, adding the loss layer into the network, and introducing a loss function to calculate the network loss;

a deep convolutional network training module configured to: initializing parameters of each layer in the previously built deep convolution network; sending the point cloud data characteristic matrix preprocessed in advance to a network in batches, and updating the weight in the network each time in an iteration mode until the network converges to an optimal state; repeatedly carrying out back propagation training;

a data fusion module configured to: and fusing the point cloud data to obtain a more precise three-dimensional reconstruction result graph.

Compared with the prior art, the invention has the beneficial effects that: compared with the traditional method, the three-dimensional reconstruction method and the three-dimensional reconstruction system of the deep convolutional network have the advantage of three-dimensional reconstruction. The invention creatively provides a novel deep convolution network, which can reduce error matching and improve matching precision, and can replace the existing three-eye stereoscopic vision method to reduce cost.

Drawings

FIG. 1 is a block diagram of a system according to the present invention.

Detailed Description

The invention is further described with reference to the following figures and examples.

It is to be understood that the following detailed description is exemplary and is intended to provide further explanation of the invention as claimed. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the invention. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.

It is noted that the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of methods and systems according to various embodiments of the present disclosure. It should be noted that each block in the flowchart or block diagrams may represent a module, a segment, or a portion of code, which may comprise one or more executable instructions for implementing the logical function specified in the respective embodiment. It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Example one

The embodiment provides a three-dimensional reconstruction method of a deep convolutional network, which comprises the following steps:

s2, data preprocessing is carried out, and the data preprocessing method comprises the following substeps:

(1) Denoising the depth image, and removing noise points acquired in real-time data acquisition, (2) normalizing the coordinates of the denoised point cloud data, transforming the point cloud data to the same coordinate system, and inducing the point cloud data to [0,1], wherein the specific formula is as follows:

x = (x-min)/(max-min) where x represents input, max is the maximum value of the point cloud data sample, min is the minimum value of the point cloud data sample, x represents output,

(3) The method selects a principal component analysis method for feature extraction, and specifically comprises the following steps:

(3.1) generating feature vector matrix for point cloud data samples

(3.2) calculate the mean of each column of features, and then subtract the mean of the column features for each dimension

(3.3) calculating covariance matrix of features

(3.4) calculation of eigenvalues and eigenvectors for the covariance matrix

(3.5) sorting the calculated characteristic values from large to small

(3.6) taking out the first K eigenvectors and eigenvalues, and performing backspacing to obtain a dimensionality-reduced eigenvector matrix;

and S3, carrying out data input, comprising the following substeps:

(1) And building three convolution layers, wherein a batch normalization layer and an activation layer are sequentially arranged between every two convolution layers, and the batch normalization layer adds a linear change to the deep convolution network, so that the next layer is closer to the input of Gaussian distribution and is used for relieving the covariate offset phenomenon. The activation layer uses an unsaturated activation function to prevent the gradient disappearance problem, and preferably, the activation layer uses a sigmoid function;

(2) Building a global average pooling layer after the convolutional layer, and regularizing the whole network structure to prevent overfitting; the global average pooling layer is arranged to replace the traditional function of one or more full connection layers, so that the parameter redundancy caused by the full connection layers is reduced (the full connection layer parameters can account for about 80% of the whole network parameters);

(3) Finally, adding the loss layer into the network, and introducing a loss function to calculate the network loss; using a loss function such as cross entropy or softmax and the like as a network target function to guide a learning process;

(1) Initializing parameters of each layer in the previously constructed deep convolutional network;

(2) Sending the point cloud data characteristic matrix preprocessed in advance into a network in batches, and iteratively updating the weight in the network each time until the network converges to an optimal state;

s6, fusing point cloud data to obtain a more detailed three-dimensional reconstruction result graph; preferably, data fusion is performed by using a truncated symbolic distance field technology, and only a few layers of voxels close to the real surface are stored, so that the memory consumption of the traditional Kinectfusion technology can be greatly reduced, and redundant points of a model are reduced.

Example two

As shown in fig. 1, the present embodiment provides a three-dimensional reconstruction system of a deep convolutional network, including:

a data acquisition module configured to: collecting point cloud data of an object to obtain a depth image;

a data input module configured to: storing the feature matrix obtained by the previous preprocessing into a data packet, and then, according to the geometric information: dividing the coordinates, the surface normal, the normal vector, the tangent vector and the non-geometric information of the point into two batches, inputting the two batches into a deep convolution network through each iteration for training, wherein the deep convolution network comprises a deep convolution network construction module and a deep convolution network training module;

a deep convolutional network construction module configured to: building three convolution layers, wherein a batch normalization layer and an activation layer are sequentially arranged between every two convolution layers; building a global average pooling layer after the convolutional layers for regularizing the structure of the whole network to prevent overfitting; finally, adding the loss layer into the network, and introducing a loss function to calculate the network loss;

a deep convolutional network training module configured to: initializing parameters of each layer in the previously constructed deep convolutional network; sending the point cloud data characteristic matrix preprocessed in advance to a network in batches, and updating the weight in the network each time in an iteration mode until the network converges to an optimal state; repeatedly carrying out back propagation training;

a data fusion module configured to: and fusing point cloud data to obtain a more refined three-dimensional reconstruction result graph.

Although the preferred embodiments of the present patent have been described in detail, the present patent is not limited to the above embodiments, and various changes can be made without departing from the spirit of the present patent within the knowledge of those skilled in the art.

Claims

1. A three-dimensional reconstruction method of a deep convolutional network is characterized by comprising the following steps: s1, data acquisition is carried out, wherein the data acquisition comprises the acquisition of point cloud data of an object and the acquisition of a depth image; s2, carrying out data preprocessing, comprising the following substeps: the method comprises the steps of (1) denoising a depth image, and removing noise points acquired in real-time data acquisition, (2) normalizing coordinates of denoised point cloud data, and transforming the point cloud data to the same coordinate system, and (3) selecting a principal component analysis method for feature extraction; and S3, carrying out data input, comprising the following substeps: (1) Respectively storing the feature matrixes obtained by the previous preprocessing into a data packet, (2) dividing the feature matrixes into two batches according to geometric information and non-geometric information, and inputting the two batches into a deep convolution network for training through each iteration; s4, constructing a deep convolutional network for fine registration of point cloud data, and comprising the following substeps: the method comprises the following steps of (1) building three convolutional layers, wherein a batch normalization layer and an activation layer are sequentially arranged between every two convolutional layers, (2) building a global average pooling layer after the convolutional layers and used for conducting regularization on the structure of the whole network to prevent overfitting, and (3) adding a loss layer into the network at last and introducing a loss function to calculate network loss; s5, network training is carried out to optimize the performance of the network, and the method comprises the following substeps: the method comprises the steps of (1) initializing parameters of each layer in a previously established deep convolution network, (2) sending a point cloud data characteristic matrix which is preprocessed in advance into the network in batches, updating the weight in the network every time in an iteration mode, and (3) repeatedly carrying out back propagation training, and S6, fusing point cloud data to obtain a three-dimensional reconstruction result graph.

2. The three-dimensional reconstruction method of the deep convolutional network as claimed in claim 1, wherein: the activation layer uses a sigmoid function.

3. The three-dimensional reconstruction method of the deep convolutional network as claimed in claim 2, wherein: and the point cloud data is fused by adopting a truncated symbol distance field technology.

4. A three-dimensional reconstruction system of a deep convolutional network is characterized in that: a data acquisition module configured to: acquiring point cloud data of an object to obtain a depth image; a data pre-processing module configured to: denoising the depth image, and eliminating noise points acquired in real-time data acquisition; normalizing the coordinates of the denoised point cloud data, and transforming the point cloud data to the same coordinate system; selecting a principal component analysis method for feature extraction; a data input module configured to: storing the feature matrix obtained by the previous preprocessing into a data packet, dividing the feature matrix into two batches according to geometric information and non-geometric information, and inputting the two batches into a deep convolution network through each iteration; a deep convolutional network construction module configured to: the deep convolutional network comprises three convolutional layers, and a batch normalization layer and an activation layer are sequentially arranged between every two convolutional layers; building a global average pooling layer after the convolutional layer, and regularizing the whole network structure to prevent overfitting; finally, adding the loss layer into the network, and introducing a loss function to calculate the network loss; a deep convolutional network training module configured to: initializing parameters of each layer in the deep convolutional network; sending the point cloud data feature matrix preprocessed in advance into a deep convolution network in batches, and updating the weight in the deep convolution network each time in an iteration mode; repeatedly performing back propagation training, a data fusion module configured to: and fusing point cloud data to obtain a more refined three-dimensional reconstruction result graph.