CN113822825A - Optical building target three-dimensional reconstruction method based on 3D-R2N2 - Google Patents

Optical building target three-dimensional reconstruction method based on 3D-R2N2 Download PDF

Info

Publication number
CN113822825A
CN113822825A CN202111409413.1A CN202111409413A CN113822825A CN 113822825 A CN113822825 A CN 113822825A CN 202111409413 A CN202111409413 A CN 202111409413A CN 113822825 A CN113822825 A CN 113822825A
Authority
CN
China
Prior art keywords
layer
input end
convolution layer
convolution
dimensional
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111409413.1A
Other languages
Chinese (zh)
Other versions
CN113822825B (en
Inventor
邹倩颖
郭雪
蔡雨静
喻淋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu College of University of Electronic Science and Technology of China
Original Assignee
Chengdu College of University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu College of University of Electronic Science and Technology of China filed Critical Chengdu College of University of Electronic Science and Technology of China
Priority to CN202111409413.1A priority Critical patent/CN113822825B/en
Publication of CN113822825A publication Critical patent/CN113822825A/en
Application granted granted Critical
Publication of CN113822825B publication Critical patent/CN113822825B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • G06T5/70
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/40Image enhancement or restoration by the use of histogram techniques
    • G06T5/90
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • G06T7/33Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
    • G06T7/344Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods involving models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20016Hierarchical, coarse-to-fine, multiscale or multiresolution image processing; Pyramid transform
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Abstract

The invention discloses a three-dimensional reconstruction method of an optical building target based on 3D-R2N2, which relates to the technical field of three-dimensional reconstruction, and comprises the steps of obtaining an optical image and preprocessing the optical image; constructing a 3D-R2N2 network, and inputting the preprocessed optical image into the constructed 3D-R2N2 network; wherein the 3D-R2N2 network includes a CNN module; carrying out feature extraction and coding on the optical image, and processing the optical image into a low-dimensional feature vector; sending the low-dimensional feature vector into a 3D-LSTM unit to obtain a three-dimensional grid structure; wherein the three-dimensional grid structure comprises voxels; inputting the three-dimensional grid structure into a decoder, and converting the voxel into a three-dimensional probability matrix; and performing pixel reconstruction through the three-dimensional probability matrix, namely completing the three-dimensional reconstruction of the optical building target. The method can stabilize the training model, reduce the convergence, improve the accuracy of the reconstructed model, recover more accurate images and enable the images to achieve good visual effect.

Description

Optical building target three-dimensional reconstruction method based on 3D-R2N2
Technical Field
The invention relates to the technical field of three-dimensional reconstruction, in particular to a 3D-R2N 2-based optical building target three-dimensional reconstruction method.
Background
Three-dimensional reconstruction refers to the creation of a mathematical model of a target object that is suitable for computer representation and processing. With the development of modern science and technology, building three-dimensional reconstruction technology attracts much attention, the construction of three-dimensional models becomes one of the key elements of urban geospatial data frames, and how to quickly, automatically and accurately construct three-dimensional models of urban areas, especially various buildings with complex shapes, is a hot problem in research in various fields at present. In 2015, three-dimensional reconstruction networks 3D renderets based on voxel representation were first proposed, but the networks have matching problems of texture defects, specular reflection, baselines and the like. In 2016, a 3D-R2N2 method is proposed, which mainly solves the problem of object feature matching, but the reconstruction accuracy and efficiency of the method are not high; providing a WarpNet network framework based on a convolutional neural network, realizing reconstruction with quality similar to that of a supervision method, and reconstructing target distortion by using the method; the MarrNet model for performing end-to-end training on a real image has the problems of complex calculation, lack of finer geometric shapes and the like. In 2018, images containing complex objects are reconstructed in three dimensions by using a voxel-level reconstruction algorithm, but for images with low resolution, the reconstruction accuracy of the method is low. In 2017, a B-Rep algorithm is adopted for three-dimensional reconstruction, and the algorithm is a polyhedron-oriented three-dimensional reconstruction algorithm and is only suitable for simple polyhedrons. The problems of low modeling efficiency, poor model visual effect, low modeling precision of texture missing areas and the like exist in the traditional three-dimensional reconstruction, so that higher requirements are put forward on a reconstruction algorithm.
Disclosure of Invention
Aiming at the defects in the prior art, the optical building target three-dimensional reconstruction method based on 3D-R2N2 solves the problems of low modeling efficiency, poor model visual effect and low modeling precision of texture missing areas in the prior art.
In order to achieve the purpose of the invention, the invention adopts the technical scheme that:
the method for three-dimensional reconstruction of the optical building target based on the 3D-R2N2 is provided, and comprises the following steps:
s1, acquiring an optical image and preprocessing the optical image;
s2, constructing a 3D-R2N2 network, and inputting the preprocessed optical image into the constructed 3D-R2N2 network; the 3D-R2N2 network comprises an image extraction module, a pyramid pooling layer, a CNN module and a 3D-LSTM unit which are connected in sequence;
s3, adjusting the size of the input optical image to be a uniform size through the pyramid pooling layer;
s4, extracting the features of the optical image with uniform size by using a CNN module and depth residual variation of a 3D-R2N2 network, and encoding the extracted features;
s5, performing one-dimensional convolution on the coded features, and compressing the features into 1024-dimensional feature vectors, namely low-dimensional feature vectors, through a coder;
s6, sending the low-dimensional feature vector into a 3D-LSTM unit to obtain a three-dimensional grid structure; wherein the three-dimensional grid structure comprises voxels;
s7, inputting the three-dimensional grid structure into a decoder, and improving the hidden state resolution of the three-dimensional grid structure through the decoder until the target output resolution is reached;
s8, converting the three-dimensional grid structure reaching the target output resolution into the existence probability of the voxel at the voxel coordinate point by using a cross entropy loss function, and processing the probability into a Bernoulli distribution form;
s9, establishing the existence probability of the voxels in the Bernoulli distribution form in the voxel coordinate points into a three-dimensional probability matrix;
and S10, performing pixel reconstruction through the three-dimensional probability matrix, namely completing the three-dimensional reconstruction of the optical building target.
Further, the specific method of preprocessing in step S1 is:
s1-1, according to the formula:
Figure 567318DEST_PATH_IMAGE001
obtaining the minimum total variation
Figure 745490DEST_PATH_IMAGE002
(ii) a Wherein
Figure 227287DEST_PATH_IMAGE003
Is a pair of
Figure 519247DEST_PATH_IMAGE004
The differential is obtained by the differential analysis,
Figure 402889DEST_PATH_IMAGE004
is a domain of definition of a pixel point,
Figure 751962DEST_PATH_IMAGE005
for the original sharp analog noise high frequency image,
Figure 721055DEST_PATH_IMAGE006
for simulating pixels in noisy high-frequency images with original definitionxThe coordinates of the position of the object to be imaged,
Figure 813776DEST_PATH_IMAGE007
for simulating pixels in noisy high-frequency images with original definitionyCoordinates of the two components,
Figure 817504DEST_PATH_IMAGE008
is composed of
Figure 337478DEST_PATH_IMAGE005
The differential operator of (a), i.e. the optical image,
Figure 793867DEST_PATH_IMAGE009
as coordinates to pixel pointsxThe derivation is carried out by the derivation,
Figure 424700DEST_PATH_IMAGE010
as coordinates to pixel pointsyDerivation is carried out;
s1-2, carrying out noise reduction processing on the optical image by utilizing the minimized total variation;
s1-3, processing the optical image after noise reduction into a vertical image, and graying the vertical image to obtain a gray image;
s1-4, extending all the area spaces with concentrated gray scales in the gray scale image to all the gray scale area space ranges to obtain a non-uniform extension and stretching gray scale image;
and S1-5, redistributing the pixel values of the non-uniform extension stretched gray-scale image to finish preprocessing.
Further, in step S2, the CNN module includes 12 convolutional layers, 5 residual connection layers, 4 bottleneck layers, and 1 transition Layer, where the residual connection layers include a first convolutional Layer, a first leak _ Relu activation function Layer, a second leak _ Relu activation function Layer, a third leak _ Relu activation function Layer, and a PC Layer path control Layer, the first bottleneck Layer to the fourth bottleneck Layer each include a BN normalization Layer and a ReLU activation function Layer, and the first convolutional Layer has a 7 × 7 structure, and the second convolutional Layer to the thirteenth convolutional Layer each have a 3 × 3 structure;
the first convolution Layer, the first Leaky _ Relu activation function Layer, the second convolution Layer, the third convolution Layer, the fourth convolution Layer, the fifth convolution Layer, the sixth convolution Layer, the seventh convolution Layer, the eighth convolution Layer, the second Leaky _ Relu activation function Layer, the ninth convolution Layer, the first bottleneck Layer, the tenth convolution Layer, the second bottleneck Layer, the eleventh convolution Layer, the third bottleneck Layer, the twelfth convolution Layer, the fourth bottleneck Layer, the thirteenth convolution Layer, the transition Layer, the third Leaky _ Relu activation function Layer and the PC Layer access control Layer are connected in sequence; wherein the first convolution Layer is an input Layer, and the PC Layer channel control Layer is an output Layer;
the output end of the second convolution layer is respectively connected with the input end of the third convolution layer, the input end of the fourth convolution layer, the input end of the fifth convolution layer, the input end of the sixth convolution layer and the input end of the seventh convolution layer; the output end of the third convolution layer is respectively connected with the input end of the fourth convolution layer, the input end of the fifth convolution layer, the input end of the sixth convolution layer and the input end of the seventh convolution layer; the output end of the fourth convolution layer is respectively connected with the input end of the fifth convolution layer, the input end of the sixth convolution layer and the input end of the seventh convolution layer; the output end of the fifth convolution layer is respectively connected with the input end of the sixth convolution layer and the input end of the seventh convolution layer; the output end of the sixth convolution layer is connected with the input end of the seventh convolution layer;
the output end of the ninth convolution layer is respectively connected with the input end of the first bottleneck layer, the input end of the second bottleneck layer, the input end of the third bottleneck layer, the input end of the fourth bottleneck layer and the input end of the transition layer; the output end of the tenth convolution layer is respectively connected with the input end of the second bottleneck layer, the input end of the third bottleneck layer, the input end of the fourth bottleneck layer and the input end of the transition layer; the eleventh convolution layer is respectively connected with the input end of the third bottleneck layer, the input end of the fourth bottleneck layer and the input end of the transition layer; the output end of the twelfth convolution layer is respectively connected with the input end of the fourth bottleneck layer and the input end of the transition layer; the output end of the thirteenth convolution layer is connected to the input end of the transition layer.
Further, the 3D-LSTM unit in step S2 includes several basic modules; the basic module comprises a first current full-connection layer, and the output end of the first current full-connection layer is respectively connected with the input end of the first full-connection layer, the input end of the second full-connection layer and the input end of the third full-connection layer; the output end of the first full connection layer is connected with the output end of the first 3 multiplied by 3 convolution layer and is connected with the input end of the first adder; the output end of the second full connection layer is connected with the output end of the second 3 multiplied by 3 convolution layer and is connected with the input end of the second adder; the output end of the third full connection layer is connected with the output end of the third 3 multiplied by 3 convolutional layer and is connected with the input end of the third adder; the output end of the second adder and the output end of the third adder are respectively connected with the input end of the first multiplier, and the output end of the first multiplier is connected with the input end of the fourth adder; the input end of the first 3 × 3 × 3 convolutional layer, the input end of the second 3 × 3 × 3 convolutional layer and the input end of the third 3 × 3 × 3 convolutional layer are respectively connected with the output end of the hidden layer at the first previous moment; the output end of the first adder is respectively connected with the input end of the second multiplier and the storage unit at the previous moment; the output end of the second multiplier is connected with the output end of the fourth adder; the output end of the fourth adder is respectively connected with the hidden layer at the current moment and the storage unit at the first current moment.
Further, the pyramid pooling layers in step S2 include a feature map 16-block division layer, a feature map 4-block division layer, and a feature map 1-block division layer.
Further, the decoder in step S7 includes several sets of two-dimensional vector matrices, a pooling layer, a one-dimensional vector matrix, a full-link layer, and a Softmax activation layer.
Further, the formula of the cross entropy loss function in step S8 is:
Figure 17355DEST_PATH_IMAGE011
wherein the content of the first and second substances,
Figure 708231DEST_PATH_IMAGE012
in order to be a function of the cross-entropy loss,
Figure 651916DEST_PATH_IMAGE013
is a coordinate of a pixel, and is,
Figure 414335DEST_PATH_IMAGE014
in the form of a real voxel point,
Figure 2443DEST_PATH_IMAGE015
in the form of a voxel probability that is,
Figure 723274DEST_PATH_IMAGE016
is a logarithmic function with a base 10.
The invention has the beneficial effects that:
1. designing a CNN module with dense links, wherein each convolution layer of the CNN module is connected with a subsequent convolution layer, so that each convolution layer can increase the feature mapping of the corresponding layer, the subsequent convolution layer can acquire the information of the previous convolution layer, and the information of the first convolution layer can be acquired even the last convolution layer, thereby fully utilizing the number of channels and completing information transmission;
2. a bottleneck layer structure is added between convolution layers of the CNN module which is densely linked, so that the dimensionality of a convolution result can be reduced, the dimensionality increase caused by acquiring a large amount of characteristic information by each convolution layer can be relieved, the dimensionality is reduced, the training is stable, and the coding efficiency is improved;
3. a pyramid pooling layer is added between the extraction module and the CNN module, so that the size of the image can be unified, the image classification effect is improved by fusing multi-scale features, and the target recognition rate is improved;
4. the algorithm can improve the characteristic extraction effect, so that the details of an image reconstruction model are more perfect, and the accuracy is higher;
5. compared with the traditional algorithm, the method can shorten the registration time, reduce the convergence of the registration result and reduce the complexity of the algorithm.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a block diagram of a CNN module according to the present invention;
FIG. 3 is a diagram of a 3D-LSTM unit structure;
FIG. 4 is a block diagram of a decoder;
FIG. 5 is an optical image;
FIG. 6 is a modeling diagram of the B-Rep algorithm;
FIG. 7 is a modeling diagram of a prior 3D-R2N2 algorithm;
FIG. 8 is a modeling diagram of the 3D-R2N2 algorithm of the present invention;
FIG. 9 is a graph of the reconstruction result of the 3D-R2N2 algorithm of the present invention;
fig. 10 is a reconstruction result diagram of the voxel reconstruction algorithm.
Detailed Description
The following description of the embodiments of the present invention is provided to facilitate the understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and it will be apparent to those skilled in the art that various changes may be made without departing from the spirit and scope of the invention as defined and defined in the appended claims, and all matters produced by the invention using the inventive concept are protected.
As shown in FIG. 1, the optical building target three-dimensional reconstruction method based on 3D-R2N2 comprises the following steps:
s1, acquiring an optical image and preprocessing the optical image;
s2, constructing a 3D-R2N2 network, and inputting the preprocessed optical image into the constructed 3D-R2N2 network; the 3D-R2N2 network comprises an image extraction module, a pyramid pooling layer, a CNN module and a 3D-LSTM unit which are connected in sequence;
s3, adjusting the size of the input optical image to be a uniform size through the pyramid pooling layer;
s4, extracting the features of the optical image with uniform size by using a CNN module and depth residual variation of a 3D-R2N2 network, and encoding the extracted features;
s5, performing one-dimensional convolution on the coded features, and compressing the features into 1024-dimensional feature vectors, namely low-dimensional feature vectors, through a coder;
s6, sending the low-dimensional feature vector into a 3D-LSTM unit to obtain a three-dimensional grid structure; wherein the three-dimensional grid structure comprises voxels;
s7, inputting the three-dimensional grid structure into a decoder, and improving the hidden state resolution of the three-dimensional grid structure through the decoder until the target output resolution is reached;
s8, converting the three-dimensional grid structure reaching the target output resolution into the existence probability of the voxel at the voxel coordinate point by using a cross entropy loss function, and processing the probability into a Bernoulli distribution form;
s9, establishing the existence probability of the voxels in the Bernoulli distribution form in the voxel coordinate points into a three-dimensional probability matrix;
and S10, performing pixel reconstruction through the three-dimensional probability matrix, namely completing the three-dimensional reconstruction of the optical building target.
The specific method of preprocessing in step S1 is:
s1-1, according to the formula:
Figure 29622DEST_PATH_IMAGE001
obtaining the minimum total variation
Figure 126891DEST_PATH_IMAGE002
(ii) a Wherein
Figure 832154DEST_PATH_IMAGE003
Is a pair of
Figure 723887DEST_PATH_IMAGE004
The differential is obtained by the differential analysis,
Figure 517530DEST_PATH_IMAGE004
is a domain of definition of a pixel point,
Figure 418490DEST_PATH_IMAGE005
for the original sharp analog noise high frequency image,
Figure 715610DEST_PATH_IMAGE006
for simulating pixels in noisy high-frequency images with original definitionxThe coordinates of the position of the object to be imaged,
Figure 309403DEST_PATH_IMAGE007
for simulating pixels in noisy high-frequency images with original definitionyCoordinates of the two components,
Figure 324763DEST_PATH_IMAGE008
is composed of
Figure 29414DEST_PATH_IMAGE005
The differential operator of (a), i.e. the optical image,
Figure 446620DEST_PATH_IMAGE009
as coordinates to pixel pointsxThe derivation is carried out by the derivation,
Figure 945735DEST_PATH_IMAGE010
as coordinates to pixel pointsyDerivation is carried out;
s1-2, carrying out noise reduction processing on the optical image by utilizing the minimized total variation;
s1-3, processing the optical image after noise reduction into a vertical image, and graying the vertical image to obtain a gray image;
s1-4, extending all the area spaces with concentrated gray scales in the gray scale image to all the gray scale area space ranges to obtain a non-uniform extension and stretching gray scale image;
and S1-5, redistributing the pixel values of the non-uniform extension stretched gray-scale image to finish preprocessing.
The sum of the frequencies of the accumulated items of the histogram is
Figure 448391DEST_PATH_IMAGE018
Figure 691154DEST_PATH_IMAGE019
Wherein the content of the first and second substances,
Figure 290762DEST_PATH_IMAGE020
is in the order of gray scaleiThe number of pixels of (a) is,nis the total number of the number of pixels,Lis as follows;
according to the formula:
Figure 836144DEST_PATH_IMAGE021
to pair
Figure 950731DEST_PATH_IMAGE022
Carrying out rounding; where int is the rounding function.
As shown in fig. 2, the CNN module in step S2 includes 12 convolutional layers, 5 residual connection layers, 4 bottleneck layers, and 1 transition Layer, where the residual connection layers include a first convolutional Layer, a first leak _ Relu activation function Layer, a second leak _ Relu activation function Layer, a third leak _ Relu activation function Layer, and a PC Layer path control Layer, the first bottleneck Layer to the fourth bottleneck Layer each include one BN normalization Layer and one ReLU activation function Layer, and the first convolutional Layer has a 7 × 7 structure, and the second convolutional Layer to the thirteenth convolutional Layer each have a 3 × 3 structure;
the first convolution Layer, the first Leaky _ Relu activation function Layer, the second convolution Layer, the third convolution Layer, the fourth convolution Layer, the fifth convolution Layer, the sixth convolution Layer, the seventh convolution Layer, the eighth convolution Layer, the second Leaky _ Relu activation function Layer, the ninth convolution Layer, the first bottleneck Layer, the tenth convolution Layer, the second bottleneck Layer, the eleventh convolution Layer, the third bottleneck Layer, the twelfth convolution Layer, the fourth bottleneck Layer, the thirteenth convolution Layer, the transition Layer, the third Leaky _ Relu activation function Layer and the PC Layer access control Layer are connected in sequence; wherein the first convolution Layer is an input Layer, and the PC Layer channel control Layer is an output Layer;
the output end of the second convolution layer is respectively connected with the input end of the third convolution layer, the input end of the fourth convolution layer, the input end of the fifth convolution layer, the input end of the sixth convolution layer and the input end of the seventh convolution layer; the output end of the third convolution layer is respectively connected with the input end of the fourth convolution layer, the input end of the fifth convolution layer, the input end of the sixth convolution layer and the input end of the seventh convolution layer; the output end of the fourth convolution layer is respectively connected with the input end of the fifth convolution layer, the input end of the sixth convolution layer and the input end of the seventh convolution layer; the output end of the fifth convolution layer is respectively connected with the input end of the sixth convolution layer and the input end of the seventh convolution layer; the output end of the sixth convolution layer is connected with the input end of the seventh convolution layer;
the output end of the ninth convolution layer is respectively connected with the input end of the first bottleneck layer, the input end of the second bottleneck layer, the input end of the third bottleneck layer, the input end of the fourth bottleneck layer and the input end of the transition layer; the output end of the tenth convolution layer is respectively connected with the input end of the second bottleneck layer, the input end of the third bottleneck layer, the input end of the fourth bottleneck layer and the input end of the transition layer; the eleventh convolution layer is respectively connected with the input end of the third bottleneck layer, the input end of the fourth bottleneck layer and the input end of the transition layer; the output end of the twelfth convolution layer is respectively connected with the input end of the fourth bottleneck layer and the input end of the transition layer; the output end of the thirteenth convolution layer is connected to the input end of the transition layer.
As shown in fig. 3, the 3D-LSTM unit in step S2 includes several basic modules; the basic module comprises a first current full-connection layer, and the output end of the first current full-connection layer is respectively connected with the input end of the first full-connection layer, the input end of the second full-connection layer and the input end of the third full-connection layer; the output end of the first full connection layer is connected with the output end of the first 3 multiplied by 3 convolution layer and is connected with the input end of the first adder; the output end of the second full connection layer is connected with the output end of the second 3 multiplied by 3 convolution layer and is connected with the input end of the second adder; the output end of the third full connection layer is connected with the output end of the third 3 multiplied by 3 convolutional layer and is connected with the input end of the third adder; the output end of the second adder and the output end of the third adder are respectively connected with the input end of the first multiplier, and the output end of the first multiplier is connected with the input end of the fourth adder; the input end of the first 3 × 3 × 3 convolutional layer, the input end of the second 3 × 3 × 3 convolutional layer and the input end of the third 3 × 3 × 3 convolutional layer are respectively connected with the output end of the hidden layer at the first previous moment; the output end of the first adder is respectively connected with the input end of the second multiplier and the storage unit at the previous moment; the output end of the second multiplier is connected with the output end of the fourth adder; the output end of the fourth adder is respectively connected with the hidden layer at the current moment and the storage unit at the first current moment.
The grid equation governing the 3D-LSTM cell is:
Figure 606971DEST_PATH_IMAGE023
wherein the content of the first and second substances,
Figure 123403DEST_PATH_IMAGE024
is composed oftThe output gate of the time of day,
Figure 854335DEST_PATH_IMAGE025
in order to be a sigmoid function,
Figure 721797DEST_PATH_IMAGE026
is a weight matrix of the output gates,
Figure 181728DEST_PATH_IMAGE027
in order to be an input function of the input function,
Figure 552667DEST_PATH_IMAGE028
is composed oftTime of dayThe input of (a) is performed,
Figure 439851DEST_PATH_IMAGE029
the matrix of transitions is hidden for the output gates,
Figure DEST_PATH_IMAGE030
in order to perform the convolution operation,
Figure 997871DEST_PATH_IMAGE031
is composed oft-a hidden state at time 1,
Figure 261494DEST_PATH_IMAGE032
is the output gate offset;
Figure 486939DEST_PATH_IMAGE033
in order to input the information into the gate,
Figure 545025DEST_PATH_IMAGE034
is a weight matrix for the input gate,
Figure 121499DEST_PATH_IMAGE035
the state matrix is hidden for the input gate,
Figure 923233DEST_PATH_IMAGE036
is the offset of the input gate;
Figure 3185DEST_PATH_IMAGE037
in order to multiply the elements of the image,
Figure 763330DEST_PATH_IMAGE038
is composed oft-a storage unit at time 1,
Figure 561522DEST_PATH_IMAGE039
in order to activate the function(s),
Figure 166947DEST_PATH_IMAGE040
is a weight matrix of the storage unit,
Figure 101405DEST_PATH_IMAGE041
the state matrix is hidden for the storage unit,
Figure 763943DEST_PATH_IMAGE042
in order to store the cell offset,
Figure 49431DEST_PATH_IMAGE043
is a hidden unit.
The pyramid pooling layers in step S2 include a feature map 16-block division layer, a feature map 4-block division layer, and a feature map 1-block division layer.
As shown in fig. 4, the decoder in step S7 includes several sets of two-dimensional vector matrices, a pooling layer, a one-dimensional vector matrix, a full-link layer, and a Softmax activation layer.
The formula of the cross entropy loss function in step S8 is:
Figure 192967DEST_PATH_IMAGE011
wherein the content of the first and second substances,
Figure 247511DEST_PATH_IMAGE012
in order to be a function of the cross-entropy loss,
Figure 411776DEST_PATH_IMAGE013
is a coordinate of a pixel, and is,
Figure 794347DEST_PATH_IMAGE014
in the form of a real voxel point,
Figure 866208DEST_PATH_IMAGE015
in the form of a voxel probability that is,
Figure 916204DEST_PATH_IMAGE016
is a logarithmic function with a base 10.
In one embodiment of the invention, the cross-overlapping rate of the three-dimensional reconstruction result output by the network and the real model is evaluated by adopting IoU value, namely the evaluation criterion of the three-dimensional reconstruction precision.
Figure 782529DEST_PATH_IMAGE044
Wherein the content of the first and second substances,
Figure 917975DEST_PATH_IMAGE045
in order to predict the value of the target,
Figure 527948DEST_PATH_IMAGE046
the representation takes the intersection set,
Figure 698029DEST_PATH_IMAGE047
in order to be the true value of the value,
Figure 735255DEST_PATH_IMAGE048
the union set is represented as a union set,
Figure 357997DEST_PATH_IMAGE049
for the purpose of the index function,
Figure 771661DEST_PATH_IMAGE050
is the first to predict valueiThe number of the individual voxels,
Figure 530670DEST_PATH_IMAGE051
is a true value ofiAnd (4) each voxel.
As shown in fig. 5, fig. 6, fig. 7, fig. 8 and table 1, it can be seen that the details of the reconstruction model of the method of the present invention are more complete, i.e., the accuracy is higher; compared with other two algorithms, the IoU value of the 3D-R2N2 algorithm is higher, so that the 3D-R2N2 algorithm has a better effect on a three-position reconstruction model, the reconstruction accuracy of the 3D-R2N2 algorithm is 7.8% higher than that of a B-Rep algorithm, and the accuracy of the 3D-R2N2 algorithm is 5.3% higher than that of a 3D-R2N2 algorithm.
TABLE 1
Figure 4377DEST_PATH_IMAGE052
In another embodiment of the invention, a teaching building is used as an object of the embodiment to detect the three-dimensional building target reconstruction performance of the 3D-R2N2 algorithm. The three-dimensional reconstruction of the image is performed in sequence by using the 3D-R2N2 algorithm and the voxel level reconstruction algorithm, and the obtained results are shown in FIGS. 9 and 10. Meanwhile, the registration time of the optical image data and the convergence of the registration result are simulated in sequence, the comparison result is shown in table 2, the registration time obtained by the two algorithms is in a direct proportion relation with the data scale, the data scale is increased, the registration time is increased, but the registration time of the voxel reconstruction algorithm to the image data each time is higher than that of the 3D-R2N2 algorithm, and the result analysis shows that the 3D-R2N2 algorithm can improve the speed by 3.2% when the three-dimensional reconstruction of the image is carried out compared with the voxel reconstruction algorithm. And then, the convergence of the registration result of the image data is analyzed by the two algorithms, the convergence performance of the 3D-R2N2 algorithm is stronger than that of a voxel reconstruction algorithm each time, and the 3D-R2N2 algorithm can effectively reduce the algorithm complexity by 15.1% compared with the voxel reconstruction algorithm when the three-dimensional image reconstruction is carried out.
TABLE 2
Figure DEST_PATH_IMAGE053
The invention designs a CNN module with compact links, each convolution layer of the CNN module is connected with the subsequent convolution layer, so that each convolution layer can increase the feature mapping of the corresponding layer, the subsequent convolution layer can acquire the information of the previous convolution layer, and the information of the first convolution layer can be acquired even the last convolution layer, thereby fully utilizing the number of channels and completing the information transmission; a bottleneck layer structure is added between convolution layers of the CNN module which is densely linked, so that the dimensionality of a convolution result can be reduced, the dimensionality increase caused by acquiring a large amount of characteristic information by each convolution layer can be relieved, the dimensionality is reduced, the training is stable, and the coding efficiency is improved; a pyramid pooling layer is added between the extraction module and the CNN module, so that the size of the image can be unified, the image classification effect is improved by fusing multi-scale features, and the target recognition rate is improved; the algorithm can improve the characteristic extraction effect, so that the details of an image reconstruction model are more perfect, and the accuracy is higher; compared with the traditional algorithm, the method can shorten the registration time, reduce the convergence of the registration result and reduce the complexity of the algorithm.

Claims (7)

1. A three-dimensional reconstruction method of an optical building target based on 3D-R2N2 is characterized by comprising the following steps:
s1, acquiring an optical image and preprocessing the optical image;
s2, constructing a 3D-R2N2 network, and inputting the preprocessed optical image into the constructed 3D-R2N2 network; the 3D-R2N2 network comprises an image extraction module, a pyramid pooling layer, a CNN module and a 3D-LSTM unit which are connected in sequence;
s3, adjusting the size of the input optical image to be a uniform size through the pyramid pooling layer;
s4, extracting the features of the optical image with uniform size by using a CNN module and depth residual variation of a 3D-R2N2 network, and encoding the extracted features;
s5, performing one-dimensional convolution on the coded features, and compressing the features into 1024-dimensional feature vectors, namely low-dimensional feature vectors, through a coder;
s6, sending the low-dimensional feature vector into a 3D-LSTM unit to obtain a three-dimensional grid structure; wherein the three-dimensional grid structure comprises voxels;
s7, inputting the three-dimensional grid structure into a decoder, and improving the hidden state resolution of the three-dimensional grid structure through the decoder until the target output resolution is reached;
s8, converting the three-dimensional grid structure reaching the target output resolution into the existence probability of the voxel at the voxel coordinate point by using a cross entropy loss function, and processing the probability into a Bernoulli distribution form;
s9, establishing the existence probability of the voxels in the Bernoulli distribution form in the voxel coordinate points into a three-dimensional probability matrix;
and S10, performing pixel reconstruction through the three-dimensional probability matrix, namely completing the three-dimensional reconstruction of the optical building target.
2. The 3D-R2N 2-based optical building target three-dimensional reconstruction method according to claim 1, wherein the preprocessing in step S1 comprises:
s1-1, according to the formula:
Figure 867253DEST_PATH_IMAGE001
obtaining the minimum total variation
Figure 819029DEST_PATH_IMAGE002
(ii) a Wherein
Figure 689334DEST_PATH_IMAGE003
Is a pair of
Figure 68363DEST_PATH_IMAGE004
The differential is obtained by the differential analysis,
Figure 665697DEST_PATH_IMAGE004
is a domain of definition of a pixel point,
Figure 421164DEST_PATH_IMAGE005
for the original sharp analog noise high frequency image,
Figure 154764DEST_PATH_IMAGE006
for simulating pixels in noisy high-frequency images with original definitionxThe coordinates of the position of the object to be imaged,
Figure 970274DEST_PATH_IMAGE007
for simulating pixels in noisy high-frequency images with original definitionyCoordinates of the two components,
Figure 523746DEST_PATH_IMAGE008
is composed of
Figure 817324DEST_PATH_IMAGE005
The differential operator of (a), i.e. the optical image,
Figure 671010DEST_PATH_IMAGE009
as coordinates to pixel pointsxThe derivation is carried out by the derivation,
Figure 657421DEST_PATH_IMAGE010
as coordinates to pixel pointsyDerivation is carried out;
s1-2, carrying out noise reduction processing on the optical image by utilizing the minimized total variation;
s1-3, processing the optical image after noise reduction into a vertical image, and graying the vertical image to obtain a gray image;
s1-4, extending all the area spaces with concentrated gray scales in the gray scale image to all the gray scale area space ranges to obtain a non-uniform extension and stretching gray scale image;
and S1-5, redistributing the pixel values of the non-uniform extension stretched gray-scale image to finish preprocessing.
3. The 3D-R2N 2-based optical building target three-dimensional reconstruction method according to claim 1, wherein the CNN module in step S2 includes 12 convolutional layers, 5 residual connection layers, 4 bottleneck layers and 1 transition Layer, wherein the residual connection layers include a first convolutional Layer, a first leak _ Relu activation function Layer, a second leak _ Relu activation function Layer, a third leak _ Relu activation function Layer and a PC Layer channel control Layer, the first bottleneck Layer to the fourth bottleneck Layer each include a BN normalization Layer and a Relu activation function Layer, and the first convolutional Layer is 7 × 7 structure, and the second convolutional Layer to the thirteenth convolutional Layer are 3 × 3 structure;
the first convolution Layer, the first Leaky _ Relu activation function Layer, the second convolution Layer, the third convolution Layer, the fourth convolution Layer, the fifth convolution Layer, the sixth convolution Layer, the seventh convolution Layer, the eighth convolution Layer, the second Leaky _ Relu activation function Layer, the ninth convolution Layer, the first bottleneck Layer, the tenth convolution Layer, the second bottleneck Layer, the eleventh convolution Layer, the third bottleneck Layer, the twelfth convolution Layer, the fourth bottleneck Layer, the thirteenth convolution Layer, the transition Layer, the third Leaky _ Relu activation function Layer and the PC Layer access control Layer are connected in sequence; wherein the first convolution Layer is an input Layer, and the PC Layer channel control Layer is an output Layer;
the output end of the second convolution layer is respectively connected with the input end of the third convolution layer, the input end of the fourth convolution layer, the input end of the fifth convolution layer, the input end of the sixth convolution layer and the input end of the seventh convolution layer; the output end of the third convolution layer is respectively connected with the input end of the fourth convolution layer, the input end of the fifth convolution layer, the input end of the sixth convolution layer and the input end of the seventh convolution layer; the output end of the fourth convolution layer is respectively connected with the input end of the fifth convolution layer, the input end of the sixth convolution layer and the input end of the seventh convolution layer; the output end of the fifth convolution layer is respectively connected with the input end of the sixth convolution layer and the input end of the seventh convolution layer; the output end of the sixth convolution layer is connected with the input end of the seventh convolution layer;
the output end of the ninth convolution layer is respectively connected with the input end of the first bottleneck layer, the input end of the second bottleneck layer, the input end of the third bottleneck layer, the input end of the fourth bottleneck layer and the input end of the transition layer; the output end of the tenth convolution layer is respectively connected with the input end of the second bottleneck layer, the input end of the third bottleneck layer, the input end of the fourth bottleneck layer and the input end of the transition layer; the eleventh convolution layer is respectively connected with the input end of the third bottleneck layer, the input end of the fourth bottleneck layer and the input end of the transition layer; the output end of the twelfth convolution layer is respectively connected with the input end of the fourth bottleneck layer and the input end of the transition layer; the output end of the thirteenth convolution layer is connected to the input end of the transition layer.
4. The method for three-dimensional reconstruction of an optical building object based on 3D-R2N2, wherein the 3D-LSTM unit comprises several basic modules in step S2; the basic module comprises a first current full-connection layer, and the output end of the first current full-connection layer is respectively connected with the input end of the first full-connection layer, the input end of the second full-connection layer and the input end of the third full-connection layer; the output end of the first full connection layer is connected with the output end of the first 3 multiplied by 3 convolution layer and is connected with the input end of the first adder; the output end of the second full connection layer is connected with the output end of the second 3 multiplied by 3 convolution layer and is connected with the input end of the second adder; the output end of the third full connection layer is connected with the output end of the third 3 multiplied by 3 convolutional layer and is connected with the input end of the third adder; the output end of the second adder and the output end of the third adder are respectively connected with the input end of the first multiplier, and the output end of the first multiplier is connected with the input end of the fourth adder; the input end of the first 3 × 3 × 3 convolutional layer, the input end of the second 3 × 3 × 3 convolutional layer and the input end of the third 3 × 3 × 3 convolutional layer are respectively connected with the output end of the hidden layer at the first previous moment; the output end of the first adder is respectively connected with the input end of the second multiplier and the storage unit at the previous moment; the output end of the second multiplier is connected with the output end of the fourth adder; the output end of the fourth adder is respectively connected with the hidden layer at the current moment and the storage unit at the first current moment.
5. The 3D-R2N 2-based optical building object three-dimensional reconstruction method according to claim 1, wherein the pyramid pooling layers in step S2 includes a feature mapping 16-block partition layer, a feature mapping 4-block partition layer and a feature mapping 1-block partition layer.
6. The 3D-R2N 2-based optical building target three-dimensional reconstruction method according to claim 1, wherein the decoder in step S7 includes several sets of two-dimensional vector matrix, pooling layer, one-dimensional vector matrix, full-connected layer and Softmax active layer.
7. The method for three-dimensional reconstruction of an optical building object based on 3D-R2N2, wherein the formula of the cross entropy loss function in step S8 is as follows:
Figure 963769DEST_PATH_IMAGE011
wherein the content of the first and second substances,
Figure 795458DEST_PATH_IMAGE012
in order to be a function of the cross-entropy loss,
Figure 238072DEST_PATH_IMAGE013
is a coordinate of a pixel, and is,
Figure 660963DEST_PATH_IMAGE014
in the form of a real voxel point,
Figure 454607DEST_PATH_IMAGE015
in the form of a voxel probability that is,
Figure 89988DEST_PATH_IMAGE016
is a logarithmic function with a base 10.
CN202111409413.1A 2021-11-25 2021-11-25 Optical building target three-dimensional reconstruction method based on 3D-R2N2 Active CN113822825B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111409413.1A CN113822825B (en) 2021-11-25 2021-11-25 Optical building target three-dimensional reconstruction method based on 3D-R2N2

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111409413.1A CN113822825B (en) 2021-11-25 2021-11-25 Optical building target three-dimensional reconstruction method based on 3D-R2N2

Publications (2)

Publication Number Publication Date
CN113822825A true CN113822825A (en) 2021-12-21
CN113822825B CN113822825B (en) 2022-02-11

Family

ID=78918240

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111409413.1A Active CN113822825B (en) 2021-11-25 2021-11-25 Optical building target three-dimensional reconstruction method based on 3D-R2N2

Country Status (1)

Country Link
CN (1) CN113822825B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116310844A (en) * 2023-05-18 2023-06-23 四川凯普顿信息技术股份有限公司 Agricultural crop growth monitoring system
CN116958455A (en) * 2023-09-21 2023-10-27 北京飞渡科技股份有限公司 Roof reconstruction method and device based on neural network and electronic equipment

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101482971A (en) * 2009-02-23 2009-07-15 公安部第一研究所 Non-uniform correction method for compensation of low-gray scale X-ray image signal
CN102831573A (en) * 2012-08-14 2012-12-19 电子科技大学 Linear stretching method of infrared image
CN103327245A (en) * 2013-06-07 2013-09-25 电子科技大学 Automatic focusing method of infrared imaging system
CN103337053A (en) * 2013-06-13 2013-10-02 华中科技大学 Switching non-local total variation based filtering method for image polluted by salt and pepper noise
CN104143101A (en) * 2014-07-01 2014-11-12 华南理工大学 Method for automatically identifying breast tumor area based on ultrasound image
CN105954994A (en) * 2016-06-30 2016-09-21 深圳先进技术研究院 Image enhancement method for lensless digital holography microscopy imaging
CN106251315A (en) * 2016-08-23 2016-12-21 南京邮电大学 A kind of image de-noising method based on full variation
CN106355561A (en) * 2016-08-30 2017-01-25 天津大学 TV (total variation) image noise removal method based on noise priori constraint
CN108737298A (en) * 2018-04-04 2018-11-02 东南大学 A kind of SCMA blind checking methods based on image procossing
US20200027269A1 (en) * 2018-07-23 2020-01-23 Fudan University Network, System and Method for 3D Shape Generation
US20200126289A1 (en) * 2016-09-23 2020-04-23 Blue Vision Labs UK Limited Method and system for creating a virtual 3d model
CN112767272A (en) * 2021-01-20 2021-05-07 南京信息工程大学 Weight self-adaptive mixed-order fully-variable image denoising algorithm
CN112785684A (en) * 2020-11-13 2021-05-11 北京航空航天大学 Three-dimensional model reconstruction method based on local information weighting mechanism

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101482971A (en) * 2009-02-23 2009-07-15 公安部第一研究所 Non-uniform correction method for compensation of low-gray scale X-ray image signal
CN102831573A (en) * 2012-08-14 2012-12-19 电子科技大学 Linear stretching method of infrared image
CN103327245A (en) * 2013-06-07 2013-09-25 电子科技大学 Automatic focusing method of infrared imaging system
CN103337053A (en) * 2013-06-13 2013-10-02 华中科技大学 Switching non-local total variation based filtering method for image polluted by salt and pepper noise
CN104143101A (en) * 2014-07-01 2014-11-12 华南理工大学 Method for automatically identifying breast tumor area based on ultrasound image
CN105954994A (en) * 2016-06-30 2016-09-21 深圳先进技术研究院 Image enhancement method for lensless digital holography microscopy imaging
CN106251315A (en) * 2016-08-23 2016-12-21 南京邮电大学 A kind of image de-noising method based on full variation
CN106355561A (en) * 2016-08-30 2017-01-25 天津大学 TV (total variation) image noise removal method based on noise priori constraint
US20200126289A1 (en) * 2016-09-23 2020-04-23 Blue Vision Labs UK Limited Method and system for creating a virtual 3d model
CN108737298A (en) * 2018-04-04 2018-11-02 东南大学 A kind of SCMA blind checking methods based on image procossing
US20200027269A1 (en) * 2018-07-23 2020-01-23 Fudan University Network, System and Method for 3D Shape Generation
CN112785684A (en) * 2020-11-13 2021-05-11 北京航空航天大学 Three-dimensional model reconstruction method based on local information weighting mechanism
CN112767272A (en) * 2021-01-20 2021-05-07 南京信息工程大学 Weight self-adaptive mixed-order fully-variable image denoising algorithm

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
CHRISTOPHER B. CHOY等: ""3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction"", 《ARXIV》 *
KAIMING HE等: ""Deep Residual Learning for Image Recognition"", 《2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》 *
ZHENGBO LUO等: ""Rethinking ResNets: Improved Stacking Strategies With High Order Schemes"", 《ARXIV》 *
邹倩颖等: ""改进 ORB-SLAM 算法在户外离线即时导航的研究"", 《实验室研究与探索》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116310844A (en) * 2023-05-18 2023-06-23 四川凯普顿信息技术股份有限公司 Agricultural crop growth monitoring system
CN116958455A (en) * 2023-09-21 2023-10-27 北京飞渡科技股份有限公司 Roof reconstruction method and device based on neural network and electronic equipment
CN116958455B (en) * 2023-09-21 2023-12-26 北京飞渡科技股份有限公司 Roof reconstruction method and device based on neural network and electronic equipment

Also Published As

Publication number Publication date
CN113822825B (en) 2022-02-11

Similar Documents

Publication Publication Date Title
CN108875813B (en) Three-dimensional grid model retrieval method based on geometric image
CN110390638B (en) High-resolution three-dimensional voxel model reconstruction method
CN109410321A (en) Three-dimensional rebuilding method based on convolutional neural networks
CN113822825B (en) Optical building target three-dimensional reconstruction method based on 3D-R2N2
CN112396703A (en) Single-image three-dimensional point cloud model reconstruction method
CN112818764B (en) Low-resolution image facial expression recognition method based on feature reconstruction model
CN107301643B (en) Well-marked target detection method based on robust rarefaction representation Yu Laplce's regular terms
CN113436237B (en) High-efficient measurement system of complicated curved surface based on gaussian process migration learning
CN113159232A (en) Three-dimensional target classification and segmentation method
CN104077742B (en) Human face sketch synthetic method and system based on Gabor characteristic
CN113269224A (en) Scene image classification method, system and storage medium
Kang et al. Competitive learning of facial fitting and synthesis using uv energy
Zhu et al. Nonlocal low-rank point cloud denoising for 3-D measurement surfaces
CN112581626B (en) Complex curved surface measurement system based on non-parametric and multi-attention force mechanism
CN104299201A (en) Image reconstruction method based on heredity sparse optimization and Bayes estimation model
CN113034371A (en) Infrared and visible light image fusion method based on feature embedding
CN112767539B (en) Image three-dimensional reconstruction method and system based on deep learning
CN116721216A (en) Multi-view three-dimensional reconstruction method based on GCF-MVSNet network
CN116758219A (en) Region-aware multi-view stereo matching three-dimensional reconstruction method based on neural network
CN116309221A (en) Method for constructing multispectral image fusion model
CN111062274A (en) Context-aware embedded crowd counting method, system, medium, and electronic device
CN112785684B (en) Three-dimensional model reconstruction method based on local information weighting mechanism
CN113011506B (en) Texture image classification method based on deep fractal spectrum network
CN112215241B (en) Image feature extraction device based on small sample learning
Wang et al. Rethinking separable convolutional encoders for end-to-end semantic image segmentation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant