CN112652059B - Mesh R-CNN model-based improved target detection and three-dimensional reconstruction method - Google Patents

Mesh R-CNN model-based improved target detection and three-dimensional reconstruction method Download PDF

Info

Publication number
CN112652059B
CN112652059B CN202011642349.7A CN202011642349A CN112652059B CN 112652059 B CN112652059 B CN 112652059B CN 202011642349 A CN202011642349 A CN 202011642349A CN 112652059 B CN112652059 B CN 112652059B
Authority
CN
China
Prior art keywords
dimensional
voxel
target object
target detection
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011642349.7A
Other languages
Chinese (zh)
Other versions
CN112652059A (en
Inventor
刘嵩
周梓涵
来庆涵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qilu University of Technology
Original Assignee
Qilu University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qilu University of Technology filed Critical Qilu University of Technology
Priority to CN202011642349.7A priority Critical patent/CN112652059B/en
Publication of CN112652059A publication Critical patent/CN112652059A/en
Application granted granted Critical
Publication of CN112652059B publication Critical patent/CN112652059B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects
    • G06V20/647Three-dimensional objects by matching two-dimensional images to three-dimensional objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Abstract

The disclosure provides a Mesh R-CNN model-based improved target detection and three-dimensional reconstruction method, which comprises the steps of obtaining an image including a target to be identified; preprocessing the image by utilizing a generated countermeasure network; performing two-dimensional target detection on the original image and the preprocessed image by using a GA-RPN network model to obtain the position, the anchor frame and the classification of a target object; carrying out voxel conversion by utilizing the preprocessed image and the obtained position, anchor frame and classification data of the target object to obtain three-dimensional voxel information of the target object; thinning the obtained three-dimensional voxel information of the target object to obtain a final 3D model of the target object; the method and the device can realize target detection and 3D model construction of the target in the image more quickly and efficiently.

Description

Mesh R-CNN model-based improved target detection and three-dimensional reconstruction method
Technical Field
The disclosure relates to the technical field of image processing, in particular to a Mesh R-CNN (three-dimensional grid area convolutional neural network) model-based improved target detection and three-dimensional reconstruction method.
Background
The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.
In recent years, deep learning is greatly promoted in the field of combining 2D target detection and 3D reconstruction, a neural network is utilized to learn the representation of three-dimensional voxels, point clouds, grids and the like, and the understanding and the development of a three-dimensional world are promoted.
2D object detection represented by fast R-CNN and Mask R-CNN has been studied with good results and applied to various fields. On the basis of the development of 2D target detection, more accurate and intuitive combined 2D target detection and 3D reconstruction are provided, 3D target detection is used for detecting a bounding box of a target object in a picture and generating a three-dimensional model of the target object, and compared with 2D target detection, the information in the picture can be more comprehensively extracted by combining 2D target detection and 3D reconstruction.
The inventor of the present disclosure finds that although two-dimensional object detection has been rapidly developed, and studies on 2D object detection represented by fast R-CNN and Mask R-CNN have achieved good results and are applied to various fields, a single two-dimensional object detection task ignores three-dimensional information of an object and cannot extract 3D information of the object to be detected in a picture.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a Mesh R-CNN model-based improved target detection and three-dimensional reconstruction method, which can more efficiently realize target detection in an image and 3D model construction of a target.
In order to achieve the purpose, the following technical scheme is adopted in the disclosure:
the first aspect of the disclosure provides a Mesh R-CNN model-based improved target detection and three-dimensional reconstruction method.
A Mesh R-CNN model-based improved target detection and three-dimensional reconstruction method comprises the following steps:
acquiring an image including an object to be recognized;
preprocessing the image by utilizing a generated countermeasure network;
performing two-dimensional target detection on the original image and the preprocessed image by using a GA-RPN network model to obtain an anchor frame area of a target object;
carrying out voxel conversion by using an anchor frame region generated by target detection and combining a Pix2Vox method to obtain three-dimensional voxel information of a target object;
and thinning the obtained three-dimensional voxel information of the target object to obtain a final 3D model of the target object.
The second aspect of the disclosure provides a Mesh R-CNN model-based improved target detection and three-dimensional reconstruction system.
A Mesh R-CNN model-based improved target detection and three-dimensional reconstruction system comprises:
a data acquisition module configured to: acquiring an image including an object to be recognized;
an image processing module configured to: preprocessing the image by utilizing a generated countermeasure network;
a two-dimensional object recognition module configured to: performing two-dimensional target detection on the original image and the preprocessed image by using a GA-RPN network model to obtain an anchor frame area of a target object;
a voxel conversion module configured to: carrying out voxel conversion by using an anchor frame region generated by target detection and combining a Pix2Vox method to obtain three-dimensional voxel information of a target object;
a 3D conversion module configured to: and refining the obtained three-dimensional voxel information of the target object by utilizing a PNA method to obtain a final 3D model of the target object.
A third aspect of the present disclosure provides a medium, on which a program is stored, which when executed by a processor, implements the steps in the Mesh R-CNN model-based improved target detection and three-dimensional reconstruction method according to the first aspect of the present disclosure.
A fourth aspect of the present disclosure provides an electronic device, including a memory, a processor, and a program stored in the memory and executable on the processor, where the processor implements the steps in the Mesh R-CNN model-based improved target detection and three-dimensional reconstruction method according to the first aspect of the present disclosure when executing the program.
Compared with the prior art, the beneficial effect of this disclosure is:
1. according to the Mesh R-CNN model-based improved target detection and three-dimensional reconstruction method, the generated countermeasure network is used for processing the original image, so that the robustness is enhanced, and the model is more robust.
2. According to the Mesh R-CNN model-based improved target detection and three-dimensional reconstruction method, GA-RPN is used for replacing Mask R-CNN in the Mesh R-CNN to perform target detection, an anchor frame guiding and positioning method is used for processing pictures to determine the position and the anchor frame of a target object, and the accuracy of target identification is improved.
3. The invention provides a Mesh R-CNN model-based improved target detection and three-dimensional reconstruction method, wherein a Pix2Vox method is used for carrying out voxel conversion to obtain more accurate three-dimensional voxel information of a target object
4. According to the Mesh R-CNN model-based improved target detection and three-dimensional reconstruction method, PNA is used for replacing GCN in the Mesh R-CNN to refine a three-dimensional body, and the PNA adopts a multi-combiner mode, so that voxel information can be better extracted.
Drawings
The accompanying drawings, which are included to provide a further understanding of the disclosure, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure and are not to limit the disclosure.
Fig. 1 is a schematic flow chart of a target detection and three-dimensional reconstruction method provided in embodiment 1 of the present disclosure.
Detailed Description
The present disclosure is further described with reference to the following drawings and examples.
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present disclosure. As used herein, the singular forms "a", "an", and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
The embodiments and features of the embodiments in the present disclosure may be combined with each other without conflict.
Example 1:
as shown in fig. 1, embodiment 1 of the present disclosure provides a more efficient and accurate 2D target detection and 3D reconstruction framework (AFM R-CNN) based on Mesh R-CNN, which specifically includes:
processing the image by using a generation countermeasure network to improve the robustness of the model;
performing parallel operation on the picture generated by the generated countermeasure network and the original picture, and performing two-dimensional target detection by using GA-RPN to obtain a two-dimensional position of a target object;
inputting an anchor frame region obtained by target detection into a Pix2VOX, converting two-dimensional features of an image into three-dimensional voxels, and fusing to generate initial rough voxels;
inputting the obtained initial rough voxel into a cubic method (cubify) layer, converting the three-dimensional voxel into a three-dimensional grid point, and optimizing the voxel by using an accurate grid prediction branch;
and thinning the voxel after optimization through two PNA thinning layers to obtain a final 3D model.
More specifically, the method comprises the following steps:
s1: generating a countermeasure network
In order to enhance the robustness of the model, the embodiment uses a generation countermeasure network (GAN) to process the original picture to generate a false picture, and the GAN is a generation model using back propagation and can generate a false sample close to a real sample.
Two models are contained in the GAN model framework: generating a Model (generic Model) and a discriminant Model (discriminant Model), wherein the generating Model is used for generating a picture, the discriminant Model is used for determining whether the picture is true or false, the two models play a game with each other to generate an output picture, and the calculation formula is as follows:
Figure BDA0002880570340000051
wherein ximgRepresenting an input picture, Pdata(ximg) Representing the distribution of the input pictures.
Figure BDA0002880570340000052
A formula is generated for the arbiter,
Figure BDA0002880570340000053
a formula is generated for the generator.
The mutual game enables the data generated by the generator G to be closer to the real data, the identification capability of the discriminator D is maximized, then the data is iterated, the modeling capability of the generated network is continuously improved, the judgment capability of the discrimination network is continuously optimized, and finally the output picture is obtained.
S2: object detection module
And replacing Mask R-CNN in the Mesh R-CNN with GA-RPN to detect the target, processing the picture by using an anchor frame guiding and positioning method, and determining the position and the anchor frame of the target object.
The GA-RPN generates a probability map indicating the position of the target object according to the feature map by the position prediction branch, then outputs the position information of the anchor frame, then the shape prediction branch generates a related anchor frame shape according to the generated position information, finally determines the most probable shape at each position by setting a threshold value, and generates a group of anchor frames by combining the position information.
The probability distribution of the position and shape is calculated as follows:
p(x,y,w,h|I)=p(x,y|I)p(w,h|x,y,I) (2)
where (x, y) denotes the position of the anchor frame, and w and h denote the width and height of the anchor frame.
S3: voxel transform fusion
The voxel conversion is to convert two-dimensional information in a two-dimensional picture into three-dimensional information, and the Mesh R-CNN uses a single voxel conversion layer to convert the two-dimensional information into the three-dimensional information, so that the two-dimensional information cannot be fully utilized.
In the embodiment, a Pix2Vox method is used for three-dimensional conversion, an encoder is used for generating feature maps from input images, a decoder takes each feature map as input to generate a corresponding coarse three-dimensional body, finally, context fusion is carried out on the generated decoding and encoding results, a fusion module adaptively selects a highest-quality result for each part from coarse three-dimensional voxels, and a finally fused three-dimensional body is output.
The rough voxels generated by the decoder enter a perception context fusion module to generate context information of each rough voxel, the obtained rough voxel information is fused, the fusion module designs each voxel to generate a score map, and the score map is calculated as follows:
Figure BDA0002880570340000061
wherein, f represents the number of views,
Figure BDA0002880570340000062
is the result of the score of point (n, m, k) on the qth voxel. x is the number ofrRepresenting the rating generated by the nth bold face of the context rating network.
Then, the score maps are subjected to weighted summation and fused into a voxel VzThe spatial information of the voxel is retained to the greatest extent, and the formula is as follows:
Figure BDA0002880570340000063
s4: generating a final 3D model
The three-dimensional body is refined by using PNA instead of GCN in Mesh R-CNN, the PNA adopts a multi-combiner mode, voxel information can be better extracted, and because the PNA cannot directly refine the voxels, the generated voxels are converted into three-dimensional grids by adopting a cubify method.
And thinning the voxels through two PNA thinning layers to obtain a final 3D model. The PNA may amplify or attenuate a signal according to the degree of each node using a node degree-based scaler. The PNA is a main neighborhood aggregation network, and by combining a plurality of aggregators and node-based scalers, each node can better understand the received information distribution, and the performance of the GNN is effectively improved.
Here, model losses need to be calculated to optimize the three-dimensional reconstruction model. The model loss calculation method is as follows, because the calculation of the loss function on the three-dimensional grid is very difficult, the dense sampling is carried out on the surface of the three-dimensional grid by using the point cloud, and the point cloud loss is taken as the shape loss. And setting two point cloud sets F and H with normal vectors, and using the normal vector distance and the chamfer angle distance of the F and the H as point cloud loss.
The normal vector distance of F and H is as follows:
Figure BDA0002880570340000071
the chamfer distances were as follows:
Figure BDA0002880570340000072
wherein ΛF,H={(f,arg minhF belongs to F) is a set of (F, H), F is the adjacent point of H on the point cloud H, mu isfThe normal vector representing point f.
Only point cloud loss is used for optimization, which can result in default shares, so that edge loss is added, and the grid prediction quality is improved, and the formula is as follows:
Figure BDA0002880570340000073
wherein
Figure BDA0002880570340000081
Representing the edges of the predicted mesh and v representing the vertices.
Example 2:
an embodiment 2 of the present disclosure provides a target detection and three-dimensional reconstruction system, including:
a data acquisition module configured to: acquiring an image including an object to be recognized;
an image processing module configured to: preprocessing the image by utilizing a generated countermeasure network;
a two-dimensional object recognition module configured to: performing two-dimensional target detection on the original image and the preprocessed image by using a GA-RPN network model to obtain the position, the anchor frame and the classification of a target object;
a voxel conversion module configured to: carrying out voxel conversion by utilizing the preprocessed image and the obtained position, anchor frame and classification data of the target object to obtain three-dimensional voxel information of the target object;
a 3D model generation module configured to: and thinning the obtained three-dimensional voxel information of the target object to obtain a final 3D model of the target object.
The working method of the system is the same as the target detection and three-dimensional reconstruction method provided in embodiment 1, and details are not repeated here.
Example 3:
the embodiment 3 of the present disclosure provides a medium, on which a program is stored, and when the program is executed by a processor, the method implements the steps of the target detection and three-dimensional reconstruction method according to the embodiment 1 of the present disclosure, where the steps are:
acquiring an image including an object to be recognized;
preprocessing the image by utilizing a generated countermeasure network;
performing two-dimensional target detection on the original image and the preprocessed image by using a GA-RPN network model to obtain the position, the anchor frame and the classification of a target object;
carrying out voxel conversion by using an anchor frame region generated by target detection and combining a Pix2Vox method to obtain three-dimensional voxel information of a target object;
and thinning the obtained three-dimensional voxel information of the target object to obtain a final 3D model of the target object.
The detailed steps are the same as those of the target detection and three-dimensional reconstruction method provided in embodiment 1, and are not described herein again.
Example 4:
an embodiment 4 of the present disclosure provides an electronic device, including a memory, a processor, and a program stored in the memory and capable of running on the processor, where the processor implements steps in the target detection and three-dimensional reconstruction method according to embodiment 1 of the present disclosure when executing the program, where the steps are:
acquiring an image including an object to be recognized;
preprocessing the image by utilizing a generated countermeasure network;
performing two-dimensional target detection on the original image and the preprocessed image by using a GA-RPN network model to obtain the position, the anchor frame and the classification of a target object;
carrying out voxel conversion by using an anchor frame region generated by target detection and combining a Pix2Vox method to obtain three-dimensional voxel information of a target object;
and thinning the obtained three-dimensional voxel information of the target object to obtain a final 3D model of the target object.
The detailed steps are the same as those of the target detection and three-dimensional reconstruction method provided in embodiment 1, and are not described herein again.
As will be appreciated by one skilled in the art, embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, the present disclosure may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the disclosure. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
The above description is only a preferred embodiment of the present disclosure and is not intended to limit the present disclosure, and various modifications and changes may be made to the present disclosure by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present disclosure should be included in the protection scope of the present disclosure.

Claims (5)

1. A Mesh R-CNN model-based improved target detection and three-dimensional reconstruction method is characterized by comprising the following steps: the method comprises the following steps:
acquiring an image including an object to be recognized;
preprocessing the image by utilizing a generated countermeasure network;
performing two-dimensional target detection on the original image and the preprocessed image by using a GA-RPN network model to obtain an anchor frame area of a target object;
carrying out voxel conversion by using an anchor frame region generated by target detection and combining a Pix2Vox method to obtain three-dimensional voxel information of a target object;
thinning the obtained three-dimensional voxel information of the target object to obtain a final 3D model of the target object;
the GA-RPN generates a probability graph indicating the position of the target object according to the characteristic graph by the position prediction branch and outputs the position information of the anchor frame;
the shape prediction branch generates a related anchor frame shape according to the generated position information, determines the most possible shape at each position through a set threshold value, and generates a group of anchor frames by combining the position information;
performing two-dimensional target detection on the original image and the preprocessed image by using the GA-RPN network model, and obtaining the position and classification information of a target object;
generating feature maps by using an encoder and an input preprocessed image, and generating a corresponding coarse three-dimensional body by taking each feature map as input by a decoder;
performing context fusion on the generated decoding results, adaptively selecting the highest-quality result for each part from the bold pixels by the fusion module, and outputting a final fused three-dimensional body;
the coarse voxels generated by the decoder enter a fusion module to generate context information of each coarse voxel, and the obtained coarse voxel information is fused, wherein the fusion module comprises:
the fusion module designs each voxel to generate a score map;
then, carrying out weighted summation on the score map, and fusing the score map into a voxel;
firstly, the generated voxels are converted into three-dimensional grids by a cubify method, and then the three-dimensional body is refined by PNA.
2. The Mesh R-CNN model-based improved target detection and three-dimensional reconstruction method of claim 1, wherein:
and carrying out dense sampling on the surface of the three-dimensional grid by using the point cloud, taking the point cloud loss as the shape loss, and adding the edge loss to improve the grid prediction quality and optimize the three-dimensional reconstruction model.
3. A Mesh R-CNN model-based improved target detection and three-dimensional reconstruction system is characterized in that: the method comprises the following steps:
a data acquisition module configured to: acquiring an image including an object to be recognized;
an image processing module configured to: preprocessing the image by utilizing a generated countermeasure network;
a two-dimensional object recognition module configured to: performing two-dimensional target detection on the original image and the preprocessed image by using a GA-RPN network model to obtain an anchor frame area of a target object;
a voxel conversion module configured to: carrying out voxel conversion by using an anchor frame region generated by target detection and combining a Pix2Vox method to obtain three-dimensional voxel information of a target object;
a 3D model generation module configured to: thinning the obtained three-dimensional voxel information of the target object to obtain a final 3D model of the target object;
the GA-RPN generates a probability graph indicating the position of the target object according to the characteristic graph by the position prediction branch and outputs the position information of the anchor frame;
the shape prediction branch generates a related anchor frame shape according to the generated position information, determines the most possible shape at each position through a set threshold value, and generates a group of anchor frames by combining the position information;
performing two-dimensional target detection on the original image and the preprocessed image by using a GA-RPN network model, and obtaining the position and classification information of a target object;
generating feature maps by using an encoder and an input preprocessed image, and generating a corresponding coarse three-dimensional body by taking each feature map as input by a decoder;
performing context fusion on the generated decoding results, adaptively selecting the highest-quality result for each part from the bold pixels by the fusion module, and outputting a final fused three-dimensional body;
the coarse voxels generated by the decoder enter a fusion module to generate context information of each coarse voxel, and the obtained coarse voxel information is fused, wherein the fusion module comprises:
the fusion module designs each voxel to generate a score map;
then carrying out weighted summation on the score map, and fusing the score map into a voxel;
firstly, converting the generated voxels into three-dimensional grids by using a cubify method, and then refining the three-dimensional body by using PNA.
4. A medium having a program stored thereon, wherein the program, when executed by a processor, implements the steps of the Mesh R-CNN model based improved object detection and three-dimensional reconstruction method of any one of claims 1-2.
5. An electronic device comprising a memory, a processor and a program stored on the memory and executable on the processor, wherein the processor implements the steps of the Mesh R-CNN model-based improved target detection and three-dimensional reconstruction method according to any one of claims 1-2 when executing the program.
CN202011642349.7A 2020-12-31 2020-12-31 Mesh R-CNN model-based improved target detection and three-dimensional reconstruction method Active CN112652059B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011642349.7A CN112652059B (en) 2020-12-31 2020-12-31 Mesh R-CNN model-based improved target detection and three-dimensional reconstruction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011642349.7A CN112652059B (en) 2020-12-31 2020-12-31 Mesh R-CNN model-based improved target detection and three-dimensional reconstruction method

Publications (2)

Publication Number Publication Date
CN112652059A CN112652059A (en) 2021-04-13
CN112652059B true CN112652059B (en) 2022-06-14

Family

ID=75367057

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011642349.7A Active CN112652059B (en) 2020-12-31 2020-12-31 Mesh R-CNN model-based improved target detection and three-dimensional reconstruction method

Country Status (1)

Country Link
CN (1) CN112652059B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111027547A (en) * 2019-12-06 2020-04-17 南京大学 Automatic detection method for multi-scale polymorphic target in two-dimensional image
CN112052860A (en) * 2020-09-11 2020-12-08 中国人民解放军国防科技大学 Three-dimensional target detection method and system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111027547A (en) * 2019-12-06 2020-04-17 南京大学 Automatic detection method for multi-scale polymorphic target in two-dimensional image
CN112052860A (en) * 2020-09-11 2020-12-08 中国人民解放军国防科技大学 Three-dimensional target detection method and system

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
3D Model Inpainting Based on 3D Deep Convolutional Generative Ad;Xinying Wang,etc;《IEEE Access》;20200915;全文 *
Pix2Vox: Context-aware 3D Reconstruction from Single and Multi-view Images;Xie, HZ,etc;《2019 IEEE/CVF International Conference on Computer Vision (ICCV)》;20191102;全文 *
基于卷积神经网络的图像三维重构技术研究;万潇潇;《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》;20191215;全文 *
基于深度学习的三维数据分析理解方法研究综述;李海生等;《计算机学报》;20200115;全文 *

Also Published As

Publication number Publication date
CN112652059A (en) 2021-04-13

Similar Documents

Publication Publication Date Title
Zhang et al. Learning signed distance field for multi-view surface reconstruction
CN111832655B (en) Multi-scale three-dimensional target detection method based on characteristic pyramid network
GB2553782A (en) Predicting depth from image data using a statistical model
US9443137B2 (en) Apparatus and method for detecting body parts
WO2019071976A1 (en) Panoramic image saliency detection method based on regional growth and eye movement model
CN112800937A (en) Intelligent face recognition method
CN104766275B (en) Sparse disparities figure denseization method and apparatus
CN111311611B (en) Real-time three-dimensional large-scene multi-object instance segmentation method
KR100813168B1 (en) Method for extracting object in digital image with shape prior and system for executing the method
Huang et al. ES-Net: An efficient stereo matching network
Ibrahim et al. MVPCC-Net: multi-view based point cloud completion network for MLS data
CN113570573A (en) Pulmonary nodule false positive eliminating method, system and equipment based on mixed attention mechanism
CN112184731A (en) Multi-view stereo depth estimation method based on antagonism training
CN112652059B (en) Mesh R-CNN model-based improved target detection and three-dimensional reconstruction method
Hou et al. Joint learning of image deblurring and depth estimation through adversarial multi-task network
CN114972882A (en) Wear surface damage depth estimation method and system based on multi-attention machine system
CN110490235B (en) Vehicle object viewpoint prediction and three-dimensional model recovery method and device facing 2D image
CN114241052A (en) Layout diagram-based multi-object scene new visual angle image generation method and system
CN113514053A (en) Method and device for generating sample image pair and method for updating high-precision map
CN113505834A (en) Method for training detection model, determining image updating information and updating high-precision map
CN112837420A (en) Method and system for completing shape of terracotta warriors point cloud based on multi-scale and folding structure
Liu et al. Enhancing point features with spatial information for point-based 3D object detection
Johnston et al. Single View 3D Point Cloud Reconstruction using Novel View Synthesis and Self-Supervised Depth Estimation
Yan et al. Building Extraction at Amodal-Instance-Segmentation Level: Datasets and Framework
Gkeli et al. Automatic 3D reconstruction of buildings roof tops in densely urbanized areas

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant