CN113591556A - Three-dimensional point cloud semantic analysis method based on neural network three-body model - Google Patents
Three-dimensional point cloud semantic analysis method based on neural network three-body model Download PDFInfo
- Publication number
- CN113591556A CN113591556A CN202110688525.9A CN202110688525A CN113591556A CN 113591556 A CN113591556 A CN 113591556A CN 202110688525 A CN202110688525 A CN 202110688525A CN 113591556 A CN113591556 A CN 113591556A
- Authority
- CN
- China
- Prior art keywords
- mesoscopic system
- mesoscopic
- point cloud
- level
- semantic analysis
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Processing Or Creating Images (AREA)
Abstract
The invention provides a three-dimensional point cloud semantic analysis method based on a neural network three-body model, which comprises the following steps: the method comprises the following steps of completing the construction of a learning body of laser 3D point cloud data semantic analysis, wherein the construction of the learning body is mainly divided into two major links, namely local space coding and attention mechanism introduction; then, a memory and an interpreter are constructed: and finding tensor units in the mesoscopic system, extracting entity parameters through the mesoscopic system, taking other particle information in the mesoscopic system as a background environment along with the gradual rise of the mesoscopic system in the model, continuously updating the description of the tensor units, and finally predicting the spatial relationship between the whole mesoscopic system group and a single mesoscopic system for each mesoscopic system. The invention provides a three-dimensional point cloud semantic analysis strategy based on a mesoscopic system neural network three-body model, can be applied to large-scene dynamic laser 3D point cloud semantic analysis, and has certain generalization on the development trend of continuously increasing dimensions in the future.
Description
Technical Field
The invention belongs to the field of three-dimensional point cloud semantic analysis of a neural network three-body model, and particularly relates to a three-dimensional point cloud semantic analysis method based on the neural network three-body model.
Background
In recent years, the demands for large scene laser Three-Dimensional (3D) point cloud target identification and tracking technology based on deep learning are increasingly strong in various fields such as industrial detection and intelligent operation. However, due to the characteristics of non-regularization, non-structuring, disorder and the like of the laser point cloud, a conventional Convolutional Neural Network (CNN) cannot be directly applied to such data. Meanwhile, due to fundamental difference between deep learning and human cognitive systems, existence of the black box model can cause fatal problems of lack of interpretability and the like.
Therefore, in order to adapt to the requirements of a complex system, break through the original parallel dimension concept, and based on the concepts of 'ascending dimension' and 'cross-boundary', the systematic research of the 3D point cloud semantic analysis method of the three-body neural network model based on the learner, the memory and the interpreter is mainly completed by introducing the concept and the characteristics of the mesoscopic system.
Disclosure of Invention
In order to solve the technical problem, the invention provides a three-dimensional point cloud semantic analysis method based on a neural network three-body model, which comprises the following steps:
step 1: local spatial coding: encoding spatial geometry information of the 3D point cloud, thereby enabling the network to better learn the spatial geometry from the relative positions of the various points and distance information;
step 2: an attention mechanism is introduced: outputting the relative position of each point and a neighborhood point feature set of distance information, automatically learning and aggregating through an attention mechanism, and further improving the algorithm execution efficiency by adopting more efficient nearest neighbor interpolation in an up-sampling stage in a decoder in consideration of continuous and large down-sampling of the input point cloud;
and step 3: through continuous learning and iteration, the mesoscopic system is promoted to continuously increase the receptive field of each point and promote the characteristic propagation among the neighborhood points, so that the process of evolution from the low-level mesoscopic system to the high-level mesoscopic system is completed, and the construction of a learner is completed;
and 4, step 4: constructing a memory and an interpreter: finding tensor units in the mesoscopic system, extracting entity parameters through the mesoscopic system, taking other particle information in the mesoscopic system as a background environment along with the gradual rise of the mesoscopic system in the model, continuously updating the description of the tensor units, and finally predicting the spatial relationship between the whole mesoscopic system group and a single mesoscopic system for each mesoscopic system;
and 5: in such iteration, the high-level mesoscopic system should have a certain interpretation capability, so that the use of an interpreter has an implicit assumption that one of the interpretations is correct, and in order to find out the correct interpretation, an objective function needs to be selected to ensure that the log-likelihood maximization of the posture which is generated by the high-level mesoscopic system through a mixed model and is observed on the low-level mesoscopic system is ensured;
step 6: when the interpreter is used for reverse propagation, how to instantiate a high-level mesoscopic system is learned, which cannot well explain elements of data, so that an analytic tree needs to be established, and the elements which are best explained can be learned and optimized by obtaining the maximum derivative.
Preferably, in the step 2, automatic learning and aggregation are performed through an attention mechanism, and an attention weight capable of automatically selecting an important feature independently is learned for each point through designing a sharing function, and the finally obtained feature is a weighted sum of the neighborhood feature point sets.
Preferably, the spatial relationship between the whole mesoscopic system group and the single mesoscopic system is predicted for each mesoscopic system in the step 4, each instantiated high-level mesoscopic system predicts the posture for each extracted low-level mesoscopic system from the image, and in the process of predicting the posture, an objective function is selected to ensure that the log-likelihood maximization of the posture, generated by the high-level mesoscopic system through a mixed model and observed on the low-level mesoscopic system, is ensured.
Preferably, the mesoscopic particle is a neuron structure in a neural network, and mainly comprises a logic unit, a matrix unit and a vector unit, wherein the logic unit is mainly used for representing whether the entity exists in the current image, no matter the entity is in any place of the image range covered by the set, the matrix unit is used for representing the spatial relationship between the entity and an observer, or the spatial relationship between an intrinsic coordinate system embedded in the entity and the observer; the vector unit is used to represent information other than the logic unit, the matrix unit.
Compared with the prior art, the invention has the beneficial effects that: the invention provides a three-dimensional point cloud semantic analysis strategy based on a mesoscopic system neural network three-body model, can be applied to large-scene dynamic laser 3D point cloud semantic analysis, and has certain generalization on the development trend of continuously increasing dimensions in the future.
Drawings
FIG. 1 is an analytical roadmap for the present invention.
Detailed Description
The invention is further described below with reference to the accompanying drawings:
example (b):
as shown in fig. 1, the invention provides a three-dimensional point cloud semantic analysis method based on a neural network three-body model, which comprises the following steps:
(1) and (3) completing the construction of a learning body for semantic analysis of laser 3D point cloud data:
the construction of a learning body is mainly divided into two major links, namely local space coding and coding of space geometric shape information of 3D point cloud, so that a network can better learn the space geometric structure from the relative position and distance information of each point; secondly, an attention mechanism is introduced, neighborhood point feature sets of relative positions and distance information of all points are output, automatic learning and aggregation are performed through the attention mechanism, an attention weight value of an important feature can be independently and automatically selected for each point learning through designing a sharing function, finally, the obtained feature is weighted summation of the neighborhood feature set, the input point cloud is considered to be subjected to continuous and large-amplitude down-sampling, through continuous learning and iteration, the mesoscopic system is promoted to continuously increase the receptive field of each point and promote feature propagation among the neighborhood points, the evolution process from the low-level mesoscopic system to the high-level mesoscopic system is further completed, and finally, in an up-sampling stage in a decoder, more efficient nearest neighbor interpolation is adopted, and the algorithm execution efficiency is further improved;
(2) completing the construction of a memory and an interpreter in a three-body neural network model:
firstly, finding out tensor units in the mesoscopic system, then extracting entity parameters through the mesoscopic system, taking other particle information in the mesoscopic system as background environment along with the gradual rise of the mesoscopic system in the model, continuously updating the description of the tensor units, finally, predicting the posture of each instantiated high-level mesoscopic system for each extracted low-level mesoscopic system from the image, such iteration should have a certain interpretation capability in the high-level mesoscopic system, so there is an implicit assumption that using an interpreter, one of them is the correct interpretation, but generally you don't know which is the correct one, for this purpose, an objective function is chosen that ensures the maximum log-likelihood of the poses already observed on the low-level mesoscopic system, generated by the high-level mesoscopic system through the hybrid model; in such iteration, a high-level mesoscopic system should have a certain interpretation capability, and therefore, an implicit assumption is that using an interpreter is that one of the interpretations is a correct interpretation, and in order to find out the correct interpretation, an objective function needs to be selected to ensure that the log-likelihood of the posture which is generated by the high-level mesoscopic system through a mixed model and is observed on the low-level mesoscopic system is maximized, when the interpreter performs reverse propagation, how to instantiate the high-level mesoscopic system is learned, which cannot well interpret the elements of data, so that an analytic tree needs to be established to enable the best interpreted elements to obtain the maximum derivative, that is, learning and optimization can be performed.
Specifically, the mesoscopic particle is a neuron structure in a neural network, and mainly comprises a logic unit, a matrix unit and a vector unit, wherein the logic unit is mainly used for representing whether the entity exists in the current image, and the matrix unit is used for representing the spatial relationship between the entity and an observer or the spatial relationship between an embedded inherent coordinate system of the entity and the observer no matter whether the entity is in any place of the image range covered by the set; the vector unit is used to represent information other than the logic unit, the matrix unit.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.
Claims (4)
1. A three-dimensional point cloud semantic analysis method based on a neural network three-body model is characterized by comprising the following steps:
step 1: local spatial coding: encoding spatial geometry information of the 3D point cloud, thereby enabling the network to better learn the spatial geometry from the relative positions of the various points and distance information;
step 2: an attention mechanism is introduced: outputting the relative position of each point and a neighborhood point feature set of distance information, automatically learning and aggregating through an attention mechanism, and further improving the algorithm execution efficiency by adopting more efficient nearest neighbor interpolation in an up-sampling stage in a decoder in consideration of continuous and large down-sampling of the input point cloud;
and step 3: through continuous learning and iteration, the mesoscopic system is promoted to continuously increase the receptive field of each point and promote the characteristic propagation among the neighborhood points, so that the process of evolution from the low-level mesoscopic system to the high-level mesoscopic system is completed, and the construction of a learner is completed;
and 4, step 4: constructing a memory and an interpreter: finding tensor units in the mesoscopic system, extracting entity parameters through the mesoscopic system, taking other particle information in the mesoscopic system as a background environment along with the gradual rise of the mesoscopic system in the model, continuously updating the description of the tensor units, and finally predicting the spatial relationship between the whole mesoscopic system group and a single mesoscopic system for each mesoscopic system;
and 5: in such iteration, the high-level mesoscopic system should have a certain interpretation capability, so that the use of an interpreter has an implicit assumption that one of the interpretations is correct, and in order to find out the correct interpretation, an objective function needs to be selected to ensure that the log-likelihood maximization of the posture which is generated by the high-level mesoscopic system through a mixed model and is observed on the low-level mesoscopic system is ensured;
step 6: when the interpreter is used for reverse propagation, how to instantiate a high-level mesoscopic system is learned, which cannot well explain elements of data, so that an analytic tree needs to be established, and the elements which are best explained can be learned and optimized by obtaining the maximum derivative.
2. The three-dimensional point cloud semantic analysis method based on the neural network three-body model as claimed in claim 1, wherein in the step 2, automatic learning and aggregation are performed through an attention mechanism, an attention weight capable of automatically selecting important features independently is learned for each point through designing a sharing function, and the finally obtained features are weighted summation of the neighborhood feature point sets.
3. The method as claimed in claim 1, wherein the step 4 of predicting the spatial relationship between the whole mesoscopic system group and the single mesoscopic system for each mesoscopic system predicts the pose for each instantiated high-level mesoscopic system extracted from the image, and selects an objective function in the process of predicting the pose to ensure that the log-likelihood of the pose generated by the high-level mesoscopic system through the hybrid model and observed on the low-level mesoscopic system is maximized.
4. The method as claimed in claim 1, wherein the mesoscopic particle is a neuron structure in a neural network, and mainly comprises three parts, namely a logic unit, a matrix unit and a vector unit, the logic unit is mainly used for indicating whether the entity exists in the current image, and the matrix unit is used for indicating the spatial relationship between the entity and the observer, or the spatial relationship between the intrinsic coordinate system embedded in the entity and the observer, no matter where the entity is in the image range covered by the set; the vector unit is used to represent information other than the logic unit, the matrix unit.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110688525.9A CN113591556A (en) | 2021-06-22 | 2021-06-22 | Three-dimensional point cloud semantic analysis method based on neural network three-body model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110688525.9A CN113591556A (en) | 2021-06-22 | 2021-06-22 | Three-dimensional point cloud semantic analysis method based on neural network three-body model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113591556A true CN113591556A (en) | 2021-11-02 |
Family
ID=78244184
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110688525.9A Pending CN113591556A (en) | 2021-06-22 | 2021-06-22 | Three-dimensional point cloud semantic analysis method based on neural network three-body model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113591556A (en) |
Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107703517A (en) * | 2017-11-03 | 2018-02-16 | 长春理工大学 | Airborne multiple beam optical phased array laser three-dimensional imaging radar system |
CN107886464A (en) * | 2017-11-09 | 2018-04-06 | 哈尔滨工业大学 | A kind of method that point cloud model is generated by two-phase composite material meso-mechanical model |
WO2018065351A1 (en) * | 2016-10-04 | 2018-04-12 | L'oreal | Method for characterizing human skin relief |
CN109345575A (en) * | 2018-09-17 | 2019-02-15 | 中国科学院深圳先进技术研究院 | A kind of method for registering images and device based on deep learning |
WO2019109181A1 (en) * | 2017-12-05 | 2019-06-13 | Simon Fraser University | Methods for analysis of single molecule localization microscopy to define molecular architecture |
CN110009671A (en) * | 2019-02-22 | 2019-07-12 | 南京航空航天大学 | A kind of grid surface reconstructing system of scene understanding |
CN110120097A (en) * | 2019-05-14 | 2019-08-13 | 南京林业大学 | Airborne cloud Semantic Modeling Method of large scene |
CN110276814A (en) * | 2019-06-05 | 2019-09-24 | 上海大学 | A kind of woven composite microscopical structure method for fast reconstruction based on topological characteristic |
CN111279362A (en) * | 2017-10-27 | 2020-06-12 | 谷歌有限责任公司 | Capsule neural network |
CN112308089A (en) * | 2019-07-29 | 2021-02-02 | 西南科技大学 | Attention mechanism-based capsule network multi-feature extraction method |
CN112308137A (en) * | 2020-10-30 | 2021-02-02 | 闽江学院 | Image matching method for aggregating neighborhood points and global features by using attention mechanism |
US20210090302A1 (en) * | 2019-09-24 | 2021-03-25 | Apple Inc. | Encoding Three-Dimensional Data For Processing By Capsule Neural Networks |
CN112818999A (en) * | 2021-02-10 | 2021-05-18 | 桂林电子科技大学 | Complex scene 3D point cloud semantic segmentation method based on convolutional neural network |
CN112883976A (en) * | 2021-02-06 | 2021-06-01 | 罗普特科技集团股份有限公司 | Point cloud based semantic segmentation method, device and system and storage medium |
-
2021
- 2021-06-22 CN CN202110688525.9A patent/CN113591556A/en active Pending
Patent Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018065351A1 (en) * | 2016-10-04 | 2018-04-12 | L'oreal | Method for characterizing human skin relief |
CN111279362A (en) * | 2017-10-27 | 2020-06-12 | 谷歌有限责任公司 | Capsule neural network |
CN107703517A (en) * | 2017-11-03 | 2018-02-16 | 长春理工大学 | Airborne multiple beam optical phased array laser three-dimensional imaging radar system |
CN107886464A (en) * | 2017-11-09 | 2018-04-06 | 哈尔滨工业大学 | A kind of method that point cloud model is generated by two-phase composite material meso-mechanical model |
WO2019109181A1 (en) * | 2017-12-05 | 2019-06-13 | Simon Fraser University | Methods for analysis of single molecule localization microscopy to define molecular architecture |
CN109345575A (en) * | 2018-09-17 | 2019-02-15 | 中国科学院深圳先进技术研究院 | A kind of method for registering images and device based on deep learning |
CN110009671A (en) * | 2019-02-22 | 2019-07-12 | 南京航空航天大学 | A kind of grid surface reconstructing system of scene understanding |
CN110120097A (en) * | 2019-05-14 | 2019-08-13 | 南京林业大学 | Airborne cloud Semantic Modeling Method of large scene |
CN110276814A (en) * | 2019-06-05 | 2019-09-24 | 上海大学 | A kind of woven composite microscopical structure method for fast reconstruction based on topological characteristic |
CN112308089A (en) * | 2019-07-29 | 2021-02-02 | 西南科技大学 | Attention mechanism-based capsule network multi-feature extraction method |
US20210090302A1 (en) * | 2019-09-24 | 2021-03-25 | Apple Inc. | Encoding Three-Dimensional Data For Processing By Capsule Neural Networks |
CN112308137A (en) * | 2020-10-30 | 2021-02-02 | 闽江学院 | Image matching method for aggregating neighborhood points and global features by using attention mechanism |
CN112883976A (en) * | 2021-02-06 | 2021-06-01 | 罗普特科技集团股份有限公司 | Point cloud based semantic segmentation method, device and system and storage medium |
CN112818999A (en) * | 2021-02-10 | 2021-05-18 | 桂林电子科技大学 | Complex scene 3D point cloud semantic segmentation method based on convolutional neural network |
Non-Patent Citations (6)
Title |
---|
LOC HUYNH等: "Mesoscopic Facial Geometry Inference Using Deep Neural Networks", 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION * |
SARA SABOUR等: "Dynamic Routing Between Capsules", PROCEEDINGS OF THE 31ST INTERNATIONAL CONFERENCE ON NEURAL INFORMATION PROCESSING SYSTEMS, pages 3859 * |
THABO BEELER等: "High-Quality Single-Shot Capture of Facial Geometry", ACM TRANSACTIONS ON GRAPHICS, vol. 29, no. 4, XP058157927, DOI: 10.1145/1778765.1778777 * |
YONGHENG ZHAO等: "3D Point Capsule Networks", 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), pages 1 - 14 * |
王华;徐明亮;毛天露;金小刚;王兆其;: "三维汽车群组动画仿真研究综述", 计算机辅助设计与图形学学报, no. 02 * |
顾军华;李炜;董永峰;: "基于点云数据的分割方法综述", 燕山大学学报, no. 02 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Sarker | Deep learning: a comprehensive overview on techniques, taxonomy, applications and research directions | |
WO2016122787A1 (en) | Hyper-parameter selection for deep convolutional networks | |
Wang et al. | An advanced YOLOv3 method for small-scale road object detection | |
EP4028956A1 (en) | Performing xnor equivalent operations by adjusting column thresholds of a compute-in-memory array | |
Chu et al. | Method of image segmentation based on fuzzy C-means clustering algorithm and artificial fish swarm algorithm | |
CN116612288B (en) | Multi-scale lightweight real-time semantic segmentation method and system | |
CN116703947A (en) | Image semantic segmentation method based on attention mechanism and knowledge distillation | |
CN114926636A (en) | Point cloud semantic segmentation method, device, equipment and storage medium | |
Wang et al. | Semaffinet: Semantic-affine transformation for point cloud segmentation | |
US20220301311A1 (en) | Efficient self-attention for video processing | |
Dong et al. | SGOP: Surrogate-assisted global optimization using a Pareto-based sampling strategy | |
CN116152611A (en) | Multistage multi-scale point cloud completion method, system, equipment and storage medium | |
Fang et al. | Sparse point‐voxel aggregation network for efficient point cloud semantic segmentation | |
CN113591556A (en) | Three-dimensional point cloud semantic analysis method based on neural network three-body model | |
KR102234917B1 (en) | Data processing apparatus through neural network learning, data processing method through the neural network learning, and recording medium recording the method | |
US20230031512A1 (en) | Surrogate hierarchical machine-learning model to provide concept explanations for a machine-learning classifier | |
US20210334623A1 (en) | Natural graph convolutions | |
He et al. | ECS-SC: Long-tailed classification via data augmentation based on easily confused sample selection and combination | |
US20240160998A1 (en) | Representing atomic structures as a gaussian process | |
Zhang et al. | Road segmentation using point cloud BEV based on fully convolution network | |
US20220159278A1 (en) | Skip convolutions for efficient video processing | |
Zeng et al. | Feature difference for single‐shot object detection | |
CN113850270B (en) | Semantic scene completion method and system based on point cloud-voxel aggregation network model | |
CN112927248B (en) | Point cloud segmentation method based on local feature enhancement and conditional random field | |
Janovský et al. | On Improving 3D U-net Architecture. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |