CN113591556A - Three-dimensional point cloud semantic analysis method based on neural network three-body model - Google Patents

Three-dimensional point cloud semantic analysis method based on neural network three-body model Download PDF

Info

Publication number
CN113591556A
CN113591556A CN202110688525.9A CN202110688525A CN113591556A CN 113591556 A CN113591556 A CN 113591556A CN 202110688525 A CN202110688525 A CN 202110688525A CN 113591556 A CN113591556 A CN 113591556A
Authority
CN
China
Prior art keywords
mesoscopic system
mesoscopic
point cloud
level
semantic analysis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110688525.9A
Other languages
Chinese (zh)
Inventor
胡奇
王春阳
段锦
翟朗
田嘉政
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changchun University of Science and Technology
Original Assignee
Changchun University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changchun University of Science and Technology filed Critical Changchun University of Science and Technology
Priority to CN202110688525.9A priority Critical patent/CN113591556A/en
Publication of CN113591556A publication Critical patent/CN113591556A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The invention provides a three-dimensional point cloud semantic analysis method based on a neural network three-body model, which comprises the following steps: the method comprises the following steps of completing the construction of a learning body of laser 3D point cloud data semantic analysis, wherein the construction of the learning body is mainly divided into two major links, namely local space coding and attention mechanism introduction; then, a memory and an interpreter are constructed: and finding tensor units in the mesoscopic system, extracting entity parameters through the mesoscopic system, taking other particle information in the mesoscopic system as a background environment along with the gradual rise of the mesoscopic system in the model, continuously updating the description of the tensor units, and finally predicting the spatial relationship between the whole mesoscopic system group and a single mesoscopic system for each mesoscopic system. The invention provides a three-dimensional point cloud semantic analysis strategy based on a mesoscopic system neural network three-body model, can be applied to large-scene dynamic laser 3D point cloud semantic analysis, and has certain generalization on the development trend of continuously increasing dimensions in the future.

Description

Three-dimensional point cloud semantic analysis method based on neural network three-body model
Technical Field
The invention belongs to the field of three-dimensional point cloud semantic analysis of a neural network three-body model, and particularly relates to a three-dimensional point cloud semantic analysis method based on the neural network three-body model.
Background
In recent years, the demands for large scene laser Three-Dimensional (3D) point cloud target identification and tracking technology based on deep learning are increasingly strong in various fields such as industrial detection and intelligent operation. However, due to the characteristics of non-regularization, non-structuring, disorder and the like of the laser point cloud, a conventional Convolutional Neural Network (CNN) cannot be directly applied to such data. Meanwhile, due to fundamental difference between deep learning and human cognitive systems, existence of the black box model can cause fatal problems of lack of interpretability and the like.
Therefore, in order to adapt to the requirements of a complex system, break through the original parallel dimension concept, and based on the concepts of 'ascending dimension' and 'cross-boundary', the systematic research of the 3D point cloud semantic analysis method of the three-body neural network model based on the learner, the memory and the interpreter is mainly completed by introducing the concept and the characteristics of the mesoscopic system.
Disclosure of Invention
In order to solve the technical problem, the invention provides a three-dimensional point cloud semantic analysis method based on a neural network three-body model, which comprises the following steps:
step 1: local spatial coding: encoding spatial geometry information of the 3D point cloud, thereby enabling the network to better learn the spatial geometry from the relative positions of the various points and distance information;
step 2: an attention mechanism is introduced: outputting the relative position of each point and a neighborhood point feature set of distance information, automatically learning and aggregating through an attention mechanism, and further improving the algorithm execution efficiency by adopting more efficient nearest neighbor interpolation in an up-sampling stage in a decoder in consideration of continuous and large down-sampling of the input point cloud;
and step 3: through continuous learning and iteration, the mesoscopic system is promoted to continuously increase the receptive field of each point and promote the characteristic propagation among the neighborhood points, so that the process of evolution from the low-level mesoscopic system to the high-level mesoscopic system is completed, and the construction of a learner is completed;
and 4, step 4: constructing a memory and an interpreter: finding tensor units in the mesoscopic system, extracting entity parameters through the mesoscopic system, taking other particle information in the mesoscopic system as a background environment along with the gradual rise of the mesoscopic system in the model, continuously updating the description of the tensor units, and finally predicting the spatial relationship between the whole mesoscopic system group and a single mesoscopic system for each mesoscopic system;
and 5: in such iteration, the high-level mesoscopic system should have a certain interpretation capability, so that the use of an interpreter has an implicit assumption that one of the interpretations is correct, and in order to find out the correct interpretation, an objective function needs to be selected to ensure that the log-likelihood maximization of the posture which is generated by the high-level mesoscopic system through a mixed model and is observed on the low-level mesoscopic system is ensured;
step 6: when the interpreter is used for reverse propagation, how to instantiate a high-level mesoscopic system is learned, which cannot well explain elements of data, so that an analytic tree needs to be established, and the elements which are best explained can be learned and optimized by obtaining the maximum derivative.
Preferably, in the step 2, automatic learning and aggregation are performed through an attention mechanism, and an attention weight capable of automatically selecting an important feature independently is learned for each point through designing a sharing function, and the finally obtained feature is a weighted sum of the neighborhood feature point sets.
Preferably, the spatial relationship between the whole mesoscopic system group and the single mesoscopic system is predicted for each mesoscopic system in the step 4, each instantiated high-level mesoscopic system predicts the posture for each extracted low-level mesoscopic system from the image, and in the process of predicting the posture, an objective function is selected to ensure that the log-likelihood maximization of the posture, generated by the high-level mesoscopic system through a mixed model and observed on the low-level mesoscopic system, is ensured.
Preferably, the mesoscopic particle is a neuron structure in a neural network, and mainly comprises a logic unit, a matrix unit and a vector unit, wherein the logic unit is mainly used for representing whether the entity exists in the current image, no matter the entity is in any place of the image range covered by the set, the matrix unit is used for representing the spatial relationship between the entity and an observer, or the spatial relationship between an intrinsic coordinate system embedded in the entity and the observer; the vector unit is used to represent information other than the logic unit, the matrix unit.
Compared with the prior art, the invention has the beneficial effects that: the invention provides a three-dimensional point cloud semantic analysis strategy based on a mesoscopic system neural network three-body model, can be applied to large-scene dynamic laser 3D point cloud semantic analysis, and has certain generalization on the development trend of continuously increasing dimensions in the future.
Drawings
FIG. 1 is an analytical roadmap for the present invention.
Detailed Description
The invention is further described below with reference to the accompanying drawings:
example (b):
as shown in fig. 1, the invention provides a three-dimensional point cloud semantic analysis method based on a neural network three-body model, which comprises the following steps:
(1) and (3) completing the construction of a learning body for semantic analysis of laser 3D point cloud data:
the construction of a learning body is mainly divided into two major links, namely local space coding and coding of space geometric shape information of 3D point cloud, so that a network can better learn the space geometric structure from the relative position and distance information of each point; secondly, an attention mechanism is introduced, neighborhood point feature sets of relative positions and distance information of all points are output, automatic learning and aggregation are performed through the attention mechanism, an attention weight value of an important feature can be independently and automatically selected for each point learning through designing a sharing function, finally, the obtained feature is weighted summation of the neighborhood feature set, the input point cloud is considered to be subjected to continuous and large-amplitude down-sampling, through continuous learning and iteration, the mesoscopic system is promoted to continuously increase the receptive field of each point and promote feature propagation among the neighborhood points, the evolution process from the low-level mesoscopic system to the high-level mesoscopic system is further completed, and finally, in an up-sampling stage in a decoder, more efficient nearest neighbor interpolation is adopted, and the algorithm execution efficiency is further improved;
(2) completing the construction of a memory and an interpreter in a three-body neural network model:
firstly, finding out tensor units in the mesoscopic system, then extracting entity parameters through the mesoscopic system, taking other particle information in the mesoscopic system as background environment along with the gradual rise of the mesoscopic system in the model, continuously updating the description of the tensor units, finally, predicting the posture of each instantiated high-level mesoscopic system for each extracted low-level mesoscopic system from the image, such iteration should have a certain interpretation capability in the high-level mesoscopic system, so there is an implicit assumption that using an interpreter, one of them is the correct interpretation, but generally you don't know which is the correct one, for this purpose, an objective function is chosen that ensures the maximum log-likelihood of the poses already observed on the low-level mesoscopic system, generated by the high-level mesoscopic system through the hybrid model; in such iteration, a high-level mesoscopic system should have a certain interpretation capability, and therefore, an implicit assumption is that using an interpreter is that one of the interpretations is a correct interpretation, and in order to find out the correct interpretation, an objective function needs to be selected to ensure that the log-likelihood of the posture which is generated by the high-level mesoscopic system through a mixed model and is observed on the low-level mesoscopic system is maximized, when the interpreter performs reverse propagation, how to instantiate the high-level mesoscopic system is learned, which cannot well interpret the elements of data, so that an analytic tree needs to be established to enable the best interpreted elements to obtain the maximum derivative, that is, learning and optimization can be performed.
Specifically, the mesoscopic particle is a neuron structure in a neural network, and mainly comprises a logic unit, a matrix unit and a vector unit, wherein the logic unit is mainly used for representing whether the entity exists in the current image, and the matrix unit is used for representing the spatial relationship between the entity and an observer or the spatial relationship between an embedded inherent coordinate system of the entity and the observer no matter whether the entity is in any place of the image range covered by the set; the vector unit is used to represent information other than the logic unit, the matrix unit.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (4)

1. A three-dimensional point cloud semantic analysis method based on a neural network three-body model is characterized by comprising the following steps:
step 1: local spatial coding: encoding spatial geometry information of the 3D point cloud, thereby enabling the network to better learn the spatial geometry from the relative positions of the various points and distance information;
step 2: an attention mechanism is introduced: outputting the relative position of each point and a neighborhood point feature set of distance information, automatically learning and aggregating through an attention mechanism, and further improving the algorithm execution efficiency by adopting more efficient nearest neighbor interpolation in an up-sampling stage in a decoder in consideration of continuous and large down-sampling of the input point cloud;
and step 3: through continuous learning and iteration, the mesoscopic system is promoted to continuously increase the receptive field of each point and promote the characteristic propagation among the neighborhood points, so that the process of evolution from the low-level mesoscopic system to the high-level mesoscopic system is completed, and the construction of a learner is completed;
and 4, step 4: constructing a memory and an interpreter: finding tensor units in the mesoscopic system, extracting entity parameters through the mesoscopic system, taking other particle information in the mesoscopic system as a background environment along with the gradual rise of the mesoscopic system in the model, continuously updating the description of the tensor units, and finally predicting the spatial relationship between the whole mesoscopic system group and a single mesoscopic system for each mesoscopic system;
and 5: in such iteration, the high-level mesoscopic system should have a certain interpretation capability, so that the use of an interpreter has an implicit assumption that one of the interpretations is correct, and in order to find out the correct interpretation, an objective function needs to be selected to ensure that the log-likelihood maximization of the posture which is generated by the high-level mesoscopic system through a mixed model and is observed on the low-level mesoscopic system is ensured;
step 6: when the interpreter is used for reverse propagation, how to instantiate a high-level mesoscopic system is learned, which cannot well explain elements of data, so that an analytic tree needs to be established, and the elements which are best explained can be learned and optimized by obtaining the maximum derivative.
2. The three-dimensional point cloud semantic analysis method based on the neural network three-body model as claimed in claim 1, wherein in the step 2, automatic learning and aggregation are performed through an attention mechanism, an attention weight capable of automatically selecting important features independently is learned for each point through designing a sharing function, and the finally obtained features are weighted summation of the neighborhood feature point sets.
3. The method as claimed in claim 1, wherein the step 4 of predicting the spatial relationship between the whole mesoscopic system group and the single mesoscopic system for each mesoscopic system predicts the pose for each instantiated high-level mesoscopic system extracted from the image, and selects an objective function in the process of predicting the pose to ensure that the log-likelihood of the pose generated by the high-level mesoscopic system through the hybrid model and observed on the low-level mesoscopic system is maximized.
4. The method as claimed in claim 1, wherein the mesoscopic particle is a neuron structure in a neural network, and mainly comprises three parts, namely a logic unit, a matrix unit and a vector unit, the logic unit is mainly used for indicating whether the entity exists in the current image, and the matrix unit is used for indicating the spatial relationship between the entity and the observer, or the spatial relationship between the intrinsic coordinate system embedded in the entity and the observer, no matter where the entity is in the image range covered by the set; the vector unit is used to represent information other than the logic unit, the matrix unit.
CN202110688525.9A 2021-06-22 2021-06-22 Three-dimensional point cloud semantic analysis method based on neural network three-body model Pending CN113591556A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110688525.9A CN113591556A (en) 2021-06-22 2021-06-22 Three-dimensional point cloud semantic analysis method based on neural network three-body model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110688525.9A CN113591556A (en) 2021-06-22 2021-06-22 Three-dimensional point cloud semantic analysis method based on neural network three-body model

Publications (1)

Publication Number Publication Date
CN113591556A true CN113591556A (en) 2021-11-02

Family

ID=78244184

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110688525.9A Pending CN113591556A (en) 2021-06-22 2021-06-22 Three-dimensional point cloud semantic analysis method based on neural network three-body model

Country Status (1)

Country Link
CN (1) CN113591556A (en)

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107703517A (en) * 2017-11-03 2018-02-16 长春理工大学 Airborne multiple beam optical phased array laser three-dimensional imaging radar system
CN107886464A (en) * 2017-11-09 2018-04-06 哈尔滨工业大学 A kind of method that point cloud model is generated by two-phase composite material meso-mechanical model
WO2018065351A1 (en) * 2016-10-04 2018-04-12 L'oreal Method for characterizing human skin relief
CN109345575A (en) * 2018-09-17 2019-02-15 中国科学院深圳先进技术研究院 A kind of method for registering images and device based on deep learning
WO2019109181A1 (en) * 2017-12-05 2019-06-13 Simon Fraser University Methods for analysis of single molecule localization microscopy to define molecular architecture
CN110009671A (en) * 2019-02-22 2019-07-12 南京航空航天大学 A kind of grid surface reconstructing system of scene understanding
CN110120097A (en) * 2019-05-14 2019-08-13 南京林业大学 Airborne cloud Semantic Modeling Method of large scene
CN110276814A (en) * 2019-06-05 2019-09-24 上海大学 A kind of woven composite microscopical structure method for fast reconstruction based on topological characteristic
CN111279362A (en) * 2017-10-27 2020-06-12 谷歌有限责任公司 Capsule neural network
CN112308089A (en) * 2019-07-29 2021-02-02 西南科技大学 Attention mechanism-based capsule network multi-feature extraction method
CN112308137A (en) * 2020-10-30 2021-02-02 闽江学院 Image matching method for aggregating neighborhood points and global features by using attention mechanism
US20210090302A1 (en) * 2019-09-24 2021-03-25 Apple Inc. Encoding Three-Dimensional Data For Processing By Capsule Neural Networks
CN112818999A (en) * 2021-02-10 2021-05-18 桂林电子科技大学 Complex scene 3D point cloud semantic segmentation method based on convolutional neural network
CN112883976A (en) * 2021-02-06 2021-06-01 罗普特科技集团股份有限公司 Point cloud based semantic segmentation method, device and system and storage medium

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018065351A1 (en) * 2016-10-04 2018-04-12 L'oreal Method for characterizing human skin relief
CN111279362A (en) * 2017-10-27 2020-06-12 谷歌有限责任公司 Capsule neural network
CN107703517A (en) * 2017-11-03 2018-02-16 长春理工大学 Airborne multiple beam optical phased array laser three-dimensional imaging radar system
CN107886464A (en) * 2017-11-09 2018-04-06 哈尔滨工业大学 A kind of method that point cloud model is generated by two-phase composite material meso-mechanical model
WO2019109181A1 (en) * 2017-12-05 2019-06-13 Simon Fraser University Methods for analysis of single molecule localization microscopy to define molecular architecture
CN109345575A (en) * 2018-09-17 2019-02-15 中国科学院深圳先进技术研究院 A kind of method for registering images and device based on deep learning
CN110009671A (en) * 2019-02-22 2019-07-12 南京航空航天大学 A kind of grid surface reconstructing system of scene understanding
CN110120097A (en) * 2019-05-14 2019-08-13 南京林业大学 Airborne cloud Semantic Modeling Method of large scene
CN110276814A (en) * 2019-06-05 2019-09-24 上海大学 A kind of woven composite microscopical structure method for fast reconstruction based on topological characteristic
CN112308089A (en) * 2019-07-29 2021-02-02 西南科技大学 Attention mechanism-based capsule network multi-feature extraction method
US20210090302A1 (en) * 2019-09-24 2021-03-25 Apple Inc. Encoding Three-Dimensional Data For Processing By Capsule Neural Networks
CN112308137A (en) * 2020-10-30 2021-02-02 闽江学院 Image matching method for aggregating neighborhood points and global features by using attention mechanism
CN112883976A (en) * 2021-02-06 2021-06-01 罗普特科技集团股份有限公司 Point cloud based semantic segmentation method, device and system and storage medium
CN112818999A (en) * 2021-02-10 2021-05-18 桂林电子科技大学 Complex scene 3D point cloud semantic segmentation method based on convolutional neural network

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
LOC HUYNH等: "Mesoscopic Facial Geometry Inference Using Deep Neural Networks", 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION *
SARA SABOUR等: "Dynamic Routing Between Capsules", PROCEEDINGS OF THE 31ST INTERNATIONAL CONFERENCE ON NEURAL INFORMATION PROCESSING SYSTEMS, pages 3859 *
THABO BEELER等: "High-Quality Single-Shot Capture of Facial Geometry", ACM TRANSACTIONS ON GRAPHICS, vol. 29, no. 4, XP058157927, DOI: 10.1145/1778765.1778777 *
YONGHENG ZHAO等: "3D Point Capsule Networks", 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), pages 1 - 14 *
王华;徐明亮;毛天露;金小刚;王兆其;: "三维汽车群组动画仿真研究综述", 计算机辅助设计与图形学学报, no. 02 *
顾军华;李炜;董永峰;: "基于点云数据的分割方法综述", 燕山大学学报, no. 02 *

Similar Documents

Publication Publication Date Title
Sarker Deep learning: a comprehensive overview on techniques, taxonomy, applications and research directions
WO2016122787A1 (en) Hyper-parameter selection for deep convolutional networks
Wang et al. An advanced YOLOv3 method for small-scale road object detection
EP4028956A1 (en) Performing xnor equivalent operations by adjusting column thresholds of a compute-in-memory array
Chu et al. Method of image segmentation based on fuzzy C-means clustering algorithm and artificial fish swarm algorithm
CN116612288B (en) Multi-scale lightweight real-time semantic segmentation method and system
CN116703947A (en) Image semantic segmentation method based on attention mechanism and knowledge distillation
CN114926636A (en) Point cloud semantic segmentation method, device, equipment and storage medium
Wang et al. Semaffinet: Semantic-affine transformation for point cloud segmentation
US20220301311A1 (en) Efficient self-attention for video processing
Dong et al. SGOP: Surrogate-assisted global optimization using a Pareto-based sampling strategy
CN116152611A (en) Multistage multi-scale point cloud completion method, system, equipment and storage medium
Fang et al. Sparse point‐voxel aggregation network for efficient point cloud semantic segmentation
CN113591556A (en) Three-dimensional point cloud semantic analysis method based on neural network three-body model
KR102234917B1 (en) Data processing apparatus through neural network learning, data processing method through the neural network learning, and recording medium recording the method
US20230031512A1 (en) Surrogate hierarchical machine-learning model to provide concept explanations for a machine-learning classifier
US20210334623A1 (en) Natural graph convolutions
He et al. ECS-SC: Long-tailed classification via data augmentation based on easily confused sample selection and combination
US20240160998A1 (en) Representing atomic structures as a gaussian process
Zhang et al. Road segmentation using point cloud BEV based on fully convolution network
US20220159278A1 (en) Skip convolutions for efficient video processing
Zeng et al. Feature difference for single‐shot object detection
CN113850270B (en) Semantic scene completion method and system based on point cloud-voxel aggregation network model
CN112927248B (en) Point cloud segmentation method based on local feature enhancement and conditional random field
Janovský et al. On Improving 3D U-net Architecture.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination