CN112365511B - Point cloud segmentation method based on overlapped region retrieval and alignment - Google Patents

Point cloud segmentation method based on overlapped region retrieval and alignment Download PDF

Info

Publication number
CN112365511B
CN112365511B CN202011273571.4A CN202011273571A CN112365511B CN 112365511 B CN112365511 B CN 112365511B CN 202011273571 A CN202011273571 A CN 202011273571A CN 112365511 B CN112365511 B CN 112365511B
Authority
CN
China
Prior art keywords
point cloud
point
original
slice
segmentation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011273571.4A
Other languages
Chinese (zh)
Other versions
CN112365511A (en
Inventor
徐宗懿
王杨滏
黄小水
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN202011273571.4A priority Critical patent/CN112365511B/en
Publication of CN112365511A publication Critical patent/CN112365511A/en
Application granted granted Critical
Publication of CN112365511B publication Critical patent/CN112365511B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24147Distances to closest patterns, e.g. nearest neighbour classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the technical field of three-dimensional point cloud segmentation, in particular to a three-dimensional point cloud segmentation method based on overlapped region retrieval and alignment, which comprises the following steps: and inputting the point cloud data set into a trained point cloud segmentation model for point cloud segmentation. The method directly processes the large scene point cloud, and can learn richer characteristics; the point cloud segmentation model can directly use the characteristic retrieval part to automatically search the reference point cloud according to the input data without providing the reference point cloud; in addition, after the alignment of the two point clouds with the overlapped areas is finished through the strategies of detection, optimization and alignment of the overlapped areas, the KNN algorithm is directly used for realizing the transmission of the labels, so that the edge segmentation effect is better.

Description

Point cloud segmentation method based on overlapped region retrieval and alignment
Technical Field
The invention relates to the technical field of three-dimensional point cloud segmentation, in particular to a three-dimensional point cloud segmentation method based on overlapped region retrieval and alignment.
Background
The point cloud segmentation is to divide the point cloud according to the characteristics of space, geometry, texture and the like, so that the same divided point cloud has similar characteristics, and a better point cloud segmentation method can be conveniently applied in a plurality of later stages. The point cloud segmentation method mainly comprises two types: the first type uses methods such as mathematical model fitting, or region growing, minimum segmentation, Euclidean clustering and the like, the methods are simple and easy to implement, but have poor flexibility, and the segmentation efficiency can be greatly influenced when noise exists in point cloud data; the second method uses a deep learning technology for segmentation, which can effectively improve the accuracy of point cloud segmentation, but consumes memory and time, and has higher requirements on data.
At present, a deep learning algorithm mainly based on a convolutional neural network greatly improves the point cloud segmentation precision. However, when the point cloud scene is too large, the point cloud needs to be sampled regionally first, and then segmentation is realized on the sampled point cloud data, so that part of context information disappears, and meanwhile, direct segmentation cannot be directly performed on the point cloud in the large scene. There is therefore a need for an efficient method of segmenting directly on large scene point clouds.
Disclosure of Invention
In order to solve the above problems, the present invention provides a point cloud segmentation method based on overlapped region retrieval and alignment, which can better solve the problem of directly segmenting large scene point cloud and avoid information loss caused by regionalization.
A point cloud segmentation method based on overlapped region retrieval and alignment comprises the following steps: inputting the point cloud test set into a trained point cloud segmentation model for point cloud segmentation to obtain a segmented result, wherein the point cloud segmentation model is trained and then used, and the training process comprises the following steps:
s1, acquiring a point cloud data set, and dividing the point cloud data into a training set, a verification set and a test set, wherein the point cloud data in the training set is used for training a point cloud segmentation model, the verification set is used for verifying the generalization capability of the point cloud segmentation model, and the test set is used for evaluating the segmentation performance of the point cloud segmentation model;
s2, slicing the point cloud in the training set to obtain slice point cloud P required by model trainingn
S3, point cloud P of slicesnInputting the point cloud into a point cloud segmentation model, and using a full convolution network to perform point cloud P on slicesnSearching an overlapping area with all the original point clouds in the point cloud data set to obtain a point cloud P which is overlapped with the slice pointnAll the overlapped original point clouds are arranged, and the original point cloud O with the maximum overlapping degree is taken as a slice point cloud PnThe reference point cloud of (2); from PnOverlap region with OSelecting a fixed number of interest points, and aligning the overlapping areas of the two point clouds by using a feature-based point cloud alignment algorithm;
s4, completing point cloud segmentation by using a nearest neighbor algorithm;
and S5, after the point cloud segmentation is completed, calculating the total loss in the training process, transmitting the total loss in the training process back to the model according to a chain derivation method to optimize parameters, and optimizing the point cloud segmentation model by using a random gradient descent method to obtain an optimal point cloud segmentation model, namely the trained point cloud segmentation model.
Further, using a full convolution network to complete the slice point cloud PnSearching overlapping areas with all original point clouds in the point cloud data set to obtain point clouds P of slice pointsnAll the original point clouds with overlap specifically include:
s31, extracting slice point cloud P through Unet networknAnd features of each original point cloud, forming a point cloud pair (P, O), where P represents a slice point cloud PnO represents the retrieved optimal original point cloud;
s32, the point cloud pair (P, O) is input to the overlap region detection module, overlap region detection is performed by the overlap region detection module using the nearest neighbor algorithm KNN, and the point pair set (P ', O') in the overlap region is output.
Further, in step S31, slice point cloud P is extracted through the Unet networknAnd the characteristics of each original point cloud to form a point pair (P, O), and the method specifically comprises the following steps:
s311, firstly, the slice point cloud PnInputting the point cloud P into a Unet network, and extracting the slice point cloud P through UnenFeature F of (1);
s312, slice point cloud P is processed through the global pooling layernIs processed to obtain the pooled feature FPSpecifically, it is represented as:
FP=MaxPool(Unet(Pn))
s313, then use Unet and global pooling layer to collect each original point cloud O in the original datasetnN belongs to {0, …, N-1} to extract the characteristic and output the characteristic corresponding to each original point cloud in the original data set
Figure BDA0002778430150000031
OnRepresenting the nth point cloud, wherein N represents the number of the point clouds;
s314, according to the point cloud P of the slicenCharacteristic F ofPFeatures corresponding to an original point cloud
Figure BDA0002778430150000032
Calculating the characteristic projection error, and finding out the characteristic F with the minimum projection error with the characteristic of the slice point cloud from all the original point cloudsOCorresponding original point cloud O, the original point cloud O and slice point cloud PnThe original point cloud O is taken as a slice point cloud PnReference point cloud of, PnThe reference point cloud and the target point cloud P form a point cloud pair (P, O).
Further, the calculation expression of the characteristic projection error is as follows:
Figure BDA0002778430150000033
wherein D is1Representing the original dataset, O representing the point cloud in the original dataset, FPFeatures representing a point cloud of slices, FORepresenting the characteristics of the point cloud in the original dataset.
Further, the calculation expression for the overlap region detection using the nearest neighbor algorithm KNN in step S32 is as follows:
FP=Unet(P)
FO=Unet(O)
(P′,O′)=KNN(FP,FO)
wherein, (P, O) is an input point cloud pair of the overlap region detection module, and (P ', O') is an output of the overlap region detection module, and represents a point pair set in the overlap region; fPFeatures representing a point cloud of slices, FOFeatures of the point cloud in the original dataset are represented, Unet (. multidot.) represents the output of the Unet network, and KNN (. multidot.) represents the nearest neighbor algorithm operation.
Further, after the overlapping area detection module outputs the point pair set (P ', O') in the overlapping area, the following optimization steps are also included:
s33, optimizing an overlapping area: removing the noise point pairs by using an overlapping area optimization module to obtain optimized point pairs (P ', O');
s34, solving a rotation matrix M ═ R | t according to the optimized point pair (P ', O')]Performing spatial transformation on the reference point cloud O to obtain an aligned point cloud
Figure BDA0002778430150000034
Namely, it is
Figure BDA0002778430150000035
Where M denotes a rotation matrix, R denotes a rotation amount in the rotation matrix, and t denotes a translation amount in the rotation matrix.
Further, the overlap region optimization specifically includes:
s331, splicing the point pair set (P ', O ') obtained by the overlapping area detection module into an n × 6 matrix, wherein n is the number of corresponding points, outputting an n × 1 vector through another Unet network, wherein each value in the vector is a weight and represents a certain point pair (P 'i,o′i) E (P ', O'), i e belongs to the probability that {0, …, n-1} belongs to the overlapping region;
s332, defining a hyper-parameter clip _ weights, reserving point pairs with weights larger than or equal to the clip _ weights, obtaining an optimized point pair set (P ', O'), wherein (P ', O'), namely the point pair set really belongs to an overlapping region, and calculating an expression as follows:
weights=Unet(concatence(P′,O′))
Figure BDA0002778430150000041
wherein, (P ', O') is a noisy point pair set output by the overlap region detection module, conference (P ', O') is a splicing operation, i.e. the point pair set is spliced together according to a certain axis, clip _ weight is a hyper-parameter used for denoising, and (P ", O") is an optimized point pair set output by the overlap region optimization module.
Further, the total loss includes overlap region loss and point cloud alignment loss:
the overlap area loss is:
Figure BDA0002778430150000042
wherein, y is the true value,
Figure BDA0002778430150000043
in order to predict the value of the target,
Figure BDA0002778430150000044
a sigmoid function is adopted, and M is the number of the overlapping area points;
loss of point cloud alignment as true rotation matrix [ R | t]And prediction
Figure BDA0002778430150000045
Euclidean distance between:
Figure BDA0002778430150000046
where R represents the true rotational transformation,
Figure BDA0002778430150000047
representing the predicted rotation transformation, t represents the true translation,
Figure BDA0002778430150000048
representing the predicted amount of translation;
the total loss function L is: l ═ Lregion+Laligment
Further, slicing the point cloud comprises: in the original point cloud OnN is equal to {0, …, N-1}, and a random number rand is determined on each of X, Y and Z axesiE [0,1), i e { X, Y, Z }, and finding the original point cloud O on each axisnMaximum value of (a) maxiWith a minimum value aMiniIf the point cloud O belongs to i E { X, Y, Z }, determining the point cloudnAny point p satisfies the following formula: (p)i-aMini)/(aMaxi-aMini)>randi,i∈{X,Y,Z}
The points P satisfying the above formula are combined into a slice point cloud PnN ∈ {0, …, N-1}, where N represents the number of raw point clouds in the training set.
The invention has the beneficial effects that:
1. the present invention differs from other methods of data preprocessing. The traditional method has a limit on the number of input points, such as 4096 points or 1024 points, which results in that the preprocessing is divided into two parts: the method comprises the steps of firstly segmenting a large scene, then obtaining a fixed number of points through sampling, and inputting the points, wherein the sampling can lose part of geometric features, so that a segmentation result of the traditional method has a large error. In the invention, the model does not limit the number of input points, and the data preprocessing only needs to carry out segmentation, so that the method not only can directly process the point cloud of a large scene, but also can learn richer characteristics, and avoids inaccurate results caused by the loss of part of characteristics in sampling.
2. According to the point cloud segmentation method, the reference point cloud is not required to be provided, and the point cloud segmentation model can directly use the characteristic retrieval part to automatically search the reference point cloud according to the input data.
3. The segmentation methods used in the invention are different, and the traditional slicing method easily causes the learning ability of the model to the edge features to be reduced, for example, if only a part of a certain object is reserved during block segmentation, the object cannot learn the robust features during feature learning, and the edge segmentation effect is poor. After the alignment of the two point clouds with the overlapped areas is finished through the strategies of overlapped area detection, optimization and alignment, the KNN algorithm is directly used for realizing the transmission of the labels, so that the edge segmentation effect is better.
Drawings
The present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.
Fig. 1 is a flowchart of a three-dimensional point cloud segmentation method based on overlapping area retrieval and alignment according to this embodiment;
fig. 2 is a diagram of a structure of a pnet network in the embodiment of the present invention;
fig. 3 is a schematic flowchart of a specific process of a three-dimensional point cloud segmentation model based on overlapping region retrieval and alignment according to an embodiment of the present invention;
fig. 4 is a schematic diagram illustrating an exemplary structure of a three-dimensional point cloud segmentation model based on overlapping region retrieval and alignment according to an embodiment of the present invention;
FIG. 5 is an example of the output of overlap region detection;
FIG. 6 is a sample output of the overlap region optimization module;
FIG. 7 is a sample output with overlapping regions aligned;
fig. 8 is an output sample of point cloud segmentation.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The method is a point cloud segmentation method based on overlapped region retrieval and alignment, and segmentation is completed based on a point cloud segmentation model. The point cloud segmentation model is trained and then used, as shown in fig. 1, the training process of the point cloud segmentation model includes, but is not limited to, the following steps:
and S1, selecting a point cloud data set. Two large indoor data sets selected in the invention are respectively S3DIS and ScanNet. The S3DIS includes 6 areas, 271 rooms, and 13 categories, and each point has label information. In the data preprocessing stage, each room is divided into random blocks, so that point cloud data pairs are obtained. ScanNet is also a common point cloud segmentation data set, which includes room data generated by real reconstruction, wherein 1513 rooms are divided into 20 classes and 1 empty class.
The method comprises the steps of dividing point cloud data into a training set, a verification set and a test set, wherein the point cloud data in the training set is used for training a point cloud segmentation model, the verification set is used for verifying the generalization ability of the point cloud segmentation model, and the test set is used for evaluating the segmentation performance of the point cloud segmentation model. For the S3DIS dataset, 1, 2, 3, 4 areas were used for training, 6 areas were used for validation, and 5 areas were used for testing. For the ScanNet dataset, the training set, validation set, and test set ratios were 1101: 100: 312.
and S2, slicing the point cloud in the training set to obtain the large scene point cloud required by model training.
The specific implementation process of slicing comprises the following steps:
assuming that N original point clouds exist in the training set, and aiming at the original point cloud scene O in the training setnN is from {0, …, N-1}, slicing on the original point cloud OnN is equal to {0, …, N-1}, and a random number rand is determined on each axis of X, Y and Z axesiE [0,1), i e { X, Y, Z }, and finding the original point cloud O on each axisnMaximum value of (a max)iI ∈ { X, Y, Z } with a minimum value aMiniIf the point cloud O belongs to i E { X, Y, Z }, determining the point cloudnAny point p in (1) satisfies:
(pi-aMini)/(aMaxi-aMini)>randi,i∈{X,Y,Z},
the points P satisfying the above formula are combined into a slice point cloud PnN ∈ {0, …, N-1}, which together form a new dataset, called a slice dataset, that holds the same number as the original dataset, i.e., N point clouds.
S3, point cloud P of slicesnInputting the point cloud into a point cloud segmentation model, and using a full convolution network to perform point cloud P on slicesnSearching an overlapping area with all the original point clouds in the point cloud data set to obtain a point cloud P which is overlapped with the slice pointnAll the original point clouds with the overlap are arranged, and the original point cloud O with the maximum overlap degree is arrangedtAs a slice point cloud PnFig. 3-4 show a schematic flow chart and a schematic model structure of the point cloud segmentation model.
S31. Searching for an overlapping area: extracting slice point cloud P through Unet networknAnd features of each original point cloud, forming a point cloud pair (P, O), where P represents a slice point cloud PnAnd O represents the retrieved optimal original point cloud.
S311, firstly, the slice point cloud PnInputting the point cloud P into a Unet network, and extracting the slice point cloud P through the Unet networknFeature F of (1);
s312, slice point cloud P is processed through the global pooling layernIs processed to obtain the pooled feature FPSpecifically, it is represented as:
FP=MaxPool(Unet(Pn))
s313, then use Unet and global pooling layer to collect each original point cloud O in the original datasetnN belongs to {0, …, N-1} to extract the characteristic and output the characteristic corresponding to each original point cloud in the original data set
Figure BDA0002778430150000071
OnRepresenting the nth point cloud, wherein N represents the number of the point clouds;
s314, according to the point cloud P of the slicenCharacteristic F ofPFeatures corresponding to an original point cloud
Figure BDA0002778430150000072
Calculating the characteristic projection error, and finding out the characteristic F with the minimum projection error with the characteristic of the slice point cloud from all the original point cloudsOCorresponding original point cloud O, the original point cloud O and slice point cloud PnThe original point cloud O is taken as a slice point cloud PnReference point cloud of, PnThe reference point cloud O and the target point cloud P form a point cloud pair (P, O).
The computational expression of the characteristic projection error is:
Figure BDA0002778430150000081
wherein D is1Representing the original dataset, O representing the point cloud in the original dataset, FPPresentation sliceCharacteristics of the point cloud, FORepresenting the characteristics of a point cloud in the original dataset.
The specific implementation mode of the Unet network comprises the following steps: the input to the Unet network is a point cloud with a size of NxD, N is the number of points in the point cloud, D is the dimensional value of the points, and it is common that D ═ 3 corresponds to XYZ coordinates or D ═ 6 corresponds to XYZ coordinates and color channels RGB. Conv and Deconv operations are performed on inputs in a Unet network. Conv in the Unet network is a convolution operation, which performs a linear transformation by convolving check inputs, and in general, the convolution operation increases the number of channels and reduces the number of points, i.e., N1 × 64 — Conv (N × D), where N > -N1 and D < 64. DeConv is the inverse operation of convolution, called deconvolution, used to restore features to corresponding points of the point cloud. Fig. 2 is a diagram of a structure of a Unet network in this embodiment, and a dotted line in fig. 2 indicates a skip connect operation, in which two parts are spliced and then used as an input of a next stage.
S32, detection of overlapping area: extracting the characteristics of each point cloud through the Unet network and obtaining a point cloud pair, then inputting the point cloud pair into an overlapping area detection module, and performing overlapping area detection in the overlapping area detection module by using a nearest neighbor algorithm (KNN) to obtain a point pair set (P ', O') in the overlapping area.
In the nearest neighbor algorithm (KNN), K is set to 1, and only one nearest neighbor is found. Inputting a set of point cloud pairs (P, O) to the overlap region detection module, and outputting a set of point pair sets (P ', O') in the overlap region, which are mathematically expressed as:
FP=Unet(P)
FO=Unet(O)
(P′,O′)=KNN(FP,FO)
where (P, O) is the input point cloud pair of the overlap region detection module, (P ', O') is the output of the overlap region detection module, i.e. (P ', O') represents the point pair set in the overlap region, FPFeatures representing a point cloud of slices, FOFeatures of the point cloud in the original dataset are represented, Unet (x) represents the output of the Unet network, KNN (x) represents the nearest neighbor algorithm operation.
However, the overlap area obtained by KNN detection only is noisy, and fig. 5 shows an example of the output of the overlap area detection module in this embodiment, so in a preferred embodiment, the overlap area optimization module is used to perform optimization processing on the output result of the overlap area detection module, where the optimization processing step includes steps S33 and S34:
s33, optimizing an overlapping area: after obtaining the point pair set in the overlap region, removing noise point pairs by using an overlap region optimization module to obtain an optimized point pair set (P ", O"), wherein the specific implementation process comprises the following steps:
s331, splicing the point pair set (P ', O ') obtained by the overlapping area detection module into a matrix of nx6 (n is the number of corresponding points), and outputting a vector of nx1 through another Unet network, wherein each value in the vector is a weight and represents a certain point pair (P 'i,o′i) E (P ', O'), i e {0, …, n-1} probability of belonging to an overlapping region.
S332, defining a hyper-parameter clip _ weights, aiming at removing noise point pairs in an overlapping area, and specifically operating as follows: only the point pairs with weights greater than or equal to clip _ weights are reserved, and an optimized point pair set (P ", O"), namely the point pair set really belonging to the overlapping region, is obtained, and is specifically represented as follows:
weights=Unet(concatence(P′,O′))
Figure BDA0002778430150000091
wherein (P ', O') is a noisy set of point pairs output by the overlap region detection module, coherence (P ', O') is a splicing operation, i.e., the set of point pairs are spliced together according to a certain axis, clip _ weight is a hyper-parameter used for denoising, and (P ", O") is an optimized set of point pairs output by the overlap region optimization module.
After the overlap region optimization is completed, the optimized point pairs (P ", O") may be misaligned, with some translation and rotation in space. The transformation of a rigid object in space includes two parts of rotation and translation,the rotation matrix represents the transformation relation between two rigid bodies in space, and is composed of a rotation amount R and a translation amount t, wherein the rotation amount R is a 3x3 matrix representing the rotation transformation, and the translation amount t is a 1x3 vector representing the translation transformation. Suppose RB2Is composed of RB1Transformed by first performing a rotational transformation, i.e. RB, during the transformation1Obtaining RB 'by left multiplying R'2And then implementing a translation transformation, namely RB'2Addition of t to obtain RB2And can therefore be expressed as:
RB2=RB′2+t=R*RB1+t
wherein RB'2The state is only after the rotation.
Therefore, a point cloud alignment operation is required to solve the rotation matrix M for performing the alignment of the overlapping region, as shown in fig. 6, which is an output sample of the overlapping region optimization module in this embodiment.
S34, alignment of overlapping regions: and according to the optimized point pair (P ', O'), carrying out overlapping region alignment by using a point cloud alignment method, and solving a rotation matrix M ═ R | t.
In one embodiment, the method for aligning the overlapping areas by using a point cloud alignment method, and the specific implementation process for solving the rotation matrix M includes: first, the characteristic F ″ ' is obtained by passing (P ', O ') through an automatic Encoder (Encoder)P,F″OThen, the features F ″' are each decoded using an automatic Decoder (Decoder)P,F″ODecoding is performed by an auto Decoder (Decoder) to make the features learned by the auto Encoder (Encoder) more efficient. Will feature F ″P,F″OAnd inputting the rotation matrix into the T-estimator, and solving the rotation matrix M by the T-estimator through a method of minimizing projection errors. The concrete expression is as follows:
Figure BDA0002778430150000101
where M denotes a rotation matrix, R denotes a rotation amount in the rotation matrix, and t denotes a translation amount in the rotation matrix, as shown in fig. 7, an output sample of the overlap region alignment module.
Obtaining a rotation matrix M ═ R | t by completing the overlap region alignment]Then, the reference point cloud O is subjected to space transformation to obtain an aligned point cloud
Figure BDA0002778430150000102
Namely that
Figure BDA0002778430150000103
And S4, completing point cloud segmentation by using a nearest neighbor algorithm.
Aligning the point clouds by nearest neighbor algorithm (KNN)
Figure BDA0002778430150000106
The label in P is transferred to point cloud P, and the label of point P in P is replaced by the label of point P
Figure BDA0002778430150000104
Nearest neighbor of the middle
Figure BDA0002778430150000107
The label of (1). The function L (-) is defined as a label function for obtaining labels. The concrete expression is as follows:
Figure BDA0002778430150000105
Figure BDA0002778430150000111
the specific implementation process of the nearest neighbor algorithm comprises the following steps: assuming that two point clouds P and Q exist, N points exist in the point clouds P, M points exist in the point clouds Q, the distance between a point P in the point clouds P and all the points in the point clouds Q is calculated, and if the distance between a point Q in the point clouds Q and the point P in the point clouds P is minimum, the point Q is marked as the nearest neighbor point of the point P. The mathematical representation is:
Figure BDA0002778430150000112
fig. 8 shows a final segmentation output sample of the model.
S5, after point cloud segmentation is completed, calculating the loss of the overlapping area and the point cloud alignment loss according to a loss function; and calculating the sum of the loss of the overlapping area and the loss of the point cloud alignment to obtain the total loss.
The overlap region loss refers to an error generated by the overlap region optimization. Overlap region detection is defined as a supervised binary classification problem, with the label of a point p being 1 when it is in the overlap region and 0 otherwise, so that the overlap region loss is a binary cross entropy loss defined as:
Figure BDA0002778430150000113
wherein, y is the true value,
Figure BDA0002778430150000114
in order to predict the value of the target,
Figure BDA0002778430150000115
m is the number of overlapping region points for a sigmoid function.
The point cloud alignment loss refers to the loss of the rotation matrix M in the alignment process between the training point cloud and the point cloud found under the optimal retrieval strategy. Loss of point cloud alignment as true rotation matrix [ R | t]And predicted rotation matrix
Figure BDA0002778430150000116
Euclidean distance therebetween:
Figure BDA0002778430150000117
wherein R is the true rotation value;
Figure BDA0002778430150000118
is a predicted rotation value; t is the true translation value;
Figure BDA0002778430150000119
is the predicted translation value.
Calculating the sum of the loss of the overlapping area and the point cloud alignment loss to obtain the total loss, wherein the total loss function L is as follows:
L=Lregion+Laligment
the total loss L in the training process is transmitted back to the model according to a chain type derivation rule in the calculus to optimize parameters of the Unet network, and the parameters w of a certain layer of the Unet in the overlapping region optimization moduleiFor example, in order to optimize the parameter, the gradient generated by the loss function L needs to be transmitted back to the layer, and then the parameter is optimized toward the negative gradient direction, and the chain derivation is specifically expressed as:
Figure BDA0002778430150000121
wherein, Δ wiFor the loss function L to the parameter wiThe resulting overall gradient, theta, represents the partial differential,
Figure BDA0002778430150000123
representing the total loss function L versus the overlap area loss LregionThe resulting gradient, in a similar manner,
Figure BDA0002778430150000124
for the loss of L in the overlap regionregionThe gradient generated for the sigmoid function, …,
Figure BDA0002778430150000125
to predict the label
Figure BDA0002778430150000126
For parameter wiThe resulting gradient.
The specific expression of gradient optimization is:
wi=wi-ηΔwi
where η is the learning rate defined in the network to control the step size of the gradient descent.
The optimization strategy selects the random gradient descent method, a group of data is randomly selected by the method to calculate the gradient return updating parameters, the optimization efficiency is high, the return is stopped after 100 times of training, and finally the trained point cloud segmentation model is obtained.
Verifying by using the slice point cloud in the verification data set; and testing the precision of point cloud segmentation of the point cloud segmentation model on the test set.
When the trained point cloud segmentation model is used, the input is a point cloud, and the output is the segmentation result of the point cloud.
According to the point cloud segmentation method based on the overlapped region retrieval and alignment, after the point clouds of two overlapped regions are aligned through the strategies of overlapped region detection, optimization and alignment, the KNN algorithm is directly used for realizing the transmission of the labels, so that the edge segmentation effect is more accurate.
It should be noted that, as one of ordinary skill in the art would understand, all or part of the processes of the above method embodiments may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when executed, the computer program may include the processes of the above method embodiments. The storage medium may be a magnetic disk, an optical disk, a Read-0nly Memory (ROM), a Random Access Memory (RAM), or the like.
The foregoing is directed to embodiments of the present invention and it will be appreciated by those skilled in the art that changes may be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (6)

1. A point cloud segmentation method based on overlapped region retrieval and alignment is characterized by comprising the following steps: inputting the point cloud test set into a trained point cloud segmentation model for point cloud segmentation to obtain a segmented result, wherein the point cloud segmentation model is trained and then used, and the training process comprises the following steps:
s1, acquiring a point cloud data set, and dividing the point cloud data into a training set, a verification set and a test set, wherein the point cloud data in the training set is used for training a point cloud segmentation model, the verification set is used for verifying the generalization capability of the point cloud segmentation model, and the test set is used for evaluating the segmentation performance of the point cloud segmentation model;
s2, slicing the point cloud in the training set to obtain slice point cloud P required by model trainingn
Slicing the point cloud comprises: in the original point cloud OnN is equal to {0, …, N-1}, and a random number rand is determined on each of X, Y and Z axesiE [0,1), i e { X, Y, Z }, and finding the original point cloud O on each axisnMaximum value of (a) maxiWith a minimum value aMiniI belongs to { X, Y, Z }, if the point cloud O is a point cloudnAny point p satisfies the following formula: (p)i-aMini)/(aMaxi-aMini)>randiIf i belongs to { X, Y, Z }, combining the points P satisfying the above formula into a slice point cloud PnN belongs to {0, …, N-1}, wherein N represents the number of original point clouds in the training set;
s3, point cloud P of slicesnInputting the point cloud into a point cloud segmentation model, and using a full convolution network to perform point cloud P on slicesnSearching overlapping areas with all original point clouds in the point cloud data set to obtain point clouds P of slice pointsnAll the overlapped original point clouds are arranged, and the original point cloud O with the maximum overlapping degree is taken as a slice point cloud PnThe reference point cloud of (2); from PnSelecting a fixed number of interest points in the overlapping area with the point O, and aligning the overlapping areas of the two point clouds by using a feature-based point cloud alignment algorithm;
the use of a full convolution network to cloud the slice point PnSearching overlapping areas with all original point clouds in the point cloud data set to obtain point clouds P of slice pointsnAll the overlapped original point clouds specifically include:
s31, extracting slice point cloud P through Unet networknAnd features of each original point cloud, forming a point cloud pair (P, O), where P represents a slice point cloud PnO represents the optimal primitive of the searchPoint cloud;
s32, inputting the point cloud pair (P, O) into an overlapping area detection module, performing overlapping area detection in the overlapping area detection module by using a nearest neighbor algorithm KNN, and outputting a point pair set (P ', O') in the overlapping area;
s33, optimizing the overlapping area: removing the noise point pairs by using an overlapping area optimization module to obtain optimized point pairs (P ', O');
s34, solving a rotation matrix M ═ R | t according to the optimized point pair (P ', O')]Performing spatial transformation on the reference point cloud O to obtain an aligned point cloud
Figure FDA0003595478180000021
Namely, it is
Figure FDA0003595478180000022
Wherein M represents a rotation matrix, R represents a rotation amount in the rotation matrix, and t represents a translation amount in the rotation matrix;
s4, completing point cloud segmentation by using a nearest neighbor algorithm;
and S5, after the point cloud segmentation is completed, calculating the total loss in the training process, transmitting the total loss in the training process back to the model according to a chain derivation method to optimize parameters, and optimizing the point cloud segmentation model by using a random gradient descent method to obtain an optimal point cloud segmentation model, namely the trained point cloud segmentation model.
2. The method for point cloud segmentation based on overlapped region retrieval and alignment as claimed in claim 1, wherein the sliced point cloud P is extracted through Unet network in step S31nAnd the characteristics of each original point cloud to form a point pair (P, O), and the method specifically comprises the following steps:
s311, firstly, the slice point cloud PnInputting the point cloud P into a Unet network, and extracting the slice point cloud P through the Unet networknFeature F of (1);
s312, slice point cloud P is processed through global pooling layernIs processed to obtain the pooled feature FPSpecifically, it is represented as:
FP=MaxPool(Unet(Pn))
s313, then use Unet and global pooling layer to collect each original point cloud O in the original datasetnN belongs to {0, …, N-1} to extract the characteristic and output the characteristic corresponding to each original point cloud in the original data set
Figure FDA0003595478180000023
OnRepresenting the nth point cloud, wherein N represents the number of the point clouds;
s314, according to the point cloud P of the slicenCharacteristic F ofPFeatures corresponding to the original point cloud
Figure FDA0003595478180000024
Calculating the characteristic projection error, and finding out the characteristic F with the minimum projection error with the point cloud characteristic of the slice point from all the original point cloudsOCorresponding original point cloud O, the original point cloud O and slice point cloud PnThe original point cloud O is taken as a slice point cloud PnReference point cloud of, PnThe reference point cloud and the target point cloud P form a point cloud pair (P, O).
3. The method of claim 2, wherein the computing expression of the characteristic projection error is:
Figure FDA0003595478180000031
wherein D is1Representing the original dataset, O representing the point cloud in the original dataset, FPFeatures representing a point cloud of slices, FORepresenting the characteristics of the point cloud in the original dataset.
4. The method for point cloud segmentation based on overlap region retrieval and alignment as claimed in claim 1, wherein the calculation expression for overlap region detection using nearest neighbor algorithm KNN in step S32 is as follows:
FP=Unet(P)
FO=Unet(O)
(P′,O′)=KNN(FP,FO)
wherein, (P, O) is an input point cloud pair of the overlap region detection module, and (P ', O') is an output of the overlap region detection module, and represents a point pair set in the overlap region; fPFeatures representing a point cloud of slices, FOFeatures of the point cloud in the original dataset are represented, Unet (. multidot.) represents the output of the Unet network, and KNN (. multidot.) represents the nearest neighbor algorithm operation.
5. The method of claim 1, wherein the optimization of the overlap area comprises:
s331, splicing the point pair sets (P ', O') obtained by the overlapping region detection module into an n × 6 matrix, wherein n is the number of corresponding points, outputting an n × 1 vector through another Unet network, and each value in the vector is a weight and represents a certain point pair (P)i′,oi') epsilon (P ', O '), i epsilon {0, …, n-1} probability of belonging to the overlapping region;
s332, defining a hyper-parameter clip _ weights, reserving point pairs with weights larger than or equal to the clip _ weights, obtaining an optimized point pair set (P ', O'), wherein (P ', O'), namely the point pair set really belongs to an overlapping region, and calculating an expression as follows:
weights=Unet(concatence(P′,O′))
Figure FDA0003595478180000032
wherein, (P ', O') is a noisy point pair set output by the overlap region detection module, conference (P ', O') is a splicing operation, i.e. the point pair set is spliced together according to a certain axis, clip _ weight is a hyper-parameter used for denoising, and (P ", O") is an optimized point pair set output by the overlap region optimization module.
6. The method of claim 1, wherein the total loss comprises an overlap region loss and a point cloud alignment loss:
the overlap area loss is:
Figure FDA0003595478180000041
wherein, y is the true value,
Figure FDA0003595478180000042
in order to predict the value of the target,
Figure FDA0003595478180000043
the method is a sigmoid function, and M is the number of overlapping area points;
loss of point cloud alignment as true rotation matrix [ R | t]And prediction
Figure FDA0003595478180000044
Euclidean distance between:
Figure FDA0003595478180000045
where R represents the true rotational transformation,
Figure FDA0003595478180000046
representing the predicted rotation transformation, t represents the true translation,
Figure FDA0003595478180000047
representing the predicted amount of translation;
the total loss function L is: l ═ Lregion+Laligment
CN202011273571.4A 2020-11-14 2020-11-14 Point cloud segmentation method based on overlapped region retrieval and alignment Active CN112365511B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011273571.4A CN112365511B (en) 2020-11-14 2020-11-14 Point cloud segmentation method based on overlapped region retrieval and alignment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011273571.4A CN112365511B (en) 2020-11-14 2020-11-14 Point cloud segmentation method based on overlapped region retrieval and alignment

Publications (2)

Publication Number Publication Date
CN112365511A CN112365511A (en) 2021-02-12
CN112365511B true CN112365511B (en) 2022-06-10

Family

ID=74514799

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011273571.4A Active CN112365511B (en) 2020-11-14 2020-11-14 Point cloud segmentation method based on overlapped region retrieval and alignment

Country Status (1)

Country Link
CN (1) CN112365511B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113112509B (en) * 2021-04-12 2023-07-04 深圳思谋信息科技有限公司 Image segmentation model training method, device, computer equipment and storage medium
CN113343765B (en) * 2021-05-11 2022-07-22 武汉大学 Scene retrieval method and system based on point cloud rigid registration
CN113139991A (en) * 2021-05-13 2021-07-20 电子科技大学 3D point cloud registration method based on overlapping region mask prediction
CN117291845B (en) * 2023-11-27 2024-03-19 成都理工大学 Point cloud ground filtering method, system, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105469103A (en) * 2014-09-11 2016-04-06 清华大学 Scene recovery method and device based on low-quality GRB-D data
CN105844629A (en) * 2016-03-21 2016-08-10 河南理工大学 Automatic segmentation method for point cloud of facade of large scene city building
CN109767464A (en) * 2019-01-11 2019-05-17 西南交通大学 A kind of point cloud registration method of low Duplication
CN110211129A (en) * 2019-05-17 2019-09-06 西安财经学院 Low covering point cloud registration algorithm based on region segmentation

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9760996B2 (en) * 2015-08-11 2017-09-12 Nokia Technologies Oy Non-rigid registration for large-scale space-time 3D point cloud alignment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105469103A (en) * 2014-09-11 2016-04-06 清华大学 Scene recovery method and device based on low-quality GRB-D data
CN105844629A (en) * 2016-03-21 2016-08-10 河南理工大学 Automatic segmentation method for point cloud of facade of large scene city building
CN109767464A (en) * 2019-01-11 2019-05-17 西南交通大学 A kind of point cloud registration method of low Duplication
CN110211129A (en) * 2019-05-17 2019-09-06 西安财经学院 Low covering point cloud registration algorithm based on region segmentation

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Xiaoshui Huang 等."Feature-Metric Registration: A Fast Semi-Supervised Approach for Robust Point Cloud Registration Without Correspondences".《2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)》.2020, *
王帅 等."适用于激光点云配准的重叠区域提取方法".《红外与激光工程》.2017,第46卷(第S1期), *
王开鑫 等."点云分割匹配的三维重建算法".《长春理工大学学报(自然科学版)》.2020,第43卷(第4期), *

Also Published As

Publication number Publication date
CN112365511A (en) 2021-02-12

Similar Documents

Publication Publication Date Title
CN112365511B (en) Point cloud segmentation method based on overlapped region retrieval and alignment
CN110930454B (en) Six-degree-of-freedom pose estimation algorithm based on boundary box outer key point positioning
CN111489358B (en) Three-dimensional point cloud semantic segmentation method based on deep learning
WO2019178702A1 (en) Systems and methods for polygon object annotation and a method of training an object annotation system
CN110032925B (en) Gesture image segmentation and recognition method based on improved capsule network and algorithm
CN110660062A (en) Point cloud instance segmentation method and system based on PointNet
CN111968138B (en) Medical image segmentation method based on 3D dynamic edge insensitivity loss function
CN111612008A (en) Image segmentation method based on convolution network
CN114943963A (en) Remote sensing image cloud and cloud shadow segmentation method based on double-branch fusion network
CN113822284B (en) RGBD image semantic segmentation method based on boundary attention
CN112330699B (en) Three-dimensional point cloud segmentation method based on overlapping region alignment
CN112232134B (en) Human body posture estimation method based on hourglass network and attention mechanism
CN111126385A (en) Deep learning intelligent identification method for deformable living body small target
CN114419413A (en) Method for constructing sensing field self-adaptive transformer substation insulator defect detection neural network
CN112215079B (en) Global multistage target tracking method
CN115439694A (en) High-precision point cloud completion method and device based on deep learning
CN115311502A (en) Remote sensing image small sample scene classification method based on multi-scale double-flow architecture
CN112329871A (en) Pulmonary nodule detection method based on self-correction convolution and channel attention mechanism
CN114387512B (en) Remote sensing image building extraction method based on multi-scale feature fusion and enhancement
CN110717405A (en) Face feature point positioning method, device, medium and electronic equipment
CN111339342B (en) Three-dimensional model retrieval method based on angle ternary center loss
CN116758219A (en) Region-aware multi-view stereo matching three-dimensional reconstruction method based on neural network
CN116844004A (en) Point cloud automatic semantic modeling method for digital twin scene
CN115331021A (en) Dynamic feature extraction and description method based on multilayer feature self-difference fusion
CN114022458A (en) Skeleton detection method and device, electronic equipment and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant