CN108319957A

CN108319957A - A kind of large-scale point cloud semantic segmentation method based on overtrick figure

Info

Publication number: CN108319957A
Application number: CN201810132821.9A
Authority: CN
Inventors: 夏春秋
Original assignee: Shenzhen Vision Technology Co Ltd
Current assignee: Shenzhen Vision Technology Co Ltd
Priority date: 2018-02-09
Filing date: 2018-02-09
Publication date: 2018-07-24

Abstract

The present invention proposes a kind of large-scale point cloud semantic segmentation method based on overtrick figure, and main contents include：Geometry uniform segmentation, overtrick figure is built, embedded overtrick, semantic segmentation, training and test, its process is, a cloud is first divided into geometry, referred to as overtrick, cloud will be entirely put as input using this unsupervised step, overtrick figure is calculated in geometric zoning, then the dimension of fixed size is selected in each overtrick, descriptor is calculated by embedded vector, finally since the figure of overtrick is smaller than the figure established on original point cloud, pass through the deep learning algorithm accumulated based on picture scroll, classified to its node using abundant edge feature, overtrick is embedded according to the information refinement that extreme edges is transmitted.The present invention solves the problems, such as that the semantic segmentation on large-scale three dimensional point cloud, overtrick figure handle large-scale data on the basis of deep learning frame, and segmentation efficiency is improved while retaining minor detail.

Description

A kind of large-scale point cloud semantic segmentation method based on overtrick figure

Technical field

The present invention relates to semantic segmentation fields, more particularly, to a kind of large-scale point cloud semantic segmentation based on overtrick figure Method.

Background technology

Semantic segmentation is exactly that machine is divided and identifies the content in image automatically, it may be said that is the basic of image understanding Technology is held the balance in automated driving system, unmanned plane application and Wearable application.It is well known that image be by Many pixels composition, and " semantic segmentation " as the term suggests be exactly to be divided pixel according to the difference for expressing semantic meaning in image It cuts, semantic segmentation is an important branch in artificial intelligence field, is in machine vision technique about the important of image understanding One ring detects the obstacles figure such as pedestrian, vehicle or trees and electric pole in automatic Pilot technology in recent years in vehicle-mounted camera As after, background computer can divide the image into classification automatically, and driver is prompted to take corresponding measures to keep clear.In addition, semantic point Medical image can be handled pixel-by-pixel by cutting, by the image that certain semantic segmentation Medical Instruments takes, can detect heart disease, Divide tumour, saprodontia etc., to assist the diagnosis state of an illness.Further, it is also possible to install camera on unmanned plane, unmanned plane passes through Surrounding enviroment are shot, building, plant, the road etc. in environment are split using semantic segmentation technology, to judge Lu Dian.In robot application field, after robot receives instruction, built-in computer starts to call camera shooting periphery object simultaneously Object is identified using image Segmentation Technology, can effectively get around barrier, is reached the instruction destination and is completed task, greatly Ground facilitates people’s lives.Although the research in terms of cloud semantic segmentation is a lot of, be available data scale it is smaller and Structural fuzzy, this leads to convolutional neural networks inefficiency when handling image on irregular data, therefore in large-scale three dimensional point Semantic segmentation on cloud still remains challenge.

The present invention proposes a kind of large-scale point cloud semantic segmentation method based on overtrick figure, using based on deep learning Frame handles the large-scale point cloud semantic segmentation of millions of points.A cloud is divided into geometry, referred to as overtrick first, is utilized This unsupervised step will entirely put cloud as input, calculate overtrick figure in geometric zoning, then be selected in each overtrick The dimension of fixed size calculates descriptor by embedded vector, finally since the figure of overtrick on original point cloud than establishing Figure is small, by the deep learning algorithm accumulated based on picture scroll, is classified to its node using abundant edge feature, overtrick The information refinement insertion transmitted according to extreme edges.The present invention solves the problems, such as the semantic segmentation on large-scale three dimensional point cloud, overtrick figure Large-scale data is handled on the basis of deep learning frame, segmentation efficiency is improved while retaining minor detail.

Invention content

For data scale in semantic segmentation the problem of smaller and structural fuzzy, the purpose of the present invention is to provide a kind of bases In the large-scale point cloud semantic segmentation method of overtrick figure, the extensive of millions of points is handled using the frame based on deep learning Point cloud semantic segmentation.A cloud is first divided into geometry, referred to as overtrick, will entirely put cloud using this unsupervised step makees For input, overtrick figure is calculated in geometric zoning, the dimension of fixed size is then selected in each overtrick, by embedded to gauge It calculates descriptor and passes through the depth accumulated based on picture scroll finally since the figure of overtrick is smaller than the figure established on original point cloud Algorithm is practised, is classified to its node using abundant edge feature, overtrick is embedded according to the information refinement that extreme edges is transmitted.

To solve the above problems, the present invention provides a kind of large-scale point cloud semantic segmentation method based on overtrick figure, master The content is wanted to include：

(1) geometry uniform segmentation；

(2) overtrick figure is built；

(3) embedded overtrick；

(4) semantic segmentation；

(5) training and test.

Wherein, a cloud is divided into geometry, referred to as overtrick by the geometry uniform segmentation, unsupervised using this The step of will entirely put cloud as input, calculate overtrick figure (SPG) in geometric zoning, each node of SPG corresponds to geometrically simple The sub-fraction point cloud of single object, it is contemplated that be semantically it is uniform, parameter by dot cloud be down-sampled to hundreds of points come It indicates.

Further, local geometric complexity can be divided and be adapted to the subregion, Universal Energy model, therefore utilize logical With energy model come computational efficiency, input point cloud C is regarded as the point of one group of n three-dimensional point composition, it is defined by each point i ∈ C The positions 3D p_i, and other observed values are defined as o color or intensity etc._i, for each point, computational geometry feature d_g, For characterizing the shape of its local neighborhood, variable sampling density is compensated by adaptive neighborhood, using the linearity, flatness and Three dimension values are scattered, and introduce verticality, calculate the elevation each put to indicate p after normalization_iZ on entirely input cloud Axial coordinate, geometry uniform segmentation is the constant connection component of optimization problem reconciliation, given by following formula：

Wherein [≠ 0] isFunction, whenWhen, [≠ 0] is equal to 0, other feelings It is equal to 1 under condition, coefficient μ is regularization intensity, and for determining the roughness of institute's scoring area, constant communication component S={ S₁,…, S_kBe equation (1) solution, for defining geometrically simple element.

Wherein, overtrick figure structure, SPG is the structured representation of a cloud, defines a directional properties figureIts node is the set of overtrick S, and the syntople between overtrick is indicated with extreme edges ε, with one group of d_fFeature is noted Release extreme edges： Including the syntople between overtrick, by G_vor=(C, E_vor) to be defined as complete input point cloud symmetrical Voronoi adjacent maps, if E_vorIn there are one edge, then S and T is two adjacent overtricks, and S and T are located at two End：

E_vorTwo overtricks are connected, the important of extreme edges (S, T) is obtained from the edge offset amount δ (S, T) of the two overtricks Correlated characteristic：

δ (S, T)={ (p_i-p_j)|(i,j)∈E_vor∩(S×T)} (3)

Extreme edges feature is exported by comparing the shapes and sizes of adjacent overtrick, is used | S | indicate the point for including in overtrick S Number, λ₁,λ₂,λ₃It indicates that each overtrick includes the position covariance of point, the characteristic length of shape is exported by covariance (S)=λ₁, surface (S)=λ₁λ₂And volume (S)=λ₁λ₂λ₃, by successively decreasing, sequence sorts.

Wherein, the insertion overtrick, in each overtrick S_iIn, select the dimension d of fixed size_z, pass through embedded vector z_i Descriptor is calculated, each overtrick is separately embedded；Select the spot net of a deep learning, in a network, input point It is aligned by spatial alternation network, then by multilayer perceptron independent process, is finally summarized to indicate input shape first, Middle input shape is a simple geometric object, indicates input shape using a small amount of point, and pass through a compact point net Network executes insertion, is n by overtrick fast sampling_p=128, to maintain effectively to calculate in batches and data to be promoted to increase, replace Less than n_pOvertrick sampled, in principle from the point of view of this have no effect on the assessment in spot net maximum pond, however pass through and test table It is bright：Less than or equal to n_minp=40 overtrick can damage the overall performance of network in training, therefore set the insertion of overtrick to Zero, make its semantic information that places one's entire reliance upon of classifying, in order to make spot net learn spatial distribution of different shapes, each overtrick is embedding It is scaled to unit sphere before entering, using their normalization position p '_i, observation o_iWith geometric properties f_iTo indicate a little, to be Coordination shape size, the raw metric diameter of overtrick is as the supplementary features after spot net maximum pond.

Wherein, the semantic segmentation is accumulated since the figure of overtrick is smaller than the figure established on original point cloud based on picture scroll Deep learning algorithm can be classified to its node using abundant edge feature, to promote prolonged interaction to make With.

Further, the segmentation, overtrick are embedded according to the information refinement that extreme edges is transmitted, specifically, each overtrick S_iThe hidden state being maintained in gating cycle unit (GRU) passes through embedded z_iHidden state is initialized, is then used Iteration t=1 ... T processing, in each iteration t, GRU is by its hidden stateWith an input informationAs defeated Enter, and calculates its new hidden stateThe input information of overtrick iBy being used as hiding for adjacent overtrick j after calculating StateWeighted sum, the practical weighting of extreme edges (j, i) depends on its attribute F_ji, pass through the attribute of multilayer perceptron Θ It calculates：

Wherein, ⊙ is Element-Level multiplication, and σ () is sigmoid function, and W. and b. are shared between all GRU train Parameter, in equation (4)The update of expression standard GRU rules gates, in equation (5)It indicates to reset gate, in order to improve Stability in training process, first by the input after linear transformation in equation (8)It is defined as ρ (a)：=(a-mean (a))/(std (a)+∈) then converts the hidden state in equation (7)Wherein ∈ is a smaller constant.

Further, the gate is being inputted by equation (9)Gating information is hidden beforeGRU can be with Input vector is reduced according to its hidden state, Θ returns a weight matrix to execute matrix-vector multiplication for each edge, Although conventional convolution can be carried out by demonstrating it on grid, longer run time can be caused, occupy higher memory And more parameters are generated, therefore a specific weight vector in edge is returned using equation (10), execute Element-Level multiplication.

Further, the hidden state, cascades hidden state, is connected in all time steps and hides shape State, and linear transformation they generate parted pattern y_i, given by following equation：

This makes in final classification, due to the increase of acceptance region, can utilize the dynamic of hidden state.

Wherein, the training and test, for training：It is embedded although the step of geometry is divided is unsupervised Overtrick and semantic segmentation are by the way of intersecting entropy loss supervision, it is assumed that semantic nature is identical, and overtrick is being semantically uniform , between the point that they are included, specify calibration label corresponding with most of labels, the Neighborhood Graph that match exponents is 3 carries out Sampling, so that each SPG at most selections 512 are more than n_minpOvertrick；For test：Complete mark SPG, in order to compensate for due to Randomness caused by the sub-sampling of point cloud in spot net, the parted pattern obtained to 10 operations using different o'clock are put down .

Description of the drawings

Fig. 1 is a kind of system framework figure of the large-scale point cloud semantic segmentation method based on overtrick figure of the present invention.

Fig. 2 is a kind of visible process figure of the large-scale point cloud semantic segmentation method based on overtrick figure of the present invention.

Fig. 3 is a kind of segmentation instance graph of the large-scale point cloud semantic segmentation method based on overtrick figure of the present invention.

Specific implementation mode

It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase It mutually combines, invention is further described in detail in the following with reference to the drawings and specific embodiments.

Fig. 1 is a kind of system framework figure of the large-scale point cloud semantic segmentation method based on overtrick figure of the present invention.Main packet Include geometry uniform segmentation, overtrick figure structure, embedded overtrick, semantic segmentation, training and test.

δ (S, T)={ (p_i-p_j)|(i,j)∈E_vor∩(S×T)} (3)

Wherein, the insertion overtrick, in each overtrick S_iIn, select the dimension d of fixed size_z, pass through embedded vector z_i Descriptor is calculated, each overtrick is separately embedded；Select the spot net of a deep learning, in a network, input point It is aligned by spatial alternation network, then by multilayer perceptron independent process, is finally summarized to indicate input shape first, Middle input shape is a simple geometric object, indicates input shape using a small amount of point, and pass through a compact point net Network executes insertion, is n by overtrick fast sampling_p=128, to maintain effectively to calculate in batches and data to be promoted to increase, replace Less than n_pOvertrick sampled, in principle from the point of view of this have no effect on the assessment in spot net maximum pond, however pass through and test table It is bright：Less than or equal to n_minp=40 overtrick can damage the overall performance of network in training, therefore set the insertion of overtrick to Zero, make its semantic information that places one's entire reliance upon of classifying, in order to make spot net learn spatial distribution of different shapes, each overtrick is embedding It is scaled to unit sphere before entering, using their normalization position p_i', observation o_iWith geometric properties f_iTo indicate a little, to be Coordination shape size, the raw metric diameter of overtrick is as the supplementary features after spot net maximum pond.

Fig. 2 is a kind of visible process figure of the large-scale point cloud semantic segmentation method based on overtrick figure of the present invention.Overtrick The node of figure indicates simple shape, the edge then syntople between Expressive Features.Input point cloud (a) is divided into simple several What shape, referred to as super point；(b) it is geometry division figure；On the basis of pretreated, connected in the extreme edges of each attribute attached Close overtrick constructs overtrick figure (c)；Finally, it is compactly embedded in overtrick, using figure process of convolution peripheral information, and is classified For semantic label, figure (d) indicates semantic segmentation.

Fig. 3 is a kind of segmentation instance graph of the large-scale point cloud semantic segmentation method based on overtrick figure of the present invention.By sweeping It retouches desk and chair obtains the figure, figure (a) expression executes geometric zoning on cloud, then constructs overtrick figure (b), Mei Gechao Grade o'clock executes insertion by a spot net, and network structure refines insertion in GRU shown in figure (c), and generates final label.

For those skilled in the art, the present invention is not limited to the details of above-described embodiment, in the essence without departing substantially from the present invention In the case of refreshing and range, the present invention can be realized in other specific forms.In addition, those skilled in the art can be to this hair Bright to carry out various modification and variations without departing from the spirit and scope of the present invention, these improvements and modifications also should be regarded as the present invention's Protection domain.Therefore, the following claims are intended to be interpreted as including preferred embodiment and falls into all changes of the scope of the invention More and change.

Claims

1. a kind of large-scale point cloud semantic segmentation method based on overtrick figure, which is characterized in that include mainly geometry uniform segmentation (1)；Overtrick figure builds (two)；Embedded overtrick (three)；Semantic segmentation (four)；Training and test (five).

2. based on the geometry uniform segmentation (one) described in claims 1, which is characterized in that a cloud is divided into geometry, Referred to as overtrick will entirely put cloud as input using this unsupervised step, overtrick figure (SPG), SPG are calculated in geometric zoning Each node correspond to the sub-fraction point cloud of geometrically simple object, it is contemplated that being semantically uniform, parameter passes through small Point cloud is down-sampled to hundreds of points to indicate.

3. based on the subregion described in claims 2, which is characterized in that Universal Energy model can be divided and adapt to local geometric Complexity, therefore input point cloud C is regarded as the point of one group of n three-dimensional point composition, by every using Universal Energy model come computational efficiency A point i ∈ C define its position 3D p_i, and other observed values are defined as o color or intensity etc._i, for each point, computational geometry Feature d_g,For characterizing the shape of its local neighborhood, variable sampling density is compensated by adaptive neighborhood, is utilized Three linearity, flatness and scattering dimension values, and verticality is introduced, the elevation each put is calculated, after indicating normalization p_iZ-axis coordinate on entirely input cloud, geometry uniform segmentation are the constant connection components of optimization problem reconciliation, are given by following formula It is fixed：

Wherein [≠ 0] isFunction, whenWhen, [≠ 0] is equal to 0, in the case of other Equal to 1, coefficient μ is regularization intensity, and for determining the roughness of institute's scoring area, constant communication component S={ S₁,…,S_kBe The solution of equation (1), for defining geometrically simple element.

4. building (two) based on the overtrick figure described in claims 1, which is characterized in that SPG is the structured representation of a cloud, Define a directional properties figureIts node is the set of overtrick S, the syntople extreme edges ε between overtrick It indicates, with one group of d_fFeature annotates extreme edges：Including the syntople between overtrick, by G_vor=(C, E_vor) definition For the symmetrical Voronoi adjacent maps of complete input point cloud, if E_vorIn there are one edge, then S and T is two adjacent overtricks, And S and T are located at both ends：

E_vorTwo overtricks are connected, the significant correlation of extreme edges (S, T) is obtained from the edge offset amount δ (S, T) of the two overtricks Feature：

δ (S, T)={ (p_i-p_j)|(i,j)∈E_vor∩(S×T)} (3)

Extreme edges feature is exported by comparing the shapes and sizes of adjacent overtrick, is used | S | indicate the number for the point for including in overtrick S Mesh, λ₁,λ₂,λ₃Indicate each overtrick include point position covariance, by covariance export shape characteristic length (S)= λ₁, surface (S)=λ₁λ₂And volume (S)=λ₁λ₂λ₃, by successively decreasing, sequence sorts.

5. based on the insertion overtrick (three) described in claims 1, which is characterized in that in each overtrick S_iIn, select fixed size Dimension d_z, pass through embedded vector z_iDescriptor is calculated, each overtrick is separately embedded；One deep learning of selection Spot net, in a network, input point are aligned by spatial alternation network first, then by multilayer perceptron independent process, most After summarize to indicate input shape, wherein input shape is a simple geometric object, a small amount of point is utilized to indicate input shape Shape, and insertion is executed by a compact spot net, it is n by overtrick fast sampling_p=128, to maintain effective batch It calculates and data is promoted to increase, replace and be less than n_pOvertrick sampled, in principle from the point of view of this have no effect on spot net maximum pond The assessment of change, however be shown experimentally that：Less than or equal to n_minp=40 overtrick can damage the whole table of network in training It is existing, therefore the insertion of overtrick is set as zero, make its semantic information that places one's entire reliance upon of classifying, in order to make spot net learn different shape Spatial distribution, each overtrick is scaled to unit sphere before embedding, using their normalization position p '_i, observation o_i With geometric properties f_iIndicate a little, in order to coordinate shape size, the raw metric diameter of overtrick as spot net maximum pondization it Supplementary features afterwards.

6. based on the semantic segmentation (four) described in claims 1, which is characterized in that since the figure of overtrick is than on original point cloud The figure of foundation is small, can be divided its node using abundant edge feature based on the deep learning algorithm of picture scroll product Class, to promote prolonged reciprocation.

7. based on the segmentation described in claims 6, which is characterized in that overtrick is embedded according to the information refinement that extreme edges is transmitted, Specifically, each overtrick S_iThe hidden state being maintained in gating cycle unit (GRU) passes through embedded z_iHidden state is carried out Then initialization uses iteration t=1 ... T processing, in each iteration t, GRU is by its hidden stateIt is inputted with one InformationAs input, and calculate its new hidden stateThe input information of overtrick iBy being used as phase after calculating The hidden state of adjacent overtrick jWeighted sum, the practical weighting of extreme edges (j, i) depends on its attribute F_ji, pass through multilayer sense Know that the attribute of device Θ calculates：

Wherein, ⊙ is Element-Level multiplication, and σ () is sigmoid function, W. and b. be shared between all GRU can training parameter, In equation (4)The update of expression standard GRU rules gates, r in equation (5)_i ^(t)It indicates to reset gate, in order to improve training Stability in the process, first by the input after linear transformation in equation (8)It is defined as ρ (a)：=(a-mean (a))/ (std (a)+∈) then converts the hidden state in equation (7)Wherein ∈ is a smaller constant.

8. based on the gate described in claims 7, which is characterized in that inputted by equation (9)Gate letter is hidden before BreathGRU can reduce input vector according to its hidden state, and Θ returns a weight matrix to be executed for each edge Matrix-vector multiplication can lead to longer run time, account for although conventional convolution can be carried out by demonstrating it on grid With higher memory and more parameters are generated, therefore a specific weight vector in edge is returned using equation (10), are held Row element grade multiplication.

9. based on the hidden state described in claims 8, which is characterized in that cascaded to hidden state, in institute's having time Connect hidden state in step, and linear transformation they generate parted pattern y_i, given by following equation：

10. based on described in claims 1 training and test (five), which is characterized in that for training：Although geometry segmentation Step is unsupervised, but embedded overtrick and semantic segmentation are by the way of intersecting entropy loss supervision, it is assumed that semantic nature Identical, overtrick is being semantically uniform, between the point that they are included, specifies calibration label opposite with most of labels It answers, the Neighborhood Graph that match exponents is 3 is sampled, so that each SPG at most selections 512 are more than n_minpOvertrick；For test： Complete mark SPG, in order to compensate for randomness caused by the sub-sampling due to the point cloud in spot net, using different o'clock to 10 times The parted pattern that operation obtains is averaged.