CN113191361B

CN113191361B - A Shape Recognition Method

Info

Publication number: CN113191361B
Application number: CN202110418108.2A
Authority: CN
Inventors: 杨剑宇; 李一凡; 闵睿朋; 黄瑶
Original assignee: Suzhou University
Current assignee: Suzhou University
Priority date: 2021-04-19
Filing date: 2021-04-19
Publication date: 2023-08-01
Anticipated expiration: 2041-04-19
Also published as: CN113191361A

Abstract

The present invention proposes a shape recognition method, which extracts contour key points of shape samples; defines approximate offset curvature values at each key point and judges the concavity and convexity of key points to obtain candidate segmentation points; adjusts the curvature screening threshold to obtain the shape Segmentation points; calculate the minimum segmentation cost for shape segmentation, and obtain several sub-shape parts; construct the topology structure of shape samples; use the full-scale visual representation method of the shape to obtain the feature expression image of the corresponding sub-shape part; input each feature expression image into the volume Construct the feature matrix of the shape sample; construct the graph convolutional neural network; train the graph convolutional neural network to obtain the feature matrix and adjacency matrix of the test sample, and input it to In the trained graph convolutional network model, shape classification and recognition are realized.

Description

A shape recognition method

技术领域Technical Field

本发明涉及一种形状识别方法，属于形状识别技术领域。The invention relates to a shape recognition method and belongs to the technical field of shape recognition.

背景技术Background Art

轮廓形状识别是机器视觉领域的一个重要研究方向，利用物体形状特征进行目标识别是机器视觉的主要研究课题，这项研究的主要成果是通过改进形状匹配算法或设计有效的形状描述符来充分提取目标形状特征用以进行更好的相似性度量。这在工程中得到了广泛应用，如雷达、红外成像检测、图像及视频的匹配与检索、机器人自动导航、场景语义分割、纹理识别和数据挖掘等多个领域中。Contour shape recognition is an important research direction in the field of machine vision. Using object shape features for target recognition is the main research topic of machine vision. The main result of this research is to fully extract the target shape features by improving the shape matching algorithm or designing effective shape descriptors for better similarity measurement. This has been widely used in engineering, such as radar, infrared imaging detection, image and video matching and retrieval, robot automatic navigation, scene semantic segmentation, texture recognition and data mining.

通常，对于轮廓形状的表达和检索基于手工设计的形状描述子来提取目标轮廓特征，如Shape Contexts，Shape Vocabulary和Bag of contour fragments等。但是通过手工描述子提取所得的形状信息通常不完备，无法保证描述子对目标形状的局部变化、遮挡和整体变形等变化具有鲁棒性。而设计过多种描述子则会导致特征提取冗余，计算复杂度较高。因此，识别准确率和效率较低。近年来，随着卷积神经网络在图像识别任务中取得较好的成绩，其在形状识别任务开始得以应用。而由于轮廓形状缺少表面纹理、色彩等图像具备的信息，直接应用卷积神经网络的识别效果较低。Usually, the expression and retrieval of contour shapes are based on manually designed shape descriptors to extract target contour features, such as Shape Contexts, Shape Vocabulary, and Bag of contour fragments. However, the shape information extracted by manual descriptors is usually incomplete, and it cannot be guaranteed that the descriptors are robust to changes in local changes, occlusions, and overall deformation of the target shape. Designing too many descriptors will lead to redundant feature extraction and high computational complexity. Therefore, the recognition accuracy and efficiency are low. In recent years, as convolutional neural networks have achieved good results in image recognition tasks, they have begun to be applied in shape recognition tasks. However, since contour shapes lack information such as surface texture and color that images have, the recognition effect of directly applying convolutional neural networks is low.

针对以上形状识别算法的问题，如何提供一种能够对目标轮廓形状进行准确分类的目标识别方法，是目前本领域技术人员亟待解决的问题。In view of the above problems of shape recognition algorithms, how to provide a target recognition method that can accurately classify the target contour shape is a problem that urgently needs to be solved by those skilled in the art.

发明内容Summary of the invention

本发明是为解决现有技术中的问题而提出的，技术方案如下，The present invention is proposed to solve the problems in the prior art, and the technical solution is as follows:

一种形状识别方法，该方法包括以下步骤：A shape recognition method, the method comprising the following steps:

步骤一、提取形状样本的轮廓关键点；Step 1: Extract the contour key points of the shape sample;

步骤二、定义各关键点处的近似偏置曲率值并判断关键点处的曲线凹凸性，以获取候选形状分割点；Step 2: define the approximate bias curvature value at each key point and determine the concavity and convexity of the curve at the key point to obtain candidate shape segmentation points;

步骤三、调整曲率筛选阈值，得到形状分割点；Step 3: Adjust the curvature screening threshold to obtain the shape segmentation point;

步骤四、基于分割线段位于形状内且互相不交叉的原则进行形状分割，并以最小分割代价分割得到若干子形状部分；Step 4: segment the shape based on the principle that the segmentation line segments are located within the shape and do not cross each other, and obtain several sub-shape parts with the minimum segmentation cost;

步骤五、构建形状样本的拓扑结构；Step 5: construct the topological structure of the shape sample;

步骤六、使用形状的全尺度可视化表示方法，得到对应子形状部分的特征表达图像；Step 6: Use the full-scale visualization method of the shape to obtain a feature expression image of the corresponding sub-shape part;

步骤七、将各特征表达图像输入卷积神经网络进行训练，学习得到各子形状部分的特征向量；Step 7: Input each feature expression image into the convolutional neural network for training, and learn to obtain the feature vector of each sub-shape part;

步骤八、构造形状样本的特征矩阵；Step 8: construct a feature matrix of shape samples;

步骤九、构建图卷积神经网络；Step 9: Construct a graph convolutional neural network;

步骤十、训练图卷积神经网络，对测试样本进行形状分割，获取各子形状部分的特征向量，计算测试样本的特征矩阵和邻接矩阵，并输入至训练好的图卷积网络模型中，实现形状分类识别。Step 10: Train the graph convolutional neural network, perform shape segmentation on the test sample, obtain the feature vector of each sub-shape part, calculate the feature matrix and adjacency matrix of the test sample, and input them into the trained graph convolutional network model to realize shape classification and recognition.

优选的，所述步骤一中，提取轮廓关键点的方法为：Preferably, in step 1, the method for extracting contour key points is:

每一个形状样本的轮廓是由一系列抽样点组成的，对于任一形状样本S来说，对轮廓抽样n个点得到：The contour of each shape sample is composed of a series of sampling points. For any shape sample S, sampling n points of the contour yields:

S＝{(p_x(i)，p_y(i))|i∈[1，n]}，S={(p _x (i), p _y (i))|i∈[1, n]},

其中，p_x(i)，p_y(i)为轮廓抽样点p(i)在二维平面内的横、纵坐标，n为轮廓长度，即轮廓抽样点的个数；Where p _x (i), p _y (i) are the horizontal and vertical coordinates of the contour sampling point p (i) in the two-dimensional plane, and n is the contour length, that is, the number of contour sampling points;

对形状样本的轮廓曲线进行演化来提取关键点，在每一次演化过程中，对目标识别起到贡献最小的点被删除，其中每个点p(i)的贡献定义为：The contour curve of the shape sample is evolved to extract key points. In each evolution process, the point that contributes the least to target recognition is deleted, where the contribution of each point p(i) is defined as:

其中，h(i，i-1)为点p(i)与p(i-1)间的曲线长度，h(i，i+1)为点p(i)与p(i+1)间的曲线长度，H₁(i)为线段p(i)p(i-1)与线段p(i)p(i+1)间的角度，长度h根据轮廓周长进行归一化；Con(i)值越大表示该点p(i)对形状特征的贡献越大；Among them, h(i, i-1) is the length of the curve between points p(i) and p(i-1), h(i, i+1) is the length of the curve between points p(i) and p(i+1), H ₁ (i) is the angle between line segment p(i)p(i-1) and line segment p(i)p(i+1), and the length h is normalized according to the contour perimeter; the larger the Con(i) value, the greater the contribution of the point p(i) to the shape feature;

本方法引用了一个基于区域的自适应结束函数F(t)克服轮廓关键点提取过多或过少的问题：This method uses a region-based adaptive end function F(t) to overcome the problem of extracting too many or too few contour key points:

其中S₀为原始形状的面积，S_i为经过i次演变后的形状面积，n₀为原始形状轮廓上的总点数；当此结束函数值F(t)超过设定阈值后，轮廓关键点提取结束并得到n^*个轮廓关键点。Where _S0 is the area of the original shape, _S1 is the area of the shape after i evolutions, and _n0 is the total number of points on the original shape contour; when the end function value F(t) exceeds the set threshold, the contour key point extraction ends and n ^* contour key points are obtained.

进一步的，所述步骤二中，定义各关键点处的近似偏置曲率值并判断关键点处的曲线凹凸性，以获取候选分割点的具体方法为：Furthermore, in step 2, the specific method of defining the approximate bias curvature value at each key point and judging the concavity and convexity of the curve at the key point to obtain the candidate segmentation point is:

为了计算形状样本S中任意一处关键点p(i)的近似偏置曲率值，取p(i)前后临近的轮廓点p(i-ε),p(i+ε)，其中ε为经验取值；由于：In order to calculate the approximate bias curvature value of any key point p(i) in the shape sample S, take the contour points p(i-ε) and p(i+ε) before and after p(i), where ε is an empirical value; because:

cosHε(i)∝cur(p(i))，cosHε(i)∝cur(p(i)),

其中，Hε(i)为线段p(i)p(i-ε)与线段p(i)p(i+ε)间的角度，cur(p(i))为点p(i)处的曲率；定义点p(i)处的近似偏置曲率值cur～(p(i))为：Where Hε(i) is the angle between the line segment p(i)p(i-ε) and the line segment p(i)p(i+ε), cur(p(i)) is the curvature at point p(i); the approximate biased curvature value cur～(p(i)) at point p(i) is defined as:

cur～(p(i))＝cosHε(i)+1，cur～(p(i))＝cosHε(i)+1，

其中Hε(i)为线段p(i)p(i-ε)与线段p(i)p(i+ε)间的角度，cosHε(i)取值范围在-1到1之间，cur～(p(i))的取值范围在0到2之间；Where Hε(i) is the angle between the line segment p(i)p(i-ε) and the line segment p(i)p(i+ε), cosHε(i) ranges from -1 to 1, and cur～(p(i)) ranges from 0 to 2;

根据符合视觉自然性的形状分割方法，形状分割点均位于轮廓凹曲线处；因此在筛选用于形状分割的候选分割点时，定义了一种判断关键点p(i)处曲线凹凸性的方法：According to the shape segmentation method that conforms to visual naturalness, the shape segmentation points are all located at the concave curve of the contour; therefore, when screening the candidate segmentation points for shape segmentation, a method for judging the concavity of the curve at the key point p(i) is defined:

对于形状的二值化图像，形状样本S轮廓内部的像素点的数值均为255，形状样本S轮廓外部的像素点的数值均为0；等距抽样线段p(i-ε)p(i+ε)得到R个离散点，若此R个离散点的像素值均为255，则线段p(i-ε)p(i+ε)全部在形状轮廓内，即p(i处曲线表现为凸；若此R个离散点的像素值均为0，则线段p(i-ε)p(i+ε)全部在形状轮廓外，即p(i)处曲线表现为凹；记所有曲线表现为凹处的关键点p(i)为候选分割点P(j)。For the binary image of the shape, the values of the pixels inside the contour of the shape sample S are all 255, and the values of the pixels outside the contour of the shape sample S are all 0; the equidistant sampling line segments p(i-ε)p(i+ε) obtain R discrete points. If the pixel values of these R discrete points are all 255, then the line segments p(i-ε)p(i+ε) are all inside the shape contour, that is, the curve at p(i) is convex; if the pixel values of these R discrete points are all 0, then the line segments p(i-ε)p(i+ε) are all outside the shape contour, that is, the curve at p(i) is concave; the key point p(i) where all curves are concave is recorded as the candidate segmentation point P(j).

进一步的，所述步骤三中，调整曲率筛选阈值Th并得到形状分割点的步骤如下：Furthermore, in step 3, the steps of adjusting the curvature screening threshold Th and obtaining the shape segmentation points are as follows:

(1)对于步骤二中得到的所有候选分割点P(j)，将它们的平均近似偏置曲率值作为初始阈值Th₀：(1) For all candidate segmentation points P(j) obtained in step 2, their average approximate bias curvature value is used as the initial threshold Th ₀ :

其中J为候选分割点总个数；Where J is the total number of candidate segmentation points;

(2)对于第τ次调整时的阈值Th_τ，根据各候选分割点P(j)的近似偏置曲率值与Th_τ的大小关系，可将P(j)分为两类：近似偏置曲率值大于Th_τ的候选分割点和近似偏置曲率值小于等于Th_τ的候选分割点计算并记录当前阈值下的分割区分度D_τ：(2) For the threshold Th _τ during the τth adjustment, according to the relationship between the approximate bias curvature value of each candidate segmentation point P(j) and Th _τ , P(j) can be divided into two categories: candidate segmentation points with approximate bias curvature values greater than Th τ and candidate segmentation points with approximate bias curvature values greater than Th _τ . Candidate segmentation points with approximate bias curvature values less than or equal to Th _τ Calculate and record the segmentation discrimination D _τ under the current threshold:

其中，in,

其中分别表示阈值Th_τ下各候选分割点P(j)的正负曲率偏差,表示所有候选分割点正曲率偏差的最小值，表示所有候选分割点负曲率偏差的最大值；in They represent the positive and negative curvature deviations of each candidate segmentation point P(j) under the threshold Th _τ , represents the minimum value of the positive curvature deviation of all candidate segmentation points, Represents the maximum value of the negative curvature deviation of all candidate segmentation points;

判断是否存在近似偏置曲率值大于阈值Th_τ的候选分割点，如果不存在，则不再调整，转到步骤(4)；如果存在近似偏置曲率值大于阈值Th_τ的候选分割点，则转到步骤(3)继续调整阈值；Determine whether there is a candidate segmentation point whose approximate bias curvature value is greater than the threshold Th _τ . If not, no adjustment is made and the process goes to step (4). If there is a candidate segmentation point whose approximate bias curvature value is greater than the threshold Th _τ , the process goes to step (3) and continues to adjust the threshold.

(3)继续调整阈值，新的阈值Th_τ+1为上一次阈值调整过程中所有候选分割点正曲率偏差的最小值，用公式表示如下：(3) Continue to adjust the threshold. The new threshold Th _τ+1 is the minimum value of the positive curvature deviation of all candidate segmentation points in the last threshold adjustment process, which can be expressed by the following formula:

根据阈值Th_τ+1计算第τ+1次调整下的各候选分割点的正负曲率偏差以及分割区分度D_τ+1并记录；判断是否存在近似偏置曲率值大于阈值Th_τ+1的候选分割点，如果不存在，则不再调整，转到步骤(4)；如果存在近似偏置曲率值大于阈值Th_τ+1的候选分割点，则令τ＝τ+1，重复当前步骤继续调整阈值；According to the threshold Th _τ+1, the positive and negative curvature deviations of each candidate segmentation point under the τ+1th adjustment are calculated. and the segmentation discrimination D _τ+1 and record them; determine whether there is a candidate segmentation point whose approximate bias curvature value is greater than the threshold Th _τ+1. If not, no adjustment is made and go to step (4); if there is a candidate segmentation point whose approximate bias curvature value is greater than the threshold Th _τ+1 , set τ=τ+1 and repeat the current step to continue adjusting the threshold;

(4)多次调整阈值则有多个分割区分度，最大分割区分度对应的阈值为最终的曲率筛选阈值Th，近似偏置曲率值小于该阈值Th的点为最终的形状分割点。(4) If the threshold is adjusted multiple times, there will be multiple segmentation distinctions. The threshold corresponding to the maximum segmentation distinction is the final curvature screening threshold Th, and the point whose approximate bias curvature value is less than the threshold Th is the final shape segmentation point.

进一步的，所述步骤四中，基于分割线段位于形状内且互相不交叉的原则进行形状分割，并以最小分割代价分割得到若干子形状部分的具体方法为：Furthermore, in the step 4, the shape segmentation is performed based on the principle that the segmentation line segments are located within the shape and do not intersect each other, and the specific method of segmenting to obtain a plurality of sub-shape parts with the minimum segmentation cost is:

(1)对于任意两个形状分割点P(e₁)，P(e₂)，等距抽样分割线段P(e₁)P(e₂)得到C个离散点，若此C个点中存在像素值为0的离散点，则线段P(e₁)P(e₂)存在形状轮廓外的部分，不选择作为分割线段；(1) For any two shape segmentation points P(e ₁ ) and P(e ₂ ), the segmentation line segments P(e ₁ ) and P(e ₂ ) are sampled at equal intervals to obtain C discrete points. If there is a discrete point with a pixel value of 0 among the C points, the segment P(e ₁ ) and P(e ₂ ) exist outside the shape contour and are not selected as segmentation line segments.

(2)对于任意两个形状分割点P(e₃)，P(e₄)，若已存在一条形状分割线段P(e₅)P(e₆)使得：(2) For any two shape segmentation points P(e ₃ ) and P(e ₄ ), if there exists a shape segmentation line segment P(e ₅ ) and P(e ₆ ) such that:

或or

则线段P(e₃)P(e₄)与已有分割线段P(e₅)P(e₆)相交，不选择线段P(e₃)P(e₄)作为分割线段；Then the line segment P(e ₃ )P(e ₄ ) intersects with the existing segmentation line segment P(e ₅ )P(e ₆ ), and the line segment P(e ₃ )P(e ₄ ) is not selected as the segmentation line segment;

(3)对于满足以上两个原则的分割线段集进一步筛选，通过定义三种评价分割线段优劣的度量指标I，在最小分割代价下实现分割：(3) The segmentation line set that meets the above two principles is further screened, and segmentation is achieved at the minimum segmentation cost by defining three metrics I to evaluate the quality of the segmentation line segments:

其中D^*(u，v)、L^*(u，v)、S^*(u，v)，分别为归一化的分割长度、分割弧长、分割剩余面积三种分割度量指标，u，v为任意两个形状分割点序号，为分割点总数；Among them, D ^* (u,v), L ^* (u,v), and S ^* (u,v) are three segmentation metrics, namely, normalized segmentation length, segmentation arc length, and segmentation residual area. u and v are the sequence numbers of any two shape segmentation points. is the total number of split points;

对于任意一条形状分割线段P(u)P(v)，三种分割评价指标计算方式如下：For any shape segmentation line segment P(u)P(v), the three segmentation evaluation indicators are calculated as follows:

其中D_max为所有分割线段中长度最大的分割线段的长度，D^*(u，v)的取值范围应当在0到1之间，且数值越小分割效果越显著；Where D _max is the length of the longest segment among all segmentation segments, and the value range of D ^* (u, v) should be between 0 and 1, and the smaller the value, the more significant the segmentation effect;

其中为从P(u)到P(v)两点间的轮廓曲线的长度，L^*(u，v)的取值范围应当在0到1之间，且数值越小分割效果越显著；in is the contour curve between points P(u) and P(v) The length of L ^* (u,v) should be between 0 and 1, and the smaller the value, the more significant the segmentation effect;

其中S_d为分割线段P(u)P(v)分割出的形状面积，即由线段P(u)P(v)和轮廓曲线构成的封闭区域面积，S^*(u，v)的取值范围应当在0到1之间，且数值越小分割效果越显著；Where _Sd is the area of the shape segmented by the segmentation line segment P(u)P(v), that is, the area of the shape segmented by the segment P(u)P(v) and the contour curve The area of the closed region formed, the value range of S ^* (u,v) should be between 0 and 1, and the smaller the value, the more significant the segmentation effect;

依据上述步骤计算得到对于分割线段P(u)P(v)的分割代价Cost:According to the above steps, the segmentation cost Cost for the segmentation line segment P(u)P(v) is calculated as:

Cost＝αD^*(u，v)+βL^*(u，v)+γS^*(u，v)，Cost=αD ^* (u, v) + βL ^* (u, v) + γS ^* (u, v),

其中α，β，γ为各度量指标的权重；Among them, α, β, and γ are the weights of each metric;

计算筛选出的分割线段集中的分割线段的分割代价Cost；对计算得到的全部Cost从小到大进行排序，最终根据形状样本S所属类别设置的分割子形状部分数量N选取Cost最小的N-1条分割线段，从而实现最优分割，得到N个子形状部分；分割子形状部分数量N取决于当前形状样本S所属的类别，对于不同类别的形状，手工设置了对应的分割子形状部分数量。Calculate the segmentation cost Cost of the segmentation line segments in the filtered segmentation line segment set; sort all the calculated costs from small to large, and finally select N-1 segmentation line segments with the smallest Cost according to the number N of segmentation sub-shape parts set for the category to which the shape sample S belongs, so as to achieve optimal segmentation and obtain N sub-shape parts; the number N of segmentation sub-shape parts depends on the category to which the current shape sample S belongs. For shapes of different categories, the corresponding number of segmentation sub-shape parts is manually set.

进一步的，所述步骤五中，构建形状样本的拓扑结构的具体方法为：对于任一形状样本S分割得到的N个子形状部分，将中心形状部分记作起始顶点v₁，并将其余邻接形状部分按顺时针方向进行顶点排序，记作顶点{v_o|o∈[2，N]}；记连接v₁到其余各顶点v_o的边为(v₁，v_o)，进而构成满足拓扑次序的形状有向图：Furthermore, in step 5, the specific method of constructing the topological structure of the shape sample is as follows: for any N sub-shape parts obtained by segmenting the shape sample S, the central shape part is recorded as the starting vertex v ₁ , and the remaining adjacent shape parts are sorted in a clockwise direction and recorded as vertices {v _o |o∈[2,N]}; the edge connecting v ₁ to the remaining vertices v _o is recorded as (v ₁ ,v _o ), thereby forming a shape directed graph that satisfies the topological order:

G₁＝(V₁，E₁)，G ₁ =(V ₁ ,E ₁ ),

其中V₁＝{v_o|o∈[1，N]}，E₁＝{(v₁，v_o)|o∈[2，N]}；Where V ₁ ={v _o |o∈[1, N]}, E ₁ ={(v ₁ , v _o )|o∈[2, N]};

对所有训练形状样本全部进行最优分割后，将训练形状样本分割所得子形状部分的最大数量记为对于任一形状样本S，其邻接矩阵的计算方式为：After all training shape samples are optimally segmented, the maximum number of sub-shape parts obtained by segmenting the training shape samples is recorded as For any shape sample S, its adjacency matrix The calculation method is:

其中表示阶实数矩阵， in express A real matrix of order,

进一步的，所述步骤六中，使用形状的全尺度可视化表示方法，得到对应子形状部分的彩色特征表达图像的具体方法为：Furthermore, in step 6, the specific method of using the full-scale visualization representation method of the shape to obtain the color feature expression image of the corresponding sub-shape part is:

对于任一形状样本S的子形状部分S¹：For any sub-shape part S ¹ of a shape sample S:

其中，为该子形状部分的轮廓抽样点p¹(i)在二维平面内的横、纵坐标，n¹为轮廓长度，即轮廓抽样点的个数；in, are the horizontal and vertical coordinates of the contour sampling point p ¹ (i) of the sub-shape part in the two-dimensional plane, n ¹ is the contour length, that is, the number of contour sampling points;

首先使用三种形状描述子构成的特征函数M描述该子形状部分S1的轮廓：First, the feature function M composed of three shape descriptors is used to describe the outline of the sub-shape part S1:

M＝{s_k(i)，l_k(i)，c_k(i)|k∈[1，m]，i∈[1，n¹]}，M={s _k (i), l _k (i), c _k (i)|k∈[1, m], i∈[1, n ¹ ]},

其中，s_k，l_k，c_k为尺度k中归一化的面积s、弧长l和重心距c三个不变量参数，k为尺度标签，m为总尺度数；分别定义这三个形状不变量描述子：Among them, s _k , l _k , c _k are the three invariant parameters of normalized area s, arc length l and centroid distance c in scale k, k is the scale label, and m is the total number of scales; these three shape invariant descriptors are defined respectively:

以一轮廓抽样点p¹(i)为圆心，以初始半径作预设圆C₁(i)，该预设圆即为计算对应轮廓点参数的初始半全局尺度；依据上述步骤得到预设圆C₁(i)后，尺度k＝1下三种形状描述子计算方式如下：With a contour sampling point p ¹ (i) as the center and an initial radius A preset circle C ₁ (i) is made, which is the initial semi-global scale for calculating the parameters of the corresponding contour points. After the preset circle C ₁ (i) is obtained according to the above steps, the calculation methods of the three shape descriptors under the scale k=1 are as follows:

在计算s₁(i)描述子时，将预设圆C₁(i)中的与目标轮廓点p¹(i)具有直接连接关系的区域Z₁(i)的面积记为则有：When calculating the descriptor s ₁ (i), the area of the region Z ₁ (i) in the preset circle C ₁ (i) that is directly connected to the target contour point p ¹ (i) is recorded as Then we have:

其中，B(Z₁(i)，z)为一指示函数，定义为Where B(Z ₁ (i), z) is an indicator function, defined as

将Z₁(i)的面积与预设圆C₁(i)面积的比值作为目标轮廓点描述子的面积参数s₁(i)：The ratio of the area of Z ₁ (i) to the area of the preset circle C ₁ (i) is used as the area parameter s ₁ (i) of the target contour point descriptor:

s₁(i)的取值范围应当在0到1之间；The value range of s ₁ (i) should be between 0 and 1;

在计算c₁(i)描述子时，首先计算与目标轮廓点p¹(i)具有直接连接关系的区域的重心，具体为对该区域中所有像素点的坐标值求取平均数，所得结果即为该区域的重心的坐标值，可以表示为：When calculating the c ₁ (i) descriptor, the centroid of the region directly connected to the target contour point p ¹ (i) is first calculated. Specifically, the coordinate values of all pixels in the region are averaged. The result is the coordinate value of the centroid of the region, which can be expressed as:

其中，w₁(i)即为上述区域的重心；Among them, w ₁ (i) is the centroid of the above area;

然后计算目标轮廓点p¹(i)与重心w₁(i)的距离可以表示为：Then calculate the distance between the target contour point p ¹ (i) and the center of gravity w ₁ (i) It can be expressed as:

最后将与目标轮廓点p¹(i)的预设圆C₁(i)的半径的比值作为该目标轮廓点描述子的重心距参数c₁(i)：Finally The ratio of the radius of the preset circle C ₁ (i) of the target contour point p ¹ (i) is used as the centroid distance parameter c ₁ (i) of the target contour point descriptor:

c₁(i)的取值范围应当在0到1之间；The value range of c ₁ (i) should be between 0 and 1;

在计算l₁(i)描述子时，将预设圆C₁(i)内与目标轮廓点p¹(i)具有直接连接关系的弧段的长度记为并将与预设圆C₁(i)周长的比值作为该目标轮廓点描述子的弧长参数l₁(i)：When calculating the l ₁ (i) descriptor, the length of the arc segment directly connected to the target contour point p ¹ (i) within the preset circle C ₁ (i) is recorded as and will The ratio to the circumference of the preset circle C ₁ (i) is used as the arc length parameter l ₁ (i) of the target contour point descriptor:

其中，l₁(i)的取值范围应当在0到1之间；Among them, the value range of l ₁ (i) should be between 0 and 1;

依据上述步骤计算得到在尺度标签k＝1，初始半径的半全局尺度下形状样本S的子形状部分S¹的特征函数M₁：According to the above steps, we can calculate the initial radius at scale label k = 1. The characteristic function M ₁ of the sub-shape part S ¹ of the shape sample S at the semi-global scale is:

M₁＝{s₁(i)，l₁(i)，c₁(i)|i∈[1，n¹]}，M ₁ ={s ₁ (i), l ₁ (i), c ₁ (i)|i∈[1, n ¹ ]},

由于数字图像以一个像素为最小单位，故选择以单个像素作为全尺度空间下的连续尺度变化间隔；即对于第k个尺度标签，设定圆C_k的半径r_k：Since a digital image uses a pixel as the smallest unit, a single pixel is selected as the continuous scale change interval in the full scale space; that is, for the kth scale label, the radius r _k of the circle C _k is set as:

即在初始尺度k＝1时，此后半径r_k以一个像素为单位等幅缩小m-1次，直至最小尺度k＝m时为止；按照计算尺度k＝1下的特征函数M₁的方式，计算其它尺度下的特征函数，最终得到全部尺度下形状样本S的子形状部分S¹的特征函数：That is, when the initial scale k = 1, After that, the radius r _k is reduced m-1 times in units of one pixel until the minimum scale k=m is reached; the characteristic functions at other scales are calculated in the same way as the characteristic function M ₁ at scale k=1, and finally the characteristic functions of the sub-shape part S ¹ of the shape sample S at all scales are obtained:

将各个尺度下的特征函数分别存入矩阵S^M、L^M、C^M，S^M用于存储s_k(i)，S^M的第k行第i列存储的为点p¹(i)在第k个尺度下的面积参数s_k(i)；L^M用于存储l_k(i)，L^M的第k行第i列存储的为点p¹(i)在第k个尺度下弧长参数l_k(i)；C^M用于存储c_k(i)，C^M的第k行第i列存储的为点p¹(i)在第k个尺度下重心距参数c_k(i)；S^M、L^M、C^M最终作为全尺度空间下形状样本S的子形状部分S¹的三种形状特征的灰度图表达：The characteristic functions at each scale are stored in matrices S ^M , L ^M , and C ^M respectively. S ^M is used to store _sk (i). The k-th row and i-th column of S ^M stores the area parameter _sk (i) of point p ¹ (i) at the k-th scale. L ^M is used to store l _k (i). The k-th row and i-th column of L ^M stores the arc length parameter l _k (i) of point p ¹ (i) at the k-th scale. C ^M is used to store c _k (i). The k-th row and i-th column of C ^M stores the centroid distance parameter c _k (i) of point p ¹ (i) at the k-th scale. S ^M , L ^M , and C ^M are finally used as grayscale images of the three shape features of the sub-shape part S ¹ of the shape sample S in the full-scale space:

GM¹＝{S^M，L^M，C^M}，GM ¹ ={S ^M , L ^M , C ^M },

其中，S^M、L^M、C^M均为尺寸大小是m×n的矩阵，各代表一幅灰度图像；Among them, S ^M , L ^M , and C ^M are matrices of size m×n, each representing a grayscale image;

接着将该子形状部分S¹的三幅灰度图像作为RGB三个通道得到一张彩色图像，作为该子形状部分S¹的特征表达图像 Then, the three grayscale images of the sub-shape part ^S1 are used as the three RGB channels to obtain a color image as the feature expression image of the sub-shape part ^S1.

进一步的，所述步骤七中将所有训练形状样本各子形状部分的特征表达图像样本输入至卷积神经网络，对卷积神经网络模型进行训练；每类形状的不同子形状部分有不同的类别标签；卷积神经网络训练至收敛后，对于任一形状样本S，将其分割形成的N个子形状部分对应的特征表达图像{T^num|num∈[1，N]}分别输入训练好的卷积神经网络，网络第二层全连接层的输出为对应子形状部分的特征向量其中Vec为第二层全连接层中神经元的个数；Furthermore, in step seven, the feature expression image samples of each sub-shape part of all training shape samples are input into the convolutional neural network to train the convolutional neural network model; different sub-shape parts of each type of shape have different category labels; after the convolutional neural network is trained to convergence, for any shape sample S, the feature expression images {T ^num | num∈[1, N]} corresponding to the N sub-shape parts formed by segmentation are respectively input into the trained convolutional neural network, and the output of the second fully connected layer of the network is the feature vector of the corresponding sub-shape part Where Vec is the number of neurons in the second fully connected layer;

其中卷积神经网络的结构包括输入层、预训练层和全连接层；所述预训练层由VGG16网络模型前4个模块组成，将该4个模块在imagenet数据集中训练后所得的参数作为初始化参数，预训练层后连接三个全连接层；The structure of the convolutional neural network includes an input layer, a pre-training layer and a fully connected layer; the pre-training layer is composed of the first four modules of the VGG16 network model, and the parameters obtained after training the four modules in the imagenet dataset are used as initialization parameters, and three fully connected layers are connected after the pre-training layer;

预训练层中第1个模块具体包括2层卷积层和1层最大池化层，其中卷积层卷积核数目为64，尺寸大小为3×3，池化层尺寸大小为2×2；第2个模块具体包括2层卷积层和1层最大池化层，其中卷积层卷积核数目为128，尺寸大小为3×3，池化层尺寸大小为2×2；第3个模块具体包括3层卷积层和1层最大池化层，其中卷积层卷积核数目为256，尺寸大小为3×3，池化层尺寸大小为2×2；第4个模块具体包括3层卷积层和1层最大池化层，其中卷积层卷积核数目为512，尺寸大小为3×3，池化层尺寸大小为2×2；每一层卷积层的计算公式为：The first module in the pre-training layer specifically includes 2 convolutional layers and 1 maximum pooling layer, where the number of convolutional kernels in the convolutional layer is 64, the size is 3×3, and the size of the pooling layer is 2×2; the second module specifically includes 2 convolutional layers and 1 maximum pooling layer, where the number of convolutional kernels in the convolutional layer is 128, the size is 3×3, and the size of the pooling layer is 2×2; the third module specifically includes 3 convolutional layers and 1 maximum pooling layer, where the number of convolutional kernels in the convolutional layer is 256, the size is 3×3, and the size of the pooling layer is 2×2; the fourth module specifically includes 3 convolutional layers and 1 maximum pooling layer, where the number of convolutional kernels in the convolutional layer is 512, the size is 3×3, and the size of the pooling layer is 2×2; the calculation formula for each convolutional layer is:

C_O＝φ_relu(W_C·C_I+θ_C),C _O =φ _relu (W _C ·C _I +θ _C ),

其中，θ_C是该卷积层的偏置向量；W_C是该卷积层的权重；C_I是该卷积层的输入；C_O是该卷积层的输出；Among them, θ _C is the bias vector of the convolutional layer; W _C is the weight of the convolutional layer; C _I is the input of the convolutional layer; C _O is the output of the convolutional layer;

全连接层模块具体包括3层全连接层，其中第1层全连接层包含512个节点，第2层全连接层包含Vec个节点,第3层全连接层包含N_T个节点；N_T为所有类别的形状的分割子形状部分的数量总和；其中前2层全连接层的计算公式为：The fully connected layer module specifically includes 3 fully connected layers, where the first fully connected layer contains 512 nodes, the second fully connected layer contains Vec nodes, and the third fully connected layer contains _NT nodes; _NT is the sum of the number of segmented sub-shape parts of all categories of shapes; the calculation formula for the first two fully connected layers is:

F_O＝φ_tanh(W_F·F_I+θ_F),F _O =φ _tanh (W _F ·F _I +θ _F ),

其中，φ_tanh为tanh激活函数，θ_F是该全连接层的偏置向量；W_F是该全连接层的权重；F_I是该全连接层的输入；F_O是该全连接层的输出；Among them, φ _tanh is the tanh activation function, θ _F is the bias vector of the fully connected layer; W _F is the weight of the fully connected layer; _FI is the input of the fully connected layer; _FO is the output of the fully connected layer;

最后一层全连接层即为输出层，其输出的计算公式为：The last fully connected layer is the output layer, and its output calculation formula is:

Y_O＝φ_softmax(W_Y·Y_I+θ_Y),Y _O =φ _softmax (W _Y ·Y _I +θ _Y ),

其中，φ_softmax为softmax激活函数，θ_Y是输出层的偏置向量，每一个输出层的神经元都表示对应的一个子形状部分类别，W_Y是输出层的权重，Y_I是输出层的输入；Y_O是输出层的输出。Among them, φ _softmax is the softmax activation function, θ _Y is the bias vector of the output layer, each neuron in the output layer represents a corresponding sub-shape part category, W _Y is the weight of the output layer, Y _I is the input of the output layer; Y _O is the output of the output layer.

进一步的，所述步骤八中构造形状样本的特征矩阵具体方法为：Furthermore, the specific method of constructing the feature matrix of the shape sample in step eight is:

对于任一形状样本S，其分割形成的N个子形状部分，对应的特征矩阵表达的计算公式为：For any shape sample S, the N sub-shape parts formed by segmentation have the corresponding feature matrix expression The calculation formula is:

其中，F_a表示矩阵F的第a行向量，f^a为所述步骤七输出的第a个子形状部分的特征向量，表示维度大小为Vec的零向量。Wherein, _Fa represents the a-th row vector of the matrix F, and ^fa is the eigenvector of the a-th sub-shape part outputted in step 7. Represents a zero vector of dimension size Vec.

进一步的，所述步骤九中，构建图卷积神经网络的结构，包括预处理输入层，隐藏层和分类输出层，所述预处理输入层中进行邻接矩阵归一化预处理，具体为：Furthermore, in step nine, the structure of the graph convolutional neural network is constructed, including a preprocessing input layer, a hidden layer and a classification output layer, and the adjacency matrix is performed in the preprocessing input layer. Normalization preprocessing, specifically:

其中I_N是单位矩阵，为度矩阵，是归一化预处理后的 in I _N is the identity matrix, is the degree matrix, After normalization preprocessing

所述隐藏层中包含2层图卷积层，每一层图卷积层的计算公式为：The hidden layer includes two graph convolution layers, and the calculation formula of each graph convolution layer is:

其中，是该图卷积层的权重；H_I是该图卷积层的输入，第1层卷积层的输入为形状样本的特征矩阵H_O是该图卷积层的输出；in, is the weight of the graph convolution layer; H _I is the input of the graph convolution layer, and the input of the first convolution layer is the feature matrix of the shape sample H _O is the output of the graph convolutional layer;

所述分类输出层的计算公式为：The calculation formula of the classification output layer is:

其中，φ_softmax为softmax激活函数，G_I是输出层的输入，即第二层图卷积层的输出，G_W是输出层的权重；G_O是输出层的输出；每一个输出层的神经元都表示对应的一个形状类别。Among them, φ _softmax is the softmax activation function, G _I is the input of the output layer, that is, the output of the second graph convolutional layer, G _W is the weight of the output layer; G _O is the output of the output layer; each neuron in the output layer represents a corresponding shape category.

进一步的，所述步骤十中实现轮廓形状分类识别的具体方法为：对图卷积神经网络模型进行训练至收敛；对于任一测试形状样本，首先提取形状轮廓关键点，计算各关键点处的曲率值，判断其凹凸性，获取候选分割点，然后调整曲率筛选阈值得到形状分割点；根据步骤五中(1)和(2)两个原则，得到分割线段集，计算该分割线段集中的分割线段的分割代价；如果分割线段的个数小于则所有的分割线段用于分割形状；否则，根据分割代价最小的个分割线段对形状进行分割；计算每个子形状部分的彩色特征表达图像，并将其输入训练好的卷积神经网络，卷积神经网络的第二层全连接层的输出作为该子形状部分的特征向量；构造该测试形状样本的形状有向图，计算其邻接矩阵和特征矩阵，输入进已训练好的图卷积神经网络模型中，输出向量中最大值对应的形状类别即判断为该测试样本的形状类型，实现形状分类识别。Furthermore, the specific method for implementing contour shape classification and recognition in step 10 is: training the graph convolutional neural network model until convergence; for any test shape sample, first extract the key points of the shape contour, calculate the curvature value at each key point, judge its concavity and convexity, obtain the candidate segmentation points, and then adjust the curvature screening threshold to obtain the shape segmentation points; according to the two principles (1) and (2) in step 5, obtain the segmentation line segment set, and calculate the segmentation cost of the segmentation line segments in the segmentation line segment set; if the number of segmentation line segments is less than Then all segmentation line segments are used to segment the shape; otherwise, according to the minimum segmentation cost The shape is segmented by segmentation line segments; the color feature expression image of each sub-shape part is calculated and input into the trained convolutional neural network, and the output of the second fully connected layer of the convolutional neural network is used as the feature vector of the sub-shape part; the shape directed graph of the test shape sample is constructed, and its adjacency matrix and feature matrix are calculated and input into the trained graph convolutional neural network model. The shape category corresponding to the maximum value in the output vector is judged as the shape type of the test sample, thereby realizing shape classification and recognition.

本发明提出了一种新的形状识别方法，并利用图卷积神经网络设计了一种新的形状分类方法；提出的形状特征拓扑图表达是基于图分割所构建的有向图结构，不但区分了形状层次，而且充分利用了形状各层次部分间的稳定拓扑特征关系代替几何位置关系。相比于背景技术中方法只计算比较对应显著点特征进行匹配，本发明能够更加鲁棒地适用于形状铰接变换、部分遮挡、刚体变换等干扰；运用的形状的全尺度可视化表示方法能够全面地表达出各子形状部分的全部信息，再利用神经网络连续的卷积计算，提取到各部分在全尺度空间中的特征；设计的图卷积神经网络相比于直接应用卷积神经网络，训练参数大大减少，计算效率更高。The present invention proposes a new shape recognition method, and designs a new shape classification method using graph convolutional neural network; the proposed shape feature topological graph expression is a directed graph structure constructed based on graph segmentation, which not only distinguishes the shape hierarchy, but also makes full use of the stable topological feature relationship between the parts of each level of the shape to replace the geometric position relationship. Compared with the method in the background technology that only calculates and compares the corresponding salient point features for matching, the present invention can be more robustly applied to interference such as shape articulation transformation, partial occlusion, and rigid body transformation; the full-scale visualization representation method of the shape used can fully express all the information of each sub-shape part, and then use the continuous convolution calculation of the neural network to extract the features of each part in the full-scale space; compared with the direct application of the convolutional neural network, the designed graph convolutional neural network has greatly reduced training parameters and higher calculation efficiency.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1是本发明一种形状识别方法的工作流程图。FIG. 1 is a flowchart of a shape recognition method of the present invention.

图2是形状样本集中目标形状的部分样本示意图。FIG. 2 is a schematic diagram of some samples of target shapes in the shape sample set.

图3是形状样本的分割示意图。FIG3 is a schematic diagram of segmentation of shape samples.

图4是全尺度空间的示意图。FIG4 is a schematic diagram of the full-scale space.

图5是目标形状被预设尺度截取后的示意图。FIG. 5 is a schematic diagram of a target shape after being cut off at a preset scale.

图6是目标形状被预设尺度分割后的示意图。FIG. 6 is a schematic diagram of a target shape segmented at a preset scale.

图7是目标形状的一个子形状部分在单一尺度下的特征函数示意图。FIG. 7 is a schematic diagram of a characteristic function of a sub-shape portion of a target shape at a single scale.

图8是目标形状的一个子形状部分在全尺度空间下的特征矩阵示意图。FIG8 is a schematic diagram of a feature matrix of a sub-shape portion of a target shape in the full-scale space.

图9是目标形状的一个子形状部分计算得到的三幅灰度图像以及合成的彩色图像示意图。FIG. 9 is a schematic diagram of three grayscale images and a synthesized color image calculated from a sub-shape portion of a target shape.

图10是用于训练各子形状部分特征表达图像的卷积神经网络结构图。FIG10 is a diagram showing the structure of a convolutional neural network used to train images expressing the features of each sub-shape portion.

图11是目标形状的各子形状部分的特征结构图。FIG. 11 is a diagram showing the characteristic structure of each sub-shape portion of the target shape.

具体实施方式DETAILED DESCRIPTION

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。The following will be combined with the drawings in the embodiments of the present invention to clearly and completely describe the technical solutions in the embodiments of the present invention. Obviously, the described embodiments are only part of the embodiments of the present invention, not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by ordinary technicians in this field without creative work are within the scope of protection of the present invention.

如图1所示，一种形状识别方法，包括如下流程：As shown in FIG1 , a shape recognition method includes the following process:

1、形状样本集总样本数为1400，共70个形状类别，每个形状类别有20个形状样本。如图2所示为形状样本集中目标形状的部分样本示意图。每个形状类别中随机选取一半的样本划入训练集，剩下的一半划入测试集，得到共700个训练样本，700个测试样本。对每一个形状样本抽样得到100个轮廓点，以一个形状样本S为例：1. The total number of samples in the shape sample set is 1400, with a total of 70 shape categories, and each shape category has 20 shape samples. Figure 2 shows a schematic diagram of some samples of the target shape in the shape sample set. Half of the samples in each shape category are randomly selected to be included in the training set, and the remaining half are included in the test set, resulting in a total of 700 training samples and 700 test samples. 100 contour points are sampled for each shape sample. Take a shape sample S as an example:

S＝{p_x(i)，p_y(i)|i∈[1，100]}，S＝{p _x (i), p _y (i)|i∈[1, 100]},

其中，p_x(i)，p_y(i)为轮廓抽样点p(i)在二维平面内的横、纵坐标。Among them, p _x (i), _py (i) are the horizontal and vertical coordinates of the contour sampling point p(i) in the two-dimensional plane.

对形状样本的轮廓曲线进行演化来提取关键点，在每一次演化过程中，对目标识别起到贡献最小的点被删除。其中每个点p(i)的贡献定义为：The contour curve of the shape sample is evolved to extract key points. In each evolution process, the point that contributes the least to target recognition is deleted. The contribution of each point p(i) is defined as:

其中，h(i，i-1)为点p(i)与p(i-1)间的曲线长度，h(i，i+1)为点p(i)与p(i+1)间的曲线长度，H₁(i)为线段p(i)p(i-1)与线段p(i)p(i+1)间的角度，长度h根据轮廓周长进行归一化。Con(i)值越大表示该点p(i)对形状特征的贡献越大。Among them, h(i, i-1) is the length of the curve between points p(i) and p(i-1), h(i, i+1) is the length of the curve between points p(i) and p(i+1), H ₁ (i) is the angle between line segment p(i)p(i-1) and line segment p(i)p(i+1), and the length h is normalized according to the contour perimeter. The larger the Con(i) value, the greater the contribution of the point p(i) to the shape feature.

其中S₀为原始形状的面积，S_i为经过i次演变后的面积，n₀为原始形状轮廓上的总点数。当此结束函数值F(t)超过设定阈值后，轮廓关键点提取结束。对于如图3所示的形状样本S，共提取得到24个轮廓关键点。Where _S0 is the area of the original shape, _S1 is the area after i evolutions, and _n0 is the total number of points on the original shape contour. When the end function value F(t) exceeds the set threshold, the contour key point extraction ends. For the shape sample S shown in Figure 3, a total of 24 contour key points are extracted.

2.计算形状样本各关键点处的近似偏置曲率值以及曲线凹凸性。以形状样本S为例，其轮廓关键点p(i)的近似偏置曲率值cur^～(p(i))的计算公式如下：2. Calculate the approximate bias curvature value and curve concavity at each key point of the shape sample. Taking the shape sample S as an example, the calculation formula of the approximate bias curvature value cur ^～ (p(i)) of its contour key point p(i) is as follows:

cur^～(p(i))＝cosH_ε(i)+1，cur ^~ (p(i))=cosH _ε (i)+1,

其中H_ε(i)为线段p(i)p(i-ε)与线段p(i)p(i+ε)间的角度，ε＝3。Where H _ε (i) is the angle between the line segment p(i)p(i-ε) and the line segment p(i)p(i+ε), ε=3.

轮廓关键点p(i)处的曲线凹凸性判断方法如下：The method for judging the concavity of the curve at the contour key point p(i) is as follows:

等距抽样线段p(i-ε)p(i+ε)得到R个离散点，若此R个离散点的像素值均为255，则线段p(i-ε)p(i+ε)全部在形状轮廓内，即p(i)处曲线表现为凸；若此R个离散点的像素值均为0，则线段p(i-ε)p(i+ε)全部在形状轮廓外，即p(i)处曲线表现为凹。记所有曲线表现为凹处的关键点p(i)为候选分割点P(j)。对于形状样本S，共提取得11个候选分割点。The line segments p(i-ε)p(i+ε) are sampled equidistantly to obtain R discrete points. If the pixel values of these R discrete points are all 255, then the line segments p(i-ε)p(i+ε) are all inside the shape contour, that is, the curve at p(i) is convex; if the pixel values of these R discrete points are all 0, then the line segments p(i-ε)p(i+ε) are all outside the shape contour, that is, the curve at p(i) is concave. The key point p(i) where all curves are concave is recorded as the candidate segmentation point P(j). For the shape sample S, a total of 11 candidate segmentation points are extracted.

3.调整曲率筛选阈值Th并得到形状分割点。对于形状样本S的11个候选分割点P(j)，将它们的平均近似偏置曲率值作为初始阈值Th₀：3. Adjust the curvature screening threshold Th and obtain the shape segmentation point. For the 11 candidate segmentation points P(j) of the shape sample S, their average approximate bias curvature value is used as the initial threshold Th ₀ :

其中11个候选分割点P(j)各自的平均近似偏置曲率值cur^～(P(j))的大小分别为0.1,0.2,0.25,0.35,0.4,0.5,0.5,0.64,0.7,0.7,0.8。The average approximate bias curvature values cur ^~ (P(j)) of the 11 candidate segmentation points P(j) are 0.1, 0.2, 0.25, 0.35, 0.4, 0.5, 0.5, 0.64, 0.7, 0.7, and 0.8 respectively.

依照以下方法依次增大阈值：Increase the threshold in the following order:

(1)对于第τ次调整时的阈值Th_τ，根据各候选分割点P(j)的近似偏置曲率值与Th_τ的大小关系，可将P(j)分为两类：近似偏置曲率值大于Th_τ的候选分割点和近似偏置曲率值小于等于Th_τ的候选分割点计算并记录当前阈值下的分割区分度D_τ：(1) For the threshold Th _τ during the τth adjustment, according to the relationship between the approximate bias curvature value of each candidate segmentation point P(j) and Th _τ , P(j) can be divided into two categories: candidate segmentation points with approximate bias curvature values greater than Th _{τ and} Candidate segmentation points with approximate bias curvature values less than or equal to Th _τ Calculate and record the segmentation discrimination D _τ under the current threshold:

其中，in,

其中分别表示阈值Th_τ下各候选分割点P(j)的正负曲率偏差,表示所有候选分割点正曲率偏差的最小值，表示所有候选分割点负曲率偏差的最大值。in They represent the positive and negative curvature deviations of each candidate segmentation point P(j) under the threshold Th _τ , represents the minimum value of the positive curvature deviation of all candidate segmentation points, Represents the maximum value of the negative curvature deviation of all candidate segmentation points.

判断是否存在近似偏置曲率值大于阈值Th_τ的候选分割点，如果不存在，则不再调整，转到步骤(3)。如果存在近似偏置曲率值大于阈值Th_τ的候选分割点，则转到步骤(2)继续调整阈值。Determine whether there is a candidate segmentation point whose approximate bias curvature value is greater than the threshold Th _τ . If not, no adjustment is made and the process goes to step (3). If there is a candidate segmentation point whose approximate bias curvature value is greater than the threshold Th _τ , the process goes to step (2) and continues to adjust the threshold.

(2)继续调整阈值，新的阈值Th_τ+1为上一次阈值调整过程中所有候选分割点正曲率偏差的最小值，用公式表示如下：(2) Continue to adjust the threshold. The new threshold Th _τ+1 is the minimum value of the positive curvature deviation of all candidate segmentation points in the last threshold adjustment process, which can be expressed by the following formula:

根据阈值Th_τ+1计算第τ+1次调整下的各候选分割点的正负曲率偏差以及分割区分度D_τ+1并记录。判断是否存在近似偏置曲率值大于阈值Th_τ+1的候选分割点，如果不存在，则不再调整，转到步骤(3)。如果存在近似偏置曲率值大于阈值Th_τ+1的候选分割点，则令τ＝τ+1，重复当前步骤继续调整阈值。According to the threshold Th _τ+1, the positive and negative curvature deviations of each candidate segmentation point under the τ+1th adjustment are calculated. And the segmentation discrimination D _τ+1 is recorded. Determine whether there is a candidate segmentation point whose approximate bias curvature value is greater than the threshold Th _τ+1. If not, no adjustment is made and go to step (3). If there is a candidate segmentation point whose approximate bias curvature value is greater than the threshold Th _τ+1 , set τ=τ+1 and repeat the current step to continue adjusting the threshold.

(3)多次调整阈值则有多个分割区分度，最大分割区分度对应的阈值为最终的曲率筛选阈值Th，近似偏置曲率值大于该阈值Th的点为最终的形状分割点。(3) If the threshold is adjusted multiple times, there will be multiple segmentation distinctions. The threshold corresponding to the maximum segmentation distinction is the final curvature screening threshold Th, and the point whose approximate bias curvature value is greater than the threshold Th is the final shape segmentation point.

对于形状样本S，4次阈值调整过程对应记录的分割区分度及阈值分别为：For the shape sample S, the segmentation discrimination and threshold values recorded in the four threshold adjustment processes are:

因此，最大分割区分度D₁对应的阈值Th₁为最终的曲率筛选阈值，即Th＝0.5。近似偏置曲率值小于Th的5个候选分割点即为最终的形状分割点，对应的近似偏置曲率分别为0.1,0.2,0.25,0.35,0.4。Therefore, the threshold Th1 corresponding to the maximum segmentation discrimination _D1 is the final curvature screening threshold, that is, Th = 0.5. The five candidate segmentation points whose approximate bias curvature values are less than _Th are the final shape segmentation points, and the corresponding approximate bias curvatures are 0.1, 0.2, 0.25, 0.35, and 0.4 respectively.

4.对于形状样本S，依次两两连接5个形状分割点构成10条线段，保留其中位于形状内且互相不交叉的7条线段作为分割线段集。依照以下方法使用度量指标I计算各分割线段的分割代价Cost：4. For the shape sample S, connect the five shape segmentation points in pairs to form 10 line segments, and retain the seven line segments that are located within the shape and do not intersect each other as the segmentation line segment set. Use the metric I to calculate the segmentation cost of each segmentation line segment according to the following method:

其中D^*(u，v)，L^*(u，v)，S^*(u，v)，分别为归一化的分割长度、分割弧长、分割剩余面积三种分割度量指标，u，v为任意两个形状分割点序号，为分割点总数。Where D ^* (u,v), L ^* (u,v), S ^* (u,v) are the three segmentation metrics of normalized segmentation length, segmentation arc length, and segmentation residual area, respectively. u and v are the serial numbers of any two shape segmentation points. is the total number of split points.

其中D_max为所有分割线段中长度最大的分割线段的长度，D^*(u，v)的取值范围应当在0到1之间，且数值越小分割效果越显著。D _max is the length of the longest segment among all segmentation segments, and the value range of D ^* (u, v) should be between 0 and 1, and the smaller the value, the more significant the segmentation effect.

其中为从P(u)到P(v)两点间的轮廓曲线的长度，L^*(u，v)的取值范围应当在0到1之间，且数值越小分割效果越显著。in is the contour curve between points P(u) and P(v) The length of L ^* (u, v) should range from 0 to 1, and the smaller the value, the more significant the segmentation effect.

其中S_d为分割线段P(u)P(v)分割出的形状面积，即由线段P(u)P(v)和轮廓曲线构成的封闭区域面积，S^*(u，v)的取值范围应当在0到1之间，且数值越小分割效果越显著。Where _Sd is the area of the shape segmented by the segmentation line segment P(u)P(v), that is, the area of the shape segmented by the segment P(u)P(v) and the contour curve The area of the closed region, S ^* (u, v) should range from 0 to 1, and the smaller the value, the more significant the segmentation effect.

其中α，β，γ为各度量指标的权重。Among them, α, β, and γ are the weights of each metric.

如图3所示，对于形状样本S，选择2条具有最小和次小分割代价的分割线段作为最终的最优分割线段，并得到3个子形状部分。As shown in FIG3 , for the shape sample S, two segmentation line segments with the minimum and second minimum segmentation costs are selected as the final optimal segmentation line segments, and three sub-shape parts are obtained.

5.对于形状样本S，将中心形状部分记作起始顶点v₁，并将其余邻接的2个形状部分按顺时针方向进行顶点排序，分别记作顶点{v₂，v₃}。记连接v₁到顶点v₂，v₃的边分别为(v₁，v₂)，(v₁，v₃)，进而构成满足拓扑次序的形状有向图：5. For the shape sample S, record the central shape part as the starting vertex v ₁ , and sort the other two adjacent shape parts in a clockwise direction, and record them as vertices {v ₂ , v ₃ }. The edges connecting v ₁ to vertices v ₂ and v ₃ are recorded as (v ₁ , v ₂ ) and (v ₁ , v ₃ ), respectively, thus forming a shape directed graph that satisfies the topological order:

G₁＝(V₁，E₁)，G ₁ =(V ₁ ,E ₁ ),

其中V₁＝{v₁，v₂，v₃}，E₁＝{(v₁，v₂)，(v₁，v₃)}。Where V ₁ ={v ₁ , v ₂ , v ₃ }, E ₁ ={(v ₁ , v ₂ ), (v ₁ , v ₃ )}.

由于轮廓形状集各个训练样本分割所得子形状部分的最大数量为11，形状样本S的邻接矩阵表示为：Since the maximum number of sub-shape parts obtained by segmenting each training sample in the contour shape set is 11, the adjacency matrix of the shape sample S is It is expressed as:

其中a∈[1，11],b∈[1，11]。Where a∈[1，11], b∈[1，11].

6.对分割得到的3个子形状部分分别进行全尺度可视化表示，其中全尺度可视化表示的具体方法为：6. Perform full-scale visualization on the three sub-shape parts obtained by segmentation, where the specific method of full-scale visualization is as follows:

(1)对于任一子形状部分的轮廓，对该轮廓抽样得到100个轮廓抽样点。如图4所示，设置全尺度空间中总尺度数为100，并基于100个轮廓抽样点的坐标计算每个轮廓点对应于每层尺度下的归一化面积、弧长及重心距。以子形状部分S¹为例，具体计算方式如下：(1) For the contour of any sub-shape part, sample the contour to obtain 100 contour sampling points. As shown in Figure 4, the total number of scales in the full-scale space is set to 100, and the normalized area, arc length and centroid distance of each contour point corresponding to each layer of scale are calculated based on the coordinates of the 100 contour sampling points. Taking the sub-shape part ^S1 as an example, the specific calculation method is as follows:

以子形状部分S¹轮廓的抽样点p¹(i)为圆心，以初始半径作预设圆C₁(i)，该预设圆即为计算对应轮廓点参数的初始半全局尺度。依据上述步骤得到预设圆C₁(i)后，目标形状必然有一部分落在该预设圆内，示意图如图5所示。如果目标形状落在预设圆内的部分为一单独区域，则该单独区域即为与目标轮廓点p¹(i)具有直接连接关系的区域，记为Z₁(i)；如果目标形状落在预设圆内的部分分为若干个互不连通的区域的话，如图5所示的区域A和区域B，那么确定目标轮廓点p¹(i)在其轮廓上的区域为与目标轮廓点p¹(i)具有直接连接关系的区域，在图5中即区域A，记为Z₁(i)。基于此，将预设圆C₁(i)中的与目标轮廓点p¹(i)具有直接连接关系的区域Z₁(i)的面积记为则有：The sampling point p ¹ (i) of the sub-shape part S ¹ is taken as the center of the circle and the initial radius is Make a preset circle C ₁ (i), which is the initial semi-global scale for calculating the corresponding contour point parameters. After obtaining the preset circle C ₁ (i) according to the above steps, a part of the target shape must fall within the preset circle, as shown in Figure 5. If the part of the target shape that falls within the preset circle is a separate area, then the separate area is the area that has a direct connection with the target contour point p ¹ (i), denoted as Z ₁ (i); if the part of the target shape that falls within the preset circle is divided into several unconnected areas, such as area A and area B as shown in Figure 5, then the area on the contour of the target contour point p ¹ (i) is determined to be the area that has a direct connection with the target contour point p ¹ (i), which is area A in Figure 5, denoted as Z ₁ (i). Based on this, the area of area Z ₁ (i) in the preset circle C ₁ (i) that has a direct connection with the target contour point p ¹ (i) is denoted as Then we have:

将Z₁(i)的面积与预设圆C₁(i)面积的比值作为目标轮廓点p¹(i)的描述子的面积参数s₁(i)：The ratio of the area of Z ₁ (i) to the area of the preset circle C ₁ (i) is used as the area parameter s ₁ (i) of the descriptor of the target contour point p ¹ (i):

s₁(i)的取值范围应当在0到1之间。The value range of s ₁ (i) should be between 0 and 1.

在计算与目标轮廓点p¹(i)具有直接连接关系的区域的重心时，具体为对该区域中所有像素点的坐标值求取平均数，所得结果即为该区域的重心的坐标值，可以表示为：When calculating the centroid of the region directly connected to the target contour point p ¹ (i), the coordinate values of all the pixels in the region are averaged, and the result is the coordinate value of the centroid of the region, which can be expressed as:

其中，w₁(i)即为上述区域的重心。Among them, w ₁ (i) is the centroid of the above area.

而计算目标轮廓点与重心w₁(i)的距离可以表示为：The distance between the target contour point and the center of gravity w ₁ (i) is calculated It can be expressed as:

并将与目标轮廓点的预设圆的半径的比值作为该目标轮廓点p¹(i)描述子的重心距参数c₁(i)：and will The ratio of the radius of the preset circle of the target contour point is used as the centroid distance parameter c ₁ (i) of the descriptor of the target contour point p ¹ (i):

c₁(i)的取值范围应当在0到1之间。The value range of c ₁ (i) should be between 0 and 1.

目标形状的轮廓被预设圆切割后必然会有一个或者多个弧段落在预设圆内，如图6所示。如果目标形状只有一个弧段落在预设圆内，则确定该弧段为与目标轮廓点p¹(i)具有直接连接关系的弧段，如果目标形状有多个弧段落在预设圆内，如图6中的弧段A(Segment A)、弧段B(Segment B)、弧段C(Segment C)，则确定目标轮廓点p¹(i)所在的弧段为与目标轮廓点p¹(i)具有直接连接关系的弧段，在图6中即为弧段A(Segment A)。基于此，将预设圆C₁(i)内与目标轮廓点p¹(i)具有直接连接关系的弧段的长度记为并将与预设圆C₁(i)周长的比值作为该目标轮廓点的描述子的弧长参数l₁(i)：After the contour of the target shape is cut by the preset circle, there will inevitably be one or more arc segments within the preset circle, as shown in Figure 6. If the target shape has only one arc segment within the preset circle, then the arc segment is determined to be an arc segment that has a direct connection relationship with the target contour point p ¹ (i). If the target shape has multiple arc segments within the preset circle, such as arc segment A (Segment A), arc segment B (Segment B), and arc segment C (Segment C) in Figure 6, then the arc segment where the target contour point p ¹ (i) is located is determined to be an arc segment that has a direct connection relationship with the target contour point p ¹ (i), which is arc segment A (Segment A) in Figure 6. Based on this, the length of the arc segment that has a direct connection relationship with the target contour point p ¹ (i) within the preset circle C ₁ (i) is recorded as and will The ratio of the circumference of the preset circle C ₁ (i) is used as the arc length parameter l ₁ (i) of the descriptor of the target contour point:

其中，l₁(i)的取值范围应当在0到1之间。Among them, the value range of l ₁ (i) should be between 0 and 1.

M₁＝{s₁(i)，l₁(i)，c₁(i)|i∈[1，100]}，M ₁ ={s ₁ (i), l ₁ (i), c ₁ (i)|i∈[1, 100]},

如图7所示，分别计算全尺度空间中100个尺度下各自的特征函数，其中对于第k个尺度标签，设定圆C_k的半径r_k：As shown in FIG7 , the characteristic functions of each of the 100 scales in the full-scale space are calculated respectively, where for the k-th scale label, the radius r _k of the circle C _k is set as:

即在初始尺度k＝1时，此后半径r_k以一个像素为单位等幅缩小99次，直至最小尺度k＝100时为止。计算得到全部尺度空间下形状样本S的子形状部分S¹的特征函数：That is, when the initial scale k = 1, After that, the radius r _k is reduced by 99 times in units of one pixel until the minimum scale k = 100. The characteristic function of the sub-shape part S ¹ of the shape sample S in all scale spaces is calculated:

M＝{s_k(i)，l_k(i)，c_k(i)|k∈[1，100]，i∈[1，100]}，M={s _k (i), l _k (i), c _k (i)|k∈[1, 100], i∈[1, 100]},

(2)如图8所示，将子形状部分S¹全尺度空间中100个尺度下的特征函数按尺度顺序合并为三个全尺度空间下的特征矩阵：(2) As shown in FIG8 , the feature functions of the 100 scales in the full-scale space of the sub-shape part S ¹ are merged into three feature matrices in the full-scale space in scale order:

GM¹＝{S^M，L^M，C^M}，GM ¹ ={S ^M , L ^M , C ^M },

其中，S^M、L^M、C^M均为尺寸大小是m×n的灰度矩阵，各代表一幅灰度图像。Among them, S ^M , L ^M , and ^CM are all grayscale matrices of size m×n, each representing a grayscale image.

(3)如图9所示，将该子形状部分S¹的三幅灰度图像作为RGB三个通道合成一张彩色图像，作为该子形状部分S¹的特征表达图像 (3) As shown in FIG. 9 , the three grayscale images of the sub-shape portion S ¹ are used as the three RGB channels to synthesize a color image as the feature expression image of the sub-shape portion S ¹

7.构建卷积神经网络，包括输入层、预训练层、全连接层。本发明将所有训练形状样本各子形状部分的特征表达图像样本输入至卷积神经网络，对卷积神经网络模型进行训练。每类形状的不同子形状部分有不同的类别标签。卷积神经网络训练至收敛后，以形状样本S为例，将其分割形成的3个子形状部分对应的尺寸大小为100×100的特征表达图像{T^num|num∈[1，3]}分别输入训练好的卷积神经网络，网络第二层全连接层的输出为对应子形状部分的特征向量其中Vec为第二层全连接层中神经元的个数，Vec设置为200。7. Construct a convolutional neural network, including an input layer, a pre-training layer, and a fully connected layer. The present invention inputs the feature expression image samples of each sub-shape part of all training shape samples into the convolutional neural network to train the convolutional neural network model. Different sub-shape parts of each type of shape have different category labels. After the convolutional neural network is trained to convergence, taking the shape sample S as an example, the feature expression images {T ^num | num∈[1,3]} of size 100×100 corresponding to the three sub-shape parts formed by segmentation are respectively input into the trained convolutional neural network, and the output of the second fully connected layer of the network is the feature vector of the corresponding sub-shape part. Where Vec is the number of neurons in the second fully connected layer, and Vec is set to 200.

本发明使用sgd优化器，学习率设置为0.001，延迟率设置为1e-6，损失函数选用交叉熵，batch size大小选择为128。如图10所示，预训练层由VGG16网络模型前4个模块组成，将该4个模块在imagenet数据集中训练后所得的参数作为初始化参数，预训练层后连接三个全连接层。The present invention uses the sgd optimizer, the learning rate is set to 0.001, the delay rate is set to 1e-6, the cross entropy is selected as the loss function, and the batch size is selected to be 128. As shown in Figure 10, the pre-training layer consists of the first 4 modules of the VGG16 network model, and the parameters obtained after training the 4 modules in the imagenet dataset are used as initialization parameters. Three fully connected layers are connected after the pre-training layer.

预训练层中第1个模块具体包括2层卷积层和1层最大池化层，其中卷积层卷积核数目为64，尺寸大小为3×3，池化层尺寸大小为2×2；第2个模块具体包括2层卷积层和1层最大池化层，其中卷积层卷积核数目为128，尺寸大小为3×3，池化层尺寸大小为2×2；第3个模块具体包括3层卷积层和1层最大池化层，其中卷积层卷积核数目为256，尺寸大小为3×3，池化层尺寸大小为2×2；第4个模块具体包括3层卷积层和1层最大池化层，其中卷积层卷积核数目为512，尺寸大小为3×3，池化层尺寸大小为2×2。每一层卷积层的计算公式为：The first module in the pre-training layer specifically includes 2 convolutional layers and 1 maximum pooling layer, where the number of convolutional kernels in the convolutional layer is 64, the size is 3×3, and the size of the pooling layer is 2×2; the second module specifically includes 2 convolutional layers and 1 maximum pooling layer, where the number of convolutional kernels in the convolutional layer is 128, the size is 3×3, and the size of the pooling layer is 2×2; the third module specifically includes 3 convolutional layers and 1 maximum pooling layer, where the number of convolutional kernels in the convolutional layer is 256, the size is 3×3, and the size of the pooling layer is 2×2; the fourth module specifically includes 3 convolutional layers and 1 maximum pooling layer, where the number of convolutional kernels in the convolutional layer is 512, the size is 3×3, and the size of the pooling layer is 2×2. The calculation formula for each convolutional layer is:

C_O＝φ_relu(W_C·C_I+θ_C),C _O =φ _relu (W _C ·C _I +θ _C ),

全连接层模块具体包括3层全连接层，其中第1层全连接层包含512个节点，第2层全连接层包含200个节点,第3层全连接层包含770个节点。其中前2层全连接层的计算公式为：The fully connected layer module specifically includes 3 fully connected layers, of which the first fully connected layer contains 512 nodes, the second fully connected layer contains 200 nodes, and the third fully connected layer contains 770 nodes. The calculation formula for the first two fully connected layers is:

F_O＝φ_tanh(W_F·F_I+θ_F),F _O =φ _tanh (W _F ·F _I +θ _F ),

其中，φ_tanh为tanh激活函数，θ_F是该全连接层的偏置向量；W_F是该全连接层的权重；F_I是该全连接层的输入；F_O是该全连接层的输出；Among them, φ _tan h is the tanh activation function, θ _F is the bias vector of the fully connected layer; W _F is the weight of the fully connected layer; _FI is the input of the fully connected layer; _FO is the output of the fully connected layer;

Y_O＝φ_softmax(W_Y·Y_I+θ_Y),Y _O =φ _softmax (W _Y ·Y _I +θ _Y ),

其中，φ_softmax为softmax激活函数，θ_Y是输出层的偏置向量，每一个输出层的神经元都表示对应的一个子形状部分类别，W_Y是输出层的权重，Y_I是输出层的输入；Y_O是输出层的输出；Among them, φ _softmax is the softmax activation function, θ _Y is the bias vector of the output layer, each neuron in the output layer represents a corresponding sub-shape part category, W _Y is the weight of the output layer, Y _I is the input of the output layer; Y _O is the output of the output layer;

8.如图11所示，根据该形状样本的3个子形状特征向量构造该形状样本的特征矩阵 8. As shown in Figure 11, the feature matrix of the shape sample is constructed based on the three sub-shape feature vectors of the shape sample

其中，F_a表示矩阵F的第a行向量，f^a为上述步骤输出的第a个子形状部分的特征向量，表示维度大小为200的零向量。Wherein, _Fa represents the a-th row vector of the matrix F, and ^fa is the eigenvector of the a-th sub-shape part outputted in the above step. Represents a zero vector of dimension size 200.

9.构建图卷积神经网络，包括预处理输入层、隐藏层和分类输出层。本发明将形状样本拓扑图的邻接矩阵和特征矩阵输入图卷积神经网络结构模型中进行训练。本发明使用sgd优化器，学习率设置为0.001，延迟率设置为1e-6，损失函数选用交叉熵，batch size大小选择为128。9. Construct a graph convolutional neural network, including a preprocessing input layer, a hidden layer, and a classification output layer. The present invention converts the adjacency matrix of the shape sample topology graph into and the feature matrix The input graph convolutional neural network structure model is trained. The present invention uses the sgd optimizer, the learning rate is set to 0.001, the delay rate is set to 1e-6, the loss function uses cross entropy, and the batch size is selected as 128.

预处理输入层中对邻接矩阵归一化预处理，具体为：Preprocess the adjacency matrix in the input layer Normalization preprocessing, specifically:

其中是单位矩阵，为度矩阵，是归一化预处理后的 in is the identity matrix, is the degree matrix, After normalization preprocessing

10.将所有训练样本输入至图卷积神经网络，对图卷积神经网络模型进行训练。对于任一测试形状样本，首先提取形状轮廓关键点，计算各关键点处的曲率值，判断其凹凸性，获取候选分割点，然后调整曲率筛选阈值得到形状分割点。依次两两连接形状分割点，构成分割线段，保留其中位于形状内且互相不交叉的线段作为分割线段，得到分割线段集，计算该分割线段集中的分割线段的分割代价。如果分割线段的个数小于10，则所有的分割线段用于分割形状。否则，根据分割代价最小的10个分割线段对形状进行分割。计算每个子形状部分的彩色特征表达图像，并将其输入训练好的卷积神经网络，卷积神经网络的第二层全连接层的输出作为该子形状部分的特征向量。构造该测试形状样本的形状有向图，计算其邻接矩阵和特征矩阵，输入进已训练好的图卷积神经网络模型中，输出向量中最大值对应的形状类别即判断为该测试样本的形状类型，实现形状分类识别。10. Input all training samples into the graph convolutional neural network to train the graph convolutional neural network model. For any test shape sample, first extract the key points of the shape contour, calculate the curvature value at each key point, determine its convexity, obtain the candidate segmentation points, and then adjust the curvature screening threshold to obtain the shape segmentation points. Connect the shape segmentation points in pairs to form segmentation segments, retain the segments that are located in the shape and do not cross each other as segmentation segments, obtain a segmentation segment set, and calculate the segmentation cost of the segmentation segments in the segmentation segment set. If the number of segmentation segments is less than 10, all segmentation segments are used to segment the shape. Otherwise, the shape is segmented according to the 10 segmentation segments with the smallest segmentation cost. Calculate the color feature expression image of each sub-shape part and input it into the trained convolutional neural network. The output of the second fully connected layer of the convolutional neural network is used as the feature vector of the sub-shape part. Construct a shape directed graph of the test shape sample, calculate its adjacency matrix and feature matrix, and input them into the trained graph convolutional neural network model. The shape category corresponding to the maximum value in the output vector is judged as the shape type of the test sample to achieve shape classification and recognition.

尽管参照前述实施例对本发明进行了详细的说明，对于本领域的技术人员来说，其依然可以对前述各实施例所记载的技术方案进行修改，或者对其中部分技术特征进行等同替换,凡在本发明的精神和原则之内，所作的任何修改、等同替换、改进等，均应包含在本发明的保护范围之内。Although the present invention has been described in detail with reference to the aforementioned embodiments, it is still possible for those skilled in the art to modify the technical solutions described in the aforementioned embodiments, or to make equivalent substitutions for some of the technical features therein. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and principles of the present invention should be included in the protection scope of the present invention.

Claims

1. A shape recognition method, characterized in that the method comprises the following steps:

Step 1: Extract the contour key points of the shape sample;

Step 2: define the approximate bias curvature value at each key point and determine the concavity and convexity of the curve at the key point to obtain candidate shape segmentation points;

Step 3: Adjust the curvature screening threshold to obtain the shape segmentation point;

Step 4: segment the shape based on the principle that the segmentation line segments are located within the shape and do not cross each other, and obtain several sub-shape parts with the minimum segmentation cost;

Step 5: Construct the topological structure of the shape sample;

Step 6: Use the full-scale visualization method of the shape to obtain a feature expression image of the corresponding sub-shape part;

Step 7: Input each feature expression image into the convolutional neural network for training, and learn to obtain the feature vector of each sub-shape part;

Step 8: construct a feature matrix of shape samples;

Step 9: Construct a graph convolutional neural network;

Step 10: Train the graph convolutional neural network, perform shape segmentation on the test sample, obtain the feature vector of each sub-shape part, calculate the feature matrix and adjacency matrix of the test sample, and input them into the trained graph convolutional network model to realize shape classification and recognition.

2. A shape recognition method according to claim 1, characterized in that: in the step 1, the method for extracting contour key points is:

The contour of each shape sample is composed of a series of sampling points. For any shape sample S, sampling n points of the contour yields:

S={(p _x (i), p _y (i))|i∈[1, n]},

Where p _x (i), p _y (i) are the horizontal and vertical coordinates of the contour sampling point p (i) in the two-dimensional plane, and n is the contour length, that is, the number of contour sampling points;

The contour curve of the shape sample is evolved to extract key points. In each evolution process, the point that contributes the least to target recognition is deleted, where the contribution of each point p(i) is defined as:

Among them, h(i, i-1) is the length of the curve between points p(i) and p(i-1), h(i, i+1) is the length of the curve between points p(i) and p(i+1), H ₁ (i) is the angle between line segment p(i)p(i-1) and line segment p(i)p(i+1), and the length h is normalized according to the contour perimeter; the larger the Con(i) value, the greater the contribution of the point p(i) to the shape feature;

This method uses a region-based adaptive end function F(t) to overcome the problem of extracting too many or too few contour key points:

Where _S0 is the area of the original shape, _S1 is the area of the shape after i evolutions, and _n0 is the total number of points on the original shape contour; when the end function value F(t) exceeds the set threshold, the contour key point extraction ends and n ^* contour key points are obtained.

3. A shape recognition method according to claim 2, characterized in that: in the step 2, the specific method of defining the approximate bias curvature value at each key point and judging the concavity and convexity of the curve at the key point to obtain the candidate segmentation point is:

In order to calculate the approximate bias curvature value of any key point p(i) in the shape sample S, take the contour points p(i-ε) and p(i+ε) before and after p(i), where ε is an empirical value; due to:

cosH _ε (i)∝cur(p(i)),

Where H _ε (i) is the angle between the line segment p(i)p(i-ε) and the line segment p(i)p(i+ε), and cur(p(i)) is the curvature at point p(i);

The approximate bias curvature value cur~(p(i)) at the definition point p(i) is:

cur~(p(i))=cosH _ε (i)+1,

Where H _ε (i) is the angle between the line segment p(i)p(i-ε) and the line segment p(i)p(i+ε), cosH _ε (i) ranges from -1 to 1, and cur～(p(i)) ranges from 0 to 2;

According to the shape segmentation method that conforms to visual naturalness, the shape segmentation points are all located at the concave curve of the contour; therefore, when screening the candidate segmentation points for shape segmentation, a method for judging the concavity of the curve at the key point p(i) is defined:

For the binary image of the shape, the values of the pixels inside the contour of the shape sample S are all 255, and the values of the pixels outside the contour of the shape sample S are all 0; the equidistant sampling line segments p(i-ε)p(i+ε) obtain R discrete points. If the pixel values of these R discrete points are all 255, then the line segments p(i-ε)p(i+ε) are all inside the shape contour, that is, the curve at p(i) is convex; if the pixel values of these R discrete points are all 0, then the line segments p(i-ε)p(i+ε) are all outside the shape contour, that is, the curve at p(i) is concave; the key point p(i) where all curves are concave is recorded as the candidate segmentation point P(j).

4. A shape recognition method according to claim 3, characterized in that: in the step 3, the step of adjusting the curvature screening threshold Th and obtaining the shape segmentation point is as follows:

(1) For all candidate segmentation points P(j) obtained in step 2, their average approximate bias curvature value is used as the initial threshold Th ₀ :

Where J is the total number of candidate segmentation points;

(2) For the threshold Th _τ during the τth adjustment, according to the relationship between the approximate bias curvature value of each candidate segmentation point P(j) and Th _τ , P(j) can be divided into two categories: candidate segmentation points with approximate bias curvature values greater than Th τ and candidate segmentation points with approximate bias curvature values greater than Th _τ . Candidate segmentation points with approximate bias curvature values less than or equal to Th _τ Calculate and record the segmentation discrimination D _τ under the current threshold:

in,

in They represent the positive and negative curvature deviations of each candidate segmentation point P(j) under the threshold Th _τ , represents the minimum value of the positive curvature deviation of all candidate segmentation points, Represents the maximum value of the negative curvature deviation of all candidate segmentation points;

Determine whether there is a candidate segmentation point whose approximate bias curvature value is greater than the threshold Th _τ . If not, no adjustment is made and the process goes to step (4). If there is a candidate segmentation point whose approximate bias curvature value is greater than the threshold Th _τ , the process goes to step (3) and continues to adjust the threshold.

(3) Continue to adjust the threshold. The new threshold Th _τ+1 is the minimum value of the positive curvature deviation of all candidate segmentation points in the last threshold adjustment process, which can be expressed by the following formula:

According to the threshold Th _τ+1, the positive and negative curvature deviations of each candidate segmentation point under the τ+1th adjustment are calculated. and the segmentation discrimination D _τ+1 and record them; determine whether there is a candidate segmentation point whose approximate bias curvature value is greater than the threshold Th _τ+1. If not, no adjustment is made and go to step (4); if there is a candidate segmentation point whose approximate bias curvature value is greater than the threshold Th _τ+1 , set τ=τ+1 and repeat the current step to continue adjusting the threshold;

(4) If the threshold is adjusted multiple times, there will be multiple segmentation distinctions. The threshold corresponding to the maximum segmentation distinction is the final curvature screening threshold Th, and the point whose approximate bias curvature value is less than the threshold Th is the final shape segmentation point.

5. A shape recognition method according to claim 4, characterized in that: in said step 4, the shape segmentation is performed based on the principle that the segmentation line segments are located within the shape and do not intersect each other, and the specific method of segmenting to obtain a plurality of sub-shape parts with the minimum segmentation cost is:

(1) For any two shape segmentation points P(e ₁ ) and P(e ₂ ), the segmentation line segments P(e ₁ ) and P(e ₂ ) are sampled at equal intervals to obtain C discrete points. If there is a discrete point with a pixel value of 0 among the C points, the segment P(e ₁ ) and P(e ₂ ) exist outside the shape contour and are not selected as segmentation line segments.

(2) For any two shape segmentation points P(e ₃ ) and P(e ₄ ), if there exists a shape segmentation line segment P(e ₅ ) and P(e ₆ ) such that:

or

Then the line segment P(e ₃ )P(e ₄ ) intersects with the existing segmentation line segment P(e ₅ )P(e ₆ ), and the line segment P(e ₃ )P(e ₄ ) is not selected as the segmentation line segment;

(3) The segmentation line set that meets the above two principles is further screened, and segmentation is achieved at the minimum segmentation cost by defining three metrics I to evaluate the quality of the segmentation line segments:

Among them, D ^* (u,v), L ^* (u,v), and S ^* (u,v) are three segmentation metrics, namely, normalized segmentation length, segmentation arc length, and segmentation residual area. u and v are the sequence numbers of any two shape segmentation points. is the total number of split points;

For any shape segmentation line segment P(u)P(v), the three segmentation evaluation indicators are calculated as follows:

Where D _max is the length of the longest segment among all segmentation segments, and the value range of D ^* (u, v) should be between 0 and 1, and the smaller the value, the more significant the segmentation effect;

in is the contour curve between points P(u) and P(v) The length of L ^* (u,v) should be between 0 and 1, and the smaller the value, the more significant the segmentation effect;

Where _Sd is the area of the shape segmented by the segmentation line segment P(u)P(v), that is, the area of the shape segmented by the segment P(u)P(v) and the contour curve The area of the closed region formed, the value range of S ^* (u,v) should be between 0 and 1, and the smaller the value, the more significant the segmentation effect;

According to the above steps, the segmentation cost Cost for the segmentation line segment P(u)P(v) is calculated as follows:

Cost＝αD ^* (u, v) + βL ^* (u, v) + γS ^* (u, v),

Among them, α, β, and γ are the weights of each metric;

Calculate the segmentation cost Cost of the segmentation line segments in the filtered segmentation line segment set; sort all the calculated costs from small to large, and finally select N-1 segmentation line segments with the smallest Cost according to the number N of segmentation sub-shape parts set for the category to which the shape sample S belongs, so as to achieve optimal segmentation and obtain N sub-shape parts; the number N of segmentation sub-shape parts depends on the category to which the current shape sample S belongs. For shapes of different categories, the corresponding number of segmentation sub-shape parts is manually set.

6. A shape recognition method according to claim 5, characterized in that: in said step 5, the specific method of constructing the topological structure of the shape sample is: for any N sub-shape parts obtained by segmenting the shape sample S, the central shape part is recorded as the starting vertex v ₁ , and the remaining adjacent shape parts are sorted in a clockwise direction and recorded as vertices {v _o |o∈[2,N]}; the edges connecting v ₁ to the remaining vertices v _o are recorded as (v ₁ ,v _o ), thereby forming a shape directed graph that satisfies the topological order:

G ₁ =(V ₁ ,E ₁ ),

Where V ₁ ={v _o |o∈[1, N]}, E ₁ ={(v ₁ , v _o )|o∈[2, N]};

After all training shape samples are optimally segmented, the maximum number of sub-shape parts obtained by segmenting the training shape samples is recorded as For any shape sample S, its adjacency matrix The calculation method is:

in express A real matrix of order,

7. A shape recognition method according to claim 6, characterized in that: in the step 6, the specific method of using the full-scale visual representation method of the shape to obtain the color feature expression image of the corresponding sub-shape part is:

For any sub-shape part S ¹ of a shape sample S:

in, are the horizontal and vertical coordinates of the contour sampling point p ¹ (i) of the sub-shape part in the two-dimensional plane, n ¹ is the contour length, that is, the number of contour sampling points;

First, the feature function M composed of three shape descriptors is used to describe the outline of the sub-shape part ^S1 :

M={s _k (i), l _k (i), c _k (i)|k∈[1, m], i∈[1, n ¹ ]},

Among them, s _k , l _k , c _k are the three invariant parameters of normalized area s, arc length l and centroid distance c in scale k, k is the scale label, and m is the total number of scales; these three shape invariant descriptors are defined respectively:

With a contour sampling point p ¹ (i) as the center and an initial radius A preset circle C ₁ (i) is made, which is the initial semi-global scale for calculating the parameters of the corresponding contour points. After the preset circle C ₁ (i) is obtained according to the above steps, the calculation methods of the three shape descriptors under the scale k=1 are as follows:

When calculating the descriptor s ₁ (i), the area of the region Z ₁ (i) in the preset circle C ₁ (i) that is directly connected to the target contour point p ¹ (i) is recorded as Then we have:

Where B(Z ₁ (i), z) is an indicator function, defined as

The ratio of the area of Z ₁ (i) to the area of the preset circle C ₁ (i) is used as the area parameter s ₁ (i) of the target contour point descriptor:

The value range of s ₁ (i) should be between 0 and 1;

When calculating the c ₁ (i) descriptor, the centroid of the region directly connected to the target contour point p ¹ (i) is first calculated. Specifically, the coordinate values of all pixels in the region are averaged. The result is the coordinate value of the centroid of the region, which can be expressed as:

Among them, w ₁ (i) is the centroid of the above area;

Then calculate the distance between the target contour point p ¹ (i) and the center of gravity w ₁ (i) It can be expressed as:

Finally The ratio of the radius of the preset circle C ₁ (i) of the target contour point p ¹ (i) is used as the centroid distance parameter c ₁ (i) of the target contour point descriptor:

The value range of c ₁ (i) should be between 0 and 1;

When calculating the l ₁ (i) descriptor, the length of the arc segment directly connected to the target contour point p ¹ (i) within the preset circle C ₁ (i) is recorded as and will The ratio to the circumference of the preset circle C ₁ (i) is used as the arc length parameter l ₁ (i) of the target contour point descriptor:

Among them, the value range of l ₁ (i) should be between 0 and 1;

According to the above steps, we can calculate the initial radius at scale label k = 1. The characteristic function M ₁ of the sub-shape part S ¹ of the shape sample S at the semi-global scale is:

M ₁ ={s ₁ (i), l ₁ (i), c ₁ (i)|i∈[1, n ¹ ]},

Since a digital image uses a pixel as the smallest unit, a single pixel is selected as the continuous scale change interval in the full scale space; that is, for the kth scale label, the radius r _k of the circle C _k is set as:

That is, when the initial scale k = 1, After that, the radius r _k is reduced m-1 times in units of one pixel until the minimum scale k=m is reached; the characteristic functions at other scales are calculated in the same way as the characteristic function M ₁ at scale k=1, and finally the characteristic functions of the sub-shape part S ¹ of the shape sample S at all scales are obtained:

M={s _k (i), l _k (i), c _k (i)|k∈[1, m], i∈[1, n ¹ ]},

The characteristic functions at each scale are stored in matrices S ^M , L ^M , and C ^M respectively. S ^M is used to store _sk (i). The k-th row and i-th column of S ^M stores the area parameter _sk (i) of point p ¹ (i) at the k-th scale. L ^M is used to store l _k (i). The k-th row and i-th column of L ^M stores the arc length parameter l _k (i) of point p ¹ (i) at the k-th scale. C ^M is used to store c _k (i). The k-th row and i-th column of C ^M stores the centroid distance parameter c _k (i) of point p ¹ (i) at the k-th scale. S ^M , L ^M , and C ^M are finally used as grayscale images of the three shape features of the sub-shape part S ¹ of the shape sample S in the full-scale space:

GM ¹ ={S ^M , L ^M , C ^M },

Among them, S ^M , L ^M , and C ^M are matrices of size m×n, each representing a grayscale image;

Then, the three grayscale images of the sub-shape part ^S1 are used as the three RGB channels to obtain a color image as the feature expression image of the sub-shape part ^S1.

8. A shape recognition method according to claim 7, characterized in that: in the step 7, the feature expression image samples of each sub-shape part of all training shape samples are input into the convolutional neural network to train the convolutional neural network model; different sub-shape parts of each type of shape have different category labels; after the convolutional neural network is trained to convergence, for any shape sample S, the feature expression images {T ^num | num∈[1, N]} corresponding to the N sub-shape parts formed by segmentation are respectively input into the trained convolutional neural network, and the output of the second fully connected layer of the network is the feature vector of the corresponding sub-shape part Where Vec is the number of neurons in the second fully connected layer;

The structure of the convolutional neural network includes an input layer, a pre-training layer and a fully connected layer; the pre-training layer is composed of the first four modules of the VGG16 network model, and the parameters obtained after training the four modules in the imagenet dataset are used as initialization parameters, and three fully connected layers are connected after the pre-training layer;

The first module in the pre-training layer specifically includes 2 convolutional layers and 1 maximum pooling layer, where the number of convolutional kernels in the convolutional layer is 64, the size is 3×3, and the size of the pooling layer is 2×2; the second module specifically includes 2 convolutional layers and 1 maximum pooling layer, where the number of convolutional kernels in the convolutional layer is 128, the size is 3×3, and the size of the pooling layer is 2×2; the third module specifically includes 3 convolutional layers and 1 maximum pooling layer, where the number of convolutional kernels in the convolutional layer is 256, the size is 3×3, and the size of the pooling layer is 2×2; the fourth module specifically includes 3 convolutional layers and 1 maximum pooling layer, where the number of convolutional kernels in the convolutional layer is 512, the size is 3×3, and the size of the pooling layer is 2×2; the calculation formula for each convolutional layer is:

C _O =φ _relu (W _C ·C _I +θ _C ),

Among them, θ _C is the bias vector of the convolutional layer; W _C is the weight of the convolutional layer; C _I is the input of the convolutional layer; C _O is the output of the convolutional layer;

The fully connected layer module specifically includes 3 fully connected layers, where the first fully connected layer contains 512 nodes, the second fully connected layer contains Vec nodes, and the third fully connected layer contains _NT nodes; _NT is the sum of the number of segmented sub-shape parts of all categories of shapes; the calculation formula for the first two fully connected layers is:

F _O =φ _tanh (W _F ·F _I +θ _F ),

Among them, φ _tanh is the tanh activation function, θ _F is the bias vector of the fully connected layer; W _F is the weight of the fully connected layer; _FI is the input of the fully connected layer; _FO is the output of the fully connected layer;

The last fully connected layer is the output layer, and its output calculation formula is:

Y _O =φ _softmax (W _Y ·Y _I +θ _Y ),

Among them, φ _softmax is the softmax activation function, θ _Y is the bias vector of the output layer, each neuron in the output layer represents a corresponding sub-shape part category, W _Y is the weight of the output layer, Y _I is the input of the output layer; Y _O is the output of the output layer.

9. A shape recognition method according to claim 8, characterized in that: the specific method of constructing the feature matrix of the shape sample in step eight is:

For any shape sample S, the N sub-shape parts formed by segmentation have the corresponding feature matrix expression The calculation formula is:

Wherein, _Fa represents the a-th row vector of the matrix F, and ^fa is the eigenvector of the a-th sub-shape part outputted in step 7. Represents a zero vector of dimension size Vec.

10. A shape recognition method according to claim 9, characterized in that: in the step 9, a graph convolutional neural network structure is constructed, including a preprocessing input layer, a hidden layer and a classification output layer, and an adjacency matrix is performed in the preprocessing input layer Normalization preprocessing, specifically:

in I _N is the identity matrix, is the degree matrix, After normalization preprocessing

The hidden layer includes two graph convolution layers, and the calculation formula of each graph convolution layer is:

in, is the weight of the convolutional layer of the graph; H _I is the input of the convolutional layer of the graph, and the input of the first convolutional layer is the feature matrix of the shape sample H _O is the output of the graph convolutional layer;

The calculation formula of the classification output layer is:

Among them, φ _softmax is the softmax activation function, G _I is the input of the output layer, that is, the output of the second graph convolutional layer, G _W is the weight of the output layer; G _O is the output of the output layer; each neuron in the output layer represents a corresponding shape category.

11. A shape recognition method according to claim 10, characterized in that: the specific method for implementing contour shape classification and recognition in step 10 is: training the graph convolutional neural network model until convergence; for any test shape sample, first extract the key points of the shape contour, calculate the curvature value at each key point, judge its concavity and convexity, obtain the candidate segmentation points, and then adjust the curvature screening threshold to obtain the shape segmentation points; according to the two principles (1) and (2) in step 5, obtain the segmentation line segment set, calculate the segmentation cost of the segmentation line segment in the segmentation line segment set; if the number of segmentation line segments is less than Then all segmentation line segments are used to segment the shape; otherwise, according to the minimum segmentation cost The shape is segmented by segmentation line segments; the color feature expression image of each sub-shape part is calculated and input into the trained convolutional neural network, and the output of the second fully connected layer of the convolutional neural network is used as the feature vector of the sub-shape part; the shape directed graph of the test shape sample is constructed, and its adjacency matrix and feature matrix are calculated and input into the trained graph convolutional neural network model. The shape category corresponding to the maximum value in the output vector is judged as the shape type of the test sample, thereby realizing shape classification and recognition.