CN113191361B - Shape recognition method - Google Patents

Shape recognition method Download PDF

Info

Publication number
CN113191361B
CN113191361B CN202110418108.2A CN202110418108A CN113191361B CN 113191361 B CN113191361 B CN 113191361B CN 202110418108 A CN202110418108 A CN 202110418108A CN 113191361 B CN113191361 B CN 113191361B
Authority
CN
China
Prior art keywords
shape
segmentation
points
layer
sub
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110418108.2A
Other languages
Chinese (zh)
Other versions
CN113191361A (en
Inventor
杨剑宇
李一凡
闵睿朋
黄瑶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou University
Original Assignee
Suzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou University filed Critical Suzhou University
Priority to CN202110418108.2A priority Critical patent/CN113191361B/en
Publication of CN113191361A publication Critical patent/CN113191361A/en
Application granted granted Critical
Publication of CN113191361B publication Critical patent/CN113191361B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a shape recognition method, which is used for extracting outline key points of a shape sample; defining approximate bias curvature values at each key point and judging the concave-convex property at the key point to obtain candidate segmentation points; adjusting a curvature screening threshold value to obtain a shape division point; calculating the minimum segmentation cost to perform shape segmentation to obtain a plurality of sub-shape parts; constructing a topological structure of the shape sample; obtaining a characteristic expression image of the corresponding sub-shape part by using a full-scale visual representation method of the shape; inputting each characteristic expression image into a convolutional neural network for training, and learning to obtain characteristic vectors of each sub-shape part; constructing a feature matrix of the shape sample; constructing a graph convolutional neural network; training a graph convolution neural network, acquiring a feature matrix and an adjacent matrix of a test sample, and inputting the feature matrix and the adjacent matrix into a trained graph convolution network model to realize shape classification and identification.

Description

Shape recognition method
Technical Field
The invention relates to a shape recognition method, and belongs to the technical field of shape recognition.
Background
Contour shape recognition is an important research direction in the field of machine vision, and object shape feature recognition is a main research subject of machine vision, and the main result of the research is to fully extract object shape features for better similarity measurement by improving a shape matching algorithm or designing effective shape descriptors. The method is widely applied to engineering, such as radar, infrared imaging detection, matching and searching of images and videos, robot automatic navigation, scene semantic segmentation, texture recognition, data mining and other fields.
Typically, the expression and retrieval of outline shapes is based on manually designed Shape descriptors to extract target outline features, such as Shape Contexts, shape vocabolary and Bag of contour fragments, etc. However, the shape information obtained by extracting the manual descriptors is usually incomplete, and the robustness of the descriptors to the changes such as local changes, occlusion, overall deformation and the like of the target shape cannot be ensured. And designing too many descriptors can lead to redundancy of feature extraction and higher computational complexity. Therefore, the recognition accuracy and efficiency are low. In recent years, as convolutional neural networks achieve better performance in image recognition tasks, they are beginning to be applied in shape recognition tasks. The outline shape lacks information of images such as surface texture and color, so that the recognition effect of directly applying the convolutional neural network is low.
Aiming at the problems of the shape recognition algorithm, how to provide a target recognition method capable of accurately classifying the shape of the target outline is a problem to be solved by the person skilled in the art.
Disclosure of Invention
The invention is provided for solving the problems in the prior art, the technical proposal is as follows,
a method of shape recognition, the method comprising the steps of:
Step one, extracting outline key points of a shape sample;
step two, defining approximate offset curvature values at all key points and judging curve convexity at the key points so as to obtain candidate shape division points;
step three, adjusting a curvature screening threshold value to obtain shape division points;
fourthly, performing shape segmentation based on the principle that segmentation line segments are positioned in the shape and do not cross each other, and segmenting the shape to obtain a plurality of sub-shape parts with minimum segmentation cost;
fifthly, constructing a topological structure of the shape sample;
step six, obtaining a characteristic expression image of the corresponding sub-shape part by using a full-scale visual representation method of the shape;
inputting each characteristic expression image into a convolutional neural network for training, and learning to obtain characteristic vectors of each sub-shape part;
constructing a feature matrix of the shape sample;
step nine, constructing a graph convolution neural network;
and step ten, training the graph convolution neural network, performing shape segmentation on the test sample, obtaining feature vectors of all sub-shape parts, calculating a feature matrix and an adjacent matrix of the test sample, and inputting the feature matrix and the adjacent matrix into a trained graph convolution network model to realize shape classification and identification.
Preferably, in the first step, the method for extracting the contour key points includes:
the profile of each shape sample is made up of a series of sampling points, and for any shape sample S, sampling the profile by n points yields:
S={(p x (i),p y (i))|i∈[1,n]},
wherein p is x (i),p y (i) The horizontal and vertical coordinates of the profile sampling points p (i) in a two-dimensional plane are shown, and n is the profile length, namely the number of the profile sampling points;
extracting key points by evolving a contour curve of the shape sample, wherein in each evolution process, points with the smallest contribution to target identification are deleted, and the contribution of each point p (i) is defined as:
wherein H (i, i-1) is the curve length between points p (i) and p (i-1), H (i, i+1) is the curve length between points p (i) and p (i+1), H 1 (i) The length h is normalized according to the perimeter of the outline for the angle between the line segment p (i) p (i-1) and the line segment p (i) p (i+1); the larger the Con (i) value, the greater the contribution of the point p (i) to the shape feature;
the method refers to a region-based adaptive ending function F (t) to overcome the problem that contour key points are extracted too much or too little:
wherein S is 0 Is the area of the original shape, S i To the shape area after i times evolution, n 0 Is the total point number on the outline of the original shape; when the ending function value F (t) exceeds the set threshold, the contour key point extraction is ended and n is obtained * And (5) contour key points.
Further, in the second step, the specific method for defining the approximate offset curvature value at each key point and judging the curve convexity at the key point to obtain the candidate segmentation point includes:
in order to calculate an approximate bias curvature value of a key point p (i) at any position in a shape sample S, taking contour points p (i-epsilon) which are adjacent to p (i) in front and back, and p (i+epsilon), wherein epsilon is an empirical value; as a result of:
cosHε(i)∝cur(p(i)),
wherein H [ epsilon ] (i) is the angle between the line segment p (i) p (i-epsilon) and the line segment p (i) p (i+epsilon), cur (p (i)) is the curvature at the point p (i); the approximate bias curvature values cur to (p (i)) at the definition point p (i) are:
cur~(p(i))=cosHε(i)+1,
wherein H epsilon (i) is the angle between the line segment p (i) p (i-epsilon) and the line segment p (i) p (i+epsilon), cosH epsilon (i) has a value range of-1 to 1, and cur to (p (i)) has a value range of 0 to 2;
according to a shape segmentation method conforming to visual natures, shape segmentation points are all positioned at the concave curve of the outline; thus, in screening candidate segmentation points for shape segmentation, a method for judging curve convexity at a key point p (i) is defined:
for the binarized image of the shape, the values of the pixel points inside the S outline of the shape sample are 255, and the values of the pixel points outside the S outline of the shape sample are 0; equidistant sampling of the line segment P (i-epsilon) P (i+epsilon) to obtain R discrete points, if the pixel values of the R discrete points are 255, the line segment P (i-epsilon) P (i+epsilon) is all in the shape contour, namely, the curve at P (i) is convex, if the pixel values of the R discrete points are 0, the line segment P (i-epsilon) P (i+epsilon) is all outside the shape contour, namely, the curve at P (i) is concave, and the key point P (i) of all the curves which are concave is marked as a candidate segmentation point P (j).
Further, in the third step, the step of adjusting the curvature screening threshold Th and obtaining the shape division points is as follows:
(1) Regarding all the candidate partition points P (j) obtained in the second step, the average approximate bias curvature value is taken as an initial threshold Th 0
Wherein J is the total number of candidate division points;
(2) For threshold Th at the τ -Th adjustment τ Based on the approximate bias curvature value and Th of each candidate partition point P (j) τ P (j) can be divided into two classes: approximate bias curvature value is greater than Th τ Candidate segmentation points of (a)And an approximate offset curvature value of Th or less τ Candidate segmentation point +.>Calculating and recording the division degree D under the current threshold τ
Wherein, the liquid crystal display device comprises a liquid crystal display device,
wherein the method comprises the steps ofRespectively represent threshold Th τ The positive and negative curvature deviations of the next candidate division points P (j),representing the minimum value of the positive curvature deviations of all candidate segmentation points, +.>Representing the maximum value of negative curvature deviations of all candidate segmentation points;
determining whether the approximate bias curvature value is greater than a threshold Th τ If not, no adjustment is performed, and the step (4) is performed; if there is an approximate bias curvature value greater than the threshold Th τ Turning to the step (3) to continuously adjust the threshold value;
(3) Continuously adjusting threshold value, new threshold value Th τ+1 The minimum value of the positive curvature deviation of all the candidate segmentation points in the last threshold adjustment process is expressed as follows:
according to threshold Th τ+1 Calculating the positive and negative curvature deviation of each candidate segmentation point under the tau+1st adjustmentDivision dividing degree D τ+1 And recording; determining whether the approximate bias curvature value is greater than a threshold Th τ+1 If not, no adjustment is performed, and the step (4) is performed; if there is an approximate bias curvature value greater than the threshold Th τ+1 Let τ=τ+1, repeat the current step and continue to adjust the threshold;
(4) The multiple adjustment threshold value has a plurality of division degrees, the threshold value corresponding to the maximum division degree is a final curvature screening threshold value Th, and the point with the approximate offset curvature value smaller than the threshold value Th is a final shape division point.
In the fourth step, the specific method for dividing the shape based on the principle that the dividing line segments are positioned in the shape and do not intersect with each other and dividing the shape into a plurality of sub-shape parts with the minimum dividing cost is as follows:
(1) For any two shape division points P (e 1 ),P(e 2 ) Equidistant sampling of segment P (e 1 )P(e 2 ) Obtaining C discrete points, and if the C discrete points have a pixel value of 0, obtaining a line segment P (e 1 )P(e 2 ) The part outside the shape outline is not selected as a segmentation line segment;
(2) For any two shape division points P (e 3 ),P(e 4 ) If a strip-shaped dividing line segment P (e 5 )P(e 6 ) Such that:
or (b)
Line segment P (e) 3 )P(e 4 ) And the existing segment P (e 5 )P(e 6 ) Intersecting, not selecting line segment P (e 3 )P(e 4 ) As a segment of the segment;
(3) The segmentation line segment set meeting the two principles is further screened, and segmentation is realized under the minimum segmentation cost by defining three measurement indexes I for evaluating the segment quality:
wherein D is * (u,v)、L * (u,v)、S * (u, v) are three segmentation measurement indexes of normalized segmentation length, segmentation arc length and segmentation residual area respectively, u, v is the sequence number of any two shape segmentation points,the total number of the segmentation points;
for any one bar-shaped segment P (u) P (v), three segmentation evaluation index calculation methods are as follows:
wherein D is max For the length of the segment with the largest length among all segments, D * The value range of (u, v) should be between 0 and 1, and the smaller the value, the more remarkable the segmentation effect;
wherein the method comprises the steps ofIs a profile curve +.>Length L of (2) * The value range of (u, v) should be between 0 and 1, and the smaller the value, the more remarkable the segmentation effect;
wherein S is d The area of the shape divided for dividing line segment P (u) P (v), i.e. the area divided by line segment P (u) P (v) and contour curve Area of closed region formed, S * The value range of (u, v) should beBetween 0 and 1, and the smaller the number, the more remarkable the segmentation effect;
according to the steps, the segmentation Cost for the segmentation line segment P (u) P (v) is obtained through calculation:
Cost=αD * (u,v)+βL * (u,v)+γS * (u,v),
wherein alpha, beta and gamma are weights of various measurement indexes;
calculating the segmentation Cost of the segmentation line segments in the screened segmentation line segment set; sorting all the calculated costs from small to large, and finally selecting N-1 segmentation line segments with the smallest Cost according to the number N of segmentation sub-shape parts set by the category to which the shape sample S belongs, thereby realizing optimal segmentation and obtaining N sub-shape parts; the number of divided sub-shape portions N depends on the category to which the current shape sample S belongs, and for shapes of different categories, the corresponding number of divided sub-shape portions is manually set.
Further, in the fifth step, a specific method for constructing the topology structure of the shape sample is as follows: for N sub-shape portions obtained by dividing any shape sample S, the center shape portion is designated as a starting vertex v 1 And the rest adjacent shape parts are ordered by vertexes in the clockwise direction and marked as vertexes { v } o |o∈[2,N]-a }; record connection v 1 To the other vertices v o Is of the side (v) 1 ,v o ) And further form a shape directed graph satisfying the topological order:
G 1 =(V 1 ,E 1 ),
wherein V is 1 ={v o |o∈[1,N]},E 1 ={(v 1 ,v o )|o∈[2,N]};
After all training shape samples are optimally segmented, the maximum number of sub-shape parts obtained by segmentation of the training shape samples is recorded asFor any shape sample S, its adjacency matrix +.>The calculation mode of (a) is as follows:
wherein the method comprises the steps ofRepresentation->Order real matrix,/->
Further, in the step six, a full-scale visual representation method of the shape is used, and a specific method for obtaining the color feature expression image corresponding to the sub-shape part is as follows:
sub-shape portion S for any shape sample S 1
Wherein, the liquid crystal display device comprises a liquid crystal display device,sampling point p for the outline of the sub-shape portion 1 (i) Horizontal and vertical coordinates in two-dimensional plane, n 1 The contour length is the number of contour sampling points;
the outline of the sub-shape portion S1 is first described using a feature function M composed of three shape descriptors:
M={s k (i),l k (i),c k (i)|k∈[1,m],i∈[1,n 1 ]},
wherein s is k ,l k ,c k Three invariant parameters of normalized area s, arc length l and center of gravity distance c in a scale k, wherein k is a scale label, and m is the total scale number; defining the three shape invariant descriptors respectively:
sampling point p with a contour 1 (i) As the center of a circle, with an initial radiusMaking a preset circle C 1 (i) The preset circle is the initial semi-global scale for calculating the corresponding contour point parameters; obtaining a preset circle C according to the steps 1 (i) The three shape descriptors at the scale k=1 are then calculated as follows:
in calculating s 1 (i) When describing, the circle C is preset 1 (i) Is matched with the target contour point p 1 (i) Zone Z with direct connection 1 (i) Is expressed as the area of (2)Then there are:
wherein B (Z) 1 (i) Z) is an indication function defined as
Will Z 1 (i) And the area of the preset circle C 1 (i) Area ratio as area parameter s of target contour point descriptor 1 (i):
s 1 (i) The value of (2) should be in the range of 0 to 1;
in calculation c 1 (i) In describing, first calculate the target contour point p 1 (i) The center of gravity of the area with the direct connection relation, specifically, the average of the coordinate values of all the pixel points in the area is calculated, and the obtained result is the coordinate value of the center of gravity of the area, which can be expressed as:
wherein w is 1 (i) Namely the center of gravity of the region;
then calculate the target contour point p 1 (i) And the center of gravity w 1 (i) Distance of (2)Can be expressed as:
finally, willAnd the target contour point p 1 (i) Is a preset circle C of (2) 1 (i) As the center of gravity distance parameter c of the target contour point descriptor 1 (i):
c 1 (i) The value of (2) should be in the range of 0 to 1;
in calculating l 1 (i) When describing, the circle C is preset 1 (i) Inner and target contour points p 1 (i) The length of the arc segment having a direct connection relationship is recorded as And will->And preset circle C 1 (i) The ratio of the circumferences is taken as the arc length parameter l of the target contour point descriptor 1 (i):
Wherein l 1 (i) The value of (2) should be in the range of 0 to 1;
calculating to obtain the scale label k=1 and the initial radius according to the stepsSub-shape portion S of shape sample S at semi-global scale 1 Is a characteristic function M of (2) 1
M 1 ={s 1 (i),l 1 (i),c 1 (i)|i∈[1,n 1 ]},
Since the digital image takes one pixel as the minimum unit, a single pixel is selected as a continuous scale change interval in a full scale space; that is, for the kth scale tag, circle C is set k Radius r of (2) k
I.e. at an initial scale k=1,thereafter radius r k Uniformly reducing m-1 times by taking one pixel as a unit until the minimum dimension k=m; according to a characteristic function M at a calculated scale k=1 1 In the way (1), the characteristic function under other scales is calculated, and finally the sub-shape part S of the shape sample S under all scales is obtained 1 Is a characteristic function of (a):
M={s k (i),l k (i),c k (i)|k∈[1,m],i∈[1,n 1 ]},
the characteristic functions under each scale are respectively stored in a matrix S M 、L M 、C M ,S M For storing s k (i),S M Is stored as point p in the kth row and ith column of (2) 1 (i) Area parameter s at the kth scale k (i);L M For storing l k (i),L M Is stored as point p in the kth row and ith column of (2) 1 (i) Arc length parameter l at the kth scale k (i);C M For storing c k (i),C M Is stored as point p in the kth row and ith column of (2) 1 (i) Center of gravity distance parameter c at the kth scale k (i);S M 、L M 、C M Sub-shape portion S that ultimately serves as shape sample S in full-scale space 1 Gray-scale representation of three shape features:
GM 1 ={S M ,L M ,C M },
wherein S is M 、L M 、C M All are matrixes with the size of m multiplied by n, and each matrix represents one gray level image;
then the sub-shape part S 1 As three gray-scale images of RGB three channels to obtain a color image as the sub-shape part S 1 Is a feature expression image of (a)
In the seventh step, feature expression image samples of all the sub-shape parts of the training shape samples are input into a convolutional neural network, and the convolutional neural network model is trained; different sub-shape portions of each class of shape have different class labels; after the convolutional neural network is trained to be converged, for any shape sample S, segmenting the sample S to form N characteristic expression images { T ] corresponding to the sub-shape parts num |num∈[1,N]Respectively inputting the training convolutional neural network, wherein the output of the second full-connection layer of the network is the characteristic vector of the corresponding sub-shape partWherein Vec is the number of neurons in the second fully connected layer;
the structure of the convolutional neural network comprises an input layer, a pre-training layer and a full-connection layer; the pre-training layer consists of 4 modules in front of the VGG16 network model, parameters obtained after the 4 modules are trained in an image data set are used as initialization parameters, and three full-connection layers are connected after the pre-training layer;
The 1 st module in the pre-training layer specifically comprises 2 convolution layers and 1 largest pooling layer, wherein the number of convolution kernels of the convolution layers is 64, the size of the convolution kernels is 3 multiplied by 3, and the size of the pooling layer is 2 multiplied by 2; the 2 nd module specifically comprises 2 layers of convolution layers and 1 layer of maximum pooling layer, wherein the number of convolution layers is 128, the size of the convolution layers is 3 multiplied by 3, and the size of the pooling layer is 2 multiplied by 2; the 3 rd module specifically comprises 3 convolution layers and 1 maximum pooling layer, wherein the number of convolution layers is 256, the size of the convolution layers is 3 multiplied by 3, and the size of the pooling layer is 2 multiplied by 2; the 4 th module specifically comprises 3 convolution layers and 1 maximum pooling layer, wherein the number of convolution layers is 512, the size of the convolution layers is 3 multiplied by 3, and the size of the pooling layer is 2 multiplied by 2; the calculation formula of each convolution layer is as follows:
C O =φ relu (W C ·C IC ),
wherein θ C Is the bias vector for the convolutional layer; w (W) C Is the weight of the convolutional layer; c (C) I Is the input to the convolutional layer; c (C) O Is the output of the convolutional layer;
the full-connection layer module specifically comprises 3 full-connection layers, wherein the 1 st full-connection layer comprises 512 nodes, the 2 nd full-connection layer comprises Vec nodes, and the 3 rd full-connection layer comprises N T A plurality of nodes; n (N) T A sum of the number of segmented sub-shape portions for all classes of shapes; the calculation formula of the first 2 full-connection layers is as follows:
F O =φ tanh (W F ·F IF ),
Wherein phi is tanh For tanh activation function, θ F Is the bias vector of the full connection layer; w (W) F Is the weight of the full connection layer; f (F) I Is the input of the full connection layer; f (F) O Is the output of the full connection layer;
the last full-connection layer is the output layer, and the output calculation formula is as follows:
Y O =φ softmax (W Y ·Y IY ),
wherein phi is softmax Is softmax excitationLiving function, θ Y Is the bias vector of the output layers, each neuron of the output layers represents a corresponding sub-shape part category, W Y Is the weight of the output layer, Y I Is the input of the output layer; y is Y O Is the output of the output layer.
Further, the specific method for constructing the feature matrix of the shape sample in the step eight is as follows:
for any shape sample S, N sub-shape parts formed by dividing the sample S are expressed by corresponding feature matrixesThe calculation formula of (2) is as follows:
wherein F is a Represents the a-th row vector of matrix F, F a For the feature vector of the a-th sub-shape part of the output of the step seven,representing a zero vector of dimension Vec.
Further, in the step nine, a structure of the graph convolutional neural network is constructed, including a preprocessing input layer, a hidden layer and a classification output layer, wherein the preprocessing input layer is used for carrying out adjacency matrixThe normalization pretreatment is specifically as follows:
Wherein the method comprises the steps ofI N Is an identity matrix>For the degree matrix->Is +.>
The hidden layer comprises 2 layers of graph convolution layers, and the calculation formula of each layer of graph convolution layer is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,is the weight of the picture volume lamination; h I Is the input of the picture scroll layer, the input of the layer 1 convolution layer is the characteristic matrix of the shape sample +.>H O Is the output of the picture scroll laminate;
the calculation formula of the classified output layer is as follows:
wherein phi is softmax Activating the function for softmax, G I Is the input of the output layer, i.e. the output of the convolution layer of the second layer, G W Is the weight of the output layer; g O Is the output of the output layer; the neurons of each output layer represent a corresponding one of the shape classes.
Further, the specific method for realizing contour shape classification and identification in the step ten is as follows: training the graph convolution neural network model until convergence; for any test shape sample, firstly extracting key points of the shape outline, calculating curvature values of the key points, judging convexity and convexity, obtaining candidate segmentation points,then adjusting a curvature screening threshold value to obtain shape division points; obtaining a segmentation line segment set according to the two principles of the step (1) and the step (2), and calculating segmentation cost of segmentation line segments in the segmentation line segment set; if the number of segments is smaller than All the segmentation line segments are used to segment the shape; otherwise, according to the minimum partition cost +.>Dividing the shape by the dividing line segments; calculating a color characteristic expression image of each sub-shape part, inputting the color characteristic expression image into a trained convolutional neural network, and taking the output of a second full-connection layer of the convolutional neural network as a characteristic vector of the sub-shape part; and constructing a shape directed graph of the test shape sample, calculating an adjacent matrix and a feature matrix of the shape directed graph, inputting the shape directed graph into a trained graph convolution neural network model, and judging the shape type of the test sample as the shape type corresponding to the maximum value in the output vector to realize shape classification and identification.
The invention provides a new shape recognition method, and designs a new shape classification method by utilizing a graph convolution neural network; the proposed shape characteristic topological graph expression is a directed graph structure constructed based on graph segmentation, not only distinguishes shape layers, but also fully utilizes stable topological characteristic relations among the parts of each layer of the shape to replace geometric position relations. Compared with the prior art that the method only calculates and compares the corresponding salient point characteristics to match, the method can be more robustly suitable for the interference of shape hinge transformation, partial shielding, rigid body transformation and the like; the full-scale visual representation method of the applied shape can comprehensively express all information of all sub-shape parts, and then the characteristics of all the parts in a full-scale space are extracted by utilizing the continuous convolution calculation of the neural network; compared with the direct application of the convolutional neural network, the designed graph convolution neural network has the advantages that training parameters are greatly reduced, and the calculation efficiency is higher.
Drawings
Fig. 1 is a workflow diagram of a shape recognition method of the present invention.
FIG. 2 is a partial sample schematic of a target shape in a shape sample set.
Fig. 3 is a schematic diagram of segmentation of a shape sample.
Fig. 4 is a schematic diagram of a full scale space.
Fig. 5 is a schematic diagram of the target shape after being intercepted by a preset scale.
Fig. 6 is a schematic diagram of a target shape divided by a preset scale.
FIG. 7 is a schematic diagram of a characteristic function of a sub-shape portion of a target shape at a single scale.
FIG. 8 is a schematic representation of a feature matrix of a sub-shape portion of a target shape in full scale space.
Fig. 9 is a schematic diagram of three gray-scale images calculated from a sub-shape portion of the target shape and a composite color image.
FIG. 10 is a block diagram of a convolutional neural network for training partial feature representation images of various sub-shapes.
Fig. 11 is a feature structure diagram of each sub-shape portion of the target shape.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
As shown in fig. 1, a shape recognition method includes the following steps:
1. the total number of shape samples is 1400, and there are 70 shape classes, each with 20 shape samples. A partial sample schematic of the target shape in the shape sample set is shown in fig. 2. Half of the samples in each shape category are randomly selected to be drawn into the training set, and the rest half of the samples are drawn into the test set, so that 700 training samples and 700 test samples are obtained. Sampling each shape sample to obtain 100 contour points, taking one shape sample S as an example:
S={p x (i),p y (i)|i∈[1,100]},
wherein p is x (i),p y (i) Is the abscissa of the profile sampling point p (i) in the two-dimensional plane.
And evolving the profile curve of the shape sample to extract key points, and deleting the point which has the smallest contribution to target identification in each evolution process. Wherein the contribution of each point p (i) is defined as:
wherein H (i, i-1) is the curve length between points p (i) and p (i-1), H (i, i+1) is the curve length between points p (i) and p (i+1), H 1 (i) The length h is normalized according to the perimeter of the contour for the angle between the line segment p (i) p (i-1) and the line segment p (i) p (i+1). The larger the Con (i) value, the greater the contribution of the point p (i) to the shape feature.
The method refers to a region-based adaptive ending function F (t) to overcome the problem that contour key points are extracted too much or too little:
Wherein S is 0 Is the area of the original shape, S i To the area after i times of evolution, n 0 Is the total number of points on the original shape outline. When the end function value F (t) exceeds the set threshold, the contour key point extraction ends. For the shape sample S shown in fig. 3, 24 contour key points are extracted altogether.
2. An approximate offset curvature value and curve convexity at each key point of the shape sample are calculated. Taking the shape sample S as an example, the approximate bias curvature value cur of the contour key point p (i) thereof The calculation formula of (p (i)) is as follows:
cur (p(i))=cosH ε (i)+1,
wherein H is ε (i) Is the angle between line segment p (i) p (i-epsilon) and line segment p (i) p (i + epsilon), epsilon=3.
The curve convexity judging method at the contour key point p (i) is as follows:
equidistant sampling of the line segment p (i-epsilon) p (i+epsilon) to obtain R discrete points, and if the pixel values of the R discrete points are 255, the line segment p (i-epsilon) p (i+epsilon) is all in the shape outline, namely the curve at the p (i) is convex; if the pixel values of the R discrete points are all 0, the line segment p (i- ε) p (i+ε) is all outside the shape contour, i.e., the curve at p (i) appears concave. Note that the key points P (i) where all curves appear as recesses are candidate segmentation points P (j). For the shape sample S, 11 candidate segmentation points are extracted in total.
3. And adjusting the curvature screening threshold Th to obtain the shape division points. For 11 candidate segmentation points P (j) of the shape sample S, their average approximate bias curvature value is taken as an initial threshold Th 0
Wherein the average approximate bias curvature value cur of each of the 11 candidate partition points P (j) The sizes of (P (j)) were 0.1,0.2,0.25,0.35,0.4,0.5,0.5,0.64,0.7,0.7,0.8, respectively.
The threshold is increased in sequence according to the following method:
(1) For threshold Th at the τ -Th adjustment τ Based on the approximate bias curvature value and Th of each candidate partition point P (j) τ P (j) can be divided into two classes: approximate bias curvature value is greater than Th τ Candidate segmentation points of (a)And an approximate offset curvature value of Th or less τ Candidate segmentation point +.>Calculating and recording the division degree D under the current threshold τ
Wherein, the liquid crystal display device comprises a liquid crystal display device,
wherein the method comprises the steps ofRespectively represent threshold Th τ The positive and negative curvature deviations of the next candidate division points P (j),representing the minimum value of the positive curvature deviations of all candidate segmentation points, +.>Representing the maximum value of negative curvature deviations for all candidate segmentation points.
Determining whether the approximate bias curvature value is greater than a threshold Th τ If not, no adjustment is made, and the process goes to step (3). If there is an approximate bias curvature value greater than the threshold Th τ And (3) moving to the step (2) to continuously adjust the threshold value.
(2) Continuously adjusting threshold value, new threshold value Th τ+1 The minimum value of the positive curvature deviation of all the candidate segmentation points in the last threshold adjustment process is expressed as follows:
According to threshold Th τ+1 Calculating the positive and negative curvature deviation of each candidate segmentation point under the tau+1st adjustmentDivision dividing degree D τ+1 And recorded. Determining whether the approximate bias curvature value is greater than a threshold Th τ+1 If not, no adjustment is made, and the process goes to step (3). If there is an approximate bias curvature value greater than the threshold Th τ+1 Let τ=τ+1, and repeat the current step to continue adjusting the threshold.
(3) The multiple adjustment threshold value has multiple division degrees, the threshold value corresponding to the maximum division degree is the final curvature screening threshold value Th, and the point with the approximate offset curvature value larger than the threshold value Th is the final shape division point.
For the shape sample S, the partition division and the threshold value recorded correspondingly in the 4-time threshold value adjustment process are respectively:
/>
thus, the maximum division degree D 1 Corresponding threshold Th 1 The threshold is screened for final curvature, i.e., th=0.5. The 5 candidate division points with the approximate offset curvature value smaller than Th are the final shape division points, and the corresponding approximate offset curvatures are 0.1,0.2,0.25,0.35,0.4 respectively.
4. For the shape sample S, 5 shape dividing points are connected in pairs in order to form 10 line segments, and 7 line segments which are located in the shape and do not intersect with each other are reserved as a dividing line segment set. The segmentation Cost of each segment is calculated using the metric index I according to the following method:
Wherein D is * (u,v),L * (u,v),S * (u, v) are three segmentation measurement indexes of normalized segmentation length, segmentation arc length and segmentation residual area respectively, u, v is the sequence number of any two shape segmentation points,is the total number of segmentation points.
For any one bar-shaped segment P (u) P (v), three segmentation evaluation index calculation methods are as follows:
wherein D is max For the length of the segment with the largest length among all segments, D * The value range of (u, v) should be between 0 and 1, and the smaller the number is, the more remarkable the segmentation effect is.
Wherein the method comprises the steps ofIs a profile curve +.>Length L of (2) * The value range of (u, v) should be between 0 and 1, and the smaller the number is, the more remarkable the segmentation effect is.
Wherein S is d The area of the shape divided for dividing line segment P (u) P (v), i.e. the area divided by line segment P (u) P (v) and contour curveArea of closed region formed, S * The value range of (u, v) should be between 0 and 1, and the smaller the number is, the more remarkable the segmentation effect is.
According to the steps, the segmentation Cost for the segmentation line segment P (u) P (v) is obtained through calculation:
Cost=αD * (u,v)+βL * (u,v)+γS * (u,v),
wherein alpha, beta and gamma are weights of the measurement indexes.
As shown in fig. 3, for the shape sample S, 2 segmentation line segments with minimum and next-smallest segmentation costs are selected as final optimal segmentation line segments, and 3 sub-shape portions are obtained.
5. For the shape sample S, the center shape portion is noted as the starting vertex v 1 And the rest adjacent 2 shape parts are subjected to vertex ordering in the clockwise direction and respectively marked as vertexes { v } 2 ,v 3 }. Record connection v 1 To vertex v 2 ,v 3 The edges of (c) are respectively (v) 1 ,v 2 ),(v 1 ,v 3 ) And further form a shape directed graph satisfying the topological order:
G 1 =(V 1 ,E 1 ),
wherein V is 1 ={v 1 ,v 2 ,v 3 },E 1 ={(v 1 ,v 2 ),(v 1 ,v 3 )}。
Since the maximum number of sub-shape portions obtained by dividing each training sample of the contour shape set is 11, the adjacency matrix of the shape samples SExpressed as:
wherein a epsilon [1, 11], b epsilon [1, 11].
6. And respectively carrying out full-scale visual representation on the 3 sub-shape parts obtained by segmentation, wherein the full-scale visual representation comprises the following specific methods:
(1) For the contour of any sub-shape portion, sampling the contour results in 100 contour sampling points. As shown in fig. 4, the total scale number in the full scale space is set to 100, and the normalized area, arc length and center of gravity at each layer scale are calculated for each contour point based on the coordinates of 100 contour sampling points. In sub-shape part S 1 For example, the specific calculation method is as follows:
in sub-shape part S 1 Sampling point p of contour 1 (i) As the center of a circle, with an initial radiusMaking a preset circle C 1 (i) The preset circle is the initial semi-global scale for calculating the corresponding contour point parameters. Obtaining a preset circle C according to the steps 1 (i) Then, a part of the target shape necessarily falls within the preset circle, and the schematic diagram is shown in fig. 5. If the part of the target shape falling within the preset circle is a single area, the single area is the point p with the target contour 1 (i) The region with direct connection is denoted as Z 1 (i) The method comprises the steps of carrying out a first treatment on the surface of the If the portion of the target shape falling within the preset circle is divided into a plurality of areas not communicating with each other, as shown in the areas A and B of FIG. 5, the target contour point p is determined 1 (i) The area on its contour is the target contour point p 1 (i) The region having a direct connection relationship, i.e., region A, is denoted as Z in FIG. 5 1 (i) A. The invention relates to a method for producing a fibre-reinforced plastic composite Based on this, a preset circle C 1 (i) Is matched with the target contour point p 1 (i) Zone Z with direct connection 1 (i) The area of (2) is marked->Then there are:
wherein B (Z) 1 (i) Z) is an indication function defined as
Will Z 1 (i) And the area of the preset circle C 1 (i) The ratio of the areas as the target contour point p 1 (i) Area parameter s of the descriptor of (2) 1 (i):
s 1 (i) The value of (2) should be in the range of 0 to 1.
At the calculation and target contour point p 1 (i) When the center of gravity of the area with the direct connection relationship is located, specifically, the average value of the coordinate values of all the pixel points in the area is calculated, and the obtained result is the coordinate value of the center of gravity of the area, which can be expressed as:
Wherein w is 1 (i) I.e. the center of gravity of the above-mentioned region.
Calculating the target contour point and the center of gravity w 1 (i) Distance of (2)Can be expressed as:
and will beAnd eyes (eyes)The ratio of the radius of the preset circle of the target contour point is taken as the target contour point p 1 (i) Center of gravity distance parameter c of descriptor 1 (i):/>
c 1 (i) The value of (2) should be in the range of 0 to 1.
The outline of the target shape must have one or more arc segments within the preset circle after being cut by the preset circle, as shown in fig. 6. If only one arc segment of the target shape falls within the preset circle, determining the arc segment as a point p with the target contour 1 (i) Arc segments with direct connection, if the target shape has a plurality of arc segments falling within a preset circle, such as arc Segment A (Segment A), arc Segment B (Segment B), arc Segment C (Segment C) in FIG. 6, the target contour point p is determined 1 (i) The arc section is the point p with the target contour 1 (i) The arc Segment with direct connection relationship is the arc Segment a (Segment a) in fig. 6. Based on this, a preset circle C 1 (i) Inner and target contour points p 1 (i) The length of the arc segment having a direct connection relationship is recorded asAnd will->And preset circle C 1 (i) The ratio of the circumferences is used as the arc length parameter l of the descriptor of the target contour point 1 (i):
Wherein l 1 (i) The value of (2) should be in the range of 0 to 1.
Calculating to obtain the scale label k=1 and the initial radius according to the stepsIs half of (2)Sub-shape portion S of shape sample S at global scale 1 Is a characteristic function M of (2) 1
M 1 ={s 1 (i),l 1 (i),c 1 (i)|i∈[1,100]},
As shown in FIG. 7, the respective feature functions at 100 scales in the full-scale space are calculated separately, wherein for the kth scale label, a circle C is set k Radius r of (2) k
I.e. at an initial scale k=1,thereafter radius r k The constant width is reduced 99 times in units of one pixel until the minimum dimension k=100. Calculating to obtain a sub-shape part S of the shape sample S under the whole scale space 1 Is a characteristic function of (a):
M={s k (i),l k (i),c k (i)|k∈[1,100],i∈[1,100]},
(2) As shown in FIG. 8, the sub-shape portion S 1 Feature functions under 100 scales in the full-scale space are combined into three feature matrixes under the full-scale space according to the scale sequence:
GM 1 ={S M ,L M ,C M },
wherein S is M 、L M 、C M Are gray matrices of size m x n, each representing a gray image.
(3) As shown in FIG. 9, the sub-shape portion S 1 Is used as RGB three channels to synthesize a color image as the sub-shape part S 1 Is a feature expression image of (a)
7. Constructing convolutional neural network comprising input layer, pre-training layer and fullAnd a connection layer. The invention inputs the characteristic expression image samples of all the sub-shape parts of the training shape samples to the convolutional neural network, and trains the convolutional neural network model. Different sub-shape portions of each class of shapes have different class labels. After the convolutional neural network is trained to be converged, taking a shape sample S as an example, dividing the shape sample S into 3 sub-shape parts to form a feature expression image { T ] with the corresponding size of 100 multiplied by 100 num |num∈[1,3]Respectively inputting the training convolutional neural network, wherein the output of the second full-connection layer of the network is the characteristic vector of the corresponding sub-shape partWhere Vec is the number of neurons in the second fully connected layer, vec is set to 200.
The invention uses sgd optimizer, the learning rate is set to 0.001, the delay rate is set to 1e-6, the loss function selects cross entropy, and the batch size is selected to be 128. As shown in fig. 10, the pre-training layer is composed of the first 4 modules of the VGG16 network model, and the parameters obtained after the 4 modules are trained in the imagenet data set are used as initialization parameters, and the three full connection layers are connected after the pre-training layer.
The 1 st module in the pre-training layer specifically comprises 2 convolution layers and 1 largest pooling layer, wherein the number of convolution kernels of the convolution layers is 64, the size of the convolution kernels is 3 multiplied by 3, and the size of the pooling layer is 2 multiplied by 2; the 2 nd module specifically comprises 2 layers of convolution layers and 1 layer of maximum pooling layer, wherein the number of convolution layers is 128, the size of the convolution layers is 3 multiplied by 3, and the size of the pooling layer is 2 multiplied by 2; the 3 rd module specifically comprises 3 convolution layers and 1 maximum pooling layer, wherein the number of convolution layers is 256, the size of the convolution layers is 3 multiplied by 3, and the size of the pooling layer is 2 multiplied by 2; the 4 th module specifically comprises 3 convolution layers and 1 maximum pooling layer, wherein the convolution layer convolution kernel number is 512, the size is 3×3, and the pooling layer size is 2×2. The calculation formula of each convolution layer is as follows:
C O =φ relu (W C ·C IC ),
Wherein θ C Is the bias vector for the convolutional layer; w (W) C Is the weight of the convolutional layer; c (C) I Is the input to the convolutional layer; c (C) o Is the output of the convolutional layer;
the full-connection layer module specifically comprises 3 full-connection layers, wherein the 1 st full-connection layer comprises 512 nodes, the 2 nd full-connection layer comprises 200 nodes, and the 3 rd full-connection layer comprises 770 nodes. The calculation formula of the first 2 full-connection layers is as follows:
F O =φ tanh (W F ·F IF ),
wherein phi is tan h is the tanh activation function, θ F Is the bias vector of the full connection layer; w (W) F Is the weight of the full connection layer; f (F) I Is the input of the full connection layer; f (F) O Is the output of the full connection layer;
the last full-connection layer is the output layer, and the output calculation formula is as follows:
Y O =φ softmax (W Y ·Y IY ),
wherein phi is softmax Activating a function for softmax, θ Y Is the bias vector of the output layers, each neuron of the output layers represents a corresponding sub-shape part category, W Y Is the weight of the output layer, Y I Is the input of the output layer; y is Y O Is the output of the output layer;
8. as shown in fig. 11, the feature matrix of the shape sample is constructed from 3 sub-shape feature vectors of the shape sample
Wherein F is a Represents the a-th row vector of matrix F, F a The feature vector of the a-th sub-shape part output for the above step, Representing a zero vector of dimension 200.
9. The convolutional neural network of the graph is constructed and comprises a preprocessing input layer, a hiding layer and a classifying output layer. The invention relates to an adjacency matrix of a shape sample topological graphAnd feature matrix->Training is performed in the convolutional neural network structure model of the input diagram. The invention uses sgd optimizer, the learning rate is set to 0.001, the delay rate is set to 1e-6, the loss function selects cross entropy, and the batch size is selected to be 128.
Preprocessing the pair adjacency matrix in the input layerThe normalization pretreatment is specifically as follows:
wherein the method comprises the steps ofIs an identity matrix>In the form of a degree matrix, is +.>
The hidden layer comprises 2 layers of graph convolution layers, and the calculation formula of each layer of graph convolution layer is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,is the weight of the picture volume lamination; h I Is the input of the picture scroll layer, the input of the layer 1 convolution layer is the characteristic matrix of the shape sample +.>H O Is the output of the picture scroll laminate;
the calculation formula of the classified output layer is as follows:
wherein phi is softmax Activating the function for softmax, G I Is the input of the output layer, i.e. the output of the convolution layer of the second layer, G W Is the weight of the output layer; g O Is the output of the output layer; the neurons of each output layer represent a corresponding one of the shape classes.
10. All training samples are input into the graph rolling neural network, and the graph rolling neural network model is trained. For any test shape sample, firstly extracting key points of the shape outline, calculating curvature values of the key points, judging convexity of the key points, obtaining candidate segmentation points, and then adjusting a curvature screening threshold value to obtain the shape segmentation points. And connecting the shape division points in pairs in sequence to form a division line segment, reserving line segments which are positioned in the shapes and are not intersected with each other as the division line segment to obtain a division line segment set, and calculating the division cost of the division line segments in the division line segment set. If the number of segment segments is less than 10, all segment segments are used to segment the shape. Otherwise, the shape is segmented according to the 10 segmentation line segments with the minimum segmentation cost. And calculating a color characteristic expression image of each sub-shape part, and inputting the color characteristic expression image into a trained convolutional neural network, wherein the output of a second full-connection layer of the convolutional neural network is used as a characteristic vector of the sub-shape part. And constructing a shape directed graph of the test shape sample, calculating an adjacent matrix and a feature matrix of the shape directed graph, inputting the shape directed graph into a trained graph convolution neural network model, and judging the shape type of the test sample as the shape type corresponding to the maximum value in the output vector to realize shape classification and identification.
Although the present invention has been described with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments described, or equivalents may be substituted for elements thereof, and any modifications, equivalents, improvements and changes may be made without departing from the spirit and principles of the present invention.

Claims (11)

1. A method of shape recognition, characterized by: the method comprises the following steps:
step one, extracting outline key points of a shape sample;
step two, defining approximate offset curvature values at all key points and judging curve convexity at the key points so as to obtain candidate shape division points;
step three, adjusting a curvature screening threshold value to obtain shape division points;
fourthly, performing shape segmentation based on the principle that segmentation line segments are positioned in the shape and do not cross each other, and segmenting the shape to obtain a plurality of sub-shape parts with minimum segmentation cost;
fifthly, constructing a topological structure of the shape sample;
step six, obtaining a characteristic expression image of the corresponding sub-shape part by using a full-scale visual representation method of the shape;
inputting each characteristic expression image into a convolutional neural network for training, and learning to obtain characteristic vectors of each sub-shape part;
Constructing a feature matrix of the shape sample;
step nine, constructing a graph convolution neural network;
and step ten, training the graph convolution neural network, performing shape segmentation on the test sample, obtaining feature vectors of all sub-shape parts, calculating a feature matrix and an adjacent matrix of the test sample, and inputting the feature matrix and the adjacent matrix into a trained graph convolution network model to realize shape classification and identification.
2. A method of shape recognition according to claim 1, wherein: in the first step, the method for extracting the contour key points comprises the following steps:
the profile of each shape sample is made up of a series of sampling points, and for any shape sample S, sampling the profile by n points yields:
S={(p x (i),p y (i))|i∈[1,n]},
wherein p is x (i),p y (i) The horizontal and vertical coordinates of the profile sampling points p (i) in a two-dimensional plane are shown, and n is the profile length, namely the number of the profile sampling points;
extracting key points by evolving a contour curve of the shape sample, wherein in each evolution process, points with the smallest contribution to target identification are deleted, and the contribution of each point p (i) is defined as:
wherein H (i, i-1) is the curve length between points p (i) and p (i-1), H (i, i+1) is the curve length between points p (i) and p (i+1), H 1 (i) The length h is normalized according to the perimeter of the outline for the angle between the line segment p (i) p (i-1) and the line segment p (i) p (i+1); the larger the Con (i) value, the greater the contribution of the point p (i) to the shape feature;
the method refers to a region-based adaptive ending function F (t) to overcome the problem that contour key points are extracted too much or too little:
wherein S is 0 Is the area of the original shape, S i To the shape area after i times evolution, n 0 Is the total point number on the outline of the original shape; when the ending function value F (t) exceeds the set threshold, the contour key point extraction is ended and n is obtained * And (5) contour key points.
3. A method of shape recognition according to claim 2, wherein: in the second step, defining approximate bias curvature values at each key point and judging curve convexity at the key point to obtain candidate segmentation points, wherein the specific method comprises the following steps:
in order to calculate an approximate bias curvature value of a key point p (i) at any position in a shape sample S, taking contour points p (i-epsilon) which are adjacent to p (i) in front and back, and p (i+epsilon), wherein epsilon is an empirical value; as a result of:
cosH ε (i)∝cur(p(i)),
wherein H is ε (i) For the angle between line segment p (i) p (i- ε) and line segment p (i) p (i+ε), cur (p (i)) is the curvature at point p (i);
the approximate bias curvature values cur to (p (i)) at the definition point p (i) are:
cur~(p(i))=cosH ε (i)+1,
Wherein H is ε (i) cosH is the angle between line segment p (i) p (i- ε) and line segment p (i) p (i+ε) ε (i) The value range is between-1 and 1, and the value range of cur to (p (i)) is between 0 and 2;
according to a shape segmentation method conforming to visual natures, shape segmentation points are all positioned at the concave curve of the outline; thus, in screening candidate segmentation points for shape segmentation, a method for judging curve convexity at a key point p (i) is defined:
for the binarized image of the shape, the values of the pixel points inside the S outline of the shape sample are 255, and the values of the pixel points outside the S outline of the shape sample are 0; equidistant sampling of the line segment p (i-epsilon) p (i+epsilon) to obtain R discrete points, and if the pixel values of the R discrete points are 255, the line segment p (i-epsilon) p (i+epsilon) is all in the shape outline, namely the curve at the p (i) is convex; if the pixel values of the R discrete points are all 0, the line segments p (i-epsilon) p (i+epsilon) are all outside the shape outline, namely the curve at p (i) is concave; note that the key points P (i) where all curves appear as recesses are candidate segmentation points P (j).
4. A method of shape recognition according to claim 3, wherein: in the third step, the step of adjusting the curvature screening threshold Th and obtaining the shape division points is as follows:
(1) Regarding all the candidate partition points P (j) obtained in the second step, the average approximate bias curvature value is taken as an initial threshold Th 0
Wherein J is the total number of candidate division points;
(2) For threshold Th at the τ -Th adjustment τ Based on the approximate bias curvature value and Th of each candidate partition point P (j) τ P (j) can be divided into two classes: approximate bias curvature value is greater than Th τ Candidate segmentation points of (a)And an approximate offset curvature value of Th or less τ Candidate segmentation point +.>Calculating and recording the division degree D under the current threshold τ
Wherein, the liquid crystal display device comprises a liquid crystal display device,
wherein the method comprises the steps ofRespectively represent threshold Th τ Positive and negative curvature deviations of the following candidate division points P (j), +.>Representing the minimum value of the positive curvature deviations of all candidate segmentation points, +.>Representing the maximum value of negative curvature deviations of all candidate segmentation points;
determining whether the approximate bias curvature value is greater than a threshold Th τ If not, no adjustment is performed, and the step (4) is performed; if there is an approximate bias curvature value greater than the threshold Th τ Turning to the step (3) to continuously adjust the threshold value;
(3) Continuously adjusting threshold value, new threshold value Th τ+1 The minimum value of the positive curvature deviation of all the candidate segmentation points in the last threshold adjustment process is expressed as follows:
According to threshold Th τ+1 Calculating the positive and negative curvature deviation of each candidate segmentation point under the tau+1st adjustmentDivision dividing degree D τ+1 And recording; determining whether the approximate bias curvature value is greater than a threshold Th τ+1 If not, no adjustment is performed, and the step (4) is performed; if there is an approximate bias curvature value greater than the threshold Th τ+1 Candidate segmentation point of (2)Let τ=τ+1, repeat the current step and continue to adjust the threshold;
(4) The multiple adjustment threshold value has a plurality of division degrees, the threshold value corresponding to the maximum division degree is a final curvature screening threshold value Th, and the point with the approximate offset curvature value smaller than the threshold value Th is a final shape division point.
5. A method of shape recognition according to claim 4, wherein: in the fourth step, the specific method for dividing the shape based on the principle that the dividing line segments are positioned in the shape and do not intersect with each other and dividing the shape into a plurality of sub-shape parts with the minimum dividing cost is as follows:
(1) For any two shape division points P (e 1 ),P(e 2 ) Equidistant sampling of segment P (e 1 )P(e 2 ) Obtaining C discrete points, and if the C discrete points have a pixel value of 0, obtaining a line segment P (e 1 )P(e 2 ) The part outside the shape outline is not selected as a segmentation line segment;
(2) For any two shape division points P (e 3 ),P(e 4 ) If a strip-shaped dividing line segment P (e 5 )P(e 6 ) Such that:
or (b)
Line segment P (e) 3 )P(e 4 ) And the existing segment P (e 5 )P(e 6 ) Intersecting, not selecting line segment P (e 3 )P(e 4 ) As a segment of the segment;
(3) The segmentation line segment set meeting the two principles is further screened, and segmentation is realized under the minimum segmentation cost by defining three measurement indexes I for evaluating the segment quality:
wherein D is * (u,v)、L * (u,v)、S * (u, v) are three segmentation measurement indexes of normalized segmentation length, segmentation arc length and segmentation residual area respectively, u, v is the sequence number of any two shape segmentation points,the total number of the segmentation points;
for any one bar-shaped segment P (u) P (v), three segmentation evaluation index calculation methods are as follows:
wherein D is max For the length of the segment with the largest length among all segments, D * The value range of (u, v) should be between 0 and 1, and the smaller the value, the more remarkable the segmentation effect;
wherein the method comprises the steps ofIs a profile curve +.>Length L of (2) * The value range of (u, v) should be between 0 and 1, and the smaller the value, the more remarkable the segmentation effect;
wherein S is d The area of the shape divided for dividing line segment P (u) P (v), i.e. the area divided by line segment P (u) P (v) and contour curve Area of closed region formed, S * The value range of (u, v) should be between 0 and 1, and the smaller the value, the more remarkable the segmentation effect;
according to the steps, the segmentation Cost for the segmentation line segment P (u) P (v) is obtained through calculation:
Cost=αD * (u,v)+βL * (u,v)+γS * (u,v),
wherein alpha, beta and gamma are weights of various measurement indexes;
calculating the segmentation Cost of the segmentation line segments in the screened segmentation line segment set; sorting all the calculated costs from small to large, and finally selecting N-1 segmentation line segments with the smallest Cost according to the number N of segmentation sub-shape parts set by the category to which the shape sample S belongs, thereby realizing optimal segmentation and obtaining N sub-shape parts; the number of divided sub-shape portions N depends on the category to which the current shape sample S belongs, and for shapes of different categories, the corresponding number of divided sub-shape portions is manually set.
6. A method of shape recognition according to claim 5, wherein: in the fifth step, the specific method for constructing the topological structure of the shape sample is as follows: for N sub-shape portions obtained by dividing any shape sample S, the center shape portion is designated as a starting vertex v 1 And the rest adjacent shape parts are ordered by vertexes in the clockwise direction and marked as vertexes { v } o |o∈[2,N]-a }; record connection v 1 To the other vertices v o Is of the side (v) 1 ,v o ) And further form a shape directed graph satisfying the topological order:
G 1 =(V 1 ,E 1 ),
wherein V is 1 ={v o |o∈[1,N]},E 1 ={(v 1 ,v o )|o∈[2,N]};
After all training shape samples are optimally segmented, the maximum number of sub-shape parts obtained by segmentation of the training shape samples is recorded asFor any shape sample S, its adjacency matrix +.>The calculation mode of (a) is as follows:
wherein the method comprises the steps ofRepresentation->Order real matrix,/->
7. A method of shape recognition according to claim 6, wherein: in the sixth step, a full-scale visual representation method of the shape is used, and the specific method for obtaining the color feature expression image of the corresponding sub-shape part is as follows:
sub-shape portion S for any shape sample S 1
Wherein, the liquid crystal display device comprises a liquid crystal display device,sampling points for the contours of the sub-shape portionp 1 (i) Horizontal and vertical coordinates in two-dimensional plane, n 1 The contour length is the number of contour sampling points;
first, the sub-shape portion S is described using a characteristic function M composed of three shape descriptors 1 Is defined by the contour of (2):
M={s k (i),l k (i),c k (i)|k∈[1,m],i∈[1,n 1 ]},
wherein s is k ,l k ,c k Three invariant parameters of normalized area s, arc length l and center of gravity distance c in a scale k, wherein k is a scale label, and m is the total scale number; defining the three shape invariant descriptors respectively:
sampling point p with a contour 1 (i) As the center of a circle, with an initial radiusMaking a preset circle C 1 (i) The preset circle is the initial semi-global scale for calculating the corresponding contour point parameters; obtaining a preset circle C according to the steps 1 (i) The three shape descriptors at the scale k=1 are then calculated as follows:
in calculating s 1 (i) When describing, the circle C is preset 1 (i) Is matched with the target contour point p 1 (i) Zone Z with direct connection 1 (i) Is expressed as the area of (2)Then there are:
wherein B (Z) 1 (i) Z) is an indication function defined as
Will Z 1 (i) And the area of the preset circle C 1 (i) Area ratio as area parameter s of target contour point descriptor 1 (i):
s 1 (i) The value of (2) should be in the range of 0 to 1;
in calculation c 1 (i) In describing, first calculate the target contour point p 1 (i) The center of gravity of the area with the direct connection relation, specifically, the average of the coordinate values of all the pixel points in the area is calculated, and the obtained result is the coordinate value of the center of gravity of the area, which can be expressed as:
wherein w is 1 (i) Namely the center of gravity of the region;
then calculate the target contour point p 1 (i) And the center of gravity w 1 (i) Distance of (2)Can be expressed as:
finally, willAnd the target contour point p 1 (i) Is a preset circle C of (2) 1 (i) As the center of gravity distance parameter c of the target contour point descriptor 1 (i):
c 1 (i) The value of (2) should be in the range of 0 to 1;
in calculating l 1 (i) When describing, the circle C is preset 1 (i) Inner and target contour points p 1 (i) The length of the arc segment having a direct connection relationship is recorded asAnd will->And preset circle C 1 (i) The ratio of the circumferences is taken as the arc length parameter l of the target contour point descriptor 1 (i):
Wherein l 1 (i) The value of (2) should be in the range of 0 to 1;
calculating to obtain the scale label k=1 and the initial radius according to the stepsSub-shape portion S of shape sample S at semi-global scale 1 Is a characteristic function M of (2) 1
M 1 ={s 1 (i),l 1 (i),c 1 (i)|i∈[1,n 1 ]},
Since the digital image takes one pixel as the minimum unit, a single pixel is selected as a continuous scale change interval in a full scale space; that is, for the kth scale tag, circle C is set k Radius r of (2) k
I.e. at an initial scale k=1,thereafter radius r k Uniformly reducing m-1 times by taking one pixel as a unit until the minimum dimension k=m; according to a characteristic function M at a calculated scale k=1 1 In the way (1), the characteristic function under other scales is calculated, and finally the sub-shape part S of the shape sample S under all scales is obtained 1 Is a characteristic function of (a):
M={s k (i),l k (i),c k (i)|k∈[1,m],i∈[1,n 1 ]},
the characteristic functions under each scale are respectively stored in a matrix S M 、L M 、C M ,S M For storing s k (i),S M Is stored as point p in the kth row and ith column of (2) 1 (i) Area parameter s at the kth scale k (i);L M For storing l k (i),L M Is stored as point p in the kth row and ith column of (2) 1 (i) Arc length parameter l at the kth scale k (i);C M For storing c k (i),C M Is stored as point p in the kth row and ith column of (2) 1 (i) Center of gravity distance parameter c at the kth scale k (i);S M 、L M 、C M Sub-shape portion S that ultimately serves as shape sample S in full-scale space 1 Gray-scale representation of three shape features:
GM 1 ={S M ,L M ,C M },
wherein S is M 、L M 、C M All are matrixes with the size of m multiplied by n, and each matrix represents one gray level image;
then the sub-shape part S 1 As three gray-scale images of RGB three channels to obtain a color image as the sub-shape part S 1 Is a feature expression image of (a)
8. A method of shape recognition according to claim 7, wherein: all training shape samples are taken in the step sevenInputting the characteristic expression image samples of each sub-shape part into a convolutional neural network, and training a convolutional neural network model; different sub-shape portions of each class of shape have different class labels; after the convolutional neural network is trained to be converged, for any shape sample S, segmenting the sample S to form N characteristic expression images { T ] corresponding to the sub-shape parts num |num∈[1,N]Respectively inputting the training convolutional neural network, wherein the output of the second full-connection layer of the network is the characteristic vector of the corresponding sub-shape part Wherein Vec is the number of neurons in the second fully connected layer;
the structure of the convolutional neural network comprises an input layer, a pre-training layer and a full-connection layer; the pre-training layer consists of 4 modules in front of the VGG16 network model, parameters obtained after the 4 modules are trained in an image data set are used as initialization parameters, and three full-connection layers are connected after the pre-training layer;
the 1 st module in the pre-training layer specifically comprises 2 convolution layers and 1 largest pooling layer, wherein the number of convolution kernels of the convolution layers is 64, the size of the convolution kernels is 3 multiplied by 3, and the size of the pooling layer is 2 multiplied by 2; the 2 nd module specifically comprises 2 layers of convolution layers and 1 layer of maximum pooling layer, wherein the number of convolution layers is 128, the size of the convolution layers is 3 multiplied by 3, and the size of the pooling layer is 2 multiplied by 2; the 3 rd module specifically comprises 3 convolution layers and 1 maximum pooling layer, wherein the number of convolution layers is 256, the size of the convolution layers is 3 multiplied by 3, and the size of the pooling layer is 2 multiplied by 2; the 4 th module specifically comprises 3 convolution layers and 1 maximum pooling layer, wherein the number of convolution layers is 512, the size of the convolution layers is 3 multiplied by 3, and the size of the pooling layer is 2 multiplied by 2; the calculation formula of each convolution layer is as follows:
C O =φ relu (W C ·C IC ),
wherein θ C Is the bias vector for the convolutional layer; w (W) C Is the weight of the convolutional layer; c (C) I Is the input to the convolutional layer; c (C) O Is the output of the convolutional layer;
full connection layerThe module specifically comprises 3 full-connection layers, wherein the 1 st full-connection layer comprises 512 nodes, the 2 nd full-connection layer comprises Vec nodes, and the 3 rd full-connection layer comprises N T A plurality of nodes; n (N) T A sum of the number of segmented sub-shape portions for all classes of shapes; the calculation formula of the first 2 full-connection layers is as follows:
F O =φ tanh (W F ·F IF ),
wherein phi is tanh For tanh activation function, θ F Is the bias vector of the full connection layer; w (W) F Is the weight of the full connection layer; f (F) I Is the input of the full connection layer; f (F) O Is the output of the full connection layer;
the last full-connection layer is the output layer, and the output calculation formula is as follows:
Y O =φ softmax (W Y ·Y IY ),
wherein phi is softmax Activating a function for softmax, θ Y Is the bias vector of the output layers, each neuron of the output layers represents a corresponding sub-shape part category, W Y Is the weight of the output layer, Y I Is the input of the output layer; y is Y O Is the output of the output layer.
9. A method of shape recognition according to claim 8, wherein: the specific method for constructing the feature matrix of the shape sample in the step eight is as follows:
for any shape sample S, N sub-shape parts formed by dividing the sample S are expressed by corresponding feature matrixes The calculation formula of (2) is as follows:
wherein F is a Represents the a-th row vector of matrix F, F a For the feature vector of the a-th sub-shape part of the output of the step seven,representing a zero vector of dimension Vec.
10. A method of shape recognition according to claim 9, wherein: in the step nine, a structure of a graph convolutional neural network is constructed, wherein the structure comprises a preprocessing input layer, a hiding layer and a classification output layer, and an adjacency matrix is carried out in the preprocessing input layerThe normalization pretreatment is specifically as follows:
wherein the method comprises the steps ofI N Is an identity matrix>For the degree matrix-> Is +.>
The hidden layer comprises 2 layers of graph convolution layers, and the calculation formula of each layer of graph convolution layer is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,is the weight of the picture volume lamination; h I Is the input of the picture scroll layer, the input of the layer 1 convolution layer is the characteristic matrix of the shape sample +.>H O Is the output of the picture scroll laminate;
the calculation formula of the classified output layer is as follows:
wherein phi is softmax Activating the function for softmax, G I Is the input of the output layer, i.e. the output of the convolution layer of the second layer, G W Is the weight of the output layer; g O Is the output of the output layer; the neurons of each output layer represent a corresponding one of the shape classes.
11. A method of shape recognition according to claim 10, wherein: the specific method for realizing outline shape classification and identification in the step ten is as follows: training the graph convolution neural network model until convergence; for any test shape sample, firstly extracting key points of a shape contour, calculating curvature values of the key points, judging convexity of the key points, obtaining candidate segmentation points, and then adjusting a curvature screening threshold value to obtain the shape segmentation points; obtaining a segmentation line segment set according to the two principles of the step (1) and the step (2), and calculating segmentation cost of segmentation line segments in the segmentation line segment set; if the number of segments is smaller thanAll the segmentation line segments are used to segment the shape; otherwise, according to the scoreMinimal cost of cutting->Dividing the shape by the dividing line segments; calculating a color characteristic expression image of each sub-shape part, inputting the color characteristic expression image into a trained convolutional neural network, and taking the output of a second full-connection layer of the convolutional neural network as a characteristic vector of the sub-shape part; and constructing a shape directed graph of the test shape sample, calculating an adjacent matrix and a feature matrix of the shape directed graph, inputting the shape directed graph into a trained graph convolution neural network model, and judging the shape type of the test sample as the shape type corresponding to the maximum value in the output vector to realize shape classification and identification.
CN202110418108.2A 2021-04-19 2021-04-19 Shape recognition method Active CN113191361B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110418108.2A CN113191361B (en) 2021-04-19 2021-04-19 Shape recognition method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110418108.2A CN113191361B (en) 2021-04-19 2021-04-19 Shape recognition method

Publications (2)

Publication Number Publication Date
CN113191361A CN113191361A (en) 2021-07-30
CN113191361B true CN113191361B (en) 2023-08-01

Family

ID=76977535

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110418108.2A Active CN113191361B (en) 2021-04-19 2021-04-19 Shape recognition method

Country Status (1)

Country Link
CN (1) CN113191361B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113392819B (en) * 2021-08-17 2022-03-08 北京航空航天大学 Batch academic image automatic segmentation and labeling device and method
CN116486265B (en) * 2023-04-26 2023-12-19 北京卫星信息工程研究所 Airplane fine granularity identification method based on target segmentation and graph classification

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104834922A (en) * 2015-05-27 2015-08-12 电子科技大学 Hybrid neural network-based gesture recognition method
CN106934419A (en) * 2017-03-09 2017-07-07 西安电子科技大学 Classification of Polarimetric SAR Image method based on plural profile ripple convolutional neural networks
CN108139334A (en) * 2015-08-28 2018-06-08 株式会社佐竹 Has the device of optical unit
WO2020199468A1 (en) * 2019-04-04 2020-10-08 平安科技(深圳)有限公司 Image classification method and device, and computer readable storage medium
CN111898621A (en) * 2020-08-05 2020-11-06 苏州大学 Outline shape recognition method
CN112464942A (en) * 2020-10-27 2021-03-09 南京理工大学 Computer vision-based overlapped tobacco leaf intelligent grading method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104834922A (en) * 2015-05-27 2015-08-12 电子科技大学 Hybrid neural network-based gesture recognition method
CN108139334A (en) * 2015-08-28 2018-06-08 株式会社佐竹 Has the device of optical unit
CN106934419A (en) * 2017-03-09 2017-07-07 西安电子科技大学 Classification of Polarimetric SAR Image method based on plural profile ripple convolutional neural networks
WO2020199468A1 (en) * 2019-04-04 2020-10-08 平安科技(深圳)有限公司 Image classification method and device, and computer readable storage medium
CN111898621A (en) * 2020-08-05 2020-11-06 苏州大学 Outline shape recognition method
CN112464942A (en) * 2020-10-27 2021-03-09 南京理工大学 Computer vision-based overlapped tobacco leaf intelligent grading method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
用于遮挡形状匹配的弦角特征描述;杨剑宇 等;《光学精密工程》;第23卷(第06期);1758-1767 *
融合局部特征与深度学习的三维掌纹识别;杨冰 等;《浙江大学学报(工学版)》;第54卷(第03期);540-545 *

Also Published As

Publication number Publication date
CN113191361A (en) 2021-07-30

Similar Documents

Publication Publication Date Title
CN106951825B (en) Face image quality evaluation system and implementation method
CN107066559B (en) Three-dimensional model retrieval method based on deep learning
CN111898621B (en) Contour shape recognition method
CN109190566B (en) Finger vein recognition method integrating local coding and CNN model
CN114220124A (en) Near-infrared-visible light cross-modal double-flow pedestrian re-identification method and system
WO2018052587A1 (en) Method and system for cell image segmentation using multi-stage convolutional neural networks
CN107633226B (en) Human body motion tracking feature processing method
CN104408469A (en) Firework identification method and firework identification system based on deep learning of image
CN113191361B (en) Shape recognition method
CN107169117B (en) Hand-drawn human motion retrieval method based on automatic encoder and DTW
CN107451594B (en) Multi-view gait classification method based on multiple regression
CN111783748A (en) Face recognition method and device, electronic equipment and storage medium
CN109840518B (en) Visual tracking method combining classification and domain adaptation
CN111968124B (en) Shoulder musculoskeletal ultrasonic structure segmentation method based on semi-supervised semantic segmentation
CN111539320B (en) Multi-view gait recognition method and system based on mutual learning network strategy
Lin et al. Determination of the varieties of rice kernels based on machine vision and deep learning technology
CN112488128A (en) Bezier curve-based detection method for any distorted image line segment
CN113128518B (en) Sift mismatch detection method based on twin convolution network and feature mixing
CN112329662B (en) Multi-view saliency estimation method based on unsupervised learning
CN109190505A (en) The image-recognizing method that view-based access control model understands
CN112418262A (en) Vehicle re-identification method, client and system
CN108765384B (en) Significance detection method for joint manifold sequencing and improved convex hull
CN116681742A (en) Visible light and infrared thermal imaging image registration method based on graph neural network
CN112560824B (en) Facial expression recognition method based on multi-feature adaptive fusion
CN109165586A (en) intelligent image processing method for AI chip

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant