Disclosure of Invention
In order to solve the above technical problems, an object of the present invention is to provide a trench feature parameter processing method based on hierarchical clustering analysis, which performs post-processing on feature parameter data detected by an instrument, and performs directional screening on singular point data to achieve the purpose of accurate measurement.
In order to achieve the purpose of the invention, the invention adopts the following technical scheme:
a groove characteristic parameter processing method based on hierarchical clustering analysis comprises the following steps:
step 1), extracting the edge contour of the image, and calculating the characteristic parameters of the groove;
step 2), the obtained groove number N and a standard value NstandardComparing, judging whether the calculation of the number of the grooves is accurate, and if not, performing hierarchical clustering screening on the data of the depth, the area and the bottom width of all holes;
step 2.1),For X ═ X1,x2,x3…xnPerforming Euclidean distance calculation, and creating a Euclidean distance matrix table, wherein xiIs a characteristic parameter vector, x, of a single holei={di,si,wiD is the depth of the trench, i.e. the vertical distance from the top of the trench hole to the bottom of the trench, s is the trench area ratio, i.e. the ratio of the area of a single trench hole to the area of the whole trench, w is the width of the bottom of the trench hole, as shown in fig. 2, the euclidean distance calculation formula:
euclidean distance matrix table
Step 2.2), finding out two elements x with minimum distance in the established Euclidean distance matrix tableiAnd xjMerge into class C (x)i,xj) By C (x)i,xj) Substitution of xiAnd xjSubstituting the new matrix table to form an iterative computation matrix table, computing the spatial distance according to the inter-class average distance method, iterating the method until all elements are grouped into one class, and generating a pedigree tree as shown in FIG. 3;
iterative computation matrix table
Step 2.3), clustering data points in the pedigree tree layer by layer from bottom to top to finally form a complete set consisting of two major categories of C1 and C2; the number of C2 elements is smaller, and the number of the elements is calculated to be | C2|, if | C2| is equal to N and NstandardDifference of (2)Value, such elements are all screened out if | C2| is greater than N and NstandardCalculating the spatial distance between each element in C2 and the gravity center G of C1 in turn, and screening out the A elements with the maximum spatial distance and the minimum similarity, wherein A is N and NstandardThe difference of (a).
Preferably, the definition of the average distance between classes is an average value of euclidean distances between all elements between classes, and the calculation formula is as follows:
wherein | C1|, | C2| represents the number of elements of C1 and C2.
Preferably, the calculation formula of the gravity center point G of C1 is as follows:
compared with the prior art, the invention has the beneficial effects that:
the method is based on hierarchical clustering analysis to perform post-optimization processing on the groove characteristic parameters obtained by image processing, effectively classify the data set, accurately screen out interference data from the classification, and ensure the accuracy of the characteristic parameters. From the practical application effect, the method has the advantages that the false detection rate is as low as 0.7%, the false detection rate before improvement is 4.64% and is reduced by 84.84%, the detection accuracy is high, and the method has a good application effect. In actual production, if the characteristic parameter vector of the method is expanded by an analysis factor to perform higher dimension analysis, the accuracy of the method is further improved.
Detailed Description
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, elements, and/or combinations thereof, unless the context clearly indicates otherwise.
Further, in the description of the present invention, it is to be understood that the terms "center", "longitudinal", "lateral", "length", "width", "thickness", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "clockwise", "counterclockwise", and the like, indicate orientations and positional relationships based on those shown in the drawings, and are only for convenience of description and simplicity of description, but do not indicate or imply that the device or element being referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus, should not be considered as limiting the present invention.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present invention, unless otherwise specified, "a plurality" means two or more unless explicitly defined otherwise.
In the present invention, unless otherwise expressly specified or limited, the terms "mounted," "connected," "secured," and the like are to be construed broadly and can, for example, be fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.
In the present invention, unless otherwise expressly stated or limited, "above" or "below" a first feature means that the first and second features are in direct contact, or that the first and second features are not in direct contact but are in contact with each other via another feature therebetween. Also, the first feature being "on," "above" and "over" the second feature includes the first feature being directly on and obliquely above the second feature, or merely indicating that the first feature is at a higher level than the second feature. A first feature being "under," "below," and "beneath" a second feature includes the first feature being directly under and obliquely below the second feature, or simply meaning that the first feature is at a lesser elevation than the second feature.
The invention will be further illustrated with reference to the following examples and drawings:
as shown in fig. 1, a trench feature parameter processing method based on hierarchical clustering analysis includes the following steps:
step 1), extracting the edge contour of the image, and calculating the characteristic parameters of the groove;
step 2), the obtained groove number N and a standard value NstandardComparing, judging whether the calculation of the number of the grooves is accurate, if not, carrying out hierarchical clustering screening on the data of the depth, the area and the bottom width of all the holes。
Step 2.1), for X ═ X1,x2,x3…xnPerforming Euclidean distance calculation, and creating a Euclidean distance matrix table, wherein xiIs a characteristic parameter vector, x, of a single holei={di,si,wiD is the depth of the groove, i.e. the vertical distance from the top of the groove hole to the bottom of the groove, s is the area ratio of the groove, i.e. the ratio of the area of a single groove hole to the area of the whole groove, w is the width of the bottom of the groove hole, and the calculation formula of the Euclidean distance is:
step 2.2), finding out two elements x with minimum distance in the established Euclidean distance matrix tableiAnd xjMerge into class C (x)i,xj) By C (x)i,xj) Substitution of xiAnd xjRe-introducing the data into the matrix table to form an iterative computation matrix table, computing the spatial distance according to the inter-class average distance method, iterating the method until all elements are gathered into one class, and generating a pedigree tree; the definition of the average distance between the classes is the average value of Euclidean distances between all elements between the classes, and the calculation formula is as follows:
wherein | C1|, | C2| represents the number of elements of C1 and C2.
Step 2.3), clustering data points in the pedigree tree layer by layer from bottom to top to finally form a complete set consisting of two major categories of C1 and C2; the number of C2 elements is smaller, and the number of the elements is calculated to be | C2|, if | C2| is equal to N and NstandardIf | C2| is greater than N and N, then such elements are all filtered outstandardCalculating the spatial distance between each element in C2 and the gravity center G of C1 in turn, and screening out the A elements with the maximum spatial distance and the minimum similarity, wherein A is N and NstandardThe center of gravity G of C1 is calculatedThe calculation formula is as follows:
the following is a detailed explanation by applying a specific example of the method of the present invention.
As can be seen from fig. 4, the calculated number of grooves is 17, the theoretical design number of grooves is 16, and the calculated value is larger than the standard value of the design, so that it can be concluded that there is a deviation in the image calculation result. An irregular dark area in the upper left corner of the image is marked as a trench hole, and between marks 15 and 17, deep analysis shows that the abnormal data points are marked by boxes for calculating the abnormal trench characteristic parameters due to the typical uneven tow filling.
Step 1: the characteristic parameter data of the whole filter stick is calculated as follows:
sample characteristic parameter data table
Step 2: the quantity of data points 17 is larger than the designed standard value 16, the data is abnormal, and an Euclidean distance matrix table is made according to the table:
sample Euclidean distance matrix table
|
1
|
2
|
3
|
4
|
5
|
6
|
7
|
8
|
9
|
10
|
11
|
12
|
13
|
14
|
15
|
16
|
17
|
1
|
0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2
|
0.234
|
0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3
|
0.285
|
0.057
|
0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
4
|
0.241
|
0.032
|
0.075
|
0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
5
|
0.320
|
0.096
|
0.042
|
0.112
|
0
|
|
|
|
|
|
|
|
|
|
|
|
|
6
|
0.181
|
0.066
|
0.107
|
0.088
|
0.139
|
0
|
|
|
|
|
|
|
|
|
|
|
|
7
|
0.297
|
0.073
|
0.036
|
0.086
|
0.033
|
0.119
|
0
|
|
|
|
|
|
|
|
|
|
|
8
|
0.321
|
0.090
|
0.045
|
0.096
|
0.032
|
0.144
|
0.030
|
0
|
|
|
|
|
|
|
|
|
|
9
|
0.304
|
0.071
|
0.024
|
0.081
|
0.042
|
0.128
|
0.041
|
0.032
|
0
|
|
|
|
|
|
|
|
|
10
|
0.268
|
0.051
|
0.052
|
0.058
|
0.070
|
0.096
|
0.037
|
0.059
|
0.061
|
0
|
|
|
|
|
|
|
|
11
|
0.326
|
0.102
|
0.046
|
0.117
|
0.010
|
0.145
|
0.042
|
0.036
|
0.044
|
0.079
|
0
|
|
|
|
|
|
|
12
|
0.112
|
0.312
|
0.355
|
0.325
|
0.382
|
0.249
|
0.361
|
0.389
|
0.377
|
0.336
|
0.388
|
0
|
|
|
|
|
|
13
|
0.100
|
0.211
|
0.250
|
0.227
|
0.275
|
0.147
|
0.255
|
0.283
|
0.273
|
0.232
|
0.282
|
0.108
|
0
|
|
|
|
|
14
|
0.092
|
0.233
|
0.273
|
0.248
|
0.298
|
0.169
|
0.279
|
0.307
|
0.295
|
0.255
|
0.305
|
0.084
|
0.024
|
0
|
|
|
|
15
|
0.240
|
0.060
|
0.081
|
0.071
|
0.098
|
0.077
|
0.068
|
0.094
|
0.095
|
0.037
|
0.108
|
0.304
|
0.198
|
0.222
|
0
|
|
|
16
|
0.285
|
0.461
|
0.492
|
0.481
|
0.514
|
0.397
|
0.503
|
0.530
|
0.515
|
0.488
|
0.518
|
0.196
|
0.271
|
0.250
|
0.459
|
0
|
|
17
|
0.223
|
0.152
|
0.155
|
0.182
|
0.169
|
0.108
|
0.167
|
0.191
|
0.179
|
0.167
|
0.171
|
0.252
|
0.157
|
0.175
|
0.152
|
0.351
|
0 |
Step 3: from the above table, it can be seen that the similarity between the data points 5 and 11 is the highest, and the spatial distance is the minimum of 0.010, and the class (5, 11) is aggregated to replace the data points 5 and 11, and is substituted into the spatial distance matrix to perform an iterative operation, so as to generate the lineage tree shown in fig. 5.
Step 4: it can be seen from the lineage tree that class C2 contains data points 1, 12, 13, 14, 16, which are 5 in number, and we need only cull one of the points to perform a spatial distance calculation between all the data points in class C2 and the center of gravity point in class C1. It can be seen from the data that data point 16 is the farthest from the class C1 space and is the data point that needs to be screened.
Space distance meter
Serial number
| Point | 1
|
Dot 12
|
Point 13
|
Dot 14
|
Point 16
|
C1 center of gravity G
|
0.265
|
0.333
|
0.227
|
0.25
|
0.474 |
Step 5: through the above steps, a final processing diagram is obtained, as shown in fig. 6.
Comparison table for calculating time
As can be seen from the above table, the average consumed time of a single filter stick is calculated repeatedly for 20 times, 935.35 milliseconds are calculated before the hierarchical clustering algorithm is added to the program, and 953.55 milliseconds are calculated after the hierarchical clustering algorithm is added, which is 18.2 milliseconds more. Therefore, the algorithm has small overall data amount, small calculation amount and small influence on the calculation complexity of the whole program.
The application provides a groove characteristic parameter processing method based on hierarchical clustering analysis, and aims to carry out reprocessing analysis on extracted characteristic parameter data, screen out suspicious data points, retain correct data information and improve the authenticity and effectiveness of data.
The hierarchical clustering analysis method is an algorithm which judges the similarity between every two points according to the spatial distance between the data points and the points, the smaller the spatial distance is, the higher the similarity is, and clusters the two points with the highest similarity, and the process is iterated repeatedly. By utilizing a hierarchical clustering analysis method, the data points with the maximum singularity can be quickly screened out by calculating the spatial distance of parameters such as depth, area, bottom width and the like between the holes.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
Although the embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and not to be construed as limiting the present invention, and those skilled in the art can make changes, modifications, substitutions and alterations to the above embodiments without departing from the principle and spirit of the present invention, and any simple modification, equivalent change and modification made to the above embodiments according to the technical spirit of the present invention still fall within the technical scope of the present invention.