Summary of the invention
The present invention uses for reference the achievement in research of the people such as Frey and Wang Changdong, a kind of many subclasses center neighbour's propagation clustering algorithm (neighbor propagation based multi-exemplar affinity propagation transmitted based on neighbor relationships is proposed, NP-MEAP), the automatic classification of moire pattern is realized in conjunction with SC feature extraction algorithm.The object of the invention is to overcome the shortcoming of the unusual poor efficiency of moire topography manual sort, design a kind of unsupervised moire topography automatic classification technology.
Realizing key problem in technology of the present invention is: moire topography pre-service, extract moire topography SC similarity matrix, carry out automatic classification by NP algorithm optimization similarity matrix, finally the employing MEAP propagation clustering algorithm improved.Specific implementation step comprises as follows:
(1) moire topography pre-service, normalization moire topography size, removal ground unrest, the refinement of moire topography lines.
(1a) normalization moire topography size, so both facilitates the unified process of successive image can not change image lines distribution situation simultaneously
(1b) remove ground unrest, also the convenient Mathematical Morphology Method that adopts carries out refinement to image simultaneously.
(1c) moire topography lines refinement, because the stripe shape of inhomogeneity moire topography is different, and the stripe shape basic simlarity of similar moire topography, therefore mainly pay close attention to moire topography stripe shape.
(2) the Shape context similarity matrix of moire topography is extracted
(2a) Shape context algorithm thinks that the object in each image can carry out approximate description with a limited number of discrete point equally distributed in graphic limit, so need extract the borderline discrete point of moire topography.
(2b) its Shape context is calculated for each discrete point.
(2c) the Shape context difference between any two points in two width moire topographies is calculated.
(2d) the tangent angle difference between any two points in two width moire topographies is calculated.
(2e) the Shape context difference in two width moire topographies between any two points and tangent angle difference are organically combined.
(2f) the Shape context distance value between any two width moire topographies is calculated.
(3) the neighbour's pass-algorithm (NP) improved optimizes S
scmatrix
(3a) Shape context Distance matrix D is calculated.
(3b) calculate neighbor relationships and transmit threshold epsilon.
(3c) the similarity matrix S between moire topography is calculated.
(3d) neighbor relationships matrix N is calculated.
(3e) neighbor relationships pass-algorithm optimizes similarity matrix.
(4) using the input matrix of the similarity matrix after above-mentioned optimization as MEAP algorithm, by adjustment reference value, obtain correct number of categories, realize the automatic classification of moire topography.
The present invention has carried out pre-service to moire topography, have chosen suitable image characteristics extraction algorithm, based on the thought of manifold learning, the neighbor relationships pass-algorithm improved is adopted to be optimized similarity matrix, finally adopt up-to-date many subclasses center neighbour's propagation clustering algorithm realization automatic classification, fill up the blank of moire topography automatic classification, ensure that the accuracy of moire topography automatic classification simultaneously.
The present invention has the following advantages:
(1) by carrying out pre-service to moire topography, eliminating the impact of picture size, ground unrest, line weight, ensureing that the accuracy of automatic classification does not affect by it.
(2) principal character for moire topography is its stripe shape, have chosen current superior Shape Feature Extraction algorithm, and Shape context descriptor ensures that cluster result is more satisfactory.
(3) to adopt neighbor relationships pass-algorithm to optimize to Shape context similarity matrix the automatic classification accuracy that algorithm is obtained further higher in the present invention, realizes the automatic classification of moire topography.
Embodiment
One, basic theory introduction
1. more than subclass center neighbour propagation clustering algorithm
MEAP algorithm is a clustering algorithm having two-layer structure, all data objects are distributed to most suitable subclass center by this algorithm as shown in Figure 2, most suitable super cluster class center is given by each subclass central dispense, thus the object of implementation modelization many subclasses problem.
With the AP class of algorithms seemingly, MEAP algorithm is that each data object is set up and the similarity information s (i, j) of other data objects and Connected degree information l (i, j).Algorithm is each data object setting deflection parameter p=s (k, k) with pp=l (k, k) value, the possibility that p and pp is worth the corresponding data object of larger expression subclass center alternatively and super cluster centre is larger, the cluster numbers obtained is more, usually arranges the intermediate value that p and pp value is similarity matrix and Connected degree matrix respectively.The core procedure of MEAP algorithm is the alternating renewal process of 4 classes, 7 formula, and more new formula is as follows:
In above-mentioned formula, the concrete meaning of correlation parameter can see document multi-exemplar affinity propagation, and all new variables are all initialized as 0.MEAP algorithm is in whole iteration renewal process, each data object is at war with and automatically produces corresponding subclass center and super cluster class center, other data objects are distributed to nearest subclass center, and subclass center is combined together to form final cluster result by super cluster class center.
Two, the present invention is a kind of conventional cloud print image automatic classification method
With reference to Fig. 1, specific embodiment of the invention process comprises the following steps:
The pre-service of step 1. moire topography
Fig. 3 is multiple moire pattern examples, and the size dimension of the different moire patterns as can be seen from Figure 3 gathered from various data is different, and the thickness of lines differs, and the moire topography that part gathers simultaneously includes gray background noise.Therefore need to carry out pre-service to moire topography, to obtain clustering precision more accurately.
(1.1) size of moire topography is different, first by the size of all image normalizations to 85*45, so both facilitates the unified process of successive image can not change image lines distribution situation simultaneously.Moire topography after normalizing to 85*45 as shown in Figure 4.
(1.2) comprise the problem of ground unrest for moire topography, carry out binary conversion treatment eliminate ground unrest to image, also the convenient Mathematical Morphology Method that adopts carries out refinement to image simultaneously.Adopt Da-Jin algorithm to calculate binary-state threshold, the moire topography after binaryzation as shown in Figure 5.
(1.3) because the stripe shape of inhomogeneity moire topography is different, and the stripe shape basic simlarity of similar moire topography, therefore the present invention mainly pays close attention to moire pattern stripe shape.The different thicknesses of lines and the classification of moire pattern are also uncorrelated, may affect the accuracy of cluster on the contrary, therefore adopt the method for mathematical morphology lines to be refine to the width of a pixel.Moire topography after refinement as shown in Figure 6.
Step 2. calculates the Shape context similarity matrix of moire pattern image set
(2.1) SC algorithm thinks that the object in each image can carry out approximate description with a limited number of discrete point, and these discrete points the key point such as flex point, extreme point needed not be in figure, but equally distributed discrete point in graphic limit.Fig. 7, by moire topography lines after pre-service are extracted discrete point schematic diagram, wherein above arranges the little figure in the rightmost side in b little figure corresponding diagram 6 in little figure, Fig. 7 in the middle of upper row in a little figure corresponding diagram 6 in Fig. 7.As can be seen from Figure 7, the discrete point on moire topography lines can describe corresponding moire pattern stripe shape comparatively accurately, and the frontier point extracted is more, more accurate to the approximate description of pattern.But when the frontier point extracted is too much, then the working time of algorithm can be caused long, usually choose 100-150 frontier point and can describe stripe shape more accurately, the present invention uses n=100 boundary discrete method point to describe stripe shape.
Certain point in two little figure in Fig. 7 is marked with little square frame.To profile point set p={p in Fig. 7
1, p
2..., p
n, certain discrete point in n=100, consider the vector arriving other n-1 points from this point, this n-1 vector can describe the shape information of this moire topography more accurately.Be illustrated in figure 8 the discrete point that marks in Fig. 7 to other vector diagram a little.
(2.2) in Fig. 8, each point can to originate in n-1 vectors that this point ends at all the other points and describes, and the vector description that each moire lines are tieed up by n n-1, can obtain every width moire topography thus than more rich feature interpretation matrix.But all calculated to describe moire topography by all these vectors, calculated amount can be very large, and unrealistic.For shape, only know and to calculate on moire topography outline all discrete points relative to the position relationship of this point.Therefore under the rectangular coordinate system at moire lines place being transformed into log-polar system, with discrete point to be calculated for log-polar system round dot, polar coordinate system is equally divided into 12 parts from 0 to 2 π on direction, radius is divided into 5 parts out to 2r by the conversion of log space function from polar coordinates round dot, wherein r is the mean value of data set Euclidean distance, and whole like this polar coordinate system is just divided into 60 parts (bin).Calculate discrete that the point of moire topography is scattering in each bin to count, forms the vector of one 60 dimension, claim this 60 vector tieed up to be the Shape context of corresponding discrete point, i.e. the log-polar histogram of discrete point.Compute histograms formula is as follows:
h
i(k)=#{q≠p
i:(q-p
i)∈bin(k)}
Wherein k represents a kth bin in polar coordinate system, and value is 1 to 60, p
ifor the frontier point of histogrammic moire topography to be calculated, q is for removing p
iother n-1 frontier point outside point, q-p
ifor the number of frontier point in a kth bin.
(2.3) the Shape context difference between any two points in two width moire topographies is calculated, for the frontier point p of in moire topography P
iwith frontier point q in moire topography Q
j, use
mark the Shape context difference of these two points, so
computing formula as follows, wherein h
i(k) and h
jk () represents p respectively
iwith q
jthe number of frontier point in a kth bin in histogram.
(2.4) the tangent angle difference between any two points in two width moire topographies is calculated.Shape context diversity ratio captures the overall difference of different moire discrete point in shape preferably, and in order to make the difference between moire shape discrete point more accurate, add the tangent angle difference of discrete point, formula is as follows, wherein θ
iwith θ
jbe respectively p
iwith q
jthe tangent angle at some place.
(2.5) the Shape context difference in two width moire topographies between any two points and tangent angle difference are organically combined, just can measure the Shape context distance between any two points on different moire topography more accurately.Formula is as follows:
(2.6) the Shape context distance between two width moire topographies is calculated.By above-mentioned formula by calculating the Arbitrary Boundaries point p in moire topography P
iwith Arbitrary Boundaries point q in moire topography Q
jbetween Shape context distance, obtain the distance matrix of a n*n (n=100), by distance matrix laterally and the mean value of longitudinal minimum value to sue for peace the Shape context distance value obtained between two width moire topographies.Computing formula is as follows:
Above-mentioned formula income value is less, and between two width images, difference is less, and similarity is larger, otherwise then similarity is less.Using this value negate as the Shape context similarity measurement between two width images, be designated as S
sc(P, Q)=-D
sc(P, Q), the Shape context similarity measurement calculated between all images tries to achieve the similarity matrix S of moire pattern image set
sc.
Neighbour's pass-algorithm (NP) that step 3. is improved optimizes S
scmatrix
(3.1) Shape context Distance matrix D=[d is calculated
ij]
n × n, this matrix is used for initialization neighbor relationships matrix N hereinafter described, the element d in matrix
ijfor the Shape context distance of moire topography i and j, this value gets opposite number for upgrading the similarity matrix S after neighbor relationships transmission success.
(3.2) calculate neighbor relationships and transmit threshold value, note moire topography x
ibe d with the distance of its kth Neighbor Points
ik, get the mean value of all moire topographies and its kth nearest neighbor distance as threshold value, this threshold value can weaken the impact of noise data to a certain extent, chooses different k values simultaneously neighbor relationships transmission can be made more accurate for different data sets.New threshold formula is defined as follows:
(3.3) similarity matrix between moire topography is calculated, similarity matrix S=[s
ij]
n × n, the i-th row jth column element s in matrix
ijcomputing formula be defined as follows:
D
ijfor Shape context distance, amplify the distance between all moire herein by exponential transform, fundamental purpose amplifies the distance on various flows shape between moire topography, thus reduce its similarity.
(3.4) neighbor relationships matrix N is calculated, if the element d in Distance matrix D
ijbe less than neighbor relationships and transmit threshold epsilon, so think data object x
iwith x
jneighbour each other, is expressed as (x
i, x
j) ∈ R, the neighbor relationships matrix of all moire topographies is tried to achieve in definition thus.Namely as data object x
iwith x
jeach other during neighbour, so corresponding in matrix element n
ijvalue be 1, otherwise value is 0, and diagonal entry is 0.
(3.5) similarity matrix is optimized in neighbor relationships transmission, if i.e. n
ij=0, and n
ik=1, n
kj=1, so n is set
ij=1, n
ji=1, upgrade s simultaneously
ij=s
ji=-min (d
ik, d
kj).
Step 4., using the input matrix executing arithmetic of the similarity matrix after above-mentioned optimization as MEAP algorithm, by adjustment reference value, obtains correct number of categories, realizes the automatic classification of moire topography.
Effect of the present invention further illustrates by following experiment.
1. simulated conditions
The cirrus line resulting from Qin period expands by the abstract hook scroll of Yun Leiwen the combination that circumnutates changing out different structure.Cirrus line is widely used in the incrustation of various implements moulding from period in Spring and Autumn and Warring States to Qin Han dynasty, and bronze ware, lacquerware, beautiful decorations, eaves tile tile carving, picture-weaving in silk embroidery etc. can be seen the cirrus line of various moulding.Cirrus line sculpture style abundant species, decorative effect is various, has consequence in decorative art developing history and Contemporary Design application.
Here select the volume moire pattern of Qin period as the sample patterns of test the inventive method.Data set comprises 230 width moire patterns of 6 types.According to the difference of cirrus line curve shape, be respectively single hook formula cirrus line, cohesive type cirrus line, divergence expression cirrus line, composite type cirrus line, S shape cirrus line, as Italian type cirrus line.Fig. 9 is the example pattern of these 6 kinds of cirrus lines.
In order to verify the present invention put forward feasibility and the validity of algorithm, by the present invention, namely transmit based on neighbour and compare with the MEAP algorithm (NP-MEAP) of SC feature and the MEAP algorithm (SIFT-MEAP) based on SIFT feature and the MEAP algorithm (ED-MEAP) based on Euclidean distance.Whole experimentation, the initial value arranging p and the pp of algorithm is the intermediate value of similarity matrix, ratio of damping lam=0.9, convergent iterations number of times convits=50, maximum cycle maxits=1000, γ=3.Experiment running environment is as follows: processor is Core (TM) i5-3470, dominant frequency 3.2GHz, internal memory 4GB, hard disk 500GB, and operating system is Windows 7 Ultimate 64 bit manipulation system, and programming language is Matlab R2013a.The present invention adopts conventional cluster result evaluation index NMI index and FMI index.
The computing formula of standardization co-information NMI index is as follows:
Wherein π is the class label of clustering algorithm gained bunch class, and ζ is the class label that data set is truly classified, n
ih () represents the number of data object in bunch class l and the true h that classifies.H (π) is a bunch Shannon entropy of class label π, and H (ζ) is the Shannon entropy of true tag along sort ζ, n
iwith n
(j)be respectively the number of sample point in bunch class i and the true j that classifies.The value of NMI is larger, cluster result is described and truly classifies more close.
The computing formula of FMI (Fowlkes-Mallows Index) index is as follows:
If the cluster result C={c of clustering algorithm
1, c
2..., c
mrepresent, the true classification P={p of data set
1, p
2..., p
lrepresent.X
iand x
jfor any two data objects of data centralization.Wherein a is x
iand x
jthe number of one bunch is belonged in C and P; B is x
iand x
jin C, belong to same cluster, and in P, belong to the number of different bunches; C is x
iand x
jin C, belong to different bunches, and in P, belong to the number of same cluster; D is x
iand x
jthe number of different bunches is belonged to, here a+b+c+d=n (n-1)/2 in C and P.It can thus be appreciated that FMI span is [0,1], and value is larger, and algorithm cluster accuracy rate is higher.
Based on the MEAP algorithm of moire topography SIFT feature after moire topography pre-service, the moire lines of a pixel wide are suitably expanded, when to ensure that stripe shape is constant, make SIFT algorithm more effectively can extract suitable SIFT feature.Based on the MEAP algorithm of Euclidean distance after moire topography pre-service, equally the moire lines of a pixel wide are suitably expanded, simultaneously by the image pixel gray level value before Postprocessing technique to binaryzation, ensure that Euclidean distance can reflect the distance between moire topography more exactly with this.The discrete point that the MEAP algorithm of Shape-based interpolation context and neighbor relationships transmission extracts has certain randomness, thus the discrete point at every turn detected has certain difference, thus the Shape context distance making algorithm once run gained has certain fluctuation, therefore Shape context algorithm is run 20 times, get 20 Shape context distances and as the input of neighbor relationships pass-algorithm, optimize similarity matrix, thus ensure the stability of algorithm.
2. simulation result
The inventive method (NP-MEAP) and ED_MEAP and SIFT_MEAP method are compared.
Figure 10 is the moire topography clustering precision contrast schematic diagram of NP-MEAP and SIFT-MEAP, ED-MEAP, as we can see from the figure, the clustering precision of SIFT-MEAP and ED-MEAP two kinds of algorithms is substantially identical, all can only reach the cluster accuracy of about 40%, it can thus be appreciated that based on SIFT extract moire topography characteristic matching number and all well can not reflect the similarity between moire topography based on the moire topography similarity of negative Euclidean distance.And the cluster accuracy reviewing NP-MEAP can reach more than 80%.For very complicated moire pattern clustering problem, such precision is pretty good, can alleviate working strength and the efficiency of manual sort to a great extent.Simultaneously this similarity matrix also illustrating through the Shape-based interpolation contextual feature of neighbor relationships pass-algorithm optimization can better reflect the similarity between moire topography, and thus Clustering Effect is better.
Figure 11 is NP-MEAP algorithm cluster result schematic diagram on moire topography, and wherein each round rectangle represents a bunch of class, is the image of wrong cluster in corresponding bunch class in each round rectangle by the moire topography of grid mark.