CN110991521A - Clustering discriminant analysis method - Google Patents

Clustering discriminant analysis method Download PDF

Info

Publication number
CN110991521A
CN110991521A CN201911201396.5A CN201911201396A CN110991521A CN 110991521 A CN110991521 A CN 110991521A CN 201911201396 A CN201911201396 A CN 201911201396A CN 110991521 A CN110991521 A CN 110991521A
Authority
CN
China
Prior art keywords
sub
sample data
cluster
clustering
analysis method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911201396.5A
Other languages
Chinese (zh)
Inventor
曲慧杨
蒲睿英
郭丽琴
邹珊珊
薛俊杰
周军华
施国强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Simulation Center
Original Assignee
Beijing Simulation Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Simulation Center filed Critical Beijing Simulation Center
Priority to CN201911201396.5A priority Critical patent/CN110991521A/en
Publication of CN110991521A publication Critical patent/CN110991521A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques

Abstract

The scheme provides a clustering discriminant analysis method, which comprises the following steps: giving sample data and the number of sub-clusters, extracting a characteristic vector of a target to be analyzed, and carrying out coring treatment to obtain the sample data in a high-dimensionality kernel space; establishing a sub-cluster structure for the sample data by using a fuzzy C-means clustering algorithm and a fuzzy clustering analysis method; the method is used for solving the problem of classifying the sample data of the complex object which cannot be effectively processed by a linear discrimination method, and can establish a multi-sub-cluster structure in each class when the sample data class has a plurality of sub-clusters.

Description

Clustering discriminant analysis method
Technical Field
The application relates to the technical field of pattern recognition, in particular to a clustering discriminant analysis method.
Background
The subspace analysis method is an important branch of the pattern recognition technology, and is successfully applied in the fields of feature recognition, image video recovery and the like. Among subspace analysis methods, a linear discriminant analysis method (LDA) has been widely applied to data dimension reduction and data separability improvement. The method simultaneously considers the scattering characteristics in the class and among the classes, provides an effective solution for many mode classification problems, but has the limitations of linearity, Gaussian assumption and singularity. Another method is to divide each class into multiple clusters and find some directions by the LDA-like criteria, so that the projections of any two clusters of different classes in the direction can be well separated with minimal scatter inside the clusters, which is called as the Cluster Discriminant Analysis (CDA). LDA can obtain the optimal Bayesian classification error when the distribution of the two classes meets the Gaussian distribution and has the same covariance matrix and different mean vectors. While CDA can better address such problems when the mean vectors of the two classes are close to each other or the distribution of the classes appears multimodal.
Disclosure of Invention
In order to solve one of the above problems, the present application provides a clustering discriminant analysis method.
According to a first aspect of embodiments of the present application, there is provided a clustering discriminant analysis method, including the steps of:
giving sample data and the number of sub-clusters, extracting a characteristic vector of a target to be analyzed, and carrying out coring treatment to obtain the sample data in a high-dimensionality kernel space;
establishing a sub-cluster structure for the sample data by using a fuzzy C-means clustering algorithm and a fuzzy clustering analysis method;
and classifying and identifying the target to be analyzed based on the obtained central vector of the sub-cluster structure.
Preferably, the coring processing comprises processing the feature vector using a gaussian kernel function.
Preferably, the extracting method includes Fourier transform method, wavelet transform method, least square method or boundary direction histogram method.
Preferably, the target to be analyzed is a complex object image; and the high-dimensional kernel space is the sample data of the high-dimensional space converted from the feature vector after being processed by the Gaussian kernel function, and the high-dimensional space is formed by the sample data.
Preferably, the establishing a sub-cluster structure for the sample data by using a fuzzy C-means clustering algorithm and a fuzzy clustering analysis method further comprises:
optimizing a function related to distance between sample data in a high-dimensionality kernel space to obtain an optimal cluster of the sample data;
optimizing for multiple times to obtain multiple optimal clusters;
and carrying out cluster analysis on the optimal clusters to obtain a sub-cluster structure.
Preferably, the function for optimizing the correlation between the sample data and the distance in the high-dimensional kernel space is a distance function for describing the sample data converted by the feature vector in the high-dimensional kernel space and the center of the sub-cluster by using an inner product, and the optimal cluster in the high-dimensional kernel space is obtained by optimizing the distance function.
Preferably, the multiple optimization to obtain multiple optimal clusters is to use a fuzzy C-means clustering algorithm to run sample data in a high-dimensional kernel space obtained through the nucleation for multiple times to obtain multiple optimal clusters.
Preferably, the cluster analysis is a fuzzy cluster analysis method.
Preferably, the classifying and identifying the target to be analyzed based on the obtained central vector of the sub-cluster structure further includes:
respectively establishing an inter-sub-cluster scatter matrix and an intra-sub-cluster scatter matrix in a high-dimensionality kernel space by using sub-cluster center vectors obtained when the sub-cluster structures are established;
maximizing the ratio of the interspersion matrix among the sub-clusters to the interspersion matrix within the sub-clusters;
calculating to obtain a characteristic value and a characteristic vector of a target to be analyzed in a high-dimensional nuclear space;
and classifying and identifying the target to be analyzed based on the obtained feature vector.
Preferably, based on the obtained feature vector, the classifying and identifying the target to be analyzed includes:
and projecting the sample data in the high-dimensional nuclear space to the direction described by the characteristic vector, and classifying and identifying the target to be analyzed after projection.
Advantageous effects
The method provided by the invention can solve the problem of classifying the sample data of the complex object which cannot be effectively processed by a linear discrimination method, and meanwhile, when the sample data class has a plurality of sub-clusters, the method can establish a multi-sub-cluster structure in each class, and realize the classification and identification of the sample data of the complex object by maximizing the distance between two sub-clusters belonging to different classes and minimizing the dispersion distance in the sub-clusters in each class. The method can be applied to the fields of complex target image recognition, workpiece and clamp recognition by an industrial robot in the intelligent manufacturing process and the like.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
fig. 1 shows a schematic diagram of the clustering discriminant analysis method according to the present embodiment.
Detailed Description
In order to make the technical solutions and advantages of the embodiments of the present application more apparent, the following further detailed description of the exemplary embodiments of the present application with reference to the accompanying drawings makes it clear that the described embodiments are only a part of the embodiments of the present application, and are not exhaustive of all embodiments. It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.
The scheme provides a kernel clustering discriminant analysis method applicable to complex objects, which is used for solving the problem of classification of sample data of the complex objects which cannot be effectively processed by a linear discriminant method.
Therefore, the method comprises the following steps:
giving sample data and the quantity of sub-clusters, extracting a characteristic vector of a complex object by utilizing a Fourier transform method, a wavelet transform method, a least square method or a boundary direction histogram method, processing the characteristic vector by utilizing a Gaussian kernel function, converting the characteristic vector into the sample data of a high-dimensional space after the characteristic vector is processed by the Gaussian kernel function, and calling the high-dimensional space formed by the sample data as the high-dimensional kernel space;
establishing a sub-cluster structure for the obtained sample data by using a fuzzy C-means clustering algorithm and a fuzzy clustering analysis method;
describing a distance function between sample data converted by the feature vectors and a sub-cluster center in the high-dimensional kernel space by using the inner product, and obtaining the optimal cluster in the high-dimensional kernel space by optimizing the distance function;
carrying out repeated operation on the sample data in the high-dimensional kernel space obtained through the coring treatment by using a fuzzy C-means clustering algorithm to obtain a plurality of optimal clusters;
performing clustering analysis on the obtained optimal clusters to obtain a sub-cluster structure;
respectively establishing an inter-sub-cluster scatter matrix and an intra-sub-cluster scatter matrix in a high-dimensionality kernel space by using sub-cluster center vectors obtained when the sub-cluster structures are established;
maximizing the ratio of the interspersion matrix among the sub-clusters to the interspersion matrix within the sub-clusters;
obtaining the eigenvalue and the eigenvector of the complex object in the high-dimensional nuclear space through the calculation of the feature matrix;
and projecting the converted sample data in the high-dimensional nuclear space to the directions described by the characteristic vectors, and classifying and identifying the complex object after projection.
The method comprises the steps of mapping feature data of a complex object in an original space to a high-dimensionality kernel space by utilizing a Gaussian kernel function, establishing a multi-sub-cluster structure of the data in the kernel space by applying a nucleation fuzzy C-means method in combination with a fuzzy cluster analysis method in the kernel space, converting nonlinear and indivisible problems in the original space into linear separable problems in the high-dimensionality kernel space, obtaining feature vectors in the high-dimensionality kernel space by applying a nucleation cluster discrimination method in the kernel space, and realizing classification and identification of the complex object through projection in the direction of the feature vectors.
The following describes the steps of the method of the present embodiment by way of example with reference to fig. 1:
the first step is to extract the characteristic data of the complex object and carry out the coring processing
The complex object is usually described by a feature vector, the feature vector can be extracted from the image of the complex object by using a common feature extraction method, such as a Fourier transform method, a wavelet transform method, a least square method or a boundary direction histogram method, the feature vector is subjected to coring processing by using a Gaussian kernel function and is converted into sample data in a high-dimensional space, and thus the nonlinear discrimination problem of the sample in the low-dimensional space is converted into the linear discrimination problem of the sample in the high-dimensional space;
in this embodiment, the feature vector is converted into sample data of a high-dimensional space after being processed by a gaussian kernel function, and the high-dimensional space formed by the sample data is referred to as a high-dimensional kernel space;
second step, establishing a multi-sub cluster structure in each class
After the feature vectors of the complex objects are converted into a high-dimensional kernel space, effective data description is difficult to perform, for this reason, sample data in the high-dimensional kernel space obtained through the coring processing is calculated by a fuzzy C mean method, and a fuzzy clustering analysis method is combined to find a multi-sub-clustering structure inside each class, and the specific steps are as follows:
giving sample data and the number of sub-clusters, describing a distance function between a feature vector in a high-dimensional kernel space and a sub-cluster center by using an inner product for the sample data in the high-dimensional kernel space obtained through the coring treatment, and obtaining the optimal cluster in the high-dimensional kernel space by optimizing the distance function; because the result obtained by calculating the sample data in the high-dimensional kernel space by using the fuzzy C mean method is easily influenced by the initial sub-cluster number, the fuzzy C mean method needs to be operated for a plurality of times on the sample data in the high-dimensional kernel space to obtain a result set, and the fuzzy cluster analysis method is applied to the result set to improve the robustness and consistency of the fuzzy C mean method on the calculation of the sample data in the high-dimensional kernel space and obtain the sub-cluster structure of the sample data in the high-dimensional kernel space.
Thirdly, classifying and identifying sample data in the high-dimensional nuclear space
After kernel function conversion, converting a nonlinear discrimination problem of an original space complex object sample feature vector into a linear discrimination problem in a high-dimensional space, respectively establishing an inter-sub-cluster dispersion matrix and an intra-sub-cluster dispersion matrix in the high-dimensional kernel space by utilizing a sub-cluster center vector of each class obtained by calculation, maximizing the inter-sub-cluster dispersion matrix while minimizing the intra-sub-cluster dispersion matrix if the ratio of the two is maximized, converting the maximization problem into a feature matrix solving problem, obtaining a feature value and a feature vector thereof in the high-dimensional kernel space after calculation, and after projecting sample data converted in the high-dimensional kernel space to the direction described by the feature vectors, classifying and identifying the complex object.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The present invention is not limited to the above embodiments, and any modifications, equivalent replacements, improvements, etc. made within the spirit and principle of the present invention are included in the scope of the claims of the present invention which are filed as the application.

Claims (10)

1. A cluster discriminant analysis method is characterized by comprising the following steps:
giving sample data and the number of sub-clusters, extracting a characteristic vector of a target to be analyzed, and carrying out coring treatment to obtain the sample data in a high-dimensionality kernel space;
establishing a sub-cluster structure for the sample data by using a fuzzy C-means clustering algorithm and a fuzzy clustering analysis method;
and classifying and identifying the target to be analyzed based on the obtained central vector of the sub-cluster structure.
2. The method according to claim 1, wherein the coring comprises processing the feature vectors using a gaussian kernel function.
3. The clustering discriminant analysis method of claim 1, wherein the extracting method comprises Fourier transform, wavelet transform, least square method or boundary direction histogram method.
4. The clustering discriminant analysis method according to claim 1, wherein the target to be analyzed is a complex object image; and the high-dimensional kernel space is the sample data of the high-dimensional space converted from the feature vector after being processed by the Gaussian kernel function, and the high-dimensional space is formed by the sample data.
5. The method of claim 1, wherein the establishing a sub-cluster structure for the sample data using a fuzzy C-means clustering algorithm and a fuzzy clustering analysis method further comprises:
optimizing a function related to distance between sample data in a high-dimensionality kernel space to obtain an optimal cluster of the sample data;
optimizing for multiple times to obtain multiple optimal clusters;
and carrying out cluster analysis on the optimal clusters to obtain a sub-cluster structure.
6. The clustering discriminant analysis method according to claim 5, wherein the function for optimizing the correlation between the sample data and the distance in the high-dimensional kernel space is a function for describing a distance between the sample data converted from the feature vector and a center of the sub-cluster in the high-dimensional kernel space by using an inner product, and the optimal clustering in the high-dimensional kernel space is obtained by optimizing the distance function.
7. The cluster discriminant analysis method of claim 5, wherein the multiple optimization to obtain multiple optimal clusters is performed by multiple operations of the fuzzy C-means clustering algorithm on the sample data in the high dimensional kernel space obtained through the coring processing to obtain multiple optimal clusters.
8. The method of claim 5, wherein the cluster analysis is a fuzzy cluster analysis method.
9. The clustering discriminant analysis method according to claim 1, wherein the classifying and identifying the target to be analyzed based on the obtained center vector of the sub-cluster structure further comprises:
respectively establishing an inter-sub-cluster scatter matrix and an intra-sub-cluster scatter matrix in a high-dimensionality kernel space by using sub-cluster center vectors obtained when the sub-cluster structures are established;
maximizing the ratio of the interspersion matrix among the sub-clusters to the interspersion matrix within the sub-clusters;
calculating to obtain a characteristic value and a characteristic vector of a target to be analyzed in a high-dimensional nuclear space;
and classifying and identifying the target to be analyzed based on the obtained feature vector.
10. The clustering discriminant analysis method according to claim 9, wherein classifying and identifying the target to be analyzed based on the obtained feature vectors comprises:
and projecting the sample data in the high-dimensional nuclear space to the direction described by the characteristic vector, and classifying and identifying the target to be analyzed after projection.
CN201911201396.5A 2019-11-29 2019-11-29 Clustering discriminant analysis method Pending CN110991521A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911201396.5A CN110991521A (en) 2019-11-29 2019-11-29 Clustering discriminant analysis method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911201396.5A CN110991521A (en) 2019-11-29 2019-11-29 Clustering discriminant analysis method

Publications (1)

Publication Number Publication Date
CN110991521A true CN110991521A (en) 2020-04-10

Family

ID=70088385

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911201396.5A Pending CN110991521A (en) 2019-11-29 2019-11-29 Clustering discriminant analysis method

Country Status (1)

Country Link
CN (1) CN110991521A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115604013A (en) * 2022-10-21 2023-01-13 北京珞安科技有限责任公司(Cn) Industrial data interaction platform and interaction method

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090297046A1 (en) * 2008-05-29 2009-12-03 Microsoft Corporation Linear Laplacian Discrimination for Feature Extraction
CN102609693A (en) * 2012-02-14 2012-07-25 南昌航空大学 Human face recognition method based on fuzzy two-dimensional kernel principal component analysis
CN103093478A (en) * 2013-02-18 2013-05-08 南京航空航天大学 Different source image rough edge test method based on rapid nuclear spatial fuzzy clustering
CN104268553A (en) * 2014-09-11 2015-01-07 江苏大学 SAR image target recognition method based on kernel fuzzy Foley-Sammon transformation
CN104794482A (en) * 2015-03-24 2015-07-22 江南大学 Inter-class maximization clustering algorithm based on improved kernel fuzzy C mean value
CN106250821A (en) * 2016-07-20 2016-12-21 南京邮电大学 The face identification method that a kind of cluster is classified again
CN106408012A (en) * 2016-09-09 2017-02-15 江苏大学 Tea infrared spectrum classification method of fuzzy discrimination clustering
CN107220627A (en) * 2017-06-06 2017-09-29 南京邮电大学 Pose-varied face recognition method based on cooperation fuzzy mean discriminatory analysis
CN107247969A (en) * 2017-06-02 2017-10-13 常州工学院 The Fuzzy c-Means Clustering Algorithm of core is induced based on Gauss

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090297046A1 (en) * 2008-05-29 2009-12-03 Microsoft Corporation Linear Laplacian Discrimination for Feature Extraction
CN102609693A (en) * 2012-02-14 2012-07-25 南昌航空大学 Human face recognition method based on fuzzy two-dimensional kernel principal component analysis
CN103093478A (en) * 2013-02-18 2013-05-08 南京航空航天大学 Different source image rough edge test method based on rapid nuclear spatial fuzzy clustering
CN104268553A (en) * 2014-09-11 2015-01-07 江苏大学 SAR image target recognition method based on kernel fuzzy Foley-Sammon transformation
CN104794482A (en) * 2015-03-24 2015-07-22 江南大学 Inter-class maximization clustering algorithm based on improved kernel fuzzy C mean value
CN106250821A (en) * 2016-07-20 2016-12-21 南京邮电大学 The face identification method that a kind of cluster is classified again
CN106408012A (en) * 2016-09-09 2017-02-15 江苏大学 Tea infrared spectrum classification method of fuzzy discrimination clustering
CN107247969A (en) * 2017-06-02 2017-10-13 常州工学院 The Fuzzy c-Means Clustering Algorithm of core is induced based on Gauss
CN107220627A (en) * 2017-06-06 2017-09-29 南京邮电大学 Pose-varied face recognition method based on cooperation fuzzy mean discriminatory analysis

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
王珉;胡茑庆;秦国军;: "LRE试车数据挖掘中基于最大散度差的模糊聚类分析方法" *
袁运能;吴央;成功;: "核空间聚类在图像纹理分类中的简化算法" *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115604013A (en) * 2022-10-21 2023-01-13 北京珞安科技有限责任公司(Cn) Industrial data interaction platform and interaction method

Similar Documents

Publication Publication Date Title
CN109117826B (en) Multi-feature fusion vehicle identification method
Ding et al. Adaptive dimension reduction for clustering high dimensional data
US7412429B1 (en) Method for data classification by kernel density shape interpolation of clusters
KR100972849B1 (en) Method of object recognition
CN107273916B (en) Information hiding detection method for unknown steganography algorithm
CN111046969A (en) Data screening method and device, storage medium and electronic equipment
CN112001257A (en) SAR image target recognition method and device based on sparse representation and cascade dictionary
Han et al. Object tracking by adaptive feature extraction
CN107480471B (en) Sequence similarity analysis method based on wavelet transform characteristics
CN110991521A (en) Clustering discriminant analysis method
CN108921853B (en) Image segmentation method based on super-pixel and immune sparse spectral clustering
CN108052867B (en) Single-sample face recognition method based on bag-of-words model
Gunawan et al. Fuzzy Region Merging Using Fuzzy Similarity Measurement on Image Segmentation
CN112818779B (en) Human behavior recognition method based on feature optimization and multiple feature fusion
Mia et al. An efficient image segmentation method based on linear discriminant analysis and K-means algorithm with automatically splitting and merging clusters
Van et al. Early and late features fusion for kinship verification based on constraint selection
CN109978066B (en) Rapid spectral clustering method based on multi-scale data structure
CN110599462B (en) Urinary sediment detection method based on unbalanced local Fisher discriminant analysis
Szemenyei et al. Dimension reduction for objects composed of vector sets
Chrétien et al. Using the LASSO for gene selection in bladder cancer data
El Ferchichi et al. A new feature extraction method based on clustering for face recognition
Myhre et al. Consensus clustering using knn mode seeking
Arifin et al. Image thresholding by histogram segmentation using discriminant analysis
Dine et al. Digit recognition using different features extraction methods
Zhou et al. Design of face recognition system based on data preprocessing method: Comparative studies and analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination