CN109034270A - A kind of visualization feature selection method based on the classification Non-negative Matrix Factorization of failure two - Google Patents
A kind of visualization feature selection method based on the classification Non-negative Matrix Factorization of failure two Download PDFInfo
- Publication number
- CN109034270A CN109034270A CN201810968454.6A CN201810968454A CN109034270A CN 109034270 A CN109034270 A CN 109034270A CN 201810968454 A CN201810968454 A CN 201810968454A CN 109034270 A CN109034270 A CN 109034270A
- Authority
- CN
- China
- Prior art keywords
- matrix
- characteristic
- classification
- negative
- division
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/211—Selection of the most significant subset of features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Computational Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Computational Mathematics (AREA)
- Pure & Applied Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Algebra (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
A kind of visualization feature selection method based on the classification Non-negative Matrix Factorization of failure two, more classification problems are first divided into multiple two classification problems according to permutation and combination method, then pass through the Non-negative Matrix Factorization of high dimensional feature set, split-matrix is subjected to thermal map expression, the final significant expression principle using thermal map selects effective characteristic of division, by characteristic of division and extract the sensitive features of entire data acquisition system;The present invention can be realized cooperating and having complementary advantages for computer and people, ensure that the good classification performance of low-dimensional character subset while carrying out dimensionality reduction to original higher-dimension primitive character.
Description
Technical field
The invention belongs to mechanical equipment state detections and fault diagnosis technology field, and in particular to one kind is divided based on failure two
The visualization feature selection method of class Non-negative Matrix Factorization.
Background technique
As the complexity and integrated level of Mechatronic Systems are continuously improved, what equipment broke down in the process of running
Risk is also being continuously increased.It is abnormal to occurring in order to accurately identify the failure that Mechatronic Systems is germinated and developed in the process of running
Component timely diagnosed and handled, condition monitoring and fault diagnosis just becomes very necessary.And with acquisition of information skill
Art is constantly progressive, and the characteristic quantity that can be obtained about system mode and operating parameter is more and more, includes redundancy and nothing
Characteristic information is closed, this brings huge challenge for subsequent diagnosis identification, this just needs to carry out effective feature to high dimensional data
Selection and extraction work.Other than traditional Dimensionality Reduction method, Non-negative Matrix Factorization (Non-negative Matrix
Factorization, NMF) low-rank of the available former characteristic matrix of method approaches, and decomposition result has and can preferably solve
The property released and physical significance can be realized the Dimensionality Reduction of data characteristics by split-matrix relevant to feature, in monitoring, diagnosing
Field is promoted and applied.
But currently based in the characteristic analysis method of Non-negative Matrix Factorization, using original multiclass fault sample square
The basic matrix or coefficient matrix that battle array is decomposed directly are analyzed, and the participation of people is generally deficient of centered on algorithm, in selection course, is caused
The process of selection is not transparent intuitive enough, and the result interpretation of selection is not strong, limits the effect of signature analysis and selection.
Summary of the invention
In order to overcome the disadvantages of the above prior art, the object of the present invention is to provide one kind based on failure two classify it is non-
The visualization feature selection method that negative matrix decomposes, combines the physical significance of Non-negative Matrix Factorization result and the significant table of thermal map
It is superior, so that feature selecting becomes visual pattern, it ensure that the nicety of grading of selected character subset.
In order to achieve the above object, the technical scheme adopted by the invention is that:
A kind of visualization feature selection method based on the classification Non-negative Matrix Factorization of failure two, comprising the following steps:
1) data set V to be processed is extractedm×n, the row m of data set represents sample, and column n represents feature;
2) by data set Vm×nNon-negativeization, normalized are carried out,
In formula: i=1,2 ..., m;J=1,2 ..., n, maxVijFor column vector VjMaximum value;minVkjFor column vector
VjMinimum value;
3) multiple two classification problems will be divided into according to permutation and combination method the problem of original multistream heat exchanger, it is assumed that
Vm×nContaining N class sample, then the corresponding characteristic set of two classification problem of each divided is expressed as Pi, wherein
4) to each non-negative characteristic setIt is decomposed using least-squares iteration algorithm, i.e. Pi=
WiHi;
Random initializtion WiAnd Hi, low-dimensional Embedded dimensions riPreferential selection is identical as sample class number, and Non-negative Matrix Factorization obtains
To basic matrix WiWith coefficient matrix Hi, rule of iteration is as follows:
In formula:Wi is characterized set PiThe basic matrix that Non-negative Matrix Factorization obtains,Indicate group moment
Battle array WiTransposition, HiIt is characterized set PiThe coefficient matrix that Non-negative Matrix Factorization obtains,Indicate coefficient matrix HiTransposition;
5) to basic matrix WiWith coefficient matrix HiCarry out thermal map Visualization, basic matrix WiRow correspond to sample, coefficient
Matrix HiColumn correspond to primitive character;
6) basic matrix W is observediFeature clustering situation, if in basic matrix WiWherein a column are it is observed that two class samples are obvious
It separates, then carries out step 7), otherwise reselect low-dimensional Embedded dimensions ri, return step 4);
7) characteristic of division F is selected using significant expression principle in thermal mapi, special by adjusting thermal map threshold value control tactics
Levying number is 1;
8) the characteristic of division F that all two classification problems are acquirediUnion operation is done, final characteristic of division set F is obtained,
The invention has the benefit that the method for the present invention can be realized cooperating and having complementary advantages for computer and people,
The result interpretation of selection is strong, ensure that the good of low-dimensional character subset while carrying out dimensionality reduction to original higher-dimension primitive character
Good classification performance.
Detailed description of the invention
Fig. 1 is the method for the present invention flow chart.
Fig. 2 is data set P in embodiment1The Non-negative Matrix Factorization result thermal map of (the one or two class sample set) visualizes effect
Fruit figure.
Fig. 3 is data set P in embodiment2The Non-negative Matrix Factorization result thermal map of (the first three classes sample set) visualizes effect
Fruit figure.
Fig. 4 is data set P in embodiment3The Non-negative Matrix Factorization result thermal map of (the second three classes sample set) visualizes effect
Fruit figure.
Fig. 5 is wine data set features 1,7 and 12 three-dimensional visualization effect pictures in embodiment.
Specific embodiment
It elaborates with reference to the accompanying drawings and examples to the present invention, the implementation case is for Wine in UCI data set
Data set expansion, Wine data set derive from the chemical analysis results of three kinds of different cultivars grape wine, which has determined three kinds
The concrete content of 13 kinds of ingredients in grape wine, Wine data set contain three classes data and amount to 178 samples, 13 kinds of characteristic attributes
(ingredient), the sample number of every one kind are respectively as follows: 59 (classes one), 71 (classes two), 48 (classes three), the present embodiment to this 13 kinds of features into
Row feature selecting selects the good feature of classification performance.
Referring to Fig.1, a kind of visualization feature selection method based on the classification Non-negative Matrix Factorization of failure two, including following step
It is rapid:
1) data set V to be processed is extractedm×n, the row m of data set represents sample, and column n represents feature, the present embodiment
Using Wine data set;
2) by data set Vm×nNon-negativeization, normalized are carried out,
In formula: i=1,2 ..., m;J=1,2 ..., n, maxVijFor column vector VjMaximum value;minVkjFor column vector
VjMinimum value;
3) multiple two classification problems will be divided into according to permutation and combination method the problem of original multistream heat exchanger, it is assumed that
Vm×nContaining N class sample, then the corresponding characteristic set of two classification problem of each divided is expressed as Pi, whereinWine data set is divided into 3 two classification problems, respectively P according to permutation and combination principle by the present embodiment1
(the one or two class sample set), P2(the first three classes sample set), P3(the second three classes sample set);
4) to each non-negative characteristic setIt is decomposed using least-squares iteration algorithm, i.e. Pi=
WiHi;
Random initializtion WiAnd Hi, low-dimensional Embedded dimensions riPreferential selection is identical as sample class number, and Non-negative Matrix Factorization obtains
To basic matrix WiWith coefficient matrix Hi, rule of iteration is as follows:
In formula:WiIt is characterized set PiThe basic matrix that Non-negative Matrix Factorization obtains,Indicate basic matrix
WiTransposition, HiIt is characterized set PiThe coefficient matrix that Non-negative Matrix Factorization obtains,Indicate coefficient matrix HiTransposition;
5) to basic matrix WiWith coefficient matrix HiCarry out thermal map Visualization, basic matrix WiRow correspond to sample, coefficient
Matrix HiColumn correspond to primitive character;
6) basic matrix W is observediFeature clustering situation, if in basic matrix WiWherein a column are it is observed that two class samples are obvious
It separates, as shown in Fig. 2, then carrying out step 7), otherwise reselects low-dimensional Embedded dimensions ri, return step 4);
In terms of cluster angle, two category features should be substantially contained only in the characteristic set of two classification problems, can be distinguished
With the feature that cannot distinguish between two classification problems, other features can be combined to obtain by these two types of substantive characteristics;Thus, the present embodiment
To P1、P2、P3When carrying out Non-negative Matrix Factorization, low-dimensional Embedded dimensions r preferentially selects 2, the Visualization of final decomposition result
As shown in Figure 2,3, 4;
7) using significant expression principle in thermal map, i.e., according to matrix multiple rule, what big number × big number obtained still can
It is big number, comes with value distinguishing small in thermal map;Pass through coefficient matrix HiPrimitive character corresponding to big number regions is selected point
Category feature Fi, by adjusting control of the thermal map threshold value realization to characteristic of division number, usual controlling feature number is 1;
8) the characteristic of division F that all two classification problems are acquirediUnion operation is done, final characteristic of division set F is obtained,
The present embodiment is using the significant expression principle of thermal map successively to P1、P2、P3Decomposition result carry out characteristic of division choosing
It selects;By adjusting the display threshold of thermal map, controlling the characteristic of division number that each two classification problem is selected is 1, then most
The characteristic of division selected eventually it is found that be successively characterized 1, feature 7 and feature 12, is obtained by coefficient matrix H after union is taken to it
Characteristic of division collection is combined into { 1,7,12 }.
The three-dimensional that the 1st dimension, the 12nd dimensional feature of the 7th peacekeeping in the present embodiment extraction wine data set draw the data set can
Depending on changing effect picture, as shown in figure 5, it may be seen that can be good using the grape wine of three kinds of different cultivars of the method for the present invention
It distinguishes, under KNN classifier, the classification rate of subset F has reached 94.94%.
Illustrated by use above, the method for the present invention can make feature selecting become visual pattern, and the result of selection can
It is explanatory strong, it ensure that the nicety of grading of selected character subset, can contribute to solution complex electromechanical systems monitoring and examined with failure
Disconnected High dimensional data analysis improves the efficiency and accuracy of fault diagnosis.
Claims (1)
1. a kind of visualization feature selection method based on the classification Non-negative Matrix Factorization of failure two, which is characterized in that including following
Step:
1) data set V to be processed is extractedm×n, the row m of data set represents sample, and column n represents feature;
2) by data set Vm×nNon-negativeization, normalized are carried out,
In formula: i=1,2 ..., m;J=1,2 ..., n, maxVijFor column vector VjMaximum value;minVkjFor column vector Vj's
Minimum value;
3) multiple two classification problems will be divided into according to permutation and combination method the problem of original multistream heat exchanger, it is assumed that Vm×nContain
There is N class sample, then the corresponding characteristic set of two classification problem of each divided is expressed as Pi, wherein
4) to each non-negative characteristic setIt is decomposed using least-squares iteration algorithm, i.e. Pi=WiHi;
Random initializtion WiAnd Hi, low-dimensional Embedded dimensions riPreferential selection is identical as sample class number, and Non-negative Matrix Factorization obtains base
Matrix WiWith coefficient matrix Hi, rule of iteration is as follows:
In formula:WiIt is characterized set PiThe basic matrix that Non-negative Matrix Factorization obtains,Indicate basic matrix Wi's
Transposition, HiIt is characterized set PiThe coefficient matrix that Non-negative Matrix Factorization obtains,Indicate coefficient matrix HiTransposition;
5) to basic matrix WiWith coefficient matrix HiCarry out thermal map Visualization, basic matrix WiRow correspond to sample, coefficient matrix
HiColumn correspond to primitive character;
6) basic matrix W is observediFeature clustering situation, if in basic matrix WiWherein a column are it is observed that two class samples are clearly separated,
Step 7) is then carried out, low-dimensional Embedded dimensions r is otherwise reselectedi, return step 4);
7) characteristic of division F is selected using significant expression principle in thermal mapi, by adjusting thermal map threshold value control tactics feature
Number is 1;
8) the characteristic of division F that all two classification problems are acquirediUnion operation is done, final characteristic of division set F is obtained,
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810968454.6A CN109034270A (en) | 2018-08-23 | 2018-08-23 | A kind of visualization feature selection method based on the classification Non-negative Matrix Factorization of failure two |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810968454.6A CN109034270A (en) | 2018-08-23 | 2018-08-23 | A kind of visualization feature selection method based on the classification Non-negative Matrix Factorization of failure two |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109034270A true CN109034270A (en) | 2018-12-18 |
Family
ID=64628235
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810968454.6A Pending CN109034270A (en) | 2018-08-23 | 2018-08-23 | A kind of visualization feature selection method based on the classification Non-negative Matrix Factorization of failure two |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109034270A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110673578A (en) * | 2019-09-29 | 2020-01-10 | 华北电力大学(保定) | Fault degradation degree determination method and device, computer equipment and storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104809475A (en) * | 2015-05-06 | 2015-07-29 | 西安电子科技大学 | Multi-labeled scene classification method based on incremental linear discriminant analysis |
CN105354593A (en) * | 2015-10-22 | 2016-02-24 | 南京大学 | NMF (Non-negative Matrix Factorization)-based three-dimensional model classification method |
-
2018
- 2018-08-23 CN CN201810968454.6A patent/CN109034270A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104809475A (en) * | 2015-05-06 | 2015-07-29 | 西安电子科技大学 | Multi-labeled scene classification method based on incremental linear discriminant analysis |
CN105354593A (en) * | 2015-10-22 | 2016-02-24 | 南京大学 | NMF (Non-negative Matrix Factorization)-based three-dimensional model classification method |
Non-Patent Citations (11)
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110673578A (en) * | 2019-09-29 | 2020-01-10 | 华北电力大学(保定) | Fault degradation degree determination method and device, computer equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107563385B (en) | License plate character recognition method based on depth convolution production confrontation network | |
CN103632168B (en) | Classifier integration method for machine learning | |
CN101661559A (en) | Digital image training and detecting methods | |
CN105389480B (en) | Multiclass imbalance genomics data iteration Ensemble feature selection method and system | |
CN103839078A (en) | Hyperspectral image classifying method based on active learning | |
CN103942562A (en) | Hyperspectral image classifying method based on multi-classifier combining | |
CN109063649A (en) | Pedestrian's recognition methods again of residual error network is aligned based on twin pedestrian | |
CN104966075B (en) | A kind of face identification method and system differentiating feature based on two dimension | |
CN113298184B (en) | Sample extraction and expansion method and storage medium for small sample image recognition | |
CN101251896A (en) | Object detecting system and method based on multiple classifiers | |
CN112926045A (en) | Group control equipment identification method based on logistic regression model | |
CN112200263B (en) | Self-organizing federal clustering method applied to power distribution internet of things | |
CN104200134A (en) | Tumor gene expression data feature selection method based on locally linear embedding algorithm | |
CN109034270A (en) | A kind of visualization feature selection method based on the classification Non-negative Matrix Factorization of failure two | |
CN112529901A (en) | Crack identification method in complex environment | |
CN108508319B (en) | Transformer fault type identification method based on correlation characteristics among fault characteristic gases | |
CN101706876A (en) | Hybrid subspace learning selective ensemble based method for detecting micro-calcification clusters | |
CN110443318A (en) | A kind of deep neural network method based on principal component analysis and clustering | |
CN106251004A (en) | The Target cluster dividing method divided based on room for improvement distance | |
CN114170634A (en) | Gesture image feature extraction method based on DenseNet network improvement | |
da Silva | HYBRID) SORT–A PATTERN-FOCUSED MATRIX REORDERING APPROACH BASED ON CLASSIFICATION | |
CN115063692B (en) | Remote sensing image scene classification method based on active learning | |
CN113221973B (en) | Interpretable air conditioning system deep neural network fault diagnosis method | |
CN101499133B (en) | Handwriting identification method based on multi-categorizer integration | |
Pati et al. | HVS inspired system for script identification in Indian multi-script documents |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20181218 |
|
WD01 | Invention patent application deemed withdrawn after publication |