CN109034270A - A kind of visualization feature selection method based on the classification Non-negative Matrix Factorization of failure two - Google Patents

A kind of visualization feature selection method based on the classification Non-negative Matrix Factorization of failure two Download PDF

Info

Publication number
CN109034270A
CN109034270A CN201810968454.6A CN201810968454A CN109034270A CN 109034270 A CN109034270 A CN 109034270A CN 201810968454 A CN201810968454 A CN 201810968454A CN 109034270 A CN109034270 A CN 109034270A
Authority
CN
China
Prior art keywords
matrix
characteristic
classification
negative
division
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810968454.6A
Other languages
Chinese (zh)
Inventor
梁霖
牛奔
刘飞
山磊
何康康
徐光华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Jiaotong University
Original Assignee
Xian Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Jiaotong University filed Critical Xian Jiaotong University
Priority to CN201810968454.6A priority Critical patent/CN109034270A/en
Publication of CN109034270A publication Critical patent/CN109034270A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Mathematics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A kind of visualization feature selection method based on the classification Non-negative Matrix Factorization of failure two, more classification problems are first divided into multiple two classification problems according to permutation and combination method, then pass through the Non-negative Matrix Factorization of high dimensional feature set, split-matrix is subjected to thermal map expression, the final significant expression principle using thermal map selects effective characteristic of division, by characteristic of division and extract the sensitive features of entire data acquisition system;The present invention can be realized cooperating and having complementary advantages for computer and people, ensure that the good classification performance of low-dimensional character subset while carrying out dimensionality reduction to original higher-dimension primitive character.

Description

A kind of visualization feature selection method based on the classification Non-negative Matrix Factorization of failure two
Technical field
The invention belongs to mechanical equipment state detections and fault diagnosis technology field, and in particular to one kind is divided based on failure two The visualization feature selection method of class Non-negative Matrix Factorization.
Background technique
As the complexity and integrated level of Mechatronic Systems are continuously improved, what equipment broke down in the process of running Risk is also being continuously increased.It is abnormal to occurring in order to accurately identify the failure that Mechatronic Systems is germinated and developed in the process of running Component timely diagnosed and handled, condition monitoring and fault diagnosis just becomes very necessary.And with acquisition of information skill Art is constantly progressive, and the characteristic quantity that can be obtained about system mode and operating parameter is more and more, includes redundancy and nothing Characteristic information is closed, this brings huge challenge for subsequent diagnosis identification, this just needs to carry out effective feature to high dimensional data Selection and extraction work.Other than traditional Dimensionality Reduction method, Non-negative Matrix Factorization (Non-negative Matrix Factorization, NMF) low-rank of the available former characteristic matrix of method approaches, and decomposition result has and can preferably solve The property released and physical significance can be realized the Dimensionality Reduction of data characteristics by split-matrix relevant to feature, in monitoring, diagnosing Field is promoted and applied.
But currently based in the characteristic analysis method of Non-negative Matrix Factorization, using original multiclass fault sample square The basic matrix or coefficient matrix that battle array is decomposed directly are analyzed, and the participation of people is generally deficient of centered on algorithm, in selection course, is caused The process of selection is not transparent intuitive enough, and the result interpretation of selection is not strong, limits the effect of signature analysis and selection.
Summary of the invention
In order to overcome the disadvantages of the above prior art, the object of the present invention is to provide one kind based on failure two classify it is non- The visualization feature selection method that negative matrix decomposes, combines the physical significance of Non-negative Matrix Factorization result and the significant table of thermal map It is superior, so that feature selecting becomes visual pattern, it ensure that the nicety of grading of selected character subset.
In order to achieve the above object, the technical scheme adopted by the invention is that:
A kind of visualization feature selection method based on the classification Non-negative Matrix Factorization of failure two, comprising the following steps:
1) data set V to be processed is extractedm×n, the row m of data set represents sample, and column n represents feature;
2) by data set Vm×nNon-negativeization, normalized are carried out,
In formula: i=1,2 ..., m;J=1,2 ..., n, maxVijFor column vector VjMaximum value;minVkjFor column vector VjMinimum value;
3) multiple two classification problems will be divided into according to permutation and combination method the problem of original multistream heat exchanger, it is assumed that Vm×nContaining N class sample, then the corresponding characteristic set of two classification problem of each divided is expressed as Pi, wherein
4) to each non-negative characteristic setIt is decomposed using least-squares iteration algorithm, i.e. Pi= WiHi
Random initializtion WiAnd Hi, low-dimensional Embedded dimensions riPreferential selection is identical as sample class number, and Non-negative Matrix Factorization obtains To basic matrix WiWith coefficient matrix Hi, rule of iteration is as follows:
In formula:Wi is characterized set PiThe basic matrix that Non-negative Matrix Factorization obtains,Indicate group moment Battle array WiTransposition, HiIt is characterized set PiThe coefficient matrix that Non-negative Matrix Factorization obtains,Indicate coefficient matrix HiTransposition;
5) to basic matrix WiWith coefficient matrix HiCarry out thermal map Visualization, basic matrix WiRow correspond to sample, coefficient Matrix HiColumn correspond to primitive character;
6) basic matrix W is observediFeature clustering situation, if in basic matrix WiWherein a column are it is observed that two class samples are obvious It separates, then carries out step 7), otherwise reselect low-dimensional Embedded dimensions ri, return step 4);
7) characteristic of division F is selected using significant expression principle in thermal mapi, special by adjusting thermal map threshold value control tactics Levying number is 1;
8) the characteristic of division F that all two classification problems are acquirediUnion operation is done, final characteristic of division set F is obtained,
The invention has the benefit that the method for the present invention can be realized cooperating and having complementary advantages for computer and people, The result interpretation of selection is strong, ensure that the good of low-dimensional character subset while carrying out dimensionality reduction to original higher-dimension primitive character Good classification performance.
Detailed description of the invention
Fig. 1 is the method for the present invention flow chart.
Fig. 2 is data set P in embodiment1The Non-negative Matrix Factorization result thermal map of (the one or two class sample set) visualizes effect Fruit figure.
Fig. 3 is data set P in embodiment2The Non-negative Matrix Factorization result thermal map of (the first three classes sample set) visualizes effect Fruit figure.
Fig. 4 is data set P in embodiment3The Non-negative Matrix Factorization result thermal map of (the second three classes sample set) visualizes effect Fruit figure.
Fig. 5 is wine data set features 1,7 and 12 three-dimensional visualization effect pictures in embodiment.
Specific embodiment
It elaborates with reference to the accompanying drawings and examples to the present invention, the implementation case is for Wine in UCI data set Data set expansion, Wine data set derive from the chemical analysis results of three kinds of different cultivars grape wine, which has determined three kinds The concrete content of 13 kinds of ingredients in grape wine, Wine data set contain three classes data and amount to 178 samples, 13 kinds of characteristic attributes (ingredient), the sample number of every one kind are respectively as follows: 59 (classes one), 71 (classes two), 48 (classes three), the present embodiment to this 13 kinds of features into Row feature selecting selects the good feature of classification performance.
Referring to Fig.1, a kind of visualization feature selection method based on the classification Non-negative Matrix Factorization of failure two, including following step It is rapid:
1) data set V to be processed is extractedm×n, the row m of data set represents sample, and column n represents feature, the present embodiment Using Wine data set;
2) by data set Vm×nNon-negativeization, normalized are carried out,
In formula: i=1,2 ..., m;J=1,2 ..., n, maxVijFor column vector VjMaximum value;minVkjFor column vector VjMinimum value;
3) multiple two classification problems will be divided into according to permutation and combination method the problem of original multistream heat exchanger, it is assumed that Vm×nContaining N class sample, then the corresponding characteristic set of two classification problem of each divided is expressed as Pi, whereinWine data set is divided into 3 two classification problems, respectively P according to permutation and combination principle by the present embodiment1 (the one or two class sample set), P2(the first three classes sample set), P3(the second three classes sample set);
4) to each non-negative characteristic setIt is decomposed using least-squares iteration algorithm, i.e. Pi= WiHi
Random initializtion WiAnd Hi, low-dimensional Embedded dimensions riPreferential selection is identical as sample class number, and Non-negative Matrix Factorization obtains To basic matrix WiWith coefficient matrix Hi, rule of iteration is as follows:
In formula:WiIt is characterized set PiThe basic matrix that Non-negative Matrix Factorization obtains,Indicate basic matrix WiTransposition, HiIt is characterized set PiThe coefficient matrix that Non-negative Matrix Factorization obtains,Indicate coefficient matrix HiTransposition;
5) to basic matrix WiWith coefficient matrix HiCarry out thermal map Visualization, basic matrix WiRow correspond to sample, coefficient Matrix HiColumn correspond to primitive character;
6) basic matrix W is observediFeature clustering situation, if in basic matrix WiWherein a column are it is observed that two class samples are obvious It separates, as shown in Fig. 2, then carrying out step 7), otherwise reselects low-dimensional Embedded dimensions ri, return step 4);
In terms of cluster angle, two category features should be substantially contained only in the characteristic set of two classification problems, can be distinguished With the feature that cannot distinguish between two classification problems, other features can be combined to obtain by these two types of substantive characteristics;Thus, the present embodiment To P1、P2、P3When carrying out Non-negative Matrix Factorization, low-dimensional Embedded dimensions r preferentially selects 2, the Visualization of final decomposition result As shown in Figure 2,3, 4;
7) using significant expression principle in thermal map, i.e., according to matrix multiple rule, what big number × big number obtained still can It is big number, comes with value distinguishing small in thermal map;Pass through coefficient matrix HiPrimitive character corresponding to big number regions is selected point Category feature Fi, by adjusting control of the thermal map threshold value realization to characteristic of division number, usual controlling feature number is 1;
8) the characteristic of division F that all two classification problems are acquirediUnion operation is done, final characteristic of division set F is obtained,
The present embodiment is using the significant expression principle of thermal map successively to P1、P2、P3Decomposition result carry out characteristic of division choosing It selects;By adjusting the display threshold of thermal map, controlling the characteristic of division number that each two classification problem is selected is 1, then most The characteristic of division selected eventually it is found that be successively characterized 1, feature 7 and feature 12, is obtained by coefficient matrix H after union is taken to it Characteristic of division collection is combined into { 1,7,12 }.
The three-dimensional that the 1st dimension, the 12nd dimensional feature of the 7th peacekeeping in the present embodiment extraction wine data set draw the data set can Depending on changing effect picture, as shown in figure 5, it may be seen that can be good using the grape wine of three kinds of different cultivars of the method for the present invention It distinguishes, under KNN classifier, the classification rate of subset F has reached 94.94%.
Illustrated by use above, the method for the present invention can make feature selecting become visual pattern, and the result of selection can It is explanatory strong, it ensure that the nicety of grading of selected character subset, can contribute to solution complex electromechanical systems monitoring and examined with failure Disconnected High dimensional data analysis improves the efficiency and accuracy of fault diagnosis.

Claims (1)

1. a kind of visualization feature selection method based on the classification Non-negative Matrix Factorization of failure two, which is characterized in that including following Step:
1) data set V to be processed is extractedm×n, the row m of data set represents sample, and column n represents feature;
2) by data set Vm×nNon-negativeization, normalized are carried out,
In formula: i=1,2 ..., m;J=1,2 ..., n, maxVijFor column vector VjMaximum value;minVkjFor column vector Vj's Minimum value;
3) multiple two classification problems will be divided into according to permutation and combination method the problem of original multistream heat exchanger, it is assumed that Vm×nContain There is N class sample, then the corresponding characteristic set of two classification problem of each divided is expressed as Pi, wherein
4) to each non-negative characteristic setIt is decomposed using least-squares iteration algorithm, i.e. Pi=WiHi
Random initializtion WiAnd Hi, low-dimensional Embedded dimensions riPreferential selection is identical as sample class number, and Non-negative Matrix Factorization obtains base Matrix WiWith coefficient matrix Hi, rule of iteration is as follows:
In formula:WiIt is characterized set PiThe basic matrix that Non-negative Matrix Factorization obtains,Indicate basic matrix Wi's Transposition, HiIt is characterized set PiThe coefficient matrix that Non-negative Matrix Factorization obtains,Indicate coefficient matrix HiTransposition;
5) to basic matrix WiWith coefficient matrix HiCarry out thermal map Visualization, basic matrix WiRow correspond to sample, coefficient matrix HiColumn correspond to primitive character;
6) basic matrix W is observediFeature clustering situation, if in basic matrix WiWherein a column are it is observed that two class samples are clearly separated, Step 7) is then carried out, low-dimensional Embedded dimensions r is otherwise reselectedi, return step 4);
7) characteristic of division F is selected using significant expression principle in thermal mapi, by adjusting thermal map threshold value control tactics feature Number is 1;
8) the characteristic of division F that all two classification problems are acquirediUnion operation is done, final characteristic of division set F is obtained,
CN201810968454.6A 2018-08-23 2018-08-23 A kind of visualization feature selection method based on the classification Non-negative Matrix Factorization of failure two Pending CN109034270A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810968454.6A CN109034270A (en) 2018-08-23 2018-08-23 A kind of visualization feature selection method based on the classification Non-negative Matrix Factorization of failure two

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810968454.6A CN109034270A (en) 2018-08-23 2018-08-23 A kind of visualization feature selection method based on the classification Non-negative Matrix Factorization of failure two

Publications (1)

Publication Number Publication Date
CN109034270A true CN109034270A (en) 2018-12-18

Family

ID=64628235

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810968454.6A Pending CN109034270A (en) 2018-08-23 2018-08-23 A kind of visualization feature selection method based on the classification Non-negative Matrix Factorization of failure two

Country Status (1)

Country Link
CN (1) CN109034270A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110673578A (en) * 2019-09-29 2020-01-10 华北电力大学(保定) Fault degradation degree determination method and device, computer equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104809475A (en) * 2015-05-06 2015-07-29 西安电子科技大学 Multi-labeled scene classification method based on incremental linear discriminant analysis
CN105354593A (en) * 2015-10-22 2016-02-24 南京大学 NMF (Non-negative Matrix Factorization)-based three-dimensional model classification method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104809475A (en) * 2015-05-06 2015-07-29 西安电子科技大学 Multi-labeled scene classification method based on incremental linear discriminant analysis
CN105354593A (en) * 2015-10-22 2016-02-24 南京大学 NMF (Non-negative Matrix Factorization)-based three-dimensional model classification method

Non-Patent Citations (11)

* Cited by examiner, † Cited by third party
Title
DIEGO GARCÍA等: ""Interactive visualization for NILM in large buildings using non-negative matrix factorization"", 《ENERGY & BUILDINGS》 *
GAO HUIZHONG等: ""Feature Extraction and Recognition for Rolling Element Bearing Fault Utilizing Short-Time Fourier Transform and Non-negative Matrix Factorization"", 《CHINESE JOURNAL OF MECHANICAL ENGINEERING》 *
LIANG LIN等: ""Feature selection for machine fault diagnosis using clustering of non-negation matrix factorization"", 《MEASUREMENT》 *
QUAN GU等: ""Bi-clustering of metabolic data using matrix factorization tools"", 《METHODS》 *
SAAD Y SAIT等: ""Multi-level anomaly detection: Relevance of big data analytics in networks"", 《SADHANA》 *
YUANMING CHEN等: ""Feature Extraction for Fault Diagnosis Utilizing Supervised Nonnegative Matrix Factorization Combined Statistical Model"", 《9TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, BIOMEDICAL ENGINEERING AND INFORMATICS (CISP-BMEI 2016)》 *
刘荣: ""带协变量的非负矩阵分解的社区发现模型"", 《中国优秀硕士学位论文全文数据库 社会科学Ⅱ辑》 *
唐曦凌等: ""结合连续小波变换和多约束非负矩阵分解的故障特征提取方法"", 《振动与冲击》 *
栗茂林等: ""基于聚类优化的非负矩阵分解方法及其应用"", 《中国机械工程》 *
梁霖等: ""基于非负矩阵分解的单通道故障特征分离方法"", 《振动、测试与诊断》 *
王艺舒等: ""非负矩阵算法在遗传相互作用数据中的应用"", 《生物数学学报》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110673578A (en) * 2019-09-29 2020-01-10 华北电力大学(保定) Fault degradation degree determination method and device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
CN107563385B (en) License plate character recognition method based on depth convolution production confrontation network
CN103632168B (en) Classifier integration method for machine learning
CN101661559A (en) Digital image training and detecting methods
CN105389480B (en) Multiclass imbalance genomics data iteration Ensemble feature selection method and system
CN103839078A (en) Hyperspectral image classifying method based on active learning
CN103942562A (en) Hyperspectral image classifying method based on multi-classifier combining
CN109063649A (en) Pedestrian's recognition methods again of residual error network is aligned based on twin pedestrian
CN104966075B (en) A kind of face identification method and system differentiating feature based on two dimension
CN113298184B (en) Sample extraction and expansion method and storage medium for small sample image recognition
CN101251896A (en) Object detecting system and method based on multiple classifiers
CN112926045A (en) Group control equipment identification method based on logistic regression model
CN112200263B (en) Self-organizing federal clustering method applied to power distribution internet of things
CN104200134A (en) Tumor gene expression data feature selection method based on locally linear embedding algorithm
CN109034270A (en) A kind of visualization feature selection method based on the classification Non-negative Matrix Factorization of failure two
CN112529901A (en) Crack identification method in complex environment
CN108508319B (en) Transformer fault type identification method based on correlation characteristics among fault characteristic gases
CN101706876A (en) Hybrid subspace learning selective ensemble based method for detecting micro-calcification clusters
CN110443318A (en) A kind of deep neural network method based on principal component analysis and clustering
CN106251004A (en) The Target cluster dividing method divided based on room for improvement distance
CN114170634A (en) Gesture image feature extraction method based on DenseNet network improvement
da Silva HYBRID) SORT–A PATTERN-FOCUSED MATRIX REORDERING APPROACH BASED ON CLASSIFICATION
CN115063692B (en) Remote sensing image scene classification method based on active learning
CN113221973B (en) Interpretable air conditioning system deep neural network fault diagnosis method
CN101499133B (en) Handwriting identification method based on multi-categorizer integration
Pati et al. HVS inspired system for script identification in Indian multi-script documents

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20181218

WD01 Invention patent application deemed withdrawn after publication