CN104966105A - Robust machine error retrieving method and system - Google Patents

Robust machine error retrieving method and system Download PDF

Info

Publication number
CN104966105A
CN104966105A CN201510408404.9A CN201510408404A CN104966105A CN 104966105 A CN104966105 A CN 104966105A CN 201510408404 A CN201510408404 A CN 201510408404A CN 104966105 A CN104966105 A CN 104966105A
Authority
CN
China
Prior art keywords
data
sample
training
label
machine error
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510408404.9A
Other languages
Chinese (zh)
Inventor
张召
江威明
张莉
李凡长
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou University
Original Assignee
Suzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou University filed Critical Suzhou University
Priority to CN201510408404.9A priority Critical patent/CN104966105A/en
Publication of CN104966105A publication Critical patent/CN104966105A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate

Abstract

The invention discloses a robust machine error retrieving method and a robust machine error retrieving system. Firstly, training set data is pre-processed by utilizing a tag estimation method, a tag of uncalibrated machine data is estimated, and a projection classifier is initialized. Based on class information of a training sample, tag consistence dictionary learning is carried out; self-adaptive reconstruction weight in a tag predication model is configured by obtained sparsity-judgment coding, and the class information of non-tag training data is updated by computing a novel projection classifier. One judged reconfigurable dictionary, one sparse coding matrix and one optimal multi-class classifier are output by multi-time iterative training. The classifier obtained by training can be used for concluding newcomer data and carrying out class predication on the newcomer data; and according to a position corresponding to a maximum probability value in a soft tag, the class of a tested sample is determined, so that robust classifying of machine error data is completed. A semi-supervised tag consistency dictionary learning method is disclosed, so that supervised prior information is enriched, and machine error retrieving precision is effectively improved.

Description

A kind of robust machine error search method and system
Technical field
The present invention relates to data mining and technical field of computer vision, it particularly relates to a kind of robust machine error search method and system.
Background technology
With computer technology and it is intelligentized continue to develop, machine error classification has been developed as a very important research topic in data mining.Machine error sorting technique is by computer by machine data electronization, and then analyze data structure, obtains data characteristics, there is great meaning in fields such as mechanical fault diagnosis, once studying successfully and putting into application, will produce huge social and economic benefit.
Current most of research work, which all concentrates on full supervision or unsupervised approaches, to be used for extraction machine data characteristics and carries out machine error classification, and has also obtained certain achievement.But the machine data in real world is typically to have label on a small quantity, and it is most of without label, most of researchs show that full measure of supervision is used for data classification and is better than unsupervised approaches, but the full measure of supervision of application, which obtains all data labels, needs very big expense, therefore how effectively to improve nicety of grading using the label in machine data is the problem of needing further investigated.
In recent years, K-SVD and D-KSVD (Discriminative K-SVD) etc. classical dictionary learning algorithm can be by learning a dictionary reconstructed, the sparse coding for obtaining data set is trained to carry out the feature of characterize data, and calculating obtains linear classifier, and data can be classified.But when the training data sample of selection is less, the feature of data fails accurate sign, so the precision of classification is very low.In order to overcome this shortcoming, their popularization LC-KSVD (Label Consistent K-SVD) is suggested, when the label of known total data sample, LC-KSVD is in the restructural dictionary that study differentiates, it is effective to keep the every inner link with data label of dictionary, even if so that training sample is less, the sparse coding obtained by the dictionary training acquired also can maximum characterize data feature, so as to accurately be classified by calculating obtained linear classifier to machine error data.LC-KSVD target is to be gone to classify to data with the method supervised entirely, but obtains total data labeling requirement very big expense.
Therefore it provides a kind of convenient machine error sorting technique for obtaining data label is to reduce expense, it is those skilled in the art's urgent problem to be solved.
The content of the invention
In view of this, the invention provides a kind of robust machine error search method and system, the problem of data label expense is big is obtained in the prior art to overcome.
To achieve the above object, the present invention provides following technical scheme:
A kind of robust machine error search method, including:
Estimate to obtain the class label for not demarcating sample data in training set by direct-push mode using Tag Estimation method, generation includes the new training set of all sample datas for having demarcation;
Machine error data and its label information in the new training set, enter the study of row label consistent phonogram allusion quotation, obtain differentiating sparse coding, utilize the discriminant coefficient code construction adaptive weighting coefficient, obtain self-adapting reconstruction coefficient matrix, one projection grader is obtained according to the self-adapting reconstruction coefficient matrix, the classification information of the sample data do not demarcated in training set is updated using the projection grader;
Trained by successive ignition, obtain the restructural dictionary of a differentiation, the differentiation sparse coding of a machine error data, and an optimal projection grader;
The class prediction and searching classification to machine error data to be measured are completed using the optimal projection grader, obtain the soft class label of testing data in the test set, maximum probability value in the soft class label finds corresponding position, the classification of test sample is determined, the robust classification of the machine error data is obtained.
Preferably, also include before estimating to obtain the class label for not demarcating sample data in training set by direct-push mode using Tag Estimation method:
Obtain original sample data sets, the sample data sets are divided into training set and test set, marked training sample and unlabelled training sample, the machine data vector set of the marked training sample and the unlabelled training sample are included in the training setN is the dimension of machine data, and l is the quantity of marked training sample, and u is unmarked training samples number, wherein including c (c>2) training sample set of individual class labelWith the training sample set without any labelWherein any vectorFor machine data sample, l+u=N, the test sample in the test set is all unmarked.
Preferably, the machine error data and its label information in the new training set, entering the study of row label consistent phonogram allusion quotation includes:The restructural dictionary that default D obtains for study, S is the differentiation sparse coding of training set, and AS encodes for the adaptive weighting, and P represents a projection grader;
Subj||si||0≤T1, i ∈ j | j=1,2 ..., N }
Wherein,It is reconstructed error,T1It is sparse constraint,
siIt is defined as follows:
To differentiate sparse coding error, whereinFor the differentiation sparse coding of training dataset, α is the balance parameter of this;
As training sample xiAnd xjWhen belonging to a different category, the training sample xiAnd xjCorresponding item is 0 in Q, otherwise training sample xiAnd xjCorresponding item is cos (x in Qi,xj);
It is the neighborhood reconstructed error of accumulation, β is the balance parameter of this;
Presentation class error,The corresponding positional representation x of greatest memberiSoft label, μiRepresent xiAdjusting parameter, as x in training setiLabel known to when, corresponding μi=1010, otherwise μi=0.
Preferably, the class prediction and searching classification to machine error data to be measured are completed using the optimal projection grader, obtaining the soft class label of testing data in the test set includes:
Obtain test sample xnewWhen, utilize PTxnewIt is embedded into calculating and obtains multiclass projection grader, the corresponding position of greatest member of gained vector is sample to be tested xnewSoft label, the hard label of each test sample can be summed up as argmaxi≤c(fnew)i, wherein(fnew)iRepresent the soft label vector f of predictionnewI-th of element position.
Present invention also offers a kind of robust machine error searching system, including:
Pretreatment module is trained, for estimating to obtain the class label for not demarcating sample data in training set by direct-push mode using Tag Estimation method, generation includes the new training set of all sample datas for having demarcation;
Training module, for the machine error data and its label information in the new training set, enter the study of row label consistent phonogram allusion quotation, obtain differentiating sparse coding, utilize the discriminant coefficient code construction adaptive weighting coefficient, self-adapting reconstruction coefficient matrix is obtained, a projection grader is obtained according to the self-adapting reconstruction coefficient matrix, the classification information of the sample data do not demarcated in training set is updated using the projection grader;
Iteration module, for being trained by successive ignition, obtains the restructural dictionary of a differentiation, the differentiation sparse coding of a machine error data, and an optimal projection grader;
Test module, for completing class prediction and searching classification to machine error data to be measured using the optimal projection grader, obtain the soft class label of testing data in the test set, maximum probability value in the soft class label finds corresponding position, the classification of test sample is determined, the robust classification of the machine error data is obtained.
A kind of robust machine error search method and system provided using the present invention, is pre-processed first with label method of estimation to training set data, estimates the label for not demarcating machine data, and initialize a projection grader.Classification information based on training sample, enters the study of row label consistent phonogram allusion quotation, differentiates that the self-adapting reconstruction in sparse coding construction Tag Estimation model is weighed using what is obtained, the category information of no label training data is updated by calculating new projection grader.Trained by successive ignition, export the restructural dictionary of a differentiation, a sparse coding matrix and an optimal multi classifier.Conclusion and class prediction, maximum probability value corresponding position according to soft label in of the obtained grader available for data of newly arriving are trained, the classification of test sample is determined, completes machine error data robust classification.By proposing the consistent dictionary learning method of semi-supervised label, the prior information of supervision is enriched, the precision of machine error retrieval is effectively increased.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, the required accompanying drawing used in embodiment or description of the prior art will be briefly described below, apparently, drawings in the following description are only embodiments of the invention, for those of ordinary skill in the art, on the premise of not paying creative work, other accompanying drawings can also be obtained according to the accompanying drawing of offer.
Fig. 1 is a kind of flow chart of machine error data classification method disclosed in the embodiment of the present invention;
Fig. 2 is a kind of structural representation of machine error data sorting system disclosed in the embodiment of the present invention.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is only a part of embodiment of the invention, rather than whole embodiments.Based on the embodiment in the present invention, the every other embodiment that those of ordinary skill in the art are obtained under the premise of creative work is not made belongs to the scope of protection of the invention.
The invention discloses a kind of robust machine error search method and system, training set data is pre-processed first with label method of estimation, the label for not demarcating machine data is estimated, and obtain an initial projections grader.Class label information based on training sample, enters the study of row label consistent phonogram allusion quotation, differentiates that the self-adapting reconstruction in sparse coding construction label estimation is weighed using what is obtained, and then update the classification information in training set without label data and projection grader.Trained by successive ignition, export the restructural dictionary of a differentiation, a machine data sparse coding matrix and an optimal multi classifier.Conclusion and class prediction of the obtained multi classifier available for data of newly arriving are trained, the problem of maximum probability value in soft label is corresponding determines the classification of test sample, complete machine error classification.By proposing the consistent dictionary learning method of semi-supervised label, the quantity of demarcation sample is added, the prior information of supervision is enriched, therefore effectively increase the precision of machine error retrieval.
The present invention is tested in the database of three machine data collection:Rolling bearing database, Gearbox dataset and Motor electrical dataset.Rolling bearing database include 4 machine data collection, and the present invention chooses wherein 0HP and 2HP and tested, and 0HP includes 400 samples, wherein comprising 10 classifications, each 40 samples of classification, 2HP includes 800 samples, 10 classifications, each 80 samples of classification;Gearbox dataset include 72 samples, 3 classifications, per 24 samples of class;Motor electrical dataset include 90 samples, 3 classifications, per 30 samples of class.These databases are collected from many aspects, thus test result is with universal illustrative.
Accompanying drawing 1 is referred to, is a kind of method flow diagram of machine error retrieval disclosed in the embodiment of the present invention.A kind of machine error search method disclosed in the embodiment of the present invention, specific implementation step is:
Step S101:Estimate to obtain the class label for not demarcating sample data in training set by direct-push mode using Tag Estimation method, generation includes the new training set of all sample datas for having demarcation;
Utilize all training samples in training set, estimate to obtain the class label of the sample data do not demarcated in training set using existing Forecasting Methodology (i.e. Laplce's discriminant analysis model) direct-push, generation includes the new training set of all sample datas for having demarcation;
Original sample data sets, which are divided into training set and test set, the training set, includes the marked unlabelled training sample of training sample Buddhist monk, marked and unlabelled machine data vector set(wherein, n is the dimension of machine data, and l is the quantity of marked training sample, and u is unmarked training samples number), wherein including c (c>2) training sample set of individual class labelWith the training sample set without any label(wherein any vectorIt is a machine data sample), wherein l+u=N includes test sample in the test set, all unmarked.
According to the marked sample of the training set, and there is the inherent geometry between exemplar data and unlabeled exemplars data, carry out Tag Estimation using Laplce's discriminant analysis method, concrete model is:
Wherein, λmIt is the every parameter of balance, EmIt is the weight matrix of similitude between each sample of estimation, this can be defined with Gaussian function.Lm=Zm-EmIt is Laplacian Matrix, ZmIt is a diagonal matrix, whereinRepresenting matrix M pseudoinverse.It can be defined as:
Wherein, ljRepresent the quantity of jth class sample.
P is initial projections matrix, while can pass throughObtain sample xiSoft label.
Step S102:Machine error data and its label information in the new training set, enter the study of row label consistent phonogram allusion quotation, obtain differentiating sparse coding, utilize the discriminant coefficient code construction adaptive weighting coefficient, obtain self-adapting reconstruction coefficient matrix, one projection grader is obtained according to the self-adapting reconstruction coefficient matrix, the classification information of the sample data do not demarcated in training set is updated using the projection grader;
Step S103:Trained by successive ignition, obtain the restructural dictionary of a differentiation, the differentiation sparse coding of a machine error data, and an optimal projection grader;
Enter the study of row label consistent phonogram allusion quotation to machine data, described problem is:
Subj||si||0≤T1, i ∈ j | j=1,2 ..., N }
Wherein,It is reconstructed error,To learn obtained dictionary, T1It is sparse constraint,It is the sparse coding of training dataset, siIt can be defined as follows:
To differentiate sparse coding error, whereinFor the differentiation sparse coding of training dataset, α is the balance parameter of this.As training sample xiAnd xjWhen belonging to a different category, the two training samples corresponding item in Q is 0, conversely, being cos (xi,xj) for example,Include the data of 3 classifications, x1,x2Belong to classification 1, x3,x4Belong to classification 2, x5,x6Belong to classification 3, Q may be defined as:
It is the neighborhood reconstructed error of accumulation, β is the balance parameter of this.Presentation class error,A multiclass projection grader is represented,The corresponding positional representation x of greatest memberiSoft label, μiRepresent xiAdjusting parameter, as x in training setiLabel known to when, corresponding μi=1010, otherwise μi=0.
Based on the matrix expression having pointed out, above mentioned problem can be rewritten as:
Subj||si||0≤T1, i ∈ j | j=1,2 ..., N }
WhereinIt is a diagonal matrix, Uiii
Due in the model, comprising multiple primary variables (D, S, A, P), and each variable influences each other, therefore can not directly solve.Therefore optimal solution strategy is sought solving the problem and need to use iteration, be specially:
After the projection grader P that S101 steps are initialized, by removing independently of D, A, S items can obtain following object function:
Subj||si||0≤T1, i ∈ j | j=1,2 ..., N }
WhereinIt is the training dataset after sequence,Expression belongs to classification i all training datas.During calculating, the problem can be converted into following problem:
Subj||si||0≤T1, i ∈ j | j=1,2 ..., N }
It is assumed thatFormer problem is converted into:
The problem can be attributed to KSVD problems, effectively can find optimal solution, i.e. d using KSVD algorithmskWith its corresponding coefficient(S line ks) updates in synchronization,And EkIn give up being expressed as after 0WithD can be obtained by the following methodkWith
Wherein,Decompose and obtain using SVD
After obtaining A, S, AS is used as estimating in label method of estimation the weight matrix of similitude between each sample, is specially:
The items independently of P are removed, following object function is can obtain:
Solution can be obtained:
Obtain after this projection matrix, can pass throughMore new samples xiSoft label.
When(wherein δ=10-6) when, terminate this iterative process.
Specific algorithm is as follows:
A kind of machine error searching algorithm
Input:Raw data matrixControl parameter α, β, U, sparse constraint T1, dictionary dimension K, and Y
Output:D,A,S,P,Q
1) predicts the classification information for obtaining unlabeled exemplars in training set using Laplce's discriminant analysis method direct-push, completes initialization;
2) calculates P(0),Q(0),D(0),A(0),S(0):
D is calculated with the method for LC-KSVD training dictionaries(0)Keep the correlation between each sample class and dictionary project;
Update raw data matrix
Calculated with OMP algorithmsSparse coding S(0)
Q is initialized with the method for defining Q(0)
A is initialized with LC-KSVD algorithm initializations A method(0)
P is initialized with the method for above-mentioned calculating projection matrix(0)
3) calculates D, A, S
As t=0:KSVD iterations -1
Initialization
With KSVD algorithms D is updated by solving following problemnew (t+1)And Snew (t+1)
According to Dnew (t+1)Obtain A(t+1)And D(t+1)
Projection grader P is updated by solving following problem(t+1)
Non- nominal data x is updated by solving following problemiSoft label:argmaxi≤c+1fi,fi=P(t+1)Txi
IfStop iteration, conversely, continuing repetitive exercise, t=t+1.
Step S104:The class prediction and searching classification to machine error data to be measured are completed using the optimal projection grader, obtain the soft class label of testing data in the test set, maximum probability value in the soft class label finds corresponding position, the classification of test sample is determined, the robust classification of the machine error data is obtained.
Obtain test sample xnewWhen, utilize PTxnewThe multiclass projection grader for calculating and obtaining is embedded into, the corresponding position of greatest member of gained vector is sample to be tested xnewSoft label, the hard label of each test sample can be summed up as argmaxi≤c(fnew)i, wherein(fnew)iRepresent the soft label vector f of predictionnewI-th of element position.
The invention discloses a kind of machine error search method and system, direct-push classification processing is carried out to machine data first with label method of estimation, quickly estimates and does not demarcate machine data label, and calculating obtains an initial projection matrix.Based on machine data sample and its label information in training set, enter the study of row label consistent phonogram allusion quotation, obtained differentiation sparse coding can be used as the self-adapting reconstruction weight in label estimating step to update classification information and its projection matrix without label data in training set.Trained by successive ignition, can obtain the restructural dictionary of a differentiation, the sparse coding of machine data and an optimal multi classifier.And then, machine data input to be sorted is calculated to obtained multi classifier and is predicted, the classification of sample to be tested is determined, realizes the mistake classification of machine data.
Method is described in detail in the invention described above disclosed embodiment, the system that the method for the present invention can take various forms is realized, therefore the invention also discloses a kind of system, specific embodiment is given below and is described in detail.
Accompanying drawing 2 is referred to, is a kind of structural representation of robust machine error searching system disclosed in the embodiment of the present invention.The invention discloses a kind of robust machine error searching system, the system is specifically included:
Pretreatment module 101 is trained, for estimating to obtain the class label for not demarcating sample data in training set by direct-push mode using Tag Estimation method, generation includes the new training set of all sample datas for having demarcation;
Training module 102, for the machine error data and its label information in the new training set, enter the study of row label consistent phonogram allusion quotation, obtain differentiating sparse coding, utilize the discriminant coefficient code construction adaptive weighting coefficient, self-adapting reconstruction coefficient matrix is obtained, a projection grader is obtained according to the self-adapting reconstruction coefficient matrix, the classification information of the sample data do not demarcated in training set is updated using the projection grader;
Iteration module 103, for being trained by successive ignition, obtains the restructural dictionary of a differentiation, the differentiation sparse coding of a machine error data, and an optimal projection grader;
Test module 104, for completing class prediction and searching classification to machine error data to be measured using the optimal projection grader, obtain the soft class label of testing data in the test set, maximum probability value in the soft class label finds corresponding position, the classification of test sample is determined, the robust classification of the machine error data is obtained.
Train pretreatment module 101 mainly to complete to carry out direct-push classification processing using label method of estimation to all training samples in training set, estimate all non-nominal data class labels, and export an initial linear projection grader;
Original sample data sets, which are divided into training set and test set, the training set, includes the marked unlabelled training sample of training sample Buddhist monk, marked and unlabelled machine data vector set(wherein, n is the dimension of machine data, and l is the quantity of marked training sample, and u is unmarked training samples number), wherein including c (c>2) training sample set of individual class labelWith the training sample set without any label(wherein any vectorIt is a machine data sample), wherein l+u=N includes test sample in the test set, all unmarked.
Calculated according to the marked sample of the training set with label method of estimation, be specially:
Wherein, λmIt is the every parameter of balance, EmIt is the weight matrix of similitude between each sample of estimation, this can be defined with Gaussian function.Lm=Zm-EmIt is Laplacian Matrix, ZmIt is a diagonal matrix, whereinRepresenting matrix M pseudoinverse.It can be defined as:
Wherein, ljRepresent the quantity of jth class sample.
Training module 102 is mainly completed based on the machine error data and its label information in new training set, enter the study of row label consistent phonogram allusion quotation, obtain differentiating sparse coding, and for the adaptive weighting coefficients to construct in label estimating step, a projection grader is obtained using adaptive reconstruction coefficients matrix update, and completes to update the classification information without demarcation sample in training set.
Enter the study of row label consistent phonogram allusion quotation to machine data, described problem is:
Subj||si||0≤T1, i ∈ j | j=1,2 ..., N }
Wherein,It is reconstructed error,To learn obtained dictionary, T1It is sparse constraint,It is the sparse coding of training dataset, siIt can be defined as follows:
To differentiate sparse coding error, whereinFor the differentiation sparse coding of training dataset, α is the balance parameter of this.As training sample xiAnd xjWhen belonging to a different category, the two training samples corresponding item in Q is 0, conversely, being cos (xi,xj) for example,Include the data of 3 classifications, x1,x2Belong to classification 1, x3,x4Belong to classification 2, x5,x6Belong to classification 3, Q may be defined as:
It is the neighborhood reconstructed error of accumulation, β is the balance parameter of this.Presentation class error,A multiclass projection grader is represented,The corresponding positional representation x of greatest memberiSoft label, μiRepresent xiAdjusting parameter, as x in training setiLabel known to when, corresponding μi=1010, otherwise μi=0.
Based on the matrix expression having pointed out, above mentioned problem can be rewritten as:
Subj||si||0≤T1, i ∈ j | j=1,2 ..., N }
WhereinIt is a diagonal matrix, Uiii。
Due in the model, comprising multiple primary variables (D, S, A, P), and each variable influences each other, therefore can not directly solve.Optimal solution strategy is sought solving the problem and needing to use iteration, is specially:
After the projection matrix P initialized, by removing independently of D, A, S items can obtain following object function:
Subj||si||0≤T1, i ∈ j | j=1,2 ..., N }
WhereinIt is the training dataset after sequence,Expression belongs to classification i all training datas.During calculating, the problem can be converted into following problem:
Subj||si||0≤T1, i ∈ j | j=1,2 ..., N }
It is assumed thatFormer problem is converted into:
The problem can be attributed to KSVD problems, effectively can find optimal solution, i.e. d using KSVD algorithmskWith its corresponding coefficient(S line ks) updates in synchronization,And EkIn give up being expressed as after 0WithD can be obtained by the following methodkWith
Wherein,Decompose and obtain using SVD
After obtaining A, S, AS is used as estimating in label method of estimation the weight matrix of similitude between each sample, is specially:
The items independently of P are removed, following object function is can obtain:
Solution can be obtained:
Obtain after this projection matrix, can pass throughMore new samples xiSoft label.
When(wherein δ=10-6) when, terminate this iterative process.
Specific algorithm is as follows:
A kind of machine error searching algorithm
Input:Raw data matrixControl parameter α, β, U, sparse constraint T1, dictionary dimension K, and Y
Output:D,A,S,P,Q
1) .1) predicted using Laplce's discriminant analysis method direct-push and obtains the classification informations of unlabeled exemplars in training set, completes initialization;
2) calculates P(0),Q(0),D(0),A(0),S(0):
D is calculated with the method for LC-KSVD training dictionaries(0)Keep the correlation between each sample class and dictionary project;
Update raw data matrix
Calculated with OMP algorithmsSparse coding S(0)
Q is initialized with the method for defining Q(0)
A is initialized with LC-KSVD algorithm initializations A method(0)
P is initialized with the method for above-mentioned calculating projection matrix(0)
3) calculates D, A, S
As t=0:KSVD iterations -1
Initialization
With KSVD algorithms D is updated by solving following problemnew (t+1)And Snew (t+1)
According to Dnew (t+1)Obtain A(t+1)And D(t+1)
Projection grader P is updated by solving following problem(t+1)
Non- nominal data x is updated by solving following problemiSoft label:argmaxi≤c+1fi,fi=P(t+1)Txi
IfStop iteration, conversely, continuing repetitive exercise, t=t+1.
Test module 104 mainly completes to be predicted machine data input linear to be measured projection grader, obtains the soft label of testing data, determines its classification, realizes the mistake classification of machine data.
It is specially by the process that to-be-detected machine data sample input multi classifier is classified:
Obtain test sample xnewWhen, utilize PTxnewThe multiclass projection grader for calculating and obtaining is embedded into, the corresponding position of greatest member of gained vector is sample to be tested xnewSoft label, the hard label of each test sample can be summed up as argmaxi≤c(fnew)i, wherein(fnew)iRepresent the soft label vector f of predictionnewI-th of element position.
Refer to table 1, for the inventive method and SRC (the Sparse Representation-based Classification), D-KSVD (Discriminative K-SVD), LC-KSVD1, LC-KSVD2 (Label Consistent K-SVD) and Lap-LDA method recognition result contrast tables, the average and highest discrimination of each method experiment is given.In this example, D-KSVD the and LC-KSVD methods (default parameters used using algorithm in each document) for participating in comparing use the sparse coding each obtained to be used for the feature extraction of machine data, and classify using standardized linear grader.This several groups of Experiment Training samples of Rolling bearing dataset and Gearbox dataset randomly select four per class from each data set, two of which as marked data, two as Unlabeled data, it is remaining to be used as test set.Motor electrical dataset training samples randomly select 6 from data set per class, wherein 3 as marked data, the other three is remaining to be used as test set as Unlabeled data.
The present invention of table 1. and SRC, D-KSVD, LC-KSVD1 and LC-KSVD2 method recognition result (%) contrast
In summary, the invention discloses a kind of robust machine error search method and system, training set data is pre-processed first with label method of estimation, the label for not demarcating machine data is estimated, and initialize a projection grader.Classification information based on training sample, enters the study of row label consistent phonogram allusion quotation, differentiates that the self-adapting reconstruction in sparse coding construction Tag Estimation model is weighed using what is obtained, the category information of no label training data is updated by calculating new projection grader.Trained by successive ignition, export the restructural dictionary of a differentiation, a sparse coding matrix and an optimal multi classifier.Conclusion and class prediction, maximum probability value corresponding position according to soft label in of the obtained grader available for data of newly arriving are trained, the classification of test sample is determined, completes machine error data robust classification.By proposing the consistent dictionary learning method of semi-supervised label, the prior information of supervision is enriched, the precision of machine error retrieval is effectively increased.
For system disclosed in embodiment, because it is corresponded to the method disclosed in Example, so description is fairly simple, related part is referring to method part illustration.
The foregoing description of the disclosed embodiments, enables professional and technical personnel in the field to realize or using the present invention.A variety of modifications to these embodiments be will be apparent for those skilled in the art, and generic principles defined herein can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, the present invention is not intended to be limited to the embodiments shown herein, and is to fit to the most wide scope consistent with features of novelty with principles disclosed herein.

Claims (5)

1. a kind of robust machine error search method, it is characterised in that including:
Estimate to obtain the class label for not demarcating sample data in training set by direct-push mode using Tag Estimation method, generation includes the new training set of all sample datas for having demarcation;
Machine error data and its label information in the new training set, enter the study of row label consistent phonogram allusion quotation, obtain differentiating sparse coding, utilize the discriminant coefficient code construction adaptive weighting coefficient, obtain self-adapting reconstruction coefficient matrix, one projection grader is obtained according to the self-adapting reconstruction coefficient matrix, the classification information of the sample data do not demarcated in training set is updated using the projection grader;
Trained by successive ignition, obtain the restructural dictionary of a differentiation, the differentiation sparse coding of a machine error data, and an optimal projection grader;
The class prediction and searching classification to machine error data to be measured are completed using the optimal projection grader, obtain the soft class label of testing data in the test set, maximum probability value in the soft class label finds corresponding position, the classification of test sample is determined, the robust classification of the machine error data is obtained.
2. according to the method described in claim 1, it is characterised in that also include before estimating to obtain the class label for not demarcating sample data in training set by direct-push mode using Tag Estimation method:
Obtain original sample data sets, the sample data sets are divided into training set and test set, marked training sample and unlabelled training sample, the machine data vector set of the marked training sample and the unlabelled training sample are included in the training setN is the dimension of machine data, and l is the quantity of marked training sample, and u is unmarked training samples number, wherein including c (c>2) training sample set of individual class labelWith the training sample set without any labelWherein any vectorFor machine data sample, l+u=N, the test sample in the test set is all unmarked.
3. method according to claim 2, it is characterised in that machine error data and its label information in the new training set, entering the study of row label consistent phonogram allusion quotation includes:The restructural dictionary that default D obtains for study, S is the differentiation sparse coding of training set, and AS encodes for the adaptive weighting, and P represents a projection grader;
⟨ D , S , A , P ⟩ = arg min D , S , A , P | | X - D S | | F 2 + α | | Q - A S | | F 2 + β Σ j = 1 l + u | | P T x j - P T Σ j : x j ∈ N ( x i ) ( A S ) i , j x j | | 2 2 + Σ i = 1 l + u μ i | | P T x i - y i | | 2 2 S u b j | | s i | | 0 ≤ T 1 , i ∈ { j | j = 1 , 2 , ... , N } , Wherein,It is reconstructed error,T1It is sparse constraint,siIt is defined as follows:
s i = s * ( x i , D ) ≡ arg m i n s | | x i - D s | | 2 2 s . t . | | s | | 0 ≤ T 1 ,
To differentiate sparse coding error, whereinFor the differentiation sparse coding of training dataset, α is the balance parameter of this;
As training sample xiAnd xjWhen belonging to a different category, the training sample xiAnd xjCorresponding item is 0 in Q, otherwise training sample xiAnd xjCorresponding item is cos (x in Qi, xj);
It is the neighborhood reconstructed error of accumulation, β is the balance parameter of this;
Presentation class error,The corresponding positional representation x of greatest memberiSoft label, μiRepresent xiAdjusting parameter, as x in training setiLabel known to when, corresponding μi=1010, otherwise μi=0.
4. method according to claim 3, it is characterised in that complete the class prediction and searching classification to machine error data to be measured using the optimal projection grader, obtaining the soft class label of testing data in the test set includes:
Obtain test sample xnewWhen, utilize PTxnewIt is embedded into calculating and obtains multiclass projection grader, the corresponding position of greatest member of gained vector is sample to be tested xnewSoft label, the hard label of each test sample can be summed up as arg maxi≤c(fnew)i, whereinRepresent the soft label vector f of predictionnewI-th of element position.
5. a kind of robust machine error searching system, it is characterised in that including:
Pretreatment module is trained, for estimating to obtain the class label for not demarcating sample data in training set by direct-push mode using Tag Estimation method, generation includes the new training set of all sample datas for having demarcation;
Training module, for the machine error data and its label information in the new training set, enter the study of row label consistent phonogram allusion quotation, obtain differentiating sparse coding, utilize the discriminant coefficient code construction adaptive weighting coefficient, self-adapting reconstruction coefficient matrix is obtained, a projection grader is obtained according to the self-adapting reconstruction coefficient matrix, the classification information of the sample data do not demarcated in training set is updated using the projection grader;
Iteration module, for being trained by successive ignition, obtains the restructural dictionary of a differentiation, the differentiation sparse coding of a machine error data, and an optimal projection grader;
Test module, for completing class prediction and searching classification to machine error data to be measured using the optimal projection grader, obtain the soft class label of testing data in the test set, maximum probability value in the soft class label finds corresponding position, the classification of test sample is determined, the robust classification of the machine error data is obtained.
CN201510408404.9A 2015-07-13 2015-07-13 Robust machine error retrieving method and system Pending CN104966105A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510408404.9A CN104966105A (en) 2015-07-13 2015-07-13 Robust machine error retrieving method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510408404.9A CN104966105A (en) 2015-07-13 2015-07-13 Robust machine error retrieving method and system

Publications (1)

Publication Number Publication Date
CN104966105A true CN104966105A (en) 2015-10-07

Family

ID=54220140

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510408404.9A Pending CN104966105A (en) 2015-07-13 2015-07-13 Robust machine error retrieving method and system

Country Status (1)

Country Link
CN (1) CN104966105A (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105608471A (en) * 2015-12-28 2016-05-25 苏州大学 Robust transductive label estimation and data classification method and system
CN108038056A (en) * 2017-12-07 2018-05-15 厦门理工学院 A kind of software defect detecting system based on asymmetric classification assessment
CN109299036A (en) * 2017-07-25 2019-02-01 北京嘀嘀无限科技发展有限公司 Label generating method, device, server and computer readable storage medium
CN110249341A (en) * 2017-02-03 2019-09-17 皇家飞利浦有限公司 Classifier training
CN110580488A (en) * 2018-06-08 2019-12-17 中南大学 Multi-working-condition industrial monitoring method, device, equipment and medium based on dictionary learning
CN110796153A (en) * 2018-08-01 2020-02-14 阿里巴巴集团控股有限公司 Training sample processing method and device
CN111461345A (en) * 2020-03-31 2020-07-28 北京百度网讯科技有限公司 Deep learning model training method and device
CN111832627A (en) * 2020-06-19 2020-10-27 华中科技大学 Image classification model training method, classification method and system for suppressing label noise
CN111931601A (en) * 2020-07-22 2020-11-13 上海交通大学 System and method for correcting error class label of gear box
CN112487231A (en) * 2020-12-17 2021-03-12 中国矿业大学(北京) Automatic image labeling method based on double-image regularization constraint and dictionary learning
CN112560920A (en) * 2020-12-10 2021-03-26 厦门大学 Machine learning classification method based on self-adaptive error correction output coding
CN112868032A (en) * 2018-10-15 2021-05-28 华为技术有限公司 Improving AI recognition learning ability
CN112964962A (en) * 2021-02-05 2021-06-15 国网宁夏电力有限公司 Power transmission line fault classification method
CN113807408A (en) * 2021-08-26 2021-12-17 华南理工大学 Data-driven audio classification method, system and medium for supervised dictionary learning

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105608471A (en) * 2015-12-28 2016-05-25 苏州大学 Robust transductive label estimation and data classification method and system
CN110249341A (en) * 2017-02-03 2019-09-17 皇家飞利浦有限公司 Classifier training
CN109299036B (en) * 2017-07-25 2021-01-05 北京嘀嘀无限科技发展有限公司 Label generation method, device, server and computer readable storage medium
CN109299036A (en) * 2017-07-25 2019-02-01 北京嘀嘀无限科技发展有限公司 Label generating method, device, server and computer readable storage medium
CN108038056A (en) * 2017-12-07 2018-05-15 厦门理工学院 A kind of software defect detecting system based on asymmetric classification assessment
CN108038056B (en) * 2017-12-07 2020-07-03 厦门理工学院 Software defect detection system based on asymmetric classification evaluation
CN110580488A (en) * 2018-06-08 2019-12-17 中南大学 Multi-working-condition industrial monitoring method, device, equipment and medium based on dictionary learning
CN110580488B (en) * 2018-06-08 2022-04-01 中南大学 Multi-working-condition industrial monitoring method, device, equipment and medium based on dictionary learning
CN110796153A (en) * 2018-08-01 2020-02-14 阿里巴巴集团控股有限公司 Training sample processing method and device
CN110796153B (en) * 2018-08-01 2023-06-20 阿里巴巴集团控股有限公司 Training sample processing method and device
CN112868032A (en) * 2018-10-15 2021-05-28 华为技术有限公司 Improving AI recognition learning ability
CN111461345B (en) * 2020-03-31 2023-08-11 北京百度网讯科技有限公司 Deep learning model training method and device
CN111461345A (en) * 2020-03-31 2020-07-28 北京百度网讯科技有限公司 Deep learning model training method and device
CN111832627B (en) * 2020-06-19 2022-08-05 华中科技大学 Image classification model training method, classification method and system for suppressing label noise
CN111832627A (en) * 2020-06-19 2020-10-27 华中科技大学 Image classification model training method, classification method and system for suppressing label noise
CN111931601B (en) * 2020-07-22 2023-10-20 上海交通大学 System and method for correcting error type label of gearbox
CN111931601A (en) * 2020-07-22 2020-11-13 上海交通大学 System and method for correcting error class label of gear box
CN112560920B (en) * 2020-12-10 2022-09-06 厦门大学 Machine learning classification method based on self-adaptive error correction output coding
CN112560920A (en) * 2020-12-10 2021-03-26 厦门大学 Machine learning classification method based on self-adaptive error correction output coding
CN112487231A (en) * 2020-12-17 2021-03-12 中国矿业大学(北京) Automatic image labeling method based on double-image regularization constraint and dictionary learning
CN112964962A (en) * 2021-02-05 2021-06-15 国网宁夏电力有限公司 Power transmission line fault classification method
CN113807408A (en) * 2021-08-26 2021-12-17 华南理工大学 Data-driven audio classification method, system and medium for supervised dictionary learning
CN113807408B (en) * 2021-08-26 2023-08-22 华南理工大学 Data-driven supervised dictionary learning audio classification method, system and medium

Similar Documents

Publication Publication Date Title
CN104966105A (en) Robust machine error retrieving method and system
CN107480261B (en) Fine-grained face image fast retrieval method based on deep learning
CN104217225B (en) A kind of sensation target detection and mask method
CN108664632A (en) A kind of text emotion sorting algorithm based on convolutional neural networks and attention mechanism
CN103632168B (en) Classifier integration method for machine learning
CN103258210B (en) A kind of high-definition image classification method based on dictionary learning
CN106951825A (en) A kind of quality of human face image assessment system and implementation method
CN107451278A (en) Chinese Text Categorization based on more hidden layer extreme learning machines
CN105046269B (en) A kind of more example multi-tag scene classification methods based on multi-core integration
CN109766277A (en) A kind of software fault diagnosis method based on transfer learning and DNN
CN105095494B (en) The method that a kind of pair of categorized data set is tested
CN110309868A (en) In conjunction with the hyperspectral image classification method of unsupervised learning
CN104573669A (en) Image object detection method
CN104750875A (en) Machine error data classification method and system
CN106294344A (en) Video retrieval method and device
CN105069483B (en) The method that a kind of pair of categorized data set is tested
CN108985360A (en) Hyperspectral classification method based on expanding morphology and Active Learning
CN103745233B (en) The hyperspectral image classification method migrated based on spatial information
CN109948735A (en) A kind of multi-tag classification method, system, device and storage medium
CN110232128A (en) Topic file classification method and device
CN111222318A (en) Trigger word recognition method based on two-channel bidirectional LSTM-CRF network
Tavakoli Seq2image: Sequence analysis using visualization and deep convolutional neural network
CN106021402A (en) Multi-modal multi-class Boosting frame construction method and device for cross-modal retrieval
CN104978569A (en) Sparse representation based incremental face recognition method
CN112861626B (en) Fine granularity expression classification method based on small sample learning

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20151007

RJ01 Rejection of invention patent application after publication