CN109816016A

CN109816016A - Method for diagnosing faults based on Large-Scale Training Data Set support vector machines

Info

Publication number: CN109816016A
Application number: CN201910062336.3A
Authority: CN
Inventors: 徐启华; 师军
Original assignee: Huaihai Institute of Techology
Current assignee: Huaihai Institute of Techology
Priority date: 2019-01-23
Filing date: 2019-01-23
Publication date: 2019-05-28

Abstract

The invention discloses the method for diagnosing faults based on Large-Scale Training Data Set support vector machines in fault diagnosis technology field, propose a kind of new SVM learning strategy, by retaining sample and error sample near initial SVM classifier supporting vector hyperplane, it is obviously reduced the finally obtained collection scale that about subtracts, so as to be obviously shortened the training time under the premise of keeping compared with high-class precision；Simultaneously, since the quantity of supporting vector reduces, the classification time also accordingly shortens, and gives the parameter selection method of Sequential minimal optimization algorithm and the solution of several critical issues, is that practical application of the algorithm of this great potential in SVM fault diagnosis has established solid foundation；The present invention is based on the method for diagnosing faults of Large-Scale Training Data Set SVM effectively, reliable, easy to accomplish, can be used as the basis of engineer application.

Description

Method for diagnosing faults based on Large-Scale Training Data Set support vector machines

Technical field

The present invention relates to fault diagnosis technology fields, and in particular to the failure based on Large-Scale Training Data Set support vector machines is examined Disconnected method.

Background technique

Troubleshooting issue is substantially pattern classification problem.Support vector machines (SVM, Support Vector It Machine) is a kind of machine learning method emerging, based on Statistical Learning Theory, the advantage protruded is mainly manifested in: energy Unique globally optimal solution is accessed, have good Generalization Capability, dexterously solves problem of dimension so that algorithm Complexity is unrelated with sample dimension.Due to above-mentioned superiority, SVM is actively used for fault diagnosis, and is proved to be one kind and has Efficacious prescriptions method.

The training of SVM is finally attributed to convex quadratic programming (QP, the Quadratic solved under linear restriction Programming).When training set is larger, SVM will appear some difficulties in realization.Firstly, SVM will calculate and store core Jacobian matrix, amount of calculation and memory space all increase with number of training purpose square, when training sample number compared with When big, expense is very big；Secondly, SVM will carry out a large amount of and time-consuming matrix operation, most cases in quadratic form searching process Under, optimizing algorithm occupies the most of the time of SVM calculating.Moreover, training set increases, supporting vector number generally also can be therewith Increase, classification speed also can accordingly lower.

In order to overcome these difficulties, for Large-Scale Training Data Set, generally set about in terms of two.First, using new reality Existing algorithm reduces the number of samples for participating in evolutionary operation, such as uses " block algorithm (Chunking) ".This field has mentioned at present Training algorithm out is mostly based on the thought of Decomposition iteration, i.e., asks original QP PROBLEM DECOMPOSITION at the lesser QP of several scales Topic solves.It specifically, all training set decomposing is working set and two subsets of inoperative collection in the iterative calculation of each step, only Data sample in working set is optimized, rather than Lagrange multiplier corresponding to sample data is kept not in working set Become.Above-mentioned " block algorithm " is exactly a typical case of this Decomposition iteration thought, and the sequential minimum optimization that Platt is proposed (SMO, Sequential Minimal Optimization) algorithm is Decomposition iteration thought to be applied to ultimate attainment result. The working set of SMO algorithm only includes two data samples, is all only carried out to two Lagrange multipliers in every single-step iteration excellent Change, due to the linear equality constraints to Lagrange multiplier, this is the minimum optimization problem being likely to be breached.Although in SMO Middle QP subproblem increases, but total calculating speed substantially increases, and this algorithm is completely without handling big matrix, because And there is no extra demand to memory space, the SVM training problem being on a grand scale can also be completed with PC machine.SMO is extensive at present A kind of the most extensive and successful method of practical application in training set SVM training.The present invention also will carry out SVM using SMO algorithm Training, and to the parameter selection of SMO algorithm and algorithm implement in related practical problem inquired into.

Second aspect is set about from training set is deleted, and basic thought is: removing the sample little to learning assistance from training set This, emphasis retains the sample being affected to classifying face, reduces training set scale, to reduce the cost of SVM training, simultaneously Guarantee the classification accuracy rate of classifier, and reduces the classification time used.According to this thought, algorithm proposed by the present invention It is first to obtain the preliminary classification device with certain nicety of grading with a small-scale sample set training, classified with this Device trims original Large-Scale Training Data Set, and trimming strategy is to consider to be retained simultaneously by the sample of preliminary classification device mistake point With preliminary classification device supporting vector apart from closer sample, other samples are deleted, obtain a scale reduction deletes collection；So Afterwards, it is trained to obtain final classifier with the collection of deleting after trimming.This algorithm has caught the essence of SVM, i.e. classifier It is only related with supporting vector, and it is unrelated with other samples, algorithm thinking is clear, it is succinct effectively, be easy to implement.Based on this, this hair It is bright to devise the method for diagnosing faults based on Large-Scale Training Data Set support vector machines, to solve Large-Scale Training Data Set in fault diagnosis The training of SVM is difficult.

Summary of the invention

The purpose of the present invention is to provide the method for diagnosing faults based on Large-Scale Training Data Set support vector machines, on solving State the training problem of Large-Scale Training Data Set SVM in the fault diagnosis proposed in background technique.

To achieve the above object, the invention provides the following technical scheme: the event based on Large-Scale Training Data Set support vector machines Hinder diagnostic method, including Large-Scale Training Data Set L and support vector machines, it is characterised in that: first with a small-scale sample Collection S0 training obtains an initial SVM classifier F0, then with the SVM classifier F0 to the Large-Scale Training Data Set L into Row trimming obtains lesser about subtract of a scale and collects S, then is trained to obtain final failure with the collection S that about subtracts after trimming Classifier；

For training sample (x_i,y_i)∈R^l× {+1, -1 }, i=1,2 ..., n, when carrying out two class failure modes, the branch Holding the optimization problem that vector machine SVM is solved is

The decision rule of test samples x is

Since the training sample is larger, the expense of amount of calculation and memory space is very big, and optimization problem is direct Solution is difficult to carry out, and needs to solve using SMO algorithm.

It is preferably, described that specific step is as follows:

(1) a small-scale sample set S0 is randomly selected from the extensive sample set L, with the small-scale sample set S0 training obtains initial SVM classifier F0；

(2) trim the extensive sample set L, about subtracted collection S, then with it is described about subtract collection S training obtain it is final Classifier；

If the Optimal Separating Hyperplane of the SVM classifier F0 is H, for any one sample of the extensive sample set L X then has if x is r at a distance from H

R=g (x)/| | w | |,

Wherein w is the weight vector of the support vector machines, distance point of the hyperplane to H where two class supporting vectors Wei ± 1/ | | w | |.

Preferably, the concrete methods of realizing solved using the SMO algorithm is as follows:

κ=K (x₁,x₂)+K(x₂,x₂)-2K(x₁,x₂)

Wherein, the calculated value of the expression previous step of " old ", the new calculated value that the expression of mark " new " obtains are marked；

It, will E in calculating formula (3) every time when being updated calculating_i(i=1,2) and the objective function W in formula (1) (α) needs to utilize all training samples, when the training sample number is big, meter if directly being calculated by formula (1), (3) Operator workload is not small, in order to reduce computing cost, rewrites to their calculation formula, it is not necessary to it directly calculates every time, and It is to make full use of last calculated result, iterative calculation；

(1) update of threshold value b

It obtainsAfterwards, if havingThen first according to the first of the threshold value b, b in KKT condition newer (2) Initial value is taken as zero:

If simultaneously alsoThen

(2)E_iIterative calculation

E_iIt is the difference between anticipation function output and desired output, can iterates to calculate as the following formula

The operation times of formula (3) are O (n) magnitude；Compared with formula (3), last calculated result is utilized in formula (4), only It needs to be calculated according to two samples again, needs to do 4 multiplication, 6 sub-additions, it is clear that in the case where training sample is larger, Income is larger；

(3) update of objective function W (α) calculates

According to formula (1) calculating target function, operation times are O (n²) magnitude；And the operation times of formula (5) are O (n) amount Grade, it is clear that at the extensive sample set L, formula (5) is much less amount of calculation than formula (1)；

In addition, also to calculate α in iterative process₂=L or α₂Target function value at=H, at this point, also need to only calculate increasing Amount is added with last calculated result, and objective function increment is

Preferably, trimming the extensive specific pruning method of sample set L is: if x is divided by the SVM classifier F0 mistake, Then retain this sample；If x is correctly classified by the SVM classifier F0, when 1- ε≤| g (x) | when≤1+ ε, retain x, otherwise delete It, wherein 0 < ε < 1 is adjustable threshold value.

Preferably, determining the condition of the small-scale sample set S0 scale foundation is: it is not high using training cost, and guarantee There is certain nicety of grading with the SVM classifier F0 that training obtains.

Preferably, there are two functions for the adjusting tool of the threshold epsilon: controlling the scale for about subtracting collection S and influences final divide The nicety of grading of class device.

Compared with prior art, the beneficial effects of the present invention are:

(1) by retaining sample and error sample near initial SVM classifier supporting vector hyperplane, make final obtain To the collection scale that about subtracts be obviously reduced, so as to keep compared with high-class precision under the premise of be obviously shortened the training time；Together When, since supporting vector reduces, the classification time also accordingly shortens, and this strategy is for the Large-Scale Training Data Set with multiclass failure Troubleshooting issue effect highly significant.

(2) parameter selection and the critical issue during realization for having inquired into SMO algorithm are the algorithm of this great potential Practical application in fault diagnosis has established solid foundation.

(3) simulation example shows that the algorithm of proposition is effective, reliable, easy to accomplish, thus the basis with engineer application, A kind of good method can be provided for the fault diagnosis of Large-Scale Training Data Set.

Detailed description of the invention

In order to illustrate the technical solution of the embodiments of the present invention more clearly, will be described below to embodiment required Attached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for ability For the those of ordinary skill of domain, without creative efforts, it can also be obtained according to these attached drawings other attached Figure.

Fig. 1 is inventive engine failure modes flow chart.

Fig. 2 is the present invention (NON and LP failure) table compared with basic SVM classifier.

Fig. 3 is that the present invention starts binary classifier training result.

Specific embodiment

Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts all other Embodiment shall fall within the protection scope of the present invention.

Fig. 1-3 is please referred to, the present invention provides a kind of technical solution: the failure based on Large-Scale Training Data Set support vector machines is examined Disconnected method, including Large-Scale Training Data Set L and support vector machines, it is characterised in that: first with a small-scale sample set S0 Training obtains an initial SVM classifier F0, is then trimmed with SVM classifier F0 to Large-Scale Training Data Set L, after trimming It obtains lesser about subtract of a scale and collects S, then be trained to obtain final fault grader with collection S is about subtracted.

SMO rudimentary algorithm

The SVM algorithm classified to two class fault datas is introduced first.

For training sample (x_i,y_i)∈R^l× {+1, -1 }, i=1,2 ..., n, the optimization problem that SVM is solved are

Here, the meaning of each parameter is consistent in usual document.

The decision rule of test samples x is

The every step of SMO selects two element optimizations, without loss of generality, it is assumed that two selected elements are α₁And α₂.Optimization problem (1) it can be realized by following calculating:

κ=K (x₁,x₂)+K(x₂,x₂)-2K(x₁,x₂)

Wherein, the calculated value of the expression previous step of " old ", the new calculated value that the expression of mark " new " obtains are marked.

Several key technologies that SMO algorithm is realized

It, will E in calculating formula (3) every time when being updated calculating_i(i=1,2) and the objective function W in formula (1) (α).If directly being calculated by formula (1), (3), need, when training sample number is big, to calculate work using all training samples It measures not small.In order to reduce computing cost, their calculation formula is rewritten herein, it is not necessary to it directly calculates every time, and It is to make full use of last calculated result, iterative calculation.

1, the update of threshold value b

If simultaneously alsoThen

2、E_iIterative calculation

The operation times of formula (3) are O (n) magnitude；Compared with formula (3), last calculated result is utilized in formula (4), only It needs to be calculated according to two samples again, needs to do 4 multiplication, 6 sub-additions, it is clear that in the case where training sample is larger, Income is larger.

3, the update of objective function W (α) calculates

According to formula (1) calculating target function, operation times are O (n²) magnitude；And the operation times of formula (5) are O (n) amount Grade, it is clear that under Large-Scale Training Data Set, formula (5) is much less amount of calculation than formula (1).

In addition, also to calculate α in SMO iterative process₂=L or α₂Target function value at=H, at this point, also need to only calculate Increment is added with last calculated result.Objective function increment is

The trimming strategy of Large-Scale Training Data Set

For Large-Scale Training Data Set L, an initial svm classifier is obtained with a small-scale sample set S0 training first Device F0, then trims L with F0, and lesser about subtract of a scale is obtained after trimming and collects S, then is trained to obtain most with S Whole classifier, we are known as NL-SVM classifier.Specific step is as follows:

(1) a small-scale sample set S0 is randomly selected from extensive sample set L, obtains initial SVM with S0 training The scale of classifier F0.S0 determines according to two conditions, first, train cost not high using it；Second, guarantee is trained with it Obtained classifier F0 has certain nicety of grading.Experiment shows what this was still easily done.

(2) L is trimmed, collection S is about subtracted, then obtains final classifier with S training.If the Optimal Separating Hyperplane of F0 is H, Then have for any one sample x of L if x is r at a distance from H

R=g (x)/| | w | |,

Wherein w is the weight vector of SVM, and the distance of hyperplane to H are respectively ± 1/ where two class supporting vectors | | w | | the specific pruning method of is: if x retains this sample by F0 mistake point；If x is correctly classified by F0, when 1- ε≤| g (x) |≤1+ ε When, retain x, otherwise delete it, wherein 0 < ε < 1 is adjustable threshold value.The regulatory function of threshold epsilon is that control about subtracts collection Scale and the nicety of grading for influencing final classification device.Near-optimization classifier can be obtained by adjusting ε in practical application, and The adjusting of ε is not difficult.

Here, different from other methods, the sample divided by F0 mistake is also received into final training set S, main reason It is: if any one sample x illustrates that preliminary classification device F0 does not have the characteristic of correct reflected sample x by F0 mistake point, that is, It says, x is the nonredundancy sample for being capable of providing classification new information for final training set S, it is to classification the result is that having It influences, should not be deleted.X is by the correct sorted trimming strategy of F0, in short, being exactly the branch left with preliminary classification device Hold the closer sample of vector distance, this strategy has caught the essence of support vector machines, i.e., SVM classifier only with supporting vector It is related, it is unrelated with other samples.By using this trimming strategy, the sample left will be very helpful to classifying, and delete What the sample removed help classification without.

One concrete application of the present embodiment are as follows: the gas path component fault diagnosis of a birotary burbine jet engine Problem, Fault characteristic parameters are 8, comprising: high and low pressure revolving speed N_H,N_L；High and low pressure compressor delivery pressure P₂,P_x；It is high and low Press turbine outlet pressure P_y,P₄；High-pressure compressor outlet temperature T₂；Low-pressure turbine exit temperature T₄.Above-mentioned parameter is all made of deviation The relative quantity expression of normal condition, dimensionless.The failure of engine air circuit unit has 5 classes, is respectively: fault-free (NON), low pressure Compressor failure (LP), high-pressure compressor failure (HP), low-pressure turbine failure (LT) and high-pressure turbine failure (HT).

In addition 2000 groups of the sample data of every class failure is used to examine for 500 groups wherein 1500 groups are used to train.In this way, instruction When practicing two class SVM, training set L shares 3000 groups of samples, then randomly selects 200 from L and (organize initial as S0 training in every each 100) of class Classifier F0, test samples have 1000 (every each 500) groups of class.The process of engine failure classification is as shown in Figure 1.

Using Gauss kernel function

(1) preliminary classification device F0

By taking two class failure of NON and LP as an example.Kernel functional parameter σ²=0.8, SVM regularization parameter C=100.At this point, obtaining Preliminary classification device F0, nicety of grading 89.2% meets the requirements.

(2) NL-SVM classifier

In threshold epsilon=0.5, preferable training result is obtained.In order to compare, calculated result of the invention and substantially The calculated result of SVM is listed in table 1 simultaneously.When calculating, the training of two kinds of classifiers all uses SMO algorithm to realize, the related ginseng of SMO Number is as follows:

Tol is the tolerance grade for determining whether to meet KKT condition, takes tol=10^-2。

μ is the α for determining to calculate twice₂The threshold value whether given up should be the positive number of very little, take μ=10-3.If calculating twice α₂Absolute value of the difference be greater than μ, just update α₂, otherwise, give up the α newly calculated₂, retain last calculated value.

Compared with basic SVM classifier, the NL-SVM classifier using reduction algorithm of the present invention is keeping nicety of grading slightly Under conditions of height, 939 groups are reduced to due to finally participating in trained sample number, the training time used is less than the former 1/ 15, training speed improves clearly.At the same time, the supporting vector number that NL-SVM is obtained is also reduced to 128, only substantially The 1/3 of SVM, this is beneficial to shorten the classification time.

(3) engine failure classification results

Need to construct 10 to classify to 5 class failures according to " one-against-one " multivariate classification method Binary SVM classifier is simultaneously decided by vote accordingly.It is similar with NON-LP binary classifier, trained by other fault sample data To binary NL-SVM classifier also all achieve satisfied effect.Finally, to the event of 5 class such as engine NON, LP, HP, LT, HT The classification results of 2500 groups of test samples of barrier are that 2419 samples are correctly classified, and only 81 samples are by mistake point, and classification is just True rate reaches 96.8%.

Since multi-class fault classification needs to train multiple binary classifiers, using algorithm proposed by the present invention, total training It is more significant that time shortens effect.

In the description of this specification, the description of reference term " one embodiment ", " example ", " specific example " etc. means Particular features, structures, materials, or characteristics described in conjunction with this embodiment or example are contained at least one implementation of the invention In example or example.In the present specification, schematic expression of the above terms may not refer to the same embodiment or example. Moreover, particular features, structures, materials, or characteristics described can be in any one or more of the embodiments or examples to close Suitable mode combines.

Present invention disclosed above preferred embodiment is only intended to help to illustrate the present invention.There is no detailed for preferred embodiment All details are described, are not limited the invention to the specific embodiments described.Obviously, according to the content of this specification, It can make many modifications and variations.These embodiments are chosen and specifically described to this specification, is in order to better explain the present invention Principle and practical application, so that skilled artisan be enable to better understand and utilize the present invention.The present invention is only It is limited by claims and its full scope and equivalent.

Claims

1. based on the method for diagnosing faults of Large-Scale Training Data Set support vector machines, including Large-Scale Training Data Set L and support vector machines SVM, it is characterised in that: obtain an initial SVM classifier F0 with a small-scale sample set S0 training first, then use The SVM classifier F0 trims the Large-Scale Training Data Set L, and lesser about subtract of a scale is obtained after trimming and collects S, then It is trained to obtain final fault grader with the collection S that about subtracts；

For training sample (x_i,y_i)∈R^l× {+1, -1 }, i=1,2 ..., n, carry out two class failure modes when, it is described support to Amount machine SVM solve optimization problem be

The decision rule of test samples x is

Since the training sample is larger, the expense of amount of calculation and memory space is very big, optimization problem direct solution It is difficult to carry out, needs to solve using SMO algorithm.

2. the method for diagnosing faults according to claim 1 based on Large-Scale Training Data Set support vector machines, it is characterised in that: It is described that specific step is as follows:

(1) a small-scale sample set S0 is randomly selected from the extensive sample set L, is instructed with the small-scale sample set S0 Get initial SVM classifier F0；

(2) the extensive sample set L is trimmed, collection S is about subtracted, is then about subtracted collection S training with described and is obtained final classification Device；

If the Optimal Separating Hyperplane of the SVM classifier F0 is H, for any one sample x of the extensive sample set L, if x With at a distance from H be r, then have

R=g (x)/| | w | |,

Wherein w is the weight vector of the support vector machines, and the distance of hyperplane to H are respectively where two class supporting vectors ±1/||w||。

3. the method for diagnosing faults according to claim 1 based on Large-Scale Training Data Set support vector machines, it is characterised in that: The concrete methods of realizing solved using the SMO algorithm is as follows:

κ=K (x₁,x₂)+K(x₂,x₂)-2K(x₁,x₂)

It, will E in calculating formula (3) every time when being updated calculating_iObjective function W (α) in (i=1,2) and formula (1), if directly It connects and is calculated by formula (1), (3), need to utilize all training samples, when the training sample number is big, amount of calculation It is not small, in order to reduce computing cost, their calculation formula is rewritten, it is not necessary to it directly calculates every time, but sufficiently benefit With last calculated result, iterative calculation；

(1) update of threshold value b

It obtainsAfterwards, if havingThen first according to the initial value of threshold value b, b in KKT condition newer (2) It is taken as zero:

If simultaneously alsoThen

(2)E_iIterative calculation

The operation times of formula (3) are O (n) magnitude；Compared with formula (3), last calculated result is utilized in formula (4), it is only necessary to weight New root is calculated according to two samples, needs to do 4 multiplication, 6 sub-additions, it is clear that in the case where training sample is larger, income It is larger；

(3) update of objective function W (α) calculates

According to formula (1) calculating target function, operation times are O (n²) magnitude；And the operation times of formula (5) are O (n) magnitude, are shown So, at the extensive sample set L, formula (5) is much less amount of calculation than formula (1)；

In addition, also to calculate α in iterative process₂=L or α₂Target function value at=H, at this point, increment also need to be only calculated, with Last calculated result is added, and objective function increment is

4. the method for diagnosing faults according to claim 2 based on Large-Scale Training Data Set support vector machines, it is characterised in that: Trimming the extensive specific pruning method of sample set L is: if x retains this sample by the SVM classifier F0 mistake point；If x Correctly classified by the SVM classifier F0, when 1- ε≤| g (x) | when≤1+ ε, retain x, otherwise delete it, wherein 0 < ε < 1 be can With the threshold value of adjustment.

5. the method for diagnosing faults according to claim 4 based on Large-Scale Training Data Set support vector machines, which is characterized in that Determining the condition of the small-scale sample set S0 scale foundation is: it is not high using training cost, and guarantee the institute obtained with training Stating SVM classifier F0 has certain nicety of grading.

6. the method for diagnosing faults according to claim 2 based on Large-Scale Training Data Set support vector machines, which is characterized in that There are two functions for the adjusting tool of the threshold epsilon: the control scale for about subtracting collection S and the nicety of grading for influencing final classification device.