CN109816016A - Method for diagnosing faults based on Large-Scale Training Data Set support vector machines - Google Patents
Method for diagnosing faults based on Large-Scale Training Data Set support vector machines Download PDFInfo
- Publication number
- CN109816016A CN109816016A CN201910062336.3A CN201910062336A CN109816016A CN 109816016 A CN109816016 A CN 109816016A CN 201910062336 A CN201910062336 A CN 201910062336A CN 109816016 A CN109816016 A CN 109816016A
- Authority
- CN
- China
- Prior art keywords
- scale
- training
- formula
- sample
- data set
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012549 training Methods 0.000 title claims abstract description 87
- 238000000034 method Methods 0.000 title claims abstract description 28
- 238000012706 support-vector machine Methods 0.000 title claims abstract description 21
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 30
- 239000013598 vector Substances 0.000 claims abstract description 18
- 238000005457 optimization Methods 0.000 claims abstract description 12
- 230000006870 function Effects 0.000 claims description 21
- 238000004364 calculation method Methods 0.000 claims description 17
- 238000009966 trimming Methods 0.000 claims description 12
- 238000012360 testing method Methods 0.000 claims description 5
- 238000007792 addition Methods 0.000 claims description 3
- 238000012804 iterative process Methods 0.000 claims description 3
- 238000013138 pruning Methods 0.000 claims description 3
- 230000008901 benefit Effects 0.000 claims description 2
- 238000003745 diagnosis Methods 0.000 abstract description 9
- 238000005516 engineering process Methods 0.000 abstract description 3
- 239000007787 solid Substances 0.000 abstract description 2
- 238000010187 selection method Methods 0.000 abstract 1
- 238000000354 decomposition reaction Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 239000011159 matrix material Substances 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 238000013024 troubleshooting Methods 0.000 description 2
- 241000208340 Araliaceae Species 0.000 description 1
- 241001269238 Data Species 0.000 description 1
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 description 1
- 235000003140 Panax quinquefolius Nutrition 0.000 description 1
- 241000139306 Platt Species 0.000 description 1
- 230000004888 barrier function Effects 0.000 description 1
- CUZMQPZYCDIHQL-VCTVXEGHSA-L calcium;(2s)-1-[(2s)-3-[(2r)-2-(cyclohexanecarbonylamino)propanoyl]sulfanyl-2-methylpropanoyl]pyrrolidine-2-carboxylate Chemical compound [Ca+2].N([C@H](C)C(=O)SC[C@@H](C)C(=O)N1[C@@H](CCC1)C([O-])=O)C(=O)C1CCCCC1.N([C@H](C)C(=O)SC[C@@H](C)C(=O)N1[C@@H](CCC1)C([O-])=O)C(=O)C1CCCCC1 CUZMQPZYCDIHQL-VCTVXEGHSA-L 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 238000002405 diagnostic procedure Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 235000008434 ginseng Nutrition 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000009711 regulatory function Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses the method for diagnosing faults based on Large-Scale Training Data Set support vector machines in fault diagnosis technology field, propose a kind of new SVM learning strategy, by retaining sample and error sample near initial SVM classifier supporting vector hyperplane, it is obviously reduced the finally obtained collection scale that about subtracts, so as to be obviously shortened the training time under the premise of keeping compared with high-class precision;Simultaneously, since the quantity of supporting vector reduces, the classification time also accordingly shortens, and gives the parameter selection method of Sequential minimal optimization algorithm and the solution of several critical issues, is that practical application of the algorithm of this great potential in SVM fault diagnosis has established solid foundation;The present invention is based on the method for diagnosing faults of Large-Scale Training Data Set SVM effectively, reliable, easy to accomplish, can be used as the basis of engineer application.
Description
Technical field
The present invention relates to fault diagnosis technology fields, and in particular to the failure based on Large-Scale Training Data Set support vector machines is examined
Disconnected method.
Background technique
Troubleshooting issue is substantially pattern classification problem.Support vector machines (SVM, Support Vector
It Machine) is a kind of machine learning method emerging, based on Statistical Learning Theory, the advantage protruded is mainly manifested in: energy
Unique globally optimal solution is accessed, have good Generalization Capability, dexterously solves problem of dimension so that algorithm
Complexity is unrelated with sample dimension.Due to above-mentioned superiority, SVM is actively used for fault diagnosis, and is proved to be one kind and has
Efficacious prescriptions method.
The training of SVM is finally attributed to convex quadratic programming (QP, the Quadratic solved under linear restriction
Programming).When training set is larger, SVM will appear some difficulties in realization.Firstly, SVM will calculate and store core
Jacobian matrix, amount of calculation and memory space all increase with number of training purpose square, when training sample number compared with
When big, expense is very big;Secondly, SVM will carry out a large amount of and time-consuming matrix operation, most cases in quadratic form searching process
Under, optimizing algorithm occupies the most of the time of SVM calculating.Moreover, training set increases, supporting vector number generally also can be therewith
Increase, classification speed also can accordingly lower.
In order to overcome these difficulties, for Large-Scale Training Data Set, generally set about in terms of two.First, using new reality
Existing algorithm reduces the number of samples for participating in evolutionary operation, such as uses " block algorithm (Chunking) ".This field has mentioned at present
Training algorithm out is mostly based on the thought of Decomposition iteration, i.e., asks original QP PROBLEM DECOMPOSITION at the lesser QP of several scales
Topic solves.It specifically, all training set decomposing is working set and two subsets of inoperative collection in the iterative calculation of each step, only
Data sample in working set is optimized, rather than Lagrange multiplier corresponding to sample data is kept not in working set
Become.Above-mentioned " block algorithm " is exactly a typical case of this Decomposition iteration thought, and the sequential minimum optimization that Platt is proposed
(SMO, Sequential Minimal Optimization) algorithm is Decomposition iteration thought to be applied to ultimate attainment result.
The working set of SMO algorithm only includes two data samples, is all only carried out to two Lagrange multipliers in every single-step iteration excellent
Change, due to the linear equality constraints to Lagrange multiplier, this is the minimum optimization problem being likely to be breached.Although in SMO
Middle QP subproblem increases, but total calculating speed substantially increases, and this algorithm is completely without handling big matrix, because
And there is no extra demand to memory space, the SVM training problem being on a grand scale can also be completed with PC machine.SMO is extensive at present
A kind of the most extensive and successful method of practical application in training set SVM training.The present invention also will carry out SVM using SMO algorithm
Training, and to the parameter selection of SMO algorithm and algorithm implement in related practical problem inquired into.
Second aspect is set about from training set is deleted, and basic thought is: removing the sample little to learning assistance from training set
This, emphasis retains the sample being affected to classifying face, reduces training set scale, to reduce the cost of SVM training, simultaneously
Guarantee the classification accuracy rate of classifier, and reduces the classification time used.According to this thought, algorithm proposed by the present invention
It is first to obtain the preliminary classification device with certain nicety of grading with a small-scale sample set training, classified with this
Device trims original Large-Scale Training Data Set, and trimming strategy is to consider to be retained simultaneously by the sample of preliminary classification device mistake point
With preliminary classification device supporting vector apart from closer sample, other samples are deleted, obtain a scale reduction deletes collection;So
Afterwards, it is trained to obtain final classifier with the collection of deleting after trimming.This algorithm has caught the essence of SVM, i.e. classifier
It is only related with supporting vector, and it is unrelated with other samples, algorithm thinking is clear, it is succinct effectively, be easy to implement.Based on this, this hair
It is bright to devise the method for diagnosing faults based on Large-Scale Training Data Set support vector machines, to solve Large-Scale Training Data Set in fault diagnosis
The training of SVM is difficult.
Summary of the invention
The purpose of the present invention is to provide the method for diagnosing faults based on Large-Scale Training Data Set support vector machines, on solving
State the training problem of Large-Scale Training Data Set SVM in the fault diagnosis proposed in background technique.
To achieve the above object, the invention provides the following technical scheme: the event based on Large-Scale Training Data Set support vector machines
Hinder diagnostic method, including Large-Scale Training Data Set L and support vector machines, it is characterised in that: first with a small-scale sample
Collection S0 training obtains an initial SVM classifier F0, then with the SVM classifier F0 to the Large-Scale Training Data Set L into
Row trimming obtains lesser about subtract of a scale and collects S, then is trained to obtain final failure with the collection S that about subtracts after trimming
Classifier;
For training sample (xi,yi)∈Rl× {+1, -1 }, i=1,2 ..., n, when carrying out two class failure modes, the branch
Holding the optimization problem that vector machine SVM is solved is
The decision rule of test samples x is
Since the training sample is larger, the expense of amount of calculation and memory space is very big, and optimization problem is direct
Solution is difficult to carry out, and needs to solve using SMO algorithm.
It is preferably, described that specific step is as follows:
(1) a small-scale sample set S0 is randomly selected from the extensive sample set L, with the small-scale sample set
S0 training obtains initial SVM classifier F0;
(2) trim the extensive sample set L, about subtracted collection S, then with it is described about subtract collection S training obtain it is final
Classifier;
If the Optimal Separating Hyperplane of the SVM classifier F0 is H, for any one sample of the extensive sample set L
X then has if x is r at a distance from H
R=g (x)/| | w | |,
Wherein w is the weight vector of the support vector machines, distance point of the hyperplane to H where two class supporting vectors
Wei ± 1/ | | w | |.
Preferably, the concrete methods of realizing solved using the SMO algorithm is as follows:
κ=K (x1,x2)+K(x2,x2)-2K(x1,x2)
Wherein, the calculated value of the expression previous step of " old ", the new calculated value that the expression of mark " new " obtains are marked;
It, will E in calculating formula (3) every time when being updated calculatingi(i=1,2) and the objective function W in formula (1)
(α) needs to utilize all training samples, when the training sample number is big, meter if directly being calculated by formula (1), (3)
Operator workload is not small, in order to reduce computing cost, rewrites to their calculation formula, it is not necessary to it directly calculates every time, and
It is to make full use of last calculated result, iterative calculation;
(1) update of threshold value b
It obtainsAfterwards, if havingThen first according to the first of the threshold value b, b in KKT condition newer (2)
Initial value is taken as zero:
If simultaneously alsoThen
(2)EiIterative calculation
EiIt is the difference between anticipation function output and desired output, can iterates to calculate as the following formula
The operation times of formula (3) are O (n) magnitude;Compared with formula (3), last calculated result is utilized in formula (4), only
It needs to be calculated according to two samples again, needs to do 4 multiplication, 6 sub-additions, it is clear that in the case where training sample is larger,
Income is larger;
(3) update of objective function W (α) calculates
According to formula (1) calculating target function, operation times are O (n2) magnitude;And the operation times of formula (5) are O (n) amount
Grade, it is clear that at the extensive sample set L, formula (5) is much less amount of calculation than formula (1);
In addition, also to calculate α in iterative process2=L or α2Target function value at=H, at this point, also need to only calculate increasing
Amount is added with last calculated result, and objective function increment is
Preferably, trimming the extensive specific pruning method of sample set L is: if x is divided by the SVM classifier F0 mistake,
Then retain this sample;If x is correctly classified by the SVM classifier F0, when 1- ε≤| g (x) | when≤1+ ε, retain x, otherwise delete
It, wherein 0 < ε < 1 is adjustable threshold value.
Preferably, determining the condition of the small-scale sample set S0 scale foundation is: it is not high using training cost, and guarantee
There is certain nicety of grading with the SVM classifier F0 that training obtains.
Preferably, there are two functions for the adjusting tool of the threshold epsilon: controlling the scale for about subtracting collection S and influences final divide
The nicety of grading of class device.
Compared with prior art, the beneficial effects of the present invention are:
(1) by retaining sample and error sample near initial SVM classifier supporting vector hyperplane, make final obtain
To the collection scale that about subtracts be obviously reduced, so as to keep compared with high-class precision under the premise of be obviously shortened the training time;Together
When, since supporting vector reduces, the classification time also accordingly shortens, and this strategy is for the Large-Scale Training Data Set with multiclass failure
Troubleshooting issue effect highly significant.
(2) parameter selection and the critical issue during realization for having inquired into SMO algorithm are the algorithm of this great potential
Practical application in fault diagnosis has established solid foundation.
(3) simulation example shows that the algorithm of proposition is effective, reliable, easy to accomplish, thus the basis with engineer application,
A kind of good method can be provided for the fault diagnosis of Large-Scale Training Data Set.
Detailed description of the invention
In order to illustrate the technical solution of the embodiments of the present invention more clearly, will be described below to embodiment required
Attached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for ability
For the those of ordinary skill of domain, without creative efforts, it can also be obtained according to these attached drawings other attached
Figure.
Fig. 1 is inventive engine failure modes flow chart.
Fig. 2 is the present invention (NON and LP failure) table compared with basic SVM classifier.
Fig. 3 is that the present invention starts binary classifier training result.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts all other
Embodiment shall fall within the protection scope of the present invention.
Fig. 1-3 is please referred to, the present invention provides a kind of technical solution: the failure based on Large-Scale Training Data Set support vector machines is examined
Disconnected method, including Large-Scale Training Data Set L and support vector machines, it is characterised in that: first with a small-scale sample set S0
Training obtains an initial SVM classifier F0, is then trimmed with SVM classifier F0 to Large-Scale Training Data Set L, after trimming
It obtains lesser about subtract of a scale and collects S, then be trained to obtain final fault grader with collection S is about subtracted.
SMO rudimentary algorithm
The SVM algorithm classified to two class fault datas is introduced first.
For training sample (xi,yi)∈Rl× {+1, -1 }, i=1,2 ..., n, the optimization problem that SVM is solved are
Here, the meaning of each parameter is consistent in usual document.
The decision rule of test samples x is
The every step of SMO selects two element optimizations, without loss of generality, it is assumed that two selected elements are α1And α2.Optimization problem
(1) it can be realized by following calculating:
κ=K (x1,x2)+K(x2,x2)-2K(x1,x2)
Wherein, the calculated value of the expression previous step of " old ", the new calculated value that the expression of mark " new " obtains are marked.
Several key technologies that SMO algorithm is realized
It, will E in calculating formula (3) every time when being updated calculatingi(i=1,2) and the objective function W in formula (1)
(α).If directly being calculated by formula (1), (3), need, when training sample number is big, to calculate work using all training samples
It measures not small.In order to reduce computing cost, their calculation formula is rewritten herein, it is not necessary to it directly calculates every time, and
It is to make full use of last calculated result, iterative calculation.
1, the update of threshold value b
It obtainsAfterwards, if havingThen first according to the first of the threshold value b, b in KKT condition newer (2)
Initial value is taken as zero:
If simultaneously alsoThen
2、EiIterative calculation
EiIt is the difference between anticipation function output and desired output, can iterates to calculate as the following formula
The operation times of formula (3) are O (n) magnitude;Compared with formula (3), last calculated result is utilized in formula (4), only
It needs to be calculated according to two samples again, needs to do 4 multiplication, 6 sub-additions, it is clear that in the case where training sample is larger,
Income is larger.
3, the update of objective function W (α) calculates
According to formula (1) calculating target function, operation times are O (n2) magnitude;And the operation times of formula (5) are O (n) amount
Grade, it is clear that under Large-Scale Training Data Set, formula (5) is much less amount of calculation than formula (1).
In addition, also to calculate α in SMO iterative process2=L or α2Target function value at=H, at this point, also need to only calculate
Increment is added with last calculated result.Objective function increment is
The trimming strategy of Large-Scale Training Data Set
For Large-Scale Training Data Set L, an initial svm classifier is obtained with a small-scale sample set S0 training first
Device F0, then trims L with F0, and lesser about subtract of a scale is obtained after trimming and collects S, then is trained to obtain most with S
Whole classifier, we are known as NL-SVM classifier.Specific step is as follows:
(1) a small-scale sample set S0 is randomly selected from extensive sample set L, obtains initial SVM with S0 training
The scale of classifier F0.S0 determines according to two conditions, first, train cost not high using it;Second, guarantee is trained with it
Obtained classifier F0 has certain nicety of grading.Experiment shows what this was still easily done.
(2) L is trimmed, collection S is about subtracted, then obtains final classifier with S training.If the Optimal Separating Hyperplane of F0 is H,
Then have for any one sample x of L if x is r at a distance from H
R=g (x)/| | w | |,
Wherein w is the weight vector of SVM, and the distance of hyperplane to H are respectively ± 1/ where two class supporting vectors | | w |
| the specific pruning method of is: if x retains this sample by F0 mistake point;If x is correctly classified by F0, when 1- ε≤| g (x) |≤1+ ε
When, retain x, otherwise delete it, wherein 0 < ε < 1 is adjustable threshold value.The regulatory function of threshold epsilon is that control about subtracts collection
Scale and the nicety of grading for influencing final classification device.Near-optimization classifier can be obtained by adjusting ε in practical application, and
The adjusting of ε is not difficult.
Here, different from other methods, the sample divided by F0 mistake is also received into final training set S, main reason
It is: if any one sample x illustrates that preliminary classification device F0 does not have the characteristic of correct reflected sample x by F0 mistake point, that is,
It says, x is the nonredundancy sample for being capable of providing classification new information for final training set S, it is to classification the result is that having
It influences, should not be deleted.X is by the correct sorted trimming strategy of F0, in short, being exactly the branch left with preliminary classification device
Hold the closer sample of vector distance, this strategy has caught the essence of support vector machines, i.e., SVM classifier only with supporting vector
It is related, it is unrelated with other samples.By using this trimming strategy, the sample left will be very helpful to classifying, and delete
What the sample removed help classification without.
One concrete application of the present embodiment are as follows: the gas path component fault diagnosis of a birotary burbine jet engine
Problem, Fault characteristic parameters are 8, comprising: high and low pressure revolving speed NH,NL;High and low pressure compressor delivery pressure P2,Px;It is high and low
Press turbine outlet pressure Py,P4;High-pressure compressor outlet temperature T2;Low-pressure turbine exit temperature T4.Above-mentioned parameter is all made of deviation
The relative quantity expression of normal condition, dimensionless.The failure of engine air circuit unit has 5 classes, is respectively: fault-free (NON), low pressure
Compressor failure (LP), high-pressure compressor failure (HP), low-pressure turbine failure (LT) and high-pressure turbine failure (HT).
In addition 2000 groups of the sample data of every class failure is used to examine for 500 groups wherein 1500 groups are used to train.In this way, instruction
When practicing two class SVM, training set L shares 3000 groups of samples, then randomly selects 200 from L and (organize initial as S0 training in every each 100) of class
Classifier F0, test samples have 1000 (every each 500) groups of class.The process of engine failure classification is as shown in Figure 1.
Using Gauss kernel function
(1) preliminary classification device F0
By taking two class failure of NON and LP as an example.Kernel functional parameter σ2=0.8, SVM regularization parameter C=100.At this point, obtaining
Preliminary classification device F0, nicety of grading 89.2% meets the requirements.
(2) NL-SVM classifier
In threshold epsilon=0.5, preferable training result is obtained.In order to compare, calculated result of the invention and substantially
The calculated result of SVM is listed in table 1 simultaneously.When calculating, the training of two kinds of classifiers all uses SMO algorithm to realize, the related ginseng of SMO
Number is as follows:
Tol is the tolerance grade for determining whether to meet KKT condition, takes tol=10-2。
μ is the α for determining to calculate twice2The threshold value whether given up should be the positive number of very little, take μ=10-3.If calculating twice
α2Absolute value of the difference be greater than μ, just update α2, otherwise, give up the α newly calculated2, retain last calculated value.
Compared with basic SVM classifier, the NL-SVM classifier using reduction algorithm of the present invention is keeping nicety of grading slightly
Under conditions of height, 939 groups are reduced to due to finally participating in trained sample number, the training time used is less than the former 1/
15, training speed improves clearly.At the same time, the supporting vector number that NL-SVM is obtained is also reduced to 128, only substantially
The 1/3 of SVM, this is beneficial to shorten the classification time.
(3) engine failure classification results
Need to construct 10 to classify to 5 class failures according to " one-against-one " multivariate classification method
Binary SVM classifier is simultaneously decided by vote accordingly.It is similar with NON-LP binary classifier, trained by other fault sample data
To binary NL-SVM classifier also all achieve satisfied effect.Finally, to the event of 5 class such as engine NON, LP, HP, LT, HT
The classification results of 2500 groups of test samples of barrier are that 2419 samples are correctly classified, and only 81 samples are by mistake point, and classification is just
True rate reaches 96.8%.
Since multi-class fault classification needs to train multiple binary classifiers, using algorithm proposed by the present invention, total training
It is more significant that time shortens effect.
In the description of this specification, the description of reference term " one embodiment ", " example ", " specific example " etc. means
Particular features, structures, materials, or characteristics described in conjunction with this embodiment or example are contained at least one implementation of the invention
In example or example.In the present specification, schematic expression of the above terms may not refer to the same embodiment or example.
Moreover, particular features, structures, materials, or characteristics described can be in any one or more of the embodiments or examples to close
Suitable mode combines.
Present invention disclosed above preferred embodiment is only intended to help to illustrate the present invention.There is no detailed for preferred embodiment
All details are described, are not limited the invention to the specific embodiments described.Obviously, according to the content of this specification,
It can make many modifications and variations.These embodiments are chosen and specifically described to this specification, is in order to better explain the present invention
Principle and practical application, so that skilled artisan be enable to better understand and utilize the present invention.The present invention is only
It is limited by claims and its full scope and equivalent.
Claims (6)
1. based on the method for diagnosing faults of Large-Scale Training Data Set support vector machines, including Large-Scale Training Data Set L and support vector machines
SVM, it is characterised in that: obtain an initial SVM classifier F0 with a small-scale sample set S0 training first, then use
The SVM classifier F0 trims the Large-Scale Training Data Set L, and lesser about subtract of a scale is obtained after trimming and collects S, then
It is trained to obtain final fault grader with the collection S that about subtracts;
For training sample (xi,yi)∈Rl× {+1, -1 }, i=1,2 ..., n, carry out two class failure modes when, it is described support to
Amount machine SVM solve optimization problem be
The decision rule of test samples x is
Since the training sample is larger, the expense of amount of calculation and memory space is very big, optimization problem direct solution
It is difficult to carry out, needs to solve using SMO algorithm.
2. the method for diagnosing faults according to claim 1 based on Large-Scale Training Data Set support vector machines, it is characterised in that:
It is described that specific step is as follows:
(1) a small-scale sample set S0 is randomly selected from the extensive sample set L, is instructed with the small-scale sample set S0
Get initial SVM classifier F0;
(2) the extensive sample set L is trimmed, collection S is about subtracted, is then about subtracted collection S training with described and is obtained final classification
Device;
If the Optimal Separating Hyperplane of the SVM classifier F0 is H, for any one sample x of the extensive sample set L, if x
With at a distance from H be r, then have
R=g (x)/| | w | |,
Wherein w is the weight vector of the support vector machines, and the distance of hyperplane to H are respectively where two class supporting vectors
±1/||w||。
3. the method for diagnosing faults according to claim 1 based on Large-Scale Training Data Set support vector machines, it is characterised in that:
The concrete methods of realizing solved using the SMO algorithm is as follows:
κ=K (x1,x2)+K(x2,x2)-2K(x1,x2)
Wherein, the calculated value of the expression previous step of " old ", the new calculated value that the expression of mark " new " obtains are marked;
It, will E in calculating formula (3) every time when being updated calculatingiObjective function W (α) in (i=1,2) and formula (1), if directly
It connects and is calculated by formula (1), (3), need to utilize all training samples, when the training sample number is big, amount of calculation
It is not small, in order to reduce computing cost, their calculation formula is rewritten, it is not necessary to it directly calculates every time, but sufficiently benefit
With last calculated result, iterative calculation;
(1) update of threshold value b
It obtainsAfterwards, if havingThen first according to the initial value of threshold value b, b in KKT condition newer (2)
It is taken as zero:
If simultaneously alsoThen
(2)EiIterative calculation
EiIt is the difference between anticipation function output and desired output, can iterates to calculate as the following formula
The operation times of formula (3) are O (n) magnitude;Compared with formula (3), last calculated result is utilized in formula (4), it is only necessary to weight
New root is calculated according to two samples, needs to do 4 multiplication, 6 sub-additions, it is clear that in the case where training sample is larger, income
It is larger;
(3) update of objective function W (α) calculates
According to formula (1) calculating target function, operation times are O (n2) magnitude;And the operation times of formula (5) are O (n) magnitude, are shown
So, at the extensive sample set L, formula (5) is much less amount of calculation than formula (1);
In addition, also to calculate α in iterative process2=L or α2Target function value at=H, at this point, increment also need to be only calculated, with
Last calculated result is added, and objective function increment is
4. the method for diagnosing faults according to claim 2 based on Large-Scale Training Data Set support vector machines, it is characterised in that:
Trimming the extensive specific pruning method of sample set L is: if x retains this sample by the SVM classifier F0 mistake point;If x
Correctly classified by the SVM classifier F0, when 1- ε≤| g (x) | when≤1+ ε, retain x, otherwise delete it, wherein 0 < ε < 1 be can
With the threshold value of adjustment.
5. the method for diagnosing faults according to claim 4 based on Large-Scale Training Data Set support vector machines, which is characterized in that
Determining the condition of the small-scale sample set S0 scale foundation is: it is not high using training cost, and guarantee the institute obtained with training
Stating SVM classifier F0 has certain nicety of grading.
6. the method for diagnosing faults according to claim 2 based on Large-Scale Training Data Set support vector machines, which is characterized in that
There are two functions for the adjusting tool of the threshold epsilon: the control scale for about subtracting collection S and the nicety of grading for influencing final classification device.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910062336.3A CN109816016A (en) | 2019-01-23 | 2019-01-23 | Method for diagnosing faults based on Large-Scale Training Data Set support vector machines |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910062336.3A CN109816016A (en) | 2019-01-23 | 2019-01-23 | Method for diagnosing faults based on Large-Scale Training Data Set support vector machines |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109816016A true CN109816016A (en) | 2019-05-28 |
Family
ID=66604893
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910062336.3A Pending CN109816016A (en) | 2019-01-23 | 2019-01-23 | Method for diagnosing faults based on Large-Scale Training Data Set support vector machines |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109816016A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110377732A (en) * | 2019-06-27 | 2019-10-25 | 江苏大学 | A method of the text classification based on sample scaling |
CN110601578A (en) * | 2019-09-24 | 2019-12-20 | 西南交通大学 | Space vector modulation method with nearest level equivalence |
CN111337263A (en) * | 2020-02-12 | 2020-06-26 | 中国民航大学 | Fault diagnosis method for engine turbine disk |
CN112085060A (en) * | 2020-08-07 | 2020-12-15 | 中国民航大学 | Dual-polarization meteorological radar precipitation particle classification method and device based on SVT-DTSVMs |
-
2019
- 2019-01-23 CN CN201910062336.3A patent/CN109816016A/en active Pending
Non-Patent Citations (1)
Title |
---|
徐启华等: "基于大规模训练集SVM的发动机故障诊断", 《航空动力学报》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110377732A (en) * | 2019-06-27 | 2019-10-25 | 江苏大学 | A method of the text classification based on sample scaling |
CN110601578A (en) * | 2019-09-24 | 2019-12-20 | 西南交通大学 | Space vector modulation method with nearest level equivalence |
CN110601578B (en) * | 2019-09-24 | 2021-02-12 | 西南交通大学 | Space vector modulation method with nearest level equivalence |
CN111337263A (en) * | 2020-02-12 | 2020-06-26 | 中国民航大学 | Fault diagnosis method for engine turbine disk |
CN112085060A (en) * | 2020-08-07 | 2020-12-15 | 中国民航大学 | Dual-polarization meteorological radar precipitation particle classification method and device based on SVT-DTSVMs |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109816016A (en) | Method for diagnosing faults based on Large-Scale Training Data Set support vector machines | |
Ferwerda et al. | Kernel-based regularized least squares in R (KRLS) and Stata (krls) | |
Khoshgoftaar et al. | Classification tree models of software quality over multiple releases | |
CN109472057B (en) | Product processing quality prediction device and method based on cross-process implicit parameter memory | |
CN106933105B (en) | Track under confined condition updates integrated forecasting Iterative Learning Control Algorithm | |
CN110688288A (en) | Automatic testing method, device, equipment and storage medium based on artificial intelligence | |
Huang et al. | Improved trajectory similarity-based approach for turbofan engine prognostics | |
Bamakan et al. | A novel feature selection method based on an integrated data envelopment analysis and entropy model | |
CN112232426A (en) | Training method, device and equipment of target detection model and readable storage medium | |
Manupati et al. | Adaptive production control system for a flexible manufacturing cell using support vector machine-based approach | |
Ferwerda et al. | KRLS: A Stata package for kernel-based regularized least squares | |
US5386498A (en) | Expert system having a knowledge base compiler and a certainty factor manager | |
CN114757286A (en) | Multi-class fault data generation method based on conditional countermeasure generation network | |
Shi et al. | Wasserstein distance based multi-scale adversarial domain adaptation method for remaining useful life prediction | |
CN114300068A (en) | Oil dry point prediction calculation method and device, computer equipment and storage medium | |
CN109960146A (en) | The method for improving soft measuring instrument model prediction accuracy | |
Dovbysh et al. | Estimation of Informativeness of Recognition Signs at Extreme Information Machine Learning of Knowledge Control System. | |
Guo et al. | Data mining and application of ship impact spectrum acceleration based on PNN neural network | |
Nagode et al. | Fault detection and classification with the rebmix R package | |
Jurisica et al. | Improving performance of case-based classification using context-based relevance | |
CN113159419A (en) | Group feature portrait analysis method, device and equipment and readable storage medium | |
Yao | Feature selection based on SVM for credit scoring | |
CN118070246B (en) | Predictive maintenance method for intelligent factory Internet of things equipment | |
CN111103789B (en) | Source network load comprehensive energy scheduling analysis method, system and terminal equipment | |
WO2023085195A1 (en) | Model generation device, model generation method, and data estimation device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190528 |
|
RJ01 | Rejection of invention patent application after publication |