CN106482967A - A kind of Cost Sensitive Support Vector Machines locomotive wheel detecting system and method - Google Patents

A kind of Cost Sensitive Support Vector Machines locomotive wheel detecting system and method Download PDF

Info

Publication number
CN106482967A
CN106482967A CN201610880518.8A CN201610880518A CN106482967A CN 106482967 A CN106482967 A CN 106482967A CN 201610880518 A CN201610880518 A CN 201610880518A CN 106482967 A CN106482967 A CN 106482967A
Authority
CN
China
Prior art keywords
module
support vector
sensitive support
cost sensitive
vector machines
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610880518.8A
Other languages
Chinese (zh)
Other versions
CN106482967B (en
Inventor
何静
刘林凡
张昌凡
谭海湖
赵凯辉
孙健
豆兵兵
刘光伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan University of Technology
Original Assignee
Hunan University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan University of Technology filed Critical Hunan University of Technology
Priority to CN201610880518.8A priority Critical patent/CN106482967B/en
Publication of CN106482967A publication Critical patent/CN106482967A/en
Application granted granted Critical
Publication of CN106482967B publication Critical patent/CN106482967B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01MTESTING STATIC OR DYNAMIC BALANCE OF MACHINES OR STRUCTURES; TESTING OF STRUCTURES OR APPARATUS, NOT OTHERWISE PROVIDED FOR
    • G01M17/00Testing of vehicles
    • G01M17/007Wheeled or endless-tracked vehicles
    • G01M17/013Wheels
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of Cost Sensitive Support Vector Machines locomotive wheel condition detecting system and method, the system includes data preprocessing module, Cost Sensitive Support Vector Machines training module, parameter optimization module, optimal cost sensitive support vector machine sort module, discrimination module and wheel condition output module;The detection method includes totally eight steps, the parameter optimization step adopts TSP question particle cluster algorithm, the algorithm is a kind of Mutation Particle Swarm Optimizer, there is very strong robustness and searching characteristic, the space constantly reduced in iteration is expanded, carry out search in greater room, population diversity is maintained, improve the possibility that algorithm searches out optimal value.

Description

A kind of Cost Sensitive Support Vector Machines locomotive wheel detecting system and method
Technical field
The invention belongs to survey engineering technical field, more particularly, to a kind of Cost Sensitive Support Vector Machines prototype Car wheel detection system and method.
Background technology
SVMs is a kind of practical approach grown up in statistical theory, is specialized under Small Sample Size The theory of machine learning rule, based on structural risk minimization, occupies in pattern-recognition and machine learning field and weighs very much Want status.As SVM can effectively overcome crossing study, owing study, the not high and local pole of generalization ability for other machines learning method The shortcomings of little value, nowadays SVM have been widely used for the fields such as state recognition, fault detect.Prior art is proposed to be based on and is propped up Hold vector machine high speed train wheel idle running Forecasting Methodology, but the method manual modification support vector cassification threshold value, experiment Amount is larger;For the fuzzy support vector machine of data nonbalance, generalization is poor.
In order to heavy loading locomotive safe operation is ensured, detection is carried out to wheel condition and is highly desirable to, locomotive wheel state is examined Examining system require rail level when changing can precise real-time detection locomotive wheel state, when rail level occurs large change, have May produce locomotive skid or idle running phenomenon, if without in time detection and place comprehends and causes larger safety problem.
And in practical application, the sample of the normal operation of wheel often far more than the sample of malfunction, class imbalance Data set often has misclassification and cost, and Cost Sensitive Support Vector Machines are used as a kind of emerging learning machine Device, although classification output can be carried out to data sample, but where still suffering from being left to be desired, its parameter choose be urgently One of problem of solution, the quality of penalty factor, kernel function and kernel functional parameter value directly affect the identification essence of grader Degree and efficiency.
Content of the invention
In order to overcome data sample misclassification that traditional Cost Sensitive Support Vector Machines are present and cost problem not etc., And then be a kind of intelligent optimizing algorithm based on colony in view of TSP question particle cluster algorithm, with very strong robustness, kind The advantages of group's diversity and searching characteristic.Small sample is had the characteristics that according to heavy loading locomotive wheel condition gathered data, and The relation of locomotive adhesion coefficient and creep speed assumes the feature of nonlinearity, can be using Cost Sensitive Support Vector Machines come right Wheel condition sets up disaggregated model.Thus, the present invention proposes the cost-sensitive of TSP question particle cluster algorithm Optimal Parameters The method classified to heavy loading locomotive wheel condition by SVMs.For heavy loading locomotive, in actual motion, normal condition is remote The unbalanced problem of the data that formed more than malfunction, sets up locomotive wheel state classification using Cost Sensitive Support Vector Machines Model, and relevant parameter is optimized using TSP question particle cluster algorithm, realize heavy loading locomotive wheel condition classification and Detection.
Specifically, the invention provides the optimal cost sensitivity that a kind of utilization TSP question particle cluster algorithm is searched out is propped up Hold vector machine model, it is to avoid blindness and inaccuracy that parameter is selected, while detection classification rate is improve, cost-sensitive is propped up Hold vector machine to classify the unbalanced sample two of classification with remarkable classification performance,.
Technical scheme is as follows:
A kind of Cost Sensitive Support Vector Machines locomotive wheel condition detecting system, the system include data prediction mould Block, Cost Sensitive Support Vector Machines training module, parameter optimization module, optimal cost sensitive support vector machine sort module, sentence Other module and wheel condition output module;
The data preprocessing module output end is connected with training module input, the Cost Sensitive Support Vector Machines instruction Practice module output end to be connected with parameter optimization module input, the parameter optimization module output end is sensitive with optimal cost to prop up Hold the connection of vector machine sort module output end, the optimal cost sensitive support vector machine sort module output end and discrimination module Input connects, and the discrimination module output end is connected with parameter optimization module input and wheel condition output module input Connect.
Further, locomotive wheel state is divided into place of safety and faulty section two types by the pretreatment module;Choose Data source, carries out feature extraction to data source, and the characteristic variable that extracts is used as real-time sample data set, real-time sample data set It is divided into training set and test set, training set sample and test set sample is normalized.
Further, the parameter optimization module adopt TSP question particle cluster algorithm, find cost-sensitive support to Two penalty factors of amount and kernel function.
Further, TSP question particle cluster algorithm is to solve for a kind of effective ways of problems, its program reality Now abnormal simple, need the parameter of adjustment few, without the need for any gradient information, in function optimization, Combinatorial Optimization and many works Journey field is all widely used.
Particle cluster algorithm is a kind of heuritic approach of the simulation flock of birds characteristics of motion, and its more new formula is:
In formula, the position of i-th particle can be x with vector representationi=(xi1,xi2,…,xid,…,xiD), xid∈ [xmin,d,xmax,d], wherein d=1,2 ... D, D are the dimension of solution space, xmin,dAnd xmax,X is the restriction of d dimension space respectively Border;The speed of particle is represented by vi=(vi1,vi2,…,vid,…,viD), its speed maximum is limited to vi=(vmax,1, vmax,2,…,vmax,d,…,vmax,D).The optimal spatial position of i-th particle is designated as local optimum pi=(pi1,pi2,…, pid,…,piD), in same population, the optimum particle in position is designated as global optimum pg;ω is inertia weight, and k is current iteration time Number;c1And c2Claim acceleration factor, r1And r2For the random number being distributed between [0,1].
Particle cluster algorithm convergence is fast, with very strong versatility, but while there is easy Premature Convergence, search precision relatively The shortcomings of low, later stage iteration is inefficient, in order to improve the performance of particle cluster algorithm, introduces a kind of TSP question operation, with The feature that determine particle cluster algorithm Premature Convergence and search capability increase with algebraically and be gradually reduced is solved, its arthmetic statement is:
IfThen
Wherein, v is initialized speed again, r3Represent equally distributed random number in [0,1] is interval.If r3> 0.5, then k=-1, if r3≤ 0.5, then k=1.
Under the property of equation (1) and (2) is not changed, it is reduced to for convenience:
vk+1=k0vk+c1r1(p1-xk)+c2r2(p2-xk) (5)
xk+1=xk+vk+1(6)
Choose
Take following simplification measure:
So equation (5) and (6) can be reduced to:
vk+1=k0vk+k1(p-xk) (10)
xk+1=xk+vk+1(11)
Y can be write as in equation (10) and (11)k+1=Myk+ Np form is wherein
In TSP question population searching process, it will converge to as following formula:
yi=[xivi]T(13)
Wherein xi=p, vi=0.
Particle seek the ginseng time depending on matrix M characteristic root λ12, the characteristic equation of matrix M is as follows:
λ2-(k0-k1+1)λ+k0=0 (14)
The sufficient and necessary condition that formula (13) is present is the characteristic root λ of matrix M12Both less than 1, then can pass through solution formula (14) following condition is obtained:
k0< 1, k1> 0,2k0-k1+ 2 > 0 (15)
ByRepresent equally distributed random number in [0,1] is interval, it is known that adaptive strain proposed by the present invention Different particle cluster algorithm meets the constraints of (15).
Further, the training module of the Cost Sensitive Support Vector Machines is processed to training set sample;The generation The supporting vector that valency sensitive support vector machine sort module storage Cost Sensitive Support Vector Machines are obtained after being trained to sample Storehouse;The discrimination module is estimated to the training aids performance of Cost Sensitive Support Vector Machines, if it is accurate to reach qualified classification Rate, then export wheel condition output module.
Further, the Cost Sensitive Support Vector Machines locomotive wheel condition detecting system of a kind of described parameter optimization Detection method, comprise the following steps:
The first step, locomotive wheel state is divided into place of safety and faulty section two types;
Second step, chooses data source, carries out feature extraction to data source, and the characteristic variable that extracts is used as real-time sample number According to collection, real-time sample data set is divided into two subsets of training set and test set;
3rd step, data preprocessing module are pre-processed to real-time sample data set, and training set and test set sample are entered Then training set after normalization is propped up by row normalized so as to numerical value of the scope only between 0~1 as cost-sensitive The input vector for holding vector machine training module is processed;
4th step, training module are processed to the training set after normalization, through the instruction of Cost Sensitive Support Vector Machines Practice, it will obtain one group of supporting vector, be then stored in database;
5th step, parameter optimization module is to the nuclear parameter of Cost Sensitive Support Vector Machines and two penalty factors with adaptive Mutation Particle Swarm Optimizer is answered to carry out global optimizing;Obtain optimal cost sensitive support vector machine;
6th step, sort module are classified to the real-time sample data in test set;And store cost-sensitive support to The supporting vector storehouse that amount machine is obtained after being trained to real-time sample data set;
7th step, data accuracy differentiate, the characteristic variable in test set are input into optimal cost sensitivity supporting vector Tested in machine model, optimal cost sensitive support vector machine model exports recognition result, recognition result is safety or fault State;The recognition result of output is compared with locomotive actual wheel state, calculates the accuracy rate of identification;
8th step, wheel condition are exported;Wheel condition is exported if the accuracy rate of identification meets requirement, otherwise enter again OK.
Further, due in the heavy loading locomotive data sample that collects each variable-difference larger, setting up cost-sensitive To training set and forecast set samples normalization to [0,1] interval, the normalization algorithm process before support vector cassification model For:
Wherein x and x ' is respectively the value before and after normalizing.
Further, gaussian radial basis function, linear kernel function, Polynomial kernel function or two-layer sense are adopted in the 6th step Know that machine kernel function is classified to sample data.
Further, in the 5th step, parameter optimization is comprised the following steps:
1. TSP question particle swarm parameter is initialized;
2. fitness is calculated;
3. individual extreme value and colony's extreme value are found;
4. particle rapidity updates and location updating;
5., after particle updates every time, particle is reinitialized with certain probability;
6. the calculating of particle fitness is re-started;
7. pair each particle, its adaptive value is made comparisons with the best position which lives through, if preferably by which most For currently best position, the renewal of individual extreme value and colony's extreme value is completed;
8. end condition inspection, if being unsatisfactory for termination criteria, going to step 4, otherwise going to next step;The termination Standard includes that predefined iterations or TSP question particle cluster algorithm preset precision;
9. two penalty factors during output termination iterations in particle and kernel function.
Further, optimal cost sensitive support vector machine parameter, warp are found using TSP question particle cluster algorithm Training set and test set pretreatment and Fitness analysis operation is crossed, particle rapidity uses TSP question particle cluster algorithm weight after updating New initialization particle, judges whether whether meet termination precision or current iteration number of times is equal to maximum iteration time, if meeting Optimum misclassification cost parameter C of output1,C2With optimum nuclear parameter width value σ;TSP question population calculation is otherwise re-started Method;
Further, TSP question particle cluster algorithm more new formula is:
IfThen
Wherein, v is initialized speed again, r3Represent equally distributed random number in [0,1] is interval.If r3> 0.5, then k=-1, if r3≤ 0.5, then k=1.
Description of the drawings
Fig. 1 is Cost Sensitive Support Vector Machines locomotive wheel condition detecting system schematic diagram;
Fig. 2 is Cost Sensitive Support Vector Machines locomotive wheel condition detection method step schematic diagram;
Fig. 3 is TSP question particle cluster algorithm step schematic diagram;
Fig. 4 unbalanced data classification schematic diagram;
Fig. 5 is TSP question particle cluster algorithm svm classifier visualization figure;
Fig. 6 is the classification results figure of test set;
Fig. 7 seeks the fitness curve map of ginseng for TSP question particle cluster algorithm;
Fig. 8 is TSP question particle cluster algorithm CS-SVM classification visualization figure;
Fig. 9 is test set classification results figure;
Figure 10 finds the fitness curve map of optimal parameter for genetic algorithm;
Figure 11 is genetic algorithm CS-SVM classification visualization figure;
Figure 12 is test set classification results figure;
Figure 13 is the ROC curve comparison diagram of SVM and CS-SVM.
Specific embodiment
The utility model is further described with reference to specific embodiment.Wherein, being for illustration only property of accompanying drawing Illustrate, expression is only schematic diagram, rather than pictorial diagram, it is impossible to be interpreted as the restriction to this patent;In order to this reality is better described With new embodiment, some parts of accompanying drawing have omission, zoom in or out, and do not represent the size of actual product;To ability For field technique personnel, some known features and its explanation in accompanying drawing may be omitted and be will be understood by.
Embodiment 1
As shown in figure 1, a kind of Cost Sensitive Support Vector Machines locomotive wheel condition detecting system, including data prediction mould Block 1, Cost Sensitive Support Vector Machines training module 2, parameter optimization module 3, optimal cost sensitive support vector machine sort module 4th, discrimination module 5 and wheel condition output module 6;
1 output end of data preprocessing module is connected with 2 input of training module, the Cost Sensitive Support Vector Machines training 2 output end of module is connected with 3 input of parameter optimization module, and 3 module output end of the parameter optimization is sensitive with optimal cost The connection of 4 output end of support vector cassification module, 4 output end of optimal cost sensitive support vector machine sort module and differentiation 5 input of module connects, and 5 output end of the discrimination module is defeated with 3 module input of parameter optimization and wheel condition output module 6 Enter end connection.
As shown in Fig. 2 a kind of detection method of Cost Sensitive Support Vector Machines locomotive wheel condition detecting system, including with Lower step:
The first step, locomotive wheel state is divided into place of safety and faulty section two types;
Second step, chooses data source, carries out feature extraction to data source, and the characteristic variable that extracts is used as real-time sample number According to collection, real-time sample data set is divided into two subsets of training set and test set;
3rd step, data preprocessing module are pre-processed to real-time sample data set, and training set and test set sample are entered Then training set after normalization is propped up by row normalized so as to numerical value of the scope only between 0~1 as cost-sensitive The input vector for holding vector machine training module is processed;
4th step, training module are processed to the training set after normalization, through the instruction of Cost Sensitive Support Vector Machines Practice, it will obtain one group of supporting vector, be then stored in database;
5th step, parameter optimization module is to the nuclear parameter of Cost Sensitive Support Vector Machines and two penalty factors with adaptive Mutation Particle Swarm Optimizer is answered to carry out global optimizing;Obtain optimal cost sensitive support vector machine;
6th step, sort module are classified to the real-time sample data in test set;And store cost-sensitive support to The supporting vector storehouse that amount machine is obtained after being trained to real-time sample data set;
7th step, data accuracy differentiate, the characteristic variable in test set are input into optimal cost sensitivity supporting vector Tested in machine model, optimal cost sensitive support vector machine model exports recognition result, recognition result is safety or fault State;The recognition result of output is compared with locomotive actual wheel state, calculates the accuracy rate of identification;
8th step, wheel condition are exported;Wheel condition is exported if the accuracy rate of identification meets requirement, otherwise enter again OK.
Embodiment 2
Data are unbalanced to refer to that two classification sample sizes for participating in classification are widely different, can cause Optimal Separating Hyperplane deflection Problem.
In Fig. 4, circular point represents positive class, and square point is represented bears class.H, H1 and H2 are calculated according to given sample set The classifying face for coming, as the sample for bearing class is little, so it was that the sample point for bearing class is not provided originally to have part, H4 in such as Fig. 4 The square dot of upper two, if providing the two points, corresponding classifying face should be H1, H3 and H4, it is clear that and knot before Fruit difference is very big.Now due to the phenomenon of deflection is present so that the positive class more than quantity can be the direction that classifies towards negative class " pushing away ", thus have impact on the accuracy of result.When solving unbalanced data sample classification problem with standard SVMs, easily lead Cause division oversubscription class plane to shift, the Cost Sensitive Support Vector Machines different with punishment parameter are therefore introduced, to sample The more class of collection adopts less punishment parameter, and less to sample set another kind of using larger punishment parameter.
By adhesion coefficient μ and creep speed vsInput quantity x as Cost Sensitive Support Vector Machines modeli=[μ, vs]i, Locomotive wheel state tag is used as output quantity y of modeli∈ {+1, -1 }, therefore training sample set isCost is quick Sense SVMs passes through Nonlinear MappingFunction is mapped to the sample x of the input space in high-dimensional feature space H, and Classification function is set up using structural risk minimization in high-dimensional feature space H
In formula, ω is weight vector, ω ∈ H;B is biasing, b ∈ R;Y ' is predicted value.
For Cost Sensitive Support Vector Machines, inhomogeneity sample misses timesharing and is endowed different misclassification costs C1,C2, profit With structural risk minimization, Cost Sensitive Support Vector Machines optimization problem is:
s.t.yi(wTφ(xi)+b)≥1-ξiI=1,2 ..., n
ξi>=0 i=1,2 ..., n
Ci>=0 i=1,2
In formula, C1,C2It is misclassification cost parameter, w is used for characterization model complexity,For error yi∈ {+1, -1 }, I+={ i:yi=+1 }, I-={ i:yi=-1 }.
During solution formula (2), lagrange function is constructed by the lagrange multiplier for introducing non-negative, then problem is converted into and asks The saddle point of lagrange formula, it is zero respectively each variable in formula to be sought partial derivative and makes which, using the principle of duality, is converted into solution right Even problem.
s.t.0≤αi≤C1i∈I+
0≤αi≤C2i∈I-
In formula,For lagrange multiplier, C1,C2It is misclassification cost parameter, K (xi,xj) it is kernel function, yi∈{+ 1,-1}I+={ i:yi=+1 }, I-={ i:yi=-1 }.
It is possible thereby to by choosing suitable parameter C1,C2With kernel function K (xi,xj), solve this dual problem.
In the present invention, Cost Sensitive Support Vector Machines disaggregated model it needs to be determined that parameter be misclassification cost parameter C1,C2With nuclear parameter δ, its selection principle and value directly have a great impact to category of model levels of precision.In order that cost is quick Sense SVMs has the performance of optimum, and the present invention is using TSP question particle cluster algorithm to Cost Sensitive Support Vector Machines mould The mistake of type divides cost parameter C1,C2Optimizing is carried out with nuclear parameter δ, it is to avoid the blindness of artificial selection parameter.TSP question particle Colony optimization algorithm is to solve for a kind of effective ways of problems, and its program is realized extremely simply needing the parameter of adjustment few, Without the need for any gradient information, all it is widely used in function optimization, Combinatorial Optimization and many engineering fields.Population Algorithm is a kind of heuritic approach of the simulation flock of birds characteristics of motion, and its more new formula is:
In formula, the position of i-th particle can be x with vector representationi=(xi1,xi2,…,xid,…,xiD), xid∈ [xmin,d,xmax,d], wherein d=1,2 ... D, D are the dimension of solution space, xmin,dAnd xmax,dIt is the restriction side of d dimension space respectively Boundary;The speed of particle is represented by vi=(vi1,vi2,…,vid,…,viD), its speed maximum is limited to vi=(vmax,1, vmax,2,…,vmax,d,…,vmax,D).The optimal spatial position of i-th particle is designated as local optimum pi=(pi1,pi2,…, pid,…,piD), in same population, the optimum particle in position is designated as global optimum pg;ω is inertia weight, and k is current iteration time Number;c1And c2Claim acceleration factor, r1And r2For the random number being distributed between [0,1].
Particle cluster algorithm convergence is fast, with very strong versatility, but while there is easy Premature Convergence, search precision relatively The shortcomings of low, later stage iteration is inefficient.In order to the performance of PSO algorithm is improved, a kind of mutation operation is introduced, in the hope of solving particle Group's algorithm Premature Convergence and search capability increase and the feature that is gradually reduced with algebraically, and its arthmetic statement is:
IfThen
Wherein, v is initialized speed again, r3Represent equally distributed random number in [0,1] is interval.If r3> 0.5, then k=-1, if r3≤ 0.5, then k=1.
Under the property of equation (4) and (5) is not changed, it is reduced to for convenience:
vk+1=k0vk+c1r1(p1-xk)+c2r2(p2-xk) (8)
xk+1=xk+vk+1(9)
Choose
Following measures are taken by equation simplification:
So equation (8) and (9)) can be reduced to:
vk+1=k0vk+k1(p-xk) (13)
xk+1=xk+vk+1(14)
Y can be write as in equation (13) and (14)k+1=Myk+ Np form is wherein
In TSP question population searching process, it will converge to as following formula:
yi=[xivi]T(16)
Wherein xi=p, vi=0
Particle seek the ginseng time depending on matrix M characteristic root λ12, the characteristic equation of matrix M is as follows:
λ2-(k0-k1+1)λ+k0=0 (17)
The sufficient and necessary condition that formula (16) is present is the characteristic root λ of matrix M12Both less than 1, then can pass through solution formula (17) following condition is obtained:
k0< 1, k1> 0,2k0-k1+ 2 > 0 (18)
ByRepresent equally distributed random number in [0,1] is interval, it is known that set forth herein TSP question Particle cluster algorithm meets the constraints of (18).
Described on end, combined with TSP question particle cluster algorithm using Cost Sensitive Support Vector Machines sorting technique Heavy loading locomotive wheel condition detecting step is as follows:
(1) data sample normalization.As in the heavy loading locomotive data sample that collects, each variable-difference is larger, setting up Interval to [0,1] to training set and forecast set samples normalization before CS-SVM disaggregated model, computing formula is
In formula, x, x ' are respectively and normalize forward and backward value.
(2) TSP question population optimizing Cost Sensitive Support Vector Machines model parameter.Training set and test set are located in advance Reason, Fitness analysis, selected, particle rapidity reinitializes particle with TSP question particle cluster algorithm after updating, and sentences Break and whether termination precision or current iteration number of times whether is met equal to maximum iteration time, export optimum misclassification generation if meeting Valency parameter C1,C2With optimum nuclear parameter width value σ;TSP question particle cluster algorithm is otherwise re-started.
(3) Cost Sensitive Support Vector Machines model and classification are set up.Optimal parameter is obtained according to (2nd) step, using Gauss Radial basis kernel function trains Cost Sensitive Support Vector Machines disaggregated model, and training set is obtained model to be carried out to test set sample point Class is simultaneously processed to data renormalization.
(4) classification of assessment model performance index classification accuracy rate and time-consuming, such as undesirable, (2nd) step is gone to, weight New settings TSP question particle cluster algorithm parameter.
(5) compare actual value and predicted value, obtain the corresponding classification accuracy of model.
Design parameter optimizing step is as shown in Figure 3:
1. TSP question particle swarm parameter is initialized;
2. fitness is calculated;
3. individual extreme value and colony's extreme value are found;
4. particle rapidity updates and location updating;
5., after particle updates every time, particle is reinitialized with certain probability;
6. the calculating of particle fitness is re-started;
7. pair each particle, its adaptive value is made comparisons with the best position which lives through, if preferably by which most For currently best position, the renewal of individual extreme value and colony's extreme value is completed;
8. end condition inspection, if being unsatisfactory for termination criteria, going to step 4, otherwise going to next step;The termination Standard includes that predefined iterations or TSP question particle cluster algorithm preset precision;
9. two penalty factors during output termination iterations in particle and kernel function.
Embodiment 3
In order to verify the validity of the method, the present invention have chosen heavy loading locomotive wheel condition data set and be tested, Prove the validity and reliability of this method.Firstly the need of the selection for carrying out kernel function in experiment, confirm that the kernel function is being processed The unbalanced two classification problems validity of data, so that it is determined that TSP question particle cluster algorithm can be preferably to cost-sensitive The parameter of support vector cassification model is optimized.
1. the selection of kernel function
At present in engineering practice, the conventional kernel function species of SVM mainly has following 4 kinds:
1) linear kernel function
K(x,xi)=xTxi
2) Polynomial kernel function:
K(x,xi)=(δ xTxi+r)d, δ > 0;
3) gaussian radial basis function:
K(x,xi)=exp (- | | x-xi||2/(2δ2)), δ > 0;
4) two-layer perceptron kernel function:
K(x,xi)=tanh (δ xTxi+r).
In formula, δ, r and d are nuclear parameters, and K is kernel function.
The contrast of different kernel functions, test set prediction classification accuracy is taken in test (to be used uniformly across [0,1] normalizing Change) as shown in table 1:
The accuracy rate that the different kernel function training of table 1 are obtained
In table, svmtrain is primary function in LIBSVM tool box.
Analyze from table 1 it is found that solving the problems, such as in the detection classification of heavy loading locomotive wheel condition, using Gauss radially The SVMs of basic function is obtained in that higher classification accuracy, therefore similarly adopts in the unbalanced problem of solution data With Gaussian radial basis function as Cost Sensitive Support Vector Machines kernel function.
2. standard support vector cassification Comparative result
Propose, for comparing the present invention, the performance that TSP question particle cluster algorithm optimizes CS-SVM model, using adaptive strain Different particle cluster algorithm, genetic algorithm and grid search are separately optimized the parameter of standard SVMs.Wherein, TSP question grain The svm classifier visualization figure of swarm optimization Optimal Parameters and classification results figure, as shown in Figure 5 and Figure 6.
Fig. 5 is shown, is asked in the wheel condition detection classification of solution heavy loading locomotive using the standard SVMs of parameter optimization During topic, creep area and faulty section classification are unbalanced, Optimal Separating Hyperplane can be made to offset to faulty section, so as to cause as figure Result shown in 6, faulty section classification accuracy is far below the classification accuracy in creep area.
Using above-mentioned three kinds of algorithm optimizations standard SVMs parameter, and classified, obtained result as shown in table 2:
The support vector cassification result of 2 parameter optimization of table
Table 2 shows that the classification accuracy obtained using TSP question particle cluster algorithm is calculated higher than genetic algorithm and grid Method, and time-consuming most short.The contrast of creep area accuracy rate and faulty section accuracy rate shows that it is accurate that creep area has higher classification Rate, and faulty section classification accuracy is not high.During locomotive operation, creep area sample far more than the sample of faulty section, Using in standard SVMs training process, faulty section data are very few, SVMs undertrained, so as to situation about judging by accident Increase.
3. Cost Sensitive Support Vector Machines category of model Comparative result
For this problem, from RBF kernel function as SVMs kernel function, while the support using cost-sensitive Vector machine solution classification is unbalanced, the problem that point cost is not waited by mistake.The SVMs of cost-sensitive needs to optimize punishment parameter C1,C2Width cs with Gaussian radial basis function.
Fig. 7 shows that terminate evolving when population evolutionary generation was 100 generation, now average fitness value is close to for 97.80% Preferable optimal value.
Known by Figure 10, in or so 20 generations, the average fitness of genetic algorithm reaches stable, end when evolutionary generation was 100 generation Only evolve, now average fitness value is 99.58%.
Fig. 8 and 11 shows, adopts the SVMs of cost-sensitive cause as data are unbalanced with effectively solving The problem that point cost is not waited by mistake.
It is respectively adopted relevant parameter and classification accuracy such as table 3 and table that algorithms of different obtains Cost Sensitive Support Vector Machines Shown in 4:
The relevant parameter of the Cost Sensitive Support Vector Machines of 3 parameter optimization of table
The classification accuracy of the Cost Sensitive Support Vector Machines of 4 parameter optimization of table
Table 3 and 4 result of table show that the Cost Sensitive Support Vector Machines of TSP question particle cluster algorithm Optimal Parameters are to machine The detection of car wheel condition has higher classification accuracy, in the unbalanced Liang Ge area of data, has accuracy rate relatively, compares Time-consuming short in the genetic algorithm of standard.Grid data service is time-consuming most long, and accuracy rate is minimum, in the case of data are unbalanced, Obtain Liang Ge area accuracy rate difference larger.The CS-SVM of TSP question particle cluster algorithm Optimal Parameters adopts Structural risk minization Change principle and VC dimension is theoretical, Generalization Ability and the precision of model is improve, reduces the degree of dependence experience.
Figure 13 shows the ROC curve of the CS-SVM using TSP question particle cluster algorithm Optimal Parameters closer to upper left Angle, illustrates there is more preferable classification performance than the standard SVMs of TSP question particle cluster algorithm Optimal Parameters.
The value of the accuracy rate of table 5SVM and CS-SVM and AUC compares
Defined is the area under ROC curve to AUC (Area Under Curve), and corresponding numerical value is bigger, represents classification The performance of device is better.Table 5 shows that the AUC of the CS-SVM of TSP question particle cluster algorithm Optimal Parameters is 0.9947 more than certainly The value of the SVM of adequate variation particle cluster algorithm Optimal Parameters.And the CS-SVM of TSP question particle cluster algorithm Optimal Parameters Classification accuracy be substantially better than the SVM of TSP question particle cluster algorithm Optimal Parameters, indicate TSP question population The superiority of the CS-SVM model of algorithm optimization parameter.
Obviously, above-described embodiment is only intended to clearly illustrate technical scheme example, and is not Restriction to embodiments of the present invention.For those of ordinary skill in the field, on the basis of the above description also Can make other changes in different forms.All any modifications that is made within the spirit and principles in the present invention, etc. With replacement and improvement etc., should be included within the protection of the claims in the present invention.

Claims (9)

1. a kind of Cost Sensitive Support Vector Machines locomotive wheel condition detecting system, it is characterised in that:The system includes data Pretreatment module, Cost Sensitive Support Vector Machines training module, parameter optimization module, the classification of optimal cost sensitive support vector machine Module, discrimination module and wheel condition output module;The data preprocessing module output end is connected with training module input; The Cost Sensitive Support Vector Machines training module output end is connected with parameter optimization module input, the parameter optimization mould Block output end is connected with optimal cost sensitive support vector machine sort module output end, the optimal cost sensitive support vector machine Sort module output end is connected with discrimination module input, the discrimination module output end and parameter optimization module input and car The input connection of wheel state output module.
2. Cost Sensitive Support Vector Machines locomotive wheel condition detecting system according to claim 1, it is characterised in that institute State pretreatment module and locomotive wheel state is divided into place of safety and faulty section two types;Data source is chosen, data source is carried out Feature extraction, used as sample data set, sample data set is divided into training set and test set to the characteristic variable that extracts, to training set Sample and test set sample are normalized.
3. Cost Sensitive Support Vector Machines locomotive wheel condition detecting system according to claim 1, it is characterised in that:Institute Parameter optimization module is stated using TSP question particle cluster algorithm, find two penalty factors and the core of cost-sensitive supporting vector Function.
4. Cost Sensitive Support Vector Machines locomotive wheel condition detecting system according to claim 1, it is characterised in that:Institute The training module for stating Cost Sensitive Support Vector Machines is processed to training set sample;The Cost Sensitive Support Vector Machines classification The supporting vector storehouse that module storage Cost Sensitive Support Vector Machines are obtained after being trained to sample;The discrimination module is to cost The training aids performance of sensitive support vector machine is estimated, if reaching qualified classification accuracy, exports wheel condition defeated Go out module.
5. Cost Sensitive Support Vector Machines locomotive wheel condition detecting system described in a kind of any one of Claims 1 to 4 Detection method, it is characterised in that comprise the following steps:
S1. locomotive wheel state is divided into place of safety and faulty section two types;
S2. data source is chosen, feature extraction is carried out to data source, the characteristic variable that extracts, will used as real-time sample data set Sample data set is divided into two subsets of training set and test set in real time;
S3. the data preprocessing module is pre-processed to real-time sample data set, and training set and test set sample are returned One change is processed so as to numerical value of the scope only between 0~1, then using the training set after normalization as cost-sensitive support to The input vector of amount machine training module is processed;
S4. the training module is processed to the training set after normalization, through the training of Cost Sensitive Support Vector Machines, will One group of supporting vector can be obtained, is then stored in database;
S5. the parameter optimization module uses adaptive strain to the nuclear parameter of Cost Sensitive Support Vector Machines and two penalty factors Different particle cluster algorithm carries out global optimizing;Obtain optimal cost sensitive support vector machine;
S6. the sort module is classified to the real-time sample data in test set;And store Cost Sensitive Support Vector Machines The supporting vector storehouse obtained after being trained to real-time sample data set.
S7. data accuracy differentiates, the characteristic variable in test set is input into optimal cost sensitive support vector machine model Tested, optimal cost sensitive support vector machine model exports recognition result, recognition result is safety or malfunction;Will be defeated The recognition result for going out is compared with locomotive actual wheel state, calculates the accuracy rate of identification;
S8. wheel condition output;Wheel condition is exported if the accuracy rate of identification meets requirement, otherwise re-start.
6. locomotive wheel detection method according to claim 5, it is characterised in that the normalization algorithm process is:
Wherein x and x ' is respectively the value before and after normalizing.
7. locomotive wheel detection method according to claim 5, it is characterised in that step S6 adopts gaussian radial basis function core Function, linear kernel function, Polynomial kernel function or two-layer perceptron kernel function are classified to sample data.
8. locomotive wheel detection method according to claim 5, it is characterised in that the step S5 parameter optimization is using adaptive Mutation Particle Swarm Optimizer is answered, is comprised the following steps:
S81. TSP question particle swarm parameter is initialized;
S82. fitness is calculated;
S83. individual extreme value and colony's extreme value are found;
S84. particle rapidity updates and location updating;
S85., after particle updates every time, particle is reinitialized with certain probability;
S86. the calculating of particle fitness is re-started;
S87. to each particle, its adaptive value is made comparisons with the best position which lives through, if preferably by which the most Currently best position, completes the renewal of individual extreme value and colony's extreme value;
S88. end condition inspection, if being unsatisfactory for termination criteria, going to step S84, otherwise going to next step;The termination Standard includes that predefined iterations or TSP question particle cluster algorithm preset precision;
S89. two penalty factors during output termination iterations in particle and kernel function.
9. locomotive wheel detection method according to claim 8, it is characterised in that the more new formula is:
IfThen
V is initialized speed again, r3Represent equally distributed random number in [0,1] is interval;If r3> 0.5, then k=-1; If r3≤ 0.5, then k=1.
CN201610880518.8A 2016-10-09 2016-10-09 A kind of Cost Sensitive Support Vector Machines locomotive wheel detection system and method Active CN106482967B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610880518.8A CN106482967B (en) 2016-10-09 2016-10-09 A kind of Cost Sensitive Support Vector Machines locomotive wheel detection system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610880518.8A CN106482967B (en) 2016-10-09 2016-10-09 A kind of Cost Sensitive Support Vector Machines locomotive wheel detection system and method

Publications (2)

Publication Number Publication Date
CN106482967A true CN106482967A (en) 2017-03-08
CN106482967B CN106482967B (en) 2019-10-29

Family

ID=58269413

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610880518.8A Active CN106482967B (en) 2016-10-09 2016-10-09 A kind of Cost Sensitive Support Vector Machines locomotive wheel detection system and method

Country Status (1)

Country Link
CN (1) CN106482967B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107240097A (en) * 2017-06-27 2017-10-10 长春工业大学 Lung neoplasm image processing method based on MKL SVM PSO algorithms
CN107833311A (en) * 2017-11-15 2018-03-23 中国联合网络通信集团有限公司 A kind of fault detection method and platform of shared bicycle
CN107862763A (en) * 2017-11-06 2018-03-30 中国人民解放军国防科技大学 train safety early warning evaluation model training method, module and monitoring evaluation system
CN107918379A (en) * 2017-11-29 2018-04-17 东北大学 Based on the industrial big data incipient fault detection method for scheming semi-supervised cost-sensitive
CN110766175A (en) * 2019-10-25 2020-02-07 长沙理工大学 Pitch system fault detection method and device based on optimal interval distribution machine
CN111353515A (en) * 2018-12-21 2020-06-30 湖南工业大学 Multi-scale grading-based classification and identification method for damage of train wheel set tread
CN111582510A (en) * 2020-05-13 2020-08-25 中国民用航空飞行学院 Intelligent identification method and system based on support vector machine and civil aircraft engine
CN111654874A (en) * 2020-06-03 2020-09-11 枣庄学院 Wireless sensor network anomaly detection method
CN113866684A (en) * 2021-11-14 2021-12-31 广东电网有限责任公司江门供电局 Distribution transformer fault diagnosis method based on hybrid sampling and cost sensitivity
CN116295620A (en) * 2023-02-17 2023-06-23 南通科瑞环境科技有限公司 Environment monitoring, collecting and detecting method

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030158830A1 (en) * 2000-04-11 2003-08-21 Adam Kowalczyk Gradient based training method for a support vector machine
US20050049990A1 (en) * 2003-08-29 2005-03-03 Milenova Boriana L. Support vector machines processing system
CN101464964A (en) * 2007-12-18 2009-06-24 同济大学 Pattern recognition method capable of holding vectorial machine for equipment fault diagnosis
CN102750551A (en) * 2012-06-18 2012-10-24 杭州电子科技大学 Hyperspectral remote sensing classification method based on support vector machine under particle optimization
CN103020434A (en) * 2012-11-30 2013-04-03 南京航空航天大学 Particle swarm optimization-based least square support vector machine combined predicting method
CN103218625A (en) * 2013-05-10 2013-07-24 陆嘉恒 Automatic remote sensing image interpretation method based on cost-sensitive support vector machine

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030158830A1 (en) * 2000-04-11 2003-08-21 Adam Kowalczyk Gradient based training method for a support vector machine
US20050049990A1 (en) * 2003-08-29 2005-03-03 Milenova Boriana L. Support vector machines processing system
CN101464964A (en) * 2007-12-18 2009-06-24 同济大学 Pattern recognition method capable of holding vectorial machine for equipment fault diagnosis
CN102750551A (en) * 2012-06-18 2012-10-24 杭州电子科技大学 Hyperspectral remote sensing classification method based on support vector machine under particle optimization
CN103020434A (en) * 2012-11-30 2013-04-03 南京航空航天大学 Particle swarm optimization-based least square support vector machine combined predicting method
CN103218625A (en) * 2013-05-10 2013-07-24 陆嘉恒 Automatic remote sensing image interpretation method based on cost-sensitive support vector machine

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
任强: "重载机车粘着控制方法的研究与设计", 《中国优秀硕士学位论文全文数据库 工程科技Ⅱ辑》 *
唐明珠: "类别不平衡和误分类代价不等的数据集分类方法及应用", 《中国博士学位论文全文数据库 信息科技辑》 *

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107240097B (en) * 2017-06-27 2020-12-29 长春工业大学 Pulmonary nodule image processing method based on MKL-SVM-PSO algorithm
CN107240097A (en) * 2017-06-27 2017-10-10 长春工业大学 Lung neoplasm image processing method based on MKL SVM PSO algorithms
CN107862763A (en) * 2017-11-06 2018-03-30 中国人民解放军国防科技大学 train safety early warning evaluation model training method, module and monitoring evaluation system
CN107833311A (en) * 2017-11-15 2018-03-23 中国联合网络通信集团有限公司 A kind of fault detection method and platform of shared bicycle
CN107918379A (en) * 2017-11-29 2018-04-17 东北大学 Based on the industrial big data incipient fault detection method for scheming semi-supervised cost-sensitive
CN107918379B (en) * 2017-11-29 2020-03-31 东北大学 Industrial big data early fault detection method based on graph semi-supervision cost sensitivity
CN111353515A (en) * 2018-12-21 2020-06-30 湖南工业大学 Multi-scale grading-based classification and identification method for damage of train wheel set tread
CN111353515B (en) * 2018-12-21 2024-01-26 湖南工业大学 Multi-scale classification-based train wheel set tread damage classification and identification method
CN110766175A (en) * 2019-10-25 2020-02-07 长沙理工大学 Pitch system fault detection method and device based on optimal interval distribution machine
CN111582510A (en) * 2020-05-13 2020-08-25 中国民用航空飞行学院 Intelligent identification method and system based on support vector machine and civil aircraft engine
CN111654874A (en) * 2020-06-03 2020-09-11 枣庄学院 Wireless sensor network anomaly detection method
CN113866684A (en) * 2021-11-14 2021-12-31 广东电网有限责任公司江门供电局 Distribution transformer fault diagnosis method based on hybrid sampling and cost sensitivity
CN113866684B (en) * 2021-11-14 2024-05-31 广东电网有限责任公司江门供电局 Mixed sampling and cost sensitivity-based distribution transformer fault diagnosis method
CN116295620A (en) * 2023-02-17 2023-06-23 南通科瑞环境科技有限公司 Environment monitoring, collecting and detecting method

Also Published As

Publication number Publication date
CN106482967B (en) 2019-10-29

Similar Documents

Publication Publication Date Title
CN106482967B (en) A kind of Cost Sensitive Support Vector Machines locomotive wheel detection system and method
CN113256066B (en) PCA-XGboost-IRF-based job shop real-time scheduling method
CN106371427B (en) Industrial process Fault Classification based on analytic hierarchy process (AHP) and fuzzy Fusion
CN106355030B (en) A kind of fault detection method based on analytic hierarchy process (AHP) and Nearest Neighbor with Weighted Voting Decision fusion
CN101464964B (en) Pattern recognition method capable of holding vectorial machine for equipment fault diagnosis
Baltas et al. A comparative analysis of decision trees, support vector machines and artificial neural networks for on-line transient stability assessment
CN109685366A (en) Equipment health state evaluation method based on mutation data
CN108520272A (en) A kind of semi-supervised intrusion detection method improving blue wolf algorithm
CN106973057A (en) A kind of sorting technique suitable for intrusion detection
CN111505424A (en) Large experimental device power equipment fault diagnosis method based on deep convolutional neural network
CN110737976B (en) Mechanical equipment health assessment method based on multidimensional information fusion
CN101738998B (en) System and method for monitoring industrial process based on local discriminatory analysis
CN109886284B (en) Fraud detection method and system based on hierarchical clustering
CN104123678A (en) Electricity relay protection status overhaul method based on status grade evaluation model
CN110794360A (en) Method and system for predicting fault of intelligent electric energy meter based on machine learning
CN109800782A (en) A kind of electric network fault detection method and device based on fuzzy knn algorithm
CN114970643A (en) High-speed electric spindle fault identification method based on UMAP dimension reduction algorithm
WO2023273249A1 (en) Tsvm-model-based abnormality detection method for automatic verification system of smart electricity meter
CN109902731B (en) Performance fault detection method and device based on support vector machine
CN108763926A (en) A kind of industrial control system intrusion detection method with security immunization ability
CN108830407A (en) Sensor distribution optimization method under the conditions of multi-state in monitoring structural health conditions
Jiang et al. Parameters calibration of traffic simulation model based on data mining
CN114279728B (en) Fault diagnosis method and system for vibrating screen body
Gao et al. Fault detection of electric vehicle charging piles based on extreme learning machine algorithm
Zhou et al. Imbalanced data classification for defective product prediction based on industrial wireless sensor network

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant