CN108846437A - The method of raising TWSVM algorithm robustness based on capped-l1 norm - Google Patents

The method of raising TWSVM algorithm robustness based on capped-l1 norm Download PDF

Info

Publication number
CN108846437A
CN108846437A CN201810622213.6A CN201810622213A CN108846437A CN 108846437 A CN108846437 A CN 108846437A CN 201810622213 A CN201810622213 A CN 201810622213A CN 108846437 A CN108846437 A CN 108846437A
Authority
CN
China
Prior art keywords
matrix
data
twsvm
capped
norm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201810622213.6A
Other languages
Chinese (zh)
Inventor
业巧林
王春燕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Forestry University
Original Assignee
Nanjing Forestry University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Forestry University filed Critical Nanjing Forestry University
Priority to CN201810622213.6A priority Critical patent/CN108846437A/en
Publication of CN108846437A publication Critical patent/CN108846437A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to data processing fields, disclose a kind of method of raising TWSVM algorithm robustness based on capped-l1 norm, including:Input raw data matrix M and parameter C1、C2、ε1And ε2;M points are positive and negative data matrix H and G;Two diagonal matrix matrix Fs and D are initialized as unit matrix;According to H, G, C1, C2, ε1、ε2, F and D, the parameter w and b of classification plane is calculated;According to w and b, calculate all data points in H and D to classification plane distance, if the distance for having data point to plane of classifying in H and D is respectively greater than ε1And ε2, then the point is judged for outlier, sets one close to zero value smallval for that entry value in corresponding F and D;To update diagonal matrix F and D, target value is calculated.The present invention can greatly improve the robustness of TWSVM algorithm, and also have preferable precision on original data set.

Description

The method of raising TWSVM algorithm robustness based on capped-l1 norm
Technical field
The present invention relates to algorithm improvement and data processing field, in particular to a kind of raising based on capped-l1 norm The method of TWSVM algorithm robustness.
Background technique
Support vector machines has been widely applied to data classification and regression problem up as a good tool ?.Including every field such as biological information, text classification, image procossings.In recent years, Mangasarian and Wild is proposed More plane approximation support vector machines based on generalized eigenvalue (Proximal SVM based 0n Generalized Eigenvalues, GEPSVM).The algorithm solves two non-parallel hyperplane by solving two generalized eigenvalue problems. GEPSVM is while guaranteeing computational efficiency better compared with SVM, also available preferable classification performance.Derived from the think of of GEPSVM Think, 2007, Javadeva et al. propose twin support vector machines (Twin SVM, TSVM) .TSVM seek two it is non-parallel Optimal classification surface so that each classifying face is suitable for cross division far from other class sample .TSVM close to a kind of sample The classification of face data collection, and solve two relatively smaller quadratic programming problem (Quadratic Programming Problem, QPP), after this makes the speed of TSVM be significantly faster than that the support vector machines of standard so far, many researchers exist Many algorithms are proposed on the basis of TWSVM.
But many already present TWSVM models often do not account for noise data in training classifying face, in reality In, many data can all have data noise, if not removing these noise datas of pipe, can be easy to cause the classification trained and obtained Plane has deviation, and the accuracy of classification reduces, and the robust performance of algorithm is bad.Why noise data for classification plane certainly Plan has an impact because that many models based on TWSVM use or L2 norm, L2 norm can be expanded by square operation The influence of big noise data.If a data do not have noise, the performance of these algorithms can be very good, still, reality In life, not having noisy data to be impossible existing.Therefore, in algorithm for design, we are contemplated that making an uproar for data Sound problem.It will be apparent that if many of data noise outlier, L2 norm is very not applicable, the TWSVM based on L2 norm Model algorithm is also just very not applicable.In order to improve algorithm for the robustness of outlier, we have proposed capped-l1 norms TWSVM has abandoned the disadvantage of L2, greatly improves the robustness of algorithm.
Summary of the invention
Goal of the invention:Aiming at the problems existing in the prior art, the present invention provides a kind of mentioning based on capped-l1 norm The method of high TWSVM algorithm robustness can greatly improve the robustness of TWSVM algorithm, and on original data set There is preferable precision.
Technical solution:The side for the raising TWSVM algorithm robustness that the present invention provides a kind of based on capped-l1 norm Method includes the following steps:Step 1:Input raw data matrix M and parameter C1、C2、ε1And ε2;The data matrix M points is just Data matrix H and negative data matrix G, wherein H ∈ Rm1×(n+1), G∈Rm2×(n+1), wherein m1And m2For the data matrix M In data amount check, n be the data matrix M data dimension;Step 2:Two diagonal matrix matrix Fs and D are initialized as list Bit matrix;Step 3:According to the correction data matrix H, negative data matrix G, parameter C1, C2, ε1、ε2And diagonal matrix F and D, The parameter w and b of classification plane is calculated;Step 4:According to the w and b, all numbers in the correction data matrix H are calculated Strong point to classification plane distance, if having data point be greater than ε1, then the point is judged for outlier, by the corresponding diagonal matrix That entry value in F is set as smallval, smallval be one close to zero value;According to the w and b, calculate described negative Data point in data matrix G to the classification plane distance, if there is data point to be greater than ε2, then judge the point for outlier, The smallval is set by that entry value in the corresponding diagonal matrix D;In this approach come update diagonal matrix F and D;Step 5:Calculate target value obj.
Preferably, it in the step 4, is set that entry value in corresponding diagonal matrix F to according to outlier The calculation method of smallval is as follows:;It will be corresponded to according to outlier Diagonal matrix D in that entry value be set as smallval calculation method it is as follows:
Preferably, in the step 5, the target value;Its In, z=(w, b)T, e is a m1× 1 column vector, each element are 1.
Further, further comprising the steps of after the step 5:Step 6:Step 3-step 5 described in iteration Step is until the target value obj restrains;Step 7 determines the parameter w and b of optimal classification plane.Step 3 and step 4 Iteration update setting so that the present invention hair can obtain it is optimal classification plane parameter w and b, further increase TWSVM algorithm Robustness and precision.
Preferably, the smallval=1e-5.
Beneficial effect:
Different from traditional TWSVM algorithm, capped-l1 normal form is applied in objective function by this algorithm, enables the algorithm Greatly improve the robustness of TWSVM algorithm;Its principle is the influence for removing those noise outliers to the decision of classification plane, Decision is carried out again after removing these outliers.In addition, the present invention is because the loss part in objective function applies Capped-l1 norm, therefore regardless of how serious a point is by the of mistake minute, the value of the loss of this function will not change too Greatly, the robustness of algorithm is improved with this;Finally, the present invention also simply effectively iteratively solves method by a set of, solve A local optimum out.
Compared to more traditional TWSVM algorithm, the present invention is based on the TWSVM methods of capped-l1 normal form distance metric to have Better robustness, and also have preferable precision on original data set;The present invention has selected 12 UCI data, in phase With on data set, by this method and TWSVM, WLTSVM, L1-GEPSVM, L1-NPSVM and this 5 algorithms progress of pTWSVM are smart Comparison in difference is spent, comparison result shows that method of the invention is concentrated in 12 data, there is the better than other algorithms of 8 performances, and And in the case where identical noise situations are added, what is still showed on 8 data sets is better than other algorithms.
Detailed description of the invention
Fig. 1 is target value iteration convergence mistake of the TWSVM algorithm based on capped-l1 norm on data set Haberman The schematic diagram of journey;
Fig. 2 is target value iterative convergent process signal of the TWSVM algorithm based on capped-l1 norm on data set Sonar Figure.
Fig. 3 is that in the case where noise spot artificially is added into the data of crossing plane, capped-l1 model is based in the present invention The precision of several TWSVM algorithms and other five algorithms changes comparison schematic diagram;
Fig. 4 is the TWSVM algorithm and other five based on capped-l1 norm in the present invention in different noise factors The precision of a algorithm changes comparison schematic diagram.
Specific embodiment
The present invention is described in detail with reference to the accompanying drawing.
The method for improving TWSVM algorithm robustness based on capped-l1 norm that present embodiments provide for a kind of, including Following steps:
Step 1:Input raw data matrix M and parameter C1、C2、ε1And ε2;Data matrix M points are correction data matrix H and negative According to matrix G, wherein H ∈ Rm1×(n+1), G∈Rm2×(n+1), wherein m1And m2For the data amount check in data matrix M, n is data The data dimension of matrix M;
Step 2:Two diagonal matrix matrix Fs and D are initialized as unit matrix;
Step 3:According to correction data matrix H, negative data matrix G, parameter C1, C2, ε1、ε2And diagonal matrix F and D, it calculates To the parameter w and b of classification plane;
Step 4:According to w and b, calculate all data points in correction data matrix H to plane of classifying distance, if there is data point Greater than ε1, then judge that the point for outlier, passes through formulaIt will correspond to Diagonal matrix F in that entry value be set as 1e-5;
According to w and b, calculate the data point in negative data matrix G to classification plane distance, if there is data point greater than ε2, then sentence The point break as outlier, passes through formula 1e-5 is set by that entry value in corresponding diagonal matrix D;
Update diagonal matrix F and D in this approach;
Step 5:Target value obj is calculated,, wherein z=(w, b)T, e is One m1× 1 column vector, each element are 1.
Step 6:Iteration three-step 5 of above-mentioned steps step is until target value obj restrains;
Step 7 determines the parameter w and b of optimal classification plane.
In order to intuitively show that this method can quickly restrain, two are listed in present embodiment based on capped- Convergence sex expression of the TWSVM algorithm of l1 norm on two UCI data sets.Such as Fig. 1 and 2, from Fig. 1 and 2 it can be found that this Method can with cracking iteration convergence to a stable value, illustrate the TWSVM algorithm based on capped-l1 norm calculate and It is all feasible on time complexity.
In addition, artificially joined obvious noise spot, such as Fig. 3 in present embodiment into the data of crossing plane Show,
With this data instance, TWSVM, WLTSVM, L1-GEPSVM are compared, L1-NPSVM, pTWSVM and be based on Precision of the TWSVM algorithm of capped-l1 norm in above data, is 55.26%, 95.12%, 97.60% respectively, 54.64%, 55.36% and 98.07%.It will be apparent that this method(TWSVM algorithm based on capped-l1 norm)Precision is higher than Other 5 algorithms, robust performance are more preferable.
In addition, present embodiment compared the TWSVM in different noise proportionals, based on capped-l1 norm The variation of the precision of algorithm and other five algorithms.As Fig. 4 is based on capped- as can be seen from Figure 4 under different noise situations The TWSVM algorithm of l1 norm is not only got well than other arithmetic accuracies, but also more steady than other algorithms.Although in noise factor In the case where 0.25, the precision of TWSVM and the precision of the TWSVM algorithm based on capped-l1 norm will as, still TWSVM is showed very unstable.In addition, WLTSVM, L1-GEPSVM, although L1-NPSVM and pTWSVM is in different noises Under the influence of show very steady, but their mean accuracy is 78.90%, 83.66%, 80.23% and 77.67%, and base In capped-l1 norm TWSVM algorithm be 86.00%.So finally or this method(Based on capped-l1 norm TWSVM algorithm)It is more advantageous.
The technical concepts and features of above embodiment only to illustrate the invention, its object is to allow be familiar with technique People cans understand the content of the present invention and implement it accordingly, and it is not intended to limit the scope of the present invention.It is all according to the present invention The equivalent transformation or modification that Spirit Essence is done, should be covered by the protection scope of the present invention.

Claims (5)

1. a kind of method of the raising TWSVM algorithm robustness based on capped-l1 norm, which is characterized in that including following step Suddenly:
Step 1:Input raw data matrix M and parameter C1、C2、ε1And ε2;The data matrix M point for correction data matrix H and Negative data matrix G, wherein H ∈ Rm1×(n+1), G∈Rm2×(n+1), wherein m1And m2For the data in the data matrix M Number, n are the data dimension of the data matrix M;
Step 2:Two diagonal matrix matrix Fs and D are initialized as unit matrix;
Step 3:According to the correction data matrix H, negative data matrix G, parameter C1, C2, ε1、ε2And diagonal matrix F and D, meter It calculates and obtains the parameter w and b of classification plane;
Step 4:According to the w and b, calculate all data points in the correction data matrix H to classification plane distance, if There is data point to be greater than ε1, then judge that the point for outlier, sets that entry value in the corresponding diagonal matrix F to Smallval, smallval be one close to zero value;
According to the w and b, calculate data point in the negative data matrix G to the classification plane distance, if there is data point Greater than ε2, then the point is judged for outlier, sets described for that entry value in the corresponding diagonal matrix D smallval;
Update diagonal matrix F and D in this approach;
Step 5:Calculate target value obj.
2. the method for the raising TWSVM algorithm robustness according to claim 1 based on capped-l1 norm, feature It is, in the step 4, sets that entry value in corresponding diagonal matrix F to according to outlier the meter of smallval Calculation method is as follows:
The calculation method for setting smallval for that entry value in corresponding diagonal matrix D according to outlier is as follows:
3. the method for the raising TWSVM algorithm robustness according to claim 1 based on capped-l1 norm, feature It is, in the step 5,
The target value
Wherein, z=(w, b)T, e is a m1× 1 column vector, each element are 1.
4. the raising TWSVM algorithm robustness according to any one of claim 1 to 3 based on capped-l1 norm Method, which is characterized in that further comprising the steps of after the step 5:
Step 6:Step 3 described in iteration-step 5 step is until the target value obj restrains;
Step 7 determines the parameter w and b of optimal classification plane.
5. the raising TWSVM algorithm robustness according to any one of claim 1 to 3 based on capped-l1 norm Method, which is characterized in that the smallval=1e-5.
CN201810622213.6A 2018-06-15 2018-06-15 The method of raising TWSVM algorithm robustness based on capped-l1 norm Withdrawn CN108846437A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810622213.6A CN108846437A (en) 2018-06-15 2018-06-15 The method of raising TWSVM algorithm robustness based on capped-l1 norm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810622213.6A CN108846437A (en) 2018-06-15 2018-06-15 The method of raising TWSVM algorithm robustness based on capped-l1 norm

Publications (1)

Publication Number Publication Date
CN108846437A true CN108846437A (en) 2018-11-20

Family

ID=64202083

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810622213.6A Withdrawn CN108846437A (en) 2018-06-15 2018-06-15 The method of raising TWSVM algorithm robustness based on capped-l1 norm

Country Status (1)

Country Link
CN (1) CN108846437A (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105335615A (en) * 2015-10-31 2016-02-17 电子科技大学 Low-complexity two-dimensional angle and polarization parameter joint estimation method
CN106847248A (en) * 2017-01-05 2017-06-13 天津大学 Chord recognition methods based on robustness scale contour feature and vector machine

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105335615A (en) * 2015-10-31 2016-02-17 电子科技大学 Low-complexity two-dimensional angle and polarization parameter joint estimation method
CN106847248A (en) * 2017-01-05 2017-06-13 天津大学 Chord recognition methods based on robustness scale contour feature and vector machine

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
WENHAO JIANG: ""Robust Dictionary Learning with Capped `1-Norm"", 《PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE》 *
ZHOUYUAN HUO: ""Video Recovery via Learning Variation and Consistency of Images"", 《PROCEEDINGS OF THE THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE》 *

Similar Documents

Publication Publication Date Title
JP6922387B2 (en) Recognition devices, training devices and methods based on deep neural networks
JP2021077377A (en) Method and apparatus for learning object recognition model
CN105210115B (en) Performing gesture recognition using 2D image data
CN107292902B (en) Two-dimensional Otsu image segmentation method combined with drosophila optimization algorithm
CN105551015A (en) Scattered-point cloud image registering method
CN105046694A (en) Quick point cloud registration method based on curved surface fitting coefficient features
JP2010266983A (en) Information processing apparatus and method, learning device and method, program, and information processing system
JP2005242808A (en) Reference data optimization learning method and pattern recognition system
Riaz et al. Fouriernet: Compact mask representation for instance segmentation using differentiable shape decoders
WO2015025472A1 (en) Feature conversion learning device, feature conversion learning method, and program storage medium
CN112633413B (en) Underwater target identification method based on improved PSO-TSNE feature selection
CN114936518A (en) Method for solving design parameters of tension/compression spring
JP6942203B2 (en) Data processing system and data processing method
CN108846437A (en) The method of raising TWSVM algorithm robustness based on capped-l1 norm
CN106022212A (en) Gyroscope temperature drift modeling method
LI et al. Training restricted boltzmann machine using gradient fixing based algorithm
JP6121187B2 (en) Acoustic model correction parameter estimation apparatus, method and program thereof
CN114818203A (en) Reducer design method based on SWA algorithm
JP7364047B2 (en) Learning devices, learning methods, and programs
CN112070127A (en) Intelligent analysis-based mass data sample increment analysis method
JP5130934B2 (en) Recognition system, information processing apparatus, design apparatus, and program
WO2020087254A1 (en) Optimization method for convolutional neural network, and related product
Lu et al. Improved SVM classifier incorporating adaptive condensed instances based on hybrid continuous-discrete particle swarm optimization
CN116152316B (en) Image registration method based on self-adaptive parameter particle swarm algorithm
Wang et al. A novel visual tracking system with adaptive incremental extreme learning machine

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20181120