CN103472013A - Visible-near infrared spectrum PLS-DA modeling method combining Adaboost algorithm - Google Patents

Visible-near infrared spectrum PLS-DA modeling method combining Adaboost algorithm Download PDF

Info

Publication number
CN103472013A
CN103472013A CN 201310232419 CN201310232419A CN103472013A CN 103472013 A CN103472013 A CN 103472013A CN 201310232419 CN201310232419 CN 201310232419 CN 201310232419 A CN201310232419 A CN 201310232419A CN 103472013 A CN103472013 A CN 103472013A
Authority
CN
China
Prior art keywords
pls
visible
infrared spectrum
models
adaboost algorithm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN 201310232419
Other languages
Chinese (zh)
Inventor
赵海挺
谢剑
彭纪奔
黄光造
吴司熠
叶冬梅
陈孝敬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wenzhou University
Original Assignee
Wenzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wenzhou University filed Critical Wenzhou University
Priority to CN 201310232419 priority Critical patent/CN103472013A/en
Publication of CN103472013A publication Critical patent/CN103472013A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Investigating Or Analysing Materials By Optical Means (AREA)

Abstract

The invention belongs to the field of visible-near infrared spectroscopic analysis technology, and more specifically relates to a visible-near infrared spectrum PLS-DA modeling method combining Adaboost algorithm. The visible-near infrared spectrum PLS-DA modeling method is characterized in that: a classifier with high performances is obtained by integration of a plurality of PLS-DA models through Adaboost algorithm. PLS-DA models are common identification models used in infrared spectrum technology. PLS-DA models are not capable of reflecting the non-linear relationship between the visible-near infrared spectrum and samples to be analyzed effectively, so that the accuracy of PLS-DA models on data with high nonlinearity is reduced. Adaboost algorithm is capable of providing a framework, and a plurality of methods can be used for construction of sub-classifiers. The high-performance classifier, which is capable of realizing accurate classification of the data with high nonlinearity, is obtained by using PLS-DA models as the sub-classifiers, and combining Adaboost algorithm, so that popularized application of PLS-DA models in identification of the data with high nonlinearity is realized.

Description

A kind of PLS-DA of the visible-near-infrared spectrum in conjunction with Adaboost algorithm modeling method
Technical field
The invention belongs to visible-near-infrared spectrum identification field, a kind of data processing method that can promote visible-near-infrared spectrum partial least squares discriminant analysis (Partial least squares dis-criminationanalysis, PLS-DA) modeling effect specifically.
Technical background
In the multivariable visible-near-infrared spectrum data of small sample, the PLS-DA model can well solve variable collinearity problem and the dimension disaster that other modeling method runs into, and therefore in infrared spectrum identification, has obtained using widely.But the PLS-DA model, as a kind of linear model, can not effectively reflect the nonlinear relationship between near infrared spectrum and analyzing samples classification, therefore to non-linear stronger data, the accuracy of PLS-DA model can descend.
Self-adaptive enhancement algorithm (Adaboost) is a kind of algorithm of applying under various classification scenes that is suitable for.The core concept of Adaboost algorithm is to train different sorter (Weak Classifier) for same training set, then these Weak Classifiers is gathered, and forms a stronger final sorter (strong classifier).Adaboost has many good qualities, and as the Adaboost algorithm provides framework, can make the structure sub-classifier that ins all sorts of ways; Building method is simple; Not there will be over-fitting etc.
Summary of the invention
The PLS-DA model is owing to can not effectively reflecting that the nonlinear relationship between visible-near-infrared spectrum data and sample class label causes the accuracy of model to be limited by the nonlinear degree of visible-near-infrared spectrum data.
The present invention is directed to this problem and proposed a kind of PLS-DA of the visible-near-infrared spectrum in conjunction with Adaboost algorithm modeling method, the PLS-DA model is generalized in the identification application of non-linear stronger visible-near-infrared spectrum data.
The present invention adopts following technical scheme to realize: a kind of PLS-DA of the visible-near-infrared spectrum in conjunction with Adaboost algorithm modeling method comprises the steps:
Step 1, given training sample, S={ (x 1, y 1) ..., (x m, y m), wherein, x i∈ X, label y i∈ Y={1,2,3 ..., N}, m means number of training, N means the classification number of training sample;
Step 2, the weight coefficient ω of each sample of initialization i=1/m, i=1 ..., m;
Step 3, the t=1 that circulates each time ..., T does following steps;
Step 3.1, used partial least square method to carry out modeling to the training sample that weight distribution is arranged, and obtains a PLS-DA model h t;
Step 3.2, calculate h ttraining error
Figure BDA00003330579600021
symbol in algorithm " [] " is defined as follows: for logical expression e, as crossed e, be true, and [e]=1, otherwise [e]=0;
Step 3.3, if ε tset T=t-1 for>1/2 and then skip to step 4;
Step 3.4, make β tt/ (1-ε t), α t=1n (1/ β t);
Step 3.5, upgrade the sample weights coefficient
ω i t + 1 = ω i t Z t × β t 1 - [ h t ( x i ) ≠ y i ]
Wherein, Z tfor normalization coefficient, can make
Figure BDA00003330579600023
Step 4, the output strong classifier: H ( x ) = arg max y ∈ Y ( Σ t = 1 T α t [ h t ( x ) = y ] ) ;
Step 5, test to the classification accuracy of H (x).
The present invention uses PLS-DA as sub-classifier, in conjunction with the Adaboost algorithm, has obtained the strong classifier that also can accurately classify to non-linear stronger data, thereby the PLS-DA model can be generalized in the identification application of non-linear stronger data.
The accompanying drawing explanation
Fig. 1 is process flow diagram of the present invention.
Embodiment
Visible-near-infrared spectrum identification example below in conjunction with a laver is further described technical scheme of the present invention.
Embodiment: a kind of PLS-DA of the visible-near-infrared spectrum in conjunction with Adaboost algorithm modeling method, the identification for laver has following steps:
Step 1, the division of sample.One has four class lavers, is respectively laver a, laver b, laver c, laver d.The row matrix that the ir data of each laver sample is 1 * 1735.Every class laver sample number is 30.Every class laver sample is random is divided into two parts: 20 foundation for model, and 10 tests for model, the matrix consisted of first's sample is training set X, the matrix consisted of the second portion sample is checking collection V.The class label value is set.Because one has 4 kinds, thus laver a, laver b, laver c, the class label of laver d is set to respectively 1,2,3,4.Form matrix Y by 4 kind label values.Be training sample set S={ (x 1, y 1) ..., (x 80, y 80), x wherein i∈ X, y i∈ Y={1,2,3,4}.
Step 2, the weight coefficient ω of each sample in the initialization training set 1 i=1/80, i=1 ..., 80.Maximum iteration time T is set, and in this example, T is set to 300.
Step 3, the t=1 that circulates each time ..., T does following steps:
Step 3.1, the weight coefficient of each training sample represents the probability that this sample is selected.This example is used roulette method to select 80 training samples, and the training sample of selecting is set up to the PLS-DA model: the visible-near-infrared spectrum of 80 training samples forms dependent variable matrix X t, corresponding class label value forms dependent variable matrix Y t, set up regression equation X t* b t=Y t, use partial least square method to try to achieve regression coefficient
Figure BDA00003330579600041
obtain a sub-classifier h t:
Figure BDA00003330579600042
round values after rounding up.
Step 3.2, calculate h ttraining error
Figure BDA00003330579600043
Step 3.3, detect ε tvalue.If ε tset T=t-1 for>1/2 and then skip to step 4.
Step 3.4, make β tt/ (1-ε t), α t=ln (1/ β t).
Step 3.5, upgrade the sample weights coefficient
ω i t + 1 = ω i t Z t × β t 1 - [ h t ( x i ) ≠ y i ]
Wherein, Z tfor normalization coefficient, can make
Step 4, the output strong classifier: H ( x ) = arg max y ∈ Y ( Σ t = 1 T α t [ h t ( x ) = y ] ) .
Step 5, tested the classification accuracy of H (x) with the sample of test set V, if higher than the accuracy of single PLS-DA model, accepts H (x) model, otherwise do not accept H (x) model.The accuracy rate of H in this example (x) classification is 95%, and the accuracy of single PLS-DA category of model is 65%.The accuracy of H (x) model is higher than the accuracy of single PLS-DA model, accepts H (x) model.

Claims (1)

1. the PLS-DA of the visible-near-infrared spectrum in conjunction with an Adaboost algorithm modeling method, its feature comprises following steps:
Step 1, given training sample, S={ (x 1, y 1) ..., (x m, y m), wherein, x i∈ X, label y i∈ Y={1,2,3 ..., N}, m means number of training, N means the classification number of training sample;
Step 2, the weight coefficient ω of each sample of initialization i=1/m, i=1 ..., m;
Step 3, the t=1 that circulates each time ..., T does following steps;
Step 3.1, used partial least square method to carry out modeling to the training sample that weight distribution is arranged, and obtains a PLS-DA model h t;
Step 3.2, calculate h ttraining error
Figure FDA00003330579500011
symbol in algorithm " [] " is defined as follows: for logical expression e, as crossed e, be true, and [e]=1, otherwise [e]=0;
Step 3.3, if ε tset T=t-1 for>1/2 and then skip to step 4;
Step 3.4, make β tt/ (1-ε t), α t=1n (1/ β t);
Step 3.5, upgrade the sample weights coefficient
ω i t + 1 = ω i t Z t × β t 1 - [ h t ( x i ) ≠ y i ]
Wherein, Z tfor normalization coefficient, can make
Figure FDA00003330579500013
Step 4, the output strong classifier: H ( x ) = arg max y ∈ Y ( Σ t = 1 T α t [ h t ( x ) = y ] ) ;
Step 5, test to the classification accuracy of H (x).
CN 201310232419 2013-06-09 2013-06-09 Visible-near infrared spectrum PLS-DA modeling method combining Adaboost algorithm Pending CN103472013A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201310232419 CN103472013A (en) 2013-06-09 2013-06-09 Visible-near infrared spectrum PLS-DA modeling method combining Adaboost algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201310232419 CN103472013A (en) 2013-06-09 2013-06-09 Visible-near infrared spectrum PLS-DA modeling method combining Adaboost algorithm

Publications (1)

Publication Number Publication Date
CN103472013A true CN103472013A (en) 2013-12-25

Family

ID=49796953

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201310232419 Pending CN103472013A (en) 2013-06-09 2013-06-09 Visible-near infrared spectrum PLS-DA modeling method combining Adaboost algorithm

Country Status (1)

Country Link
CN (1) CN103472013A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109060771A (en) * 2018-07-26 2018-12-21 温州大学 A kind of common recognition model building method based on spectrum different characteristic collection
CN108681697B (en) * 2018-04-28 2021-03-23 北京农业质量标准与检测技术研究中心 Feature selection method and device

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108681697B (en) * 2018-04-28 2021-03-23 北京农业质量标准与检测技术研究中心 Feature selection method and device
CN109060771A (en) * 2018-07-26 2018-12-21 温州大学 A kind of common recognition model building method based on spectrum different characteristic collection
CN109060771B (en) * 2018-07-26 2020-12-29 温州大学 Consensus model construction method based on different characteristic sets of spectrum

Similar Documents

Publication Publication Date Title
CN107247989B (en) Real-time computer vision processing method and device
CN102819745B (en) Hyper-spectral remote sensing image classifying method based on AdaBoost
CN110533631B (en) SAR image change detection method based on pyramid pooling twin network
US10539613B2 (en) Analog circuit fault diagnosis method using single testable node
CN105243398B (en) The method of improvement convolutional neural networks performance based on linear discriminant analysis criterion
CN103632168B (en) Classifier integration method for machine learning
CN108921285B (en) Bidirectional gate control cyclic neural network-based classification method for power quality disturbance
CN100595780C (en) Handwriting digital automatic identification method based on module neural network SN9701 rectangular array
CN108171209A (en) A kind of face age estimation method that metric learning is carried out based on convolutional neural networks
CN110298085A (en) Analog-circuit fault diagnosis method based on XGBoost and random forests algorithm
CN104598920B (en) Scene classification method based on Gist feature and extreme learning machine
CN105095863A (en) Similarity-weight-semi-supervised-dictionary-learning-based human behavior identification method
CN106651574A (en) Personal credit assessment method and apparatus
CN105334504B (en) The radar target identification method of nonlinear discriminant projection model based on big border
CN105606914A (en) IWO-ELM-based Aviation power converter fault diagnosis method
CN105116397A (en) Radar high-resolution range profile target recognition method based on MMFA model
CN109543720A (en) A kind of wafer figure defect mode recognition methods generating network based on confrontation
CN106485289A (en) A kind of sorting technique of the grade of magnesite ore and equipment
CN113420795B (en) Mineral spectrum classification method based on cavity convolutional neural network
Murata Detecting communities from bipartite networks based on bipartite modularities
CN104318515A (en) Hyper-spectral image wave band dimension descending method based on NNIA evolutionary algorithm
CN105184314A (en) wrapper-type hyperspectral waveband selection method based on pixel clustering
CN105550712A (en) Optimized convolution automatic encoding network-based auroral image sorting method
CN104504391B (en) A kind of hyperspectral image classification method based on sparse features and markov random file
CN104966075A (en) Face recognition method and system based on two-dimensional discriminant features

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20131225