CN107392241A

CN107392241A - A kind of image object sorting technique that sampling XGBoost is arranged based on weighting

Info

Publication number: CN107392241A
Application number: CN201710580163.5A
Authority: CN
Inventors: 高欣; 范少华; 李新鹏; 张�浩; 戚岳; 曹良晶; 贾庆轩; 彭岳星; 刁新平
Original assignee: Beijing University of Posts and Telecommunications; State Grid Jibei Electric Power Co Ltd
Current assignee: Beijing University of Posts and Telecommunications; State Grid Jibei Electric Power Co Ltd
Priority date: 2017-07-17
Filing date: 2017-07-17
Publication date: 2017-11-24
Anticipated expiration: 2037-07-17
Also published as: CN107392241B

Abstract

The embodiment of the present invention proposes a kind of image object sorting technique that sampling XGBoost is arranged based on weighting, including：Using large data collection ILSVRC pre-training cross and convolutional neural networks that carried out on the data sets of PASCAL VOC 2012 fine setting extract target image characteristics；The feature that connection multilayer learns more determines the content information of its image category to obtain；Characteristics of image is classified using the XGBoost methods based on weighting row sampling, according to Attribute Significance, enter places sampling to attribute before decision tree is built, the attribute with more information of extraction is used for the structure of current decision tree, iteration obtains the optimal image classification model of performance until convergence.The technical scheme provided according to embodiments of the present invention, when the attribute dimensions of data are big and redundancy is high, this method can expand to other sorting techniques using row sampling, improve the Average Accuracy of image object classification.

Description

A kind of image object sorting technique that sampling XGBoost is arranged based on weighting

【Technical field】

The present invention relates to image object sorting technique, more particularly to a kind of image mesh that sampling XGBoost is arranged based on weighting Mark sorting technique.

【Background technology】

In image object classification task, an image is classified according to vision content in image.For example, it is in this pictures It is no automobile to be presentOne important application is exactly picture retrieval --- concentrate figure of the retrieval with certain content in image data Piece.But turning into the main information carrier in internet along with picture, problem occurs therewith, when information is by literature record, I Can be easily found by keyword search needed for content and carry out any editor, and when information is recorded by picture, carry Its high retrieval rate is the hot issue studied now.Picture brings efficiently information record and the side of sharing to us Formula, our Information Retrieval Efficiency is but reduced, in this environment, the image recognition technology of computer just seems particularly heavy Will.Image characteristics extraction serves key effect when identifying image category, and traditional method based on manual extraction will be directed to Particular problem adjusts, and needs substantial amounts of professional knowledge and time.In recent years, extracted and schemed based on depth convolutional neural networks As feature is proved to largely effective, however, the characteristics of image after extraction has larger dimension and redundancy, current classification Device is poor for such data performance, has influence on the identification for image target category.

【The content of the invention】

In view of this, the embodiment of the present invention proposes a kind of image object classification side that sampling XGBoost is arranged based on weighting Method, to improve classification accuracy of the model for the characteristics of image with High redundancy information.

The embodiment of the present invention proposes a kind of image object sorting technique that sampling XGBoost is arranged based on weighting, including：

Using large data collection ILSVRC pre-training cross and carried out on the data sets of PASCAL VOC 2012 micro- The feature of the convolutional neural networks extraction target image of tune；

The feature that connection multilayer learns more determines the content information of its image category to obtain；

Characteristics of image is classified using the XGBoost methods based on weighting row sampling, according to Attribute Significance, determined in structure Enter places sampling before plan tree to attribute, the attribute with more information being drawn into is used for the structure of current decision tree, weight Multiple iteration obtains the optimal image classification model of performance until convergence.

In upper methods described, using large data collection ILSVRC pre-training cross and counted in PASCAL VOC 2012 The method of feature according to the convolutional neural networks extraction target image that fine setting was carried out on collection is：Change CaffeNet input Layer, the size for adjusting input picture are 227 × 227；Replaced using SigmoidCrossEntropyLoss loss functions Last layer of Softmax function of CaffeNet；Checking collection fine setting CaffeNet models, instruction are trained using PASCAL VOC 2012 Practice 600 bouts, obtained convolutional neural networks model is the model that can effectively extract characteristics of image；

The expression formula of SigmoidCrossEntropyLoss loss functions for evaluation model performance quality is：

Wherein, l is the size of logical intersection entropy loss；N is each sample of batch processing, and its span arrives N, N for 1 It is batch processing sample size；J is the classification that sample may belong to, and m is the sum of the possible classification of sample；N-th of sample is in jth The correct label of class；For the output of logistic regression function, i.e. n-th of sample belongs to the probability of jth class, for judging image In whether there is some object.

In the above method, obtain the feature that connection multilayer learns and more determine that the content of its image category is believed to obtain Breath, its specific method are：The image of extraction feature will be needed as the input of convolutional neural networks, by five layers of convolutional layer (CONV1-CONV5) calculated with three layers of full articulamentum (FC6-FC8), every layer of excitation output is the character representation of image；Depth The low layers of convolutional neural networks exports more generalized feature, and its high level exports more targeted feature, extraction FC6 and FC7 layer characteristic drives are as image feature representation；This two layers output characteristic is connected to obtain with more image feature informations Feature.

In the above method, the method classified using the XGBoost methods that sampling is arranged based on weighting to characteristics of image is：According to Attribute calculates the importance of each attribute for the other separating capacity of target class；Attribute Significance is proportional to attribute when arranging sampling The probability size that this attribute is pumped to, before every decision tree is built, characteristics of image is carried out based on Attribute Significance not Deng generally not putting back to sampling, the attribute with more information being drawn into is used for the structure of current decision tree, iteration until Convergence, you can obtain the optimal image classification model of performance；

XGBoost method and steps using the weighting row sampling based on Attribute Significance are as follows：

1) Attribute Significance of its each column feature is calculated based on differentiation degree of the attribute for target；

Attribute Significance f (k) is determined by following formula：

Wherein,The variance within clusters of attribute k all samples are represented,Determined by following formula：

Wherein,Represent the average value in attribute k for belonging to classification i；D_iTo belong to the sample of i classifications；n_iFor i classifications Number of samples；X is sample data；x^(k)For the value of sample data kth dimension；The variance within clusters tieed up for classification i in kth；m For possible categorical measure；

Wherein,Attribute k class inherited is represented,Determined by following formula：

Wherein, μ^(k)It is the average value of all sample kth dimensions, n is the sum of all categories sample；D is all samples Collection；

2) auxiliary variable that the general sampling without peplacement such as not is calculated based on Attribute Significance, i.e., the probability that attribute is pumped to are calculated Size；

Attribute sampling probability Z_iDetermined by following formula：

Wherein, N is the attribute sum of sample space；

3) sampling parametric initializes：Set by tree sampling proportion as p₁, it is p by layer sampling proportion₂；

4) sample for generating a decision tree is extracted：Generation one 1 is to random whole between N (attribute number sum) Number i, generation are arrived for one 1Random number z；If P_iThe Bernoulli random variable that ＞ z and parameter are p is now 1 Ith attribute is extracted, puts it into K, otherwise repeats this step；

Wherein, K is to generate the attribute sample space that single decision tree is pumped to, { P_iIt is after attribute is extracted, in sample The attribute is removed in space, attribute enters sample probability after adjustment；Attribute sampling probability P after adjustment_iDetermined by following formula：

P_i=Z_i/ (1-sum (all properties enters sample probability Z in K))

Above extraction step is repeated until the sample drawn ratio p of setting₁；

5) step 4 methods described is utilized by sample space of K in each layer of generation decision tree, extraction ratio is p₂'s Sample, the structure for this layer of decision tree；

6) when building every decision tree, the sampling process of repeat step 4,5, until algorithm receives overall goal function Hold back, now obtain optimal target image classification XGBoost models；

The Average Accuracy that described image objective classification method to identify every class image is improved.

As can be seen from the above technical solutions, the embodiment of the present invention has the advantages that：

In the technical scheme that the present invention is implemented, using large data collection ILSVRC pre-training cross and in PASCAL The convolutional neural networks extraction target image characteristics of fine setting were carried out on the data sets of VOC 2012；The feature that connection multilayer learns To obtain the content information for more determining its image category；Using the XGBoost methods based on weighting row sampling to characteristics of image Classification, according to Attribute Significance, enter places sampling to attribute before decision tree is built, by the category with more information of extraction Property be used for current decision tree structure, iteration until restrain, obtain the optimal image classification model of performance.According to the present invention The technical scheme that embodiment provides, when the attribute dimensions of data are big and redundancy is high, this method can expand to other uses The sorting technique of sampling is arranged, improves the Average Accuracy of image object classification.

【Brief description of the drawings】

In order to illustrate the technical solution of the embodiments of the present invention more clearly, below by embodiment it is required use it is attached Figure is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the present invention, for this area For those of ordinary skill, without having to pay creative labor, it can also be obtained according to these accompanying drawings other attached Figure.

Fig. 1 is the image object sorting technique for the XGBoost based on weighting row sampling that the embodiment of the present invention is proposed Schematic flow sheet；

Fig. 2 is the image object sorting technique instruction for the XGBoost based on weighting row sampling that the embodiment of the present invention is proposed Practice and test phase algorithm frame flow chart；

Fig. 3 is to utilize CaffeNet extraction characteristics of image result schematic diagrams；

Fig. 4 is the number and CV score trend graphs of XGBoost algorithm spanning trees；

Fig. 5 is that objective classification method shows in bird classes accurate rate-recall rate (PR) curve proposed in the embodiment of the present invention It is intended to；

Fig. 6 be proposed in the embodiment of the present invention image object sorting algorithm in the PR curves and AP values of 20 classifications.

【Embodiment】

In order to be better understood from technical scheme, the embodiment of the present invention is retouched in detail below in conjunction with the accompanying drawings State.

It will be appreciated that described embodiment is only part of the embodiment of the present invention, rather than whole embodiments.Base Embodiment in the present invention, those of ordinary skill in the art obtained under the premise of creative work is not made it is all its Its embodiment, belongs to the scope of protection of the invention.

The embodiment of the present invention provides a kind of image object sorting technique that sampling XGBoost is arranged based on weighting, refer to figure 1, it is illustrated by the flow of the image object sorting technique for the XGBoost that sampling is arranged based on weighting that the embodiment of the present invention proposes Figure, as shown in figure 1, this method comprises the following steps：

Fig. 2 is the image object sorting technique instruction for the XGBoost based on weighting row sampling that the embodiment of the present invention is proposed Practice mainly includes training stage and test phase with test phase algorithm frame flow chart, the method for proposition.In the training stage, head First, using the MODEL C affeNet of trained mistake on large data collection ILSVRC2012, for extracting characteristics of image.So Afterwards, input of the characteristic extracted as XGBoost algorithms, Importance of attribute is used in the row sampling set every time Degree extracts data of the higher attribute of importance as this wheel decision-tree model training so that this decision tree can more collect In in data of the training with higher attribute, so as to reduce the interference of redundant attributes, lifting recognition accuracy is imitated with computing resource Rate.In test phase, same pre-training depth convolutional neural networks are used for the feature for extracting test image, then will extraction Characteristic afterwards is with the XGBoost categories of model trained.

Step 101, using large data collection ILSVRC pre-training cross and in the data sets of PASCAL VOC 2012 On carried out fine setting convolutional neural networks extraction target image feature.

Specifically, modification CaffeNet input layer, the size for adjusting input picture is 227 × 227；Utilize SigmoidCrossEntropyLoss loss functions replace last layer of Softmax function of CaffeNet；Utilize PASCAL The training checking collection fine setting CaffeNet models of VOC 2012, train 600 bouts, obtained convolutional neural networks model is to have The model of effect extraction characteristics of image；

Step 102, the feature that connection multilayer learns more determines the content information of its image category to obtain.

Specifically, it would be desirable to input of the image as convolutional neural networks of feature is extracted, by five layers of convolutional layer (CONV1-CONV5) calculated with three layers of full articulamentum (FC6-FC8), every layer of excitation output is the character representation of image；Depth The low layers of convolutional neural networks exports more generalized feature, and its high level exports more targeted feature, extraction FC6 and FC7 layer characteristic drives are as image feature representation；This two layers output characteristic is connected to obtain with more image feature informations Feature.

Fig. 3 is which show the image characteristics extraction of an ox using CaffeNet extraction characteristics of image result schematic diagrams Schematic diagram, Fig. 3 the picture left above is original input picture, its input as CaffeNet input layer；Fig. 3 upper rights are CONV1 36 filter output before layer, because it is in convolutional neural networks lower level so it can be seen that significantly in image Hold；Positioned at two of Fig. 3 bottoms figures be respectively FC6 and FC7 layers output valve and its be on the occasion of histogram, from point of histogram The output Distribution value of FC6 with FC7 layers is different from the point of view of cloth, and it is therefore necessary to connect multilayer feature value to obtain with more information Characteristics of image, while can be seen that two layers of characteristic value is mainly distributed on low value region, so characteristics of image inevitably has There are high correlation and redundancy.

Step 103, characteristics of image is classified using the XGBoost methods that sampling is arranged based on weighting；According to Attribute Significance, Enter places sampling to attribute before decision tree is built；The attribute with more information being drawn into is used for current decision tree Structure, iteration obtain the optimal image classification model of performance until convergence.

Specifically, the XGBoost methods based on weighting row sampling that the embodiment of the present invention is proposed comprise the following steps that：

Attribute Significance f (k) is determined by following formula：

Attribute sampling probability Z_iDetermined by following formula：

Wherein, N is the attribute sum of sample space；

P_i=Z_i/ (1-sum (all properties enters sample probability Z in K))

Algorithm 1 is proposed to weight the false code of the row methods of sampling by the embodiment of the present invention：

The weighting row methods of sampling that the embodiment of the present invention proposes has certain versatility, when characteristic has dimension height And redundancy it is big the characteristics of when, the method can readily expand to other need arrange sampling method in.In order to effectively classify Image feature data, the weighting row methods of sampling that the embodiment of the present invention proposes are applied to XGBoost algorithms by tree sampling and by layer Sampling phase.

Algorithm 2 is the false code for the XGBoost algorithms based on weighting row sampling that the embodiment of the present invention proposes：

The XGBoost algorithms based on weighting row sampling are optimised.

For in specific embodiment, using the data set training algorithm models of PASCAL VOC 2012, in XGBoost algorithms In two parameters be that comparison is crucial, the decision tree number K of learning rate η and generation, during tree is integrated, η times of learning rate is changed The decision tree weight newly added.The number of generation decision tree may be increased by reducing learning rate, so parameter η and K balance exist Key effect is served on algorithm performance.Normal conditions, take η≤0.1, and in the embodiment of the present invention, fixed learning rate is 0.1, so Afterwards, in training, increase a decision tree per bout and carry out assessment algorithm performance parameters using 5 folding cross validations, work as calculation Method can now obtain optimal decision tree number when being restrained on assessment level.

Fig. 4 is the number and CV score trend graphs of XGBoost algorithm spanning trees, in embodiments of the present invention, works as learning rate For 0.1 when, obtain optimal generation decision tree number as 110.

After learning rate and decision tree number the two parameters are determined, first, the parameter based on tree construction is adjusted, this The output for the influence model that parameters will be maximum a bit.In embodiments of the present invention, existed at random using RandomizedSearchCV Optimized parameter is searched on parameter space, it, which often covers parameter, is extracted from a possible parameter distribution.It is every relative to traversal Individual possible parameter value has two advantages：1. make the computation budget of searching optimized parameter independently of number of parameters and possible values 2. Increase parameter, which will not reduce efficiency, influences performance.Table one be the embodiment of the present invention proposed based on weighting row sampling CV result of the XGBoost algorithms on birds, the optimized parameter found is used to train proposed method.

Table one

Optimal test can be obtained by table one and be scored at 0.829, parameter now is Max_depth=8, Colsample_bytree=0.099, Gamma=3.523, Min_child_weight=7, it is emphasized that, in order to carry High algorithm computational efficiency has only used parameter Colsample_bytree here, i.e., begins to use row weight in growth each tree The methods of sampling, also can enter ranks sampling in the growth of every layer of tree certainly, still, in practice to balance efficiency of algorithm with it is accurate Rate, here without using parameter Colsample_bylevel.The embodiment of the present invention is in the test data sets of PASCAL VOC 2007 Training pattern, the identification for birds, the AP values of test result are 81.7%, have 16.9% relative to three kinds of typical algorithms Improve.Fig. 5 arranges PR song of the XGBoost algorithms of sampling when identifying birds by what the embodiment of the present invention proposed based on weighting Line.

Table two is that the XGBoost methods based on weighting row sampling that the embodiment of the present invention proposes are appointed for image object classification The test result of business, contrast algorithm is to use the data outside the data sets of standard PASCAL VOC 2007 in the embodiment of the present invention Train, in three kinds of typical algorithms, GHM methods combine vision bag of words and contextual information feature to identify figure from VOC data sets As classification, AGS methods pass through SIFT, HOG by clustering character representation of the VOC data for the subclassification acquisition second layer, NUS methods And the methods of LBP, trains a code book to extract feature to identify image category from VOC data sets.

Table two

Fig. 6 be proposed in the embodiment of the present invention image object sorting algorithm in the PR curves and AP values of 20 classifications. What the embodiment of the present invention proposed arranges 10 of compared to three kinds typical algorithms of XGBoost algorithms of sampling in 20 classes based on weighting The Average Accuracy of individual class increases.Particularly, proposition method is 81.7 in the discrimination AP values of birds, compared to three kinds allusion quotations Best algorithm has 16.9% raising in type algorithm, and ship class is 79.0, the raising for having 2.9%.The embodiment of the present invention is proposed Method achieve certain breakthrough in the discrimination of some classifications.

In summary, the embodiment of the present invention has the advantages that：

The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all essences in the present invention God any modification, equivalent substitution and improvements done etc., should be included within the scope of protection of the invention with principle.

Claims

A kind of 1. image object sorting technique that sampling XGBoost is arranged based on weighting, it is characterised in that methods described step bag Include：

(1) using large data collection ILSVRC pre-training cross and carried out on the data sets of PASCAL VOC 2012 micro- The feature of the convolutional neural networks extraction target image of tune；

(2) feature that learns of connection multilayer more determines the content information of its image category to obtain；

(3) characteristics of image is classified using the XGBoost methods based on weighting row sampling, according to Attribute Significance, determined in structure Enter places sampling before plan tree to attribute, the attribute with more information being drawn into is used for the structure of current decision tree, weight Multiple iteration obtains the optimal image classification model of performance until convergence.
2. according to the method for claim 1, it is characterised in that using large data collection ILSVRC pre-training cross and The feature of the convolutional neural networks extraction target image of fine setting was carried out on the data sets of PASCAL VOC 2012, including：Modification CaffeNet input layer, the size for adjusting input picture are 227 × 227；Lost using SigmoidCrossEntropyLoss Function replaces last layer of Softmax function of CaffeNet；Checking collection fine setting CaffeNet is trained using PASCAL VOC 2012 Model, trains 600 bouts, and obtained convolutional neural networks model is the model that can effectively extract characteristics of image；

The expression formula of SigmoidCrossEntropyLoss loss functions for evaluation model performance quality is：

<mrow> <mi>l</mi> <mo>=</mo> <mo>-</mo> <mfrac> <mn>1</mn> <mi>N</mi> </mfrac> <munderover> <mo>&Sigma;</mo> <mrow> <mi>n</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>N</mi> </munderover> <munderover> <mo>&Sigma;</mo> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>m</mi> </munderover> <mo>&lsqb;</mo> <msubsup> <mi>y</mi> <mi>j</mi> <mrow> <mo>(</mo> <mi>n</mi> <mo>)</mo> </mrow> </msubsup> <mi>ln</mi> <mi> </mi> <msubsup> <mi>p</mi> <mi>j</mi> <mrow> <mo>(</mo> <mi>n</mi> <mo>)</mo> </mrow> </msubsup> <mo>+</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>-</mo> <msubsup> <mi>y</mi> <mi>j</mi> <mrow> <mo>(</mo> <mi>n</mi> <mo>)</mo> </mrow> </msubsup> <mo>)</mo> </mrow> <mi>l</mi> <mi>n</mi> <mrow> <mo>(</mo> <mn>1</mn> <mo>-</mo> <msubsup> <mi>p</mi> <mi>j</mi> <mrow> <mo>(</mo> <mi>n</mi> <mo>)</mo> </mrow> </msubsup> <mo>)</mo> </mrow> <mo>&rsqb;</mo> </mrow>

Wherein, l is the size of logical intersection entropy loss；N is each sample of batch processing, and its span arrives N for 1, and N is at criticizing Manage sample size；J is the classification that sample may belong to, and m is the sum of the possible classification of sample；It is n-th of sample in jth class Correct label；For the output of logistic regression function, i.e. n-th of sample belongs to the probability of jth class, for judging in image With the presence or absence of some object.
3. according to the method for claim 1, it is characterised in that the feature that connection multilayer learns more determines it to obtain The content information of image category, is described as follows：The image of extraction feature will be needed as the input of convolutional neural networks, warp Cross five layers of convolutional layer (CONV1-CONV5) and three layers of full articulamentum (FC6-FC8) calculate, every layer of excitation output is the spy of image Sign represents；The low layer of depth convolutional neural networks exports more generalized feature, and its high level exports more targeted feature, FC6 and FC7 layer characteristic drives are extracted as image feature representation；This two layers output characteristic is connected to obtain with more images The feature of characteristic information.
4. according to the method for claim 1, it is characterised in that using the XGBoost methods based on weighting row sampling to figure As tagsort, according to Attribute Significance, enter places sampling to attribute before decision tree is built, have what is be drawn into more The attribute of information is used for the structure of current decision tree, and iteration obtains the optimal image classification model of performance, wrapped until convergence Include：The importance of each attribute is calculated for the other separating capacity of target class according to attribute；Attribute Significance is proportional to attribute and existed The probability size that this attribute is pumped to during row sampling, before every decision tree is built, carries out being based on attribute weight to characteristics of image Not waiting for being spent generally does not put back to sampling, and the attribute with more information being drawn into is used for the structure of current decision tree, repeats Iteration is until convergence, you can obtains the optimal image classification model of performance；

XGBoost method and steps using the weighting row sampling based on Attribute Significance are as follows：

1) Attribute Significance of its each column feature is calculated based on differentiation degree of the attribute for target；

Attribute Significance f (k) is determined by following formula：

<mrow> <mi>f</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <msubsup> <mi>S</mi> <mi>B</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> </msubsup> <msubsup> <mi>S</mi> <mi>W</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> </msubsup> </mfrac> </mrow> 1

Wherein,The variance within clusters of attribute k all samples are represented,Determined by following formula：

<mrow> <msubsup> <mi>S</mi> <mi>W</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> </msubsup> <mo>=</mo> <munderover> <mo>&Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>m</mi> </munderover> <msubsup> <mi>S</mi> <mi>i</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> </msubsup> <mo>,</mo> </mrow>

<mrow> <msubsup> <mi>S</mi> <mi>i</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> </msubsup> <mo>=</mo> <munder> <mo>&Sigma;</mo> <mrow> <mi>x</mi> <mo>&Element;</mo> <msub> <mi>D</mi> <mi>i</mi> </msub> </mrow> </munder> <msup> <mrow> <mo>(</mo> <msup> <mi>x</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> </msup> <mo>-</mo> <msubsup> <mi>&mu;</mi> <mi>i</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> </msubsup> <mo>)</mo> </mrow> <mn>2</mn> </msup> <mo>,</mo> </mrow>

<mrow> <msubsup> <mi>u</mi> <mi>i</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> </msubsup> <mo>=</mo> <mfrac> <mn>1</mn> <msub> <mi>n</mi> <mi>i</mi> </msub> </mfrac> <munder> <mo>&Sigma;</mo> <mrow> <mi>x</mi> <mo>&Element;</mo> <msub> <mi>D</mi> <mi>i</mi> </msub> </mrow> </munder> <msup> <mi>x</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> </msup> <mo>,</mo> </mrow>

Wherein,Represent the average value in attribute k for belonging to classification i；D_iTo belong to the sample of i classifications；n_iFor the sample of i classifications Number；X is sample data；x^(k)For the value of sample data kth dimension；The variance within clusters tieed up for classification i in kth；M is possible Categorical measure；

Wherein,Attribute k class inherited is represented,Determined by following formula：

<mrow> <msubsup> <mi>S</mi> <mi>B</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> </msubsup> <mo>=</mo> <munderover> <mo>&Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>m</mi> </munderover> <msub> <mi>n</mi> <mi>i</mi> </msub> <msup> <mrow> <mo>(</mo> <msubsup> <mi>&mu;</mi> <mi>i</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> </msubsup> <mo>-</mo> <msup> <mi>&mu;</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> </msup> <mo>)</mo> </mrow> <mn>2</mn> </msup> <mo>,</mo> </mrow>

<mrow> <msup> <mi>&mu;</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> </msup> <mo>=</mo> <mfrac> <mn>1</mn> <mi>n</mi> </mfrac> <munder> <mo>&Sigma;</mo> <mrow> <mi>x</mi> <mo>&Element;</mo> <mi>D</mi> </mrow> </munder> <msup> <mi>x</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> </msup> <mo>=</mo> <mfrac> <mn>1</mn> <mi>n</mi> </mfrac> <munderover> <mo>&Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>m</mi> </munderover> <msub> <mi>n</mi> <mi>i</mi> </msub> <msup> <mi>&mu;</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> </msup> </mrow>

Wherein, μ^(k)It is the average value of all sample kth dimensions, n is the sum of all categories sample；D is all sample sets；

2) auxiliary variable that the general sampling without peplacement such as not is calculated based on Attribute Significance is calculated, i.e., the probability that attribute is pumped to is big It is small；

Attribute sampling probability Z_iDetermined by following formula：

<mrow> <msub> <mi>Z</mi> <mi>i</mi> </msub> <mo>=</mo> <mi>f</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> <mo>/</mo> <munderover> <mo>&Sigma;</mo> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>N</mi> </munderover> <mi>f</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> </mrow>

Wherein, N is the attribute sum of sample space；

3) sampling parametric initializes：Set by tree sampling proportion as p₁, it is p by layer sampling proportion₂；

4) sample for generating a decision tree is extracted：The random integers i that generation is arrived for one 1 between N (attribute number sum), Generation is arrived for one 1Random number z；If P_iThe Bernoulli random variable that ＞ z and parameter are p is now 1 extraction Ith attribute, put it into K, otherwise repeat this step；

Wherein, K is to generate the attribute sample space that single decision tree is pumped to, { P_iIt is after attribute is extracted, in sample space The attribute is removed, attribute enters sample probability after adjustment；Attribute sampling probability P after adjustment_iDetermined by following formula：

P_i=Z_i/ (1-sum (all properties enters sample probability Z in K))

Above extraction step is repeated until the sample drawn ratio p of setting₁；

5) step 4 methods described is utilized by sample space of K in each layer of generation decision tree, extraction ratio is p₂Sample, use In the structure of this layer of decision tree；

6) when building every decision tree, the sampling process of repeat step 4,5, until algorithm makes overall goal function convergence, this When obtain optimal target image classification XGBoost models.