CN109376331A - A kind of city bus emission index estimation method promoting regression tree based on gradient - Google Patents

A kind of city bus emission index estimation method promoting regression tree based on gradient Download PDF

Info

Publication number
CN109376331A
CN109376331A CN201810958885.4A CN201810958885A CN109376331A CN 109376331 A CN109376331 A CN 109376331A CN 201810958885 A CN201810958885 A CN 201810958885A CN 109376331 A CN109376331 A CN 109376331A
Authority
CN
China
Prior art keywords
regression tree
emission index
bus
gradient
indicate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810958885.4A
Other languages
Chinese (zh)
Inventor
陈淑燕
潘应久
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN201810958885.4A priority Critical patent/CN109376331A/en
Publication of CN109376331A publication Critical patent/CN109376331A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/15Correlation function computation including computation of convolution operations

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computational Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention discloses a kind of city bus emission index estimation methods that regression tree is promoted based on gradient, first according to actual measurement bus emissions data, carry out standardization processing using Lagrange's interpolation, obtain by second emissions data;Secondly, characterizing the current operating condition of bus using vehicle specific power VSP (Vehicle Specific Power), while considering influence of the previous driving status to discharge, establishes the quantitative model of emission index;Regression tree training data finally is promoted using gradient, and carries out parameter regulation, obtains bus emission index estimation model.The present invention considers current time operating condition and previous driving condition to the joint effect of current time emission index, it overcomes and is difficult to describe non-linear relation complicated between bus emission index and each influence factor existing for existing emission index estimation method, regression tree model is promoted using nonparametric technique gradient, the estimated accuracy for improving bus emission index has realistic meaning for control transportation emission discharge amount and optimization road environment.

Description

A kind of city bus emission index estimation method promoting regression tree based on gradient
Technical field
The invention belongs to intelligent transport technology and traffic environment field, more particularly to a kind of regression tree is promoted based on gradient City bus emission index estimation method.
Background technique
Pollution problem caused by urban transportation has caused the attention of countries in the world, and wherein bus is daily as heavy-duty car It shuttles among city, therefore estimates with the emission performance of assessment bus for managing and controlling Pollution of City Traffic problem tool There is realistic meaning.Currently, having the model of some vehicular emission amounts in the world, such as the MOVES model of U.S. EPA, Europe committee The COPERT model that member can develop, the CMEM model etc. of University of California Riverside's exploitation, these models are all based on greatly foreign countries The discharge of traffic emission data mining estimate model, for Chinese complicated road traffic environment, model is not fully applicable in, Some fuel types are in current certain models and unavailable simultaneously, and as do not supported at present in MOVES, liquefied natural gas is public Hand over the emission performance estimation of vehicle.
In terms of city bus emission index estimation, handed over since vehicle will receive complicated road during actual travel Logical environment influences, therefore complicated non-linear relation is presented in bus emission index and road traffic parameter, against simple linear Homing method can not relationship between precise quantification explanatory variable and emission index.Meanwhile merely with simple regression tree-model, Wu Fajing It really extracts the information characteristics of explanatory variable and be easy to cause the over-fitting of discharge estimation model.If being directed to certain of explanatory variable A kind of characteristic is classified, and establishes regression tree respectively, and calculates residual error using loss function using gradient method for improving to be fitted Several regression tree models are finally overlapped by one regression tree again, can more in depth excavate the information of explanatory variable Feature reaches the target of the estimation accuracy for improving model and enhancing generalization ability.
Summary of the invention
Goal of the invention: for the above problem of the existing technology, the present invention proposes a kind of with gradient promotion regression tree More regression iterative methods carry out the estimation of bus emission index method, this method sufficiently analyze bus current operating conditions with Previous operating status is to the influence degree of current emissions characteristic, to improve the accuracy of bus emission index estimation.
Technical solution: to achieve the purpose of the present invention, the technical scheme adopted by the invention is that: one kind is promoted based on gradient The city bus emission index estimation method of regression tree, this method comprises the following steps:
(1) standardization processing is carried out to obtained bus discharge and running data, obtained by second emission index and traveling shape State characteristic parameter data;
(2) bus driving condition calculation of characteristic parameters real-time vehicle specific power is utilized, and is characterized with velocity and acceleration The driving status of previous second, the emissions data obtained based on step (1) determines training set, as mode input parameter;
(3) the loss function L that model is determined using the input parameter that step (2) obtain sets regression tree number M, and just The weak learner of beginningization constructs new return as residual error approximation in the value of current regression tree model with the negative gradient of loss function Gui Shu;
(4) regression tree determined in an iteration according to step (3), renewal learning device function, until M iteration Terminate, i.e. M regression tree obtains final strong learner model;
(5) emission index estimation is carried out to test set using the model established.
Wherein, in step (1), standardization processing is carried out to measured data according to the following formula:
For n+1 point to (x0,y0),(x1,y1),...,(xn,yn), seek a function li(x), make the function in xiPlace Obtain corresponding yiValue, liIt (x) is Lagrangian fundamental polynomials, i.e. Interpolation-Radix-Function, expression formula are as follows:
Wherein, n+1 indicates the point of data set to number;xnIndicate (n+1)th point to it is corresponding at the time of;ynExpression n-th+ The discharge and driving status characteristic variable value of 1 point pair;
Assuming that the x that any two are differentiIt is all different, lagrange polynomial can be obtained:
Wherein, in step (1), the bus includes that bus discharges during actual travel by second emission index CO、CO2、HC、NOXBy second emission index;The real-time driving condition characteristic parameter includes speed, acceleration, road grade and moves State passenger capacity.
Wherein, described to utilize bus driving condition calculation of characteristic parameters real-time vehicle specific power in step (2), it is used in combination The driving status method that velocity and acceleration characterizes the previous second is as follows:
(2.1) the bus driving parameters data obtained using step (1), calculate the vehicle specific power of bus, specifically Calculation method is as follows:
In formula, VSP is the vehicle specific power of bus, FtIt is tractive force (N);V is travel speed (m/s);M is bus Total weight, including vehicle body nt wt net weight and carrying weight (kg);Ff,Fw,Fi,Fj, respectively indicate rolling resistance, air drag, ramp Resistance and acceleration resistance (N);A indicates bus acceleration (m/s2);G is acceleration of gravity (9.8 m/s2);F is rolling resistance Coefficient is dimensionless group;εiIndicate quality factor;α indicates road grade;ρaIndicate atmospheric density;CDTraction coeficient;A is indicated Bus windshield area;
(2.2) velocity and acceleration of previous driving status is obtained by second driving status supplemental characteristic according to bus, I.e. previous second velocity and acceleration.
Wherein, in step (2), the emissions data obtained based on step (1) determines training set, as mode input parameter, Above-mentioned training set determines that method is as follows:
D={ (x1,y1),(x2,y2),...(xi,yi),...,(xN,yN) i=1,2 ..., N
Wherein, D indicates the training set as discharge estimation mode input layer, (x thereini,yi) indicate i-th in training set The independent variable and dependent variable point pair of group data, xiIt indicates argument data collection, i.e. discharge variation, altogether includes that three influences become Amount, respectively VSPt, vt-1, at-1, wherein VSPtIndicate t moment, the i.e. instantaneous vehicle specific power at current time, vt-1And at-1Point Not Biao Shi the t-1 moment instantaneous velocity and acceleration;yiIndicate bus emission index, including CO, CO2, HC, NOXFour kinds of discharges The emission index of object, N are the number of samples of input data.
Wherein, in step (3), regression tree number M is preset, and loss function is negative binomial log-likelihood function, expression Formula is as follows:
L (y, f (x))=log (1+exp (- 2yf (x)));
Wherein, y surveys emission index value for being dependent variable value;F (x) indicates emission index estimated value;
Initialize the form of weak learner are as follows:
Wherein, N is the number of samples of input data;C is initial leaf node output parameter;L(yi, c) and it indicates with i-th The loss function that sample training obtains.
Wherein, it in step (3), with the negative gradient of loss function in the value of "current" model, is wanted as the regression tree newly constructed The approximation of the residual error of fitting, for each sample (xi,yi), residual error is calculated using the method for gradient decline:
Wherein, rm,iIndicate the residual error of i-th of sample in the m regression tree;fm-1(xi) indicate the m-1 regression tree training Obtained learner, i.e., when independent variable is xiThe emission index estimated value that the m-1 regression tree of Shi Liyong acquires is calculating the m F is used when the residual error of i-th of sample of regression treem-1(xi) replace f (xi);
Since loss function is negative binomial log-likelihood function, residual error can be further indicated that are as follows:
Wherein, the method for one regression tree of fitting is in step (3): utilizing i-th of sample in calculated the m regression tree This residual error rm,i, gathered { (xi,rm,i)}I=1,2 ..., N, to train the m regression tree Tm, stroke of leaf node Subregion is denoted as Rm,j, j=1,2 ..., J.
Wherein, in step (3), the method for solving leaf node output valve is, for regression tree TmEach leaf node:
Wherein, cm,jIndicate the leaf node output valve in j-th of feature unit of the m regression tree, the i.e. estimation of emission index Value.
Wherein, in step (4), the update method of learner are as follows: obtain regression tree TmAll leaf node output valves after, more New learner:
Wherein, feature space is divided into J unit { R by regression tree1,R2,...,RJ, feature space refers to every recurrence The generating mode of leaf nodes determines the method for dividing leaf node that is, according to the number of independent variable and value range;Rm,jIt indicates J-th of feature unit of the m regression tree, each feature unit indicate a division classification;I(x∈Rm,j) it is indicator function, x For independent variable, i.e. discharge variation VSPt, vt-1And at-1, as regression tree TmDetermine x ∈ Rm,jWhen, that is, indicate the independent variable category In Rm,jIn unit, I value takes 1 at this time, is otherwise 0;cm,jIndicate the leaf segment that the m regression tree obtains under j-th of feature unit The output valve of point, i.e. emission index value under this feature unit;
Gradient method for improving introduces shrinkage parameters v, then the expression formula of renewal learning device becomes:
Wherein, shrinkage parameters v is known as learning rate;
According to the continuous iteration of process of the m regression tree of training, until obtaining M regression tree superposition most after iteration M times Whole gradient promotes regression tree model, and representation method is as follows:
Wherein,Indicate that gradient promotes regression tree.
The utility model has the advantages that compared with prior art, technical solution of the present invention has following advantageous effects:
(1) influence of the current operating condition to emission performance is not only allowed for, while considering previous operating status pair The influence of emission performance, to improve the accuracy of discharge estimation;
(2) it using the current operating condition of vehicle specific power characterization bus as one of input parameter of model, both wrapped The operating condition feature of bus is contained, such as velocity and acceleration also contains roadway characteristic parameter, such as road grade, simultaneously Also contemplate influence of the passenger capacity to discharge of dynamic change;
(3) compared to single post-class processing, promoting regression tree using gradient can implicit spy preferably in learning data Sign, overcome be difficult to describe existing for existing emission index estimation method it is complicated non-thread between bus emission index and each influence factor Sexual intercourse;
(4) by more regression tree Shared Decision Makings of iteration, all regression trees are stacked up to obtain final discharge estimation Model, can the significantly more efficient information for excavating explanatory variable so that entire model reaches higher estimation accuracy.
Detailed description of the invention
Fig. 1 is overview flow chart of the invention;
Fig. 2 is the emission index estimated result obtained in the present invention using liquefied natural gas bus measured data.
Specific embodiment
Further description of the technical solution of the present invention with reference to the accompanying drawings and examples.
As shown in Figure of description 1, the invention proposes a kind of city bus discharges that regression tree is promoted based on gradient Rate estimation method, this method comprises the following steps:
(1) using PEMS actual measurement urban road bus discharge and transport condition data, and Lagrange's interpolation side is utilized Method carries out standardization processing to data.
The test of the bus discharge algorithm for estimating of regression tree is promoted the present invention is based on gradient and training dataset is all from In 1 tunnel of industry, 51 tunnels, No. 206 buses measured data.In operation using PEMS equipment acquisition bus Real-time emission index, including CO, CO2、HC、NOXThe emission index of four kinds of pollutants, while utilizing handhold GPS equipment record vehicle Running track can obtain vehicle running state data, including speed, acceleration by running track data, have also obtained description Link characteristics road grade data.In addition, investigator by be recorded in the passengers quantity that bus station is got on or off the bus obtain it is dynamic The passenger loading data of state.The sampling interval of PEMS and GPS device is 1-2 seconds, to obtain by second discharge and transport condition data, benefit Standardization processing is carried out to data set with lagrange-interpolation, is obtained by second emissions data.By taking speed data as an example, preceding 10 The speed data of second is as shown in table 1 below.
1 preceding 10 second speed data of table
Moment (s) 1 2 3 4 5
Speed (km/h) 11.0 13.5 Null value 16.5 Null value
Moment (s) 6 7 8 9 10
Speed (km/h) 19.5 18.0 Null value 25.0 26.0
It is modeled using 2 data not lacked each before and after missing values, concrete methods of realizing is as follows:
The moment (s) is indicated with x, and y=f (x) indicates the speed (km/h) under moment x.It is needed in table 1 to f (3), f (5), f (8) Interpolation is carried out, by taking f (3) as an example, is modeled using 2 data not lacked each before and after missing values, i.e. selection f (1), f (2), The data of f (4), f (5) are modeled, but f (5) is missing values herein, therefore only chooses f (1), and f (2), f (4) are carried out to f (3) Interpolation calculation.The Interpolation-Radix-Function l of calculatingi(x) as follows:
According to the Interpolation-Radix-Function l being calculatedi(x) lagrange polynomial can be obtained:
According to obtained lagrange polynomial, the value of (3) f can be calculated:
F (3)=L (3)=15.32
According to above-mentioned calculation method, can also interpolation be carried out to f (5) and f (8).
It include respectively CO emission index, CO to emissions data using above-mentioned lagrange-interpolation2Emission index, NOXDischarge Rate and HC emission index;Transport condition data includes vehicle specific power VSP, instantaneous velocity, acceleration, roadway characteristic data, i.e. road The road gradient and dynamic passenger loading data carry out interpolation, obtain by second data, and be recorded as referring to the time, discharge number to by the second According to, merged by second transport condition data, by second roadway characteristic data and dynamic passenger loading data, obtain final data Collection.Data set includes that acquisition moment, dynamic passenger capacity, instantaneous velocity, acceleration, road grade, operating range, vehicle compare function altogether Rate, longitude and latitude, CO emission index, CO2Emission index, NOX12 attributes of emission index and HC emission index.
(2) characteristic factor for influencing emission performance is extracted, determines the input layer of emission index estimation model.
In at a time, the pollutant emission rate of bus is influenced by factors, road grade, vehicle driving State, bus carrying weight etc. can all influence the emission index of vehicle to a certain extent.Accurately to estimate vehicle emission index, Quantify the operating condition of bus using the vehicle specific power (VSP) being widely adopted, circular is as follows:
In formula, Power indicates vehicle general power (kW);Mass indicate bus gross mass (kg), be vehicle dry weight amount with The sum of passenger capacity;FtIt is tractive force (N);V is travel speed (m/s);M is bus total weight, including vehicle body nt wt net weight and load Objective weight (kg);Ff,Fw,Fi,Fj, respectively indicate rolling resistance, air drag, gradient resistance and acceleration resistance (N);A indicates public Hand over vehicle acceleration (m/s2);G is acceleration of gravity (9.8m/s2);F is coefficient of rolling resistance, is dimensionless group;εiIndicate matter Measure the factor;α indicates road grade;ρaIt indicates atmospheric density, takes 1.207kg/m at 20 °C3;CDTraction coeficient;A is indicated Bus windshield area (m2)。
Except when preceding operating condition will affect except the emission index at bus current time, previous driving status also can be one Determine to influence bus in degree in the emission performance at current time.Therefore, the instantaneous velocity in the present invention with the previous second and acceleration The previous driving status for spending two parameter characterization buses, the vehicle specific power with current time is together as the main defeated of model Enter parameter.
Mode input layer indicates are as follows:
D={ (x1,y1),(x2,y2),...(xi,yi),...,(xN,yN) i=1,2 ..., N
Wherein, D indicates the training set as discharge estimation mode input layer, (x thereini,yi) indicate i-th in training set The independent variable and dependent variable point pair of group data, xiIndicate argument data collection, i.e. discharge variation, according to being analyzed above, altogether Including three variations, respectively VSPt, vt-1, at-1, wherein VSPtIndicate t moment, i.e. the instantaneous vehicle at current time compares function Rate, vt-1And at-1Respectively indicate the instantaneous velocity and acceleration at t-1 moment;yiIndicate bus emission index, including CO, CO2, HC, NOXThe emission index of four kinds of emissions;N is the number of samples of input data.
3) loss function type, and the method for determining regression criterion are determined, gradient is established and promotes regression tree model.
To improve the estimated accuracy of emission index and the extensive degree of very high model, gradient promotes regression tree model and passes through iteration To obtain final discharge estimated result, regression tree sum is indicated more regression trees with M.
It selects to use negative binomial log-likelihood function as loss function in the present invention, expression formula is as follows:
L (y, f (x))=log (1+exp (- 2yf (x)));
Wherein, y surveys emission index value for being dependent variable value;F (x) indicates emission index estimated value.By utilizing gradient Descending method uses the negative gradient of loss function in the value of "current" model, as the regression tree newly constructed the residual error to be fitted Approximation optimizes loss function with this.
Illustrate the construction method of model by taking the process of the m regression tree of training as an example below.
Initialize the form of weak learner are as follows:
Wherein, N is the number of samples of input data;C is initial leaf node output parameter, be can according to need customized Variable c;L(yi, c) and indicate the loss function obtained with i-th of sample training.
For each sample (xi,yi), residual error is determined using the method that gradient declines:
Wherein, rm,iIndicate the residual error of i-th of sample in the m regression tree;fm-1(xi) indicate the m-1 regression tree training Obtained learner, i.e., when independent variable is xiThe emission index estimated value that the m-1 regression tree of Shi Liyong acquires is calculating the m F is used when the residual error of i-th of sample of regression treem-1(xi) replace f (xi).Residual error is calculated by gradient descent method, so that each Loss function is mobile and smaller and smaller to negative gradient direction when iteration, to obtain more and more accurate model.
Since loss function is negative binomial log-likelihood function, residual error can be further indicated that are as follows:
After determining residual error, { (x is utilizedi,rm,i)}I=1,2 ..., NTrain the m regression tree TmMethod it is as follows:
For regression tree TmEach leaf node, the calculation method of output valve are as follows:
Obtain regression tree TmAll leaf node output valves after, renewal learning device:
Wherein, feature space is divided into J unit { R by regression tree1,R2,...,RJ, feature space refers to every recurrence The generating mode of leaf nodes is exactly specifically the number and value range according to independent variable, determines the side for dividing leaf node Method, the present invention in, refer to independent variable VSPt, vt-1And at-1Value within a certain range when, such as 2≤VSPt≤ 5,10≤vt-1 ≤ 20, -2.5≤at-1When≤0, it can be divided into a leaf node, at this time 2≤VSPt≤ 5,10≤vt-1≤ 20, -2.5≤at-1 ≤ 0 is exactly one of division unit, and above-mentioned value range can be set according to actual needs;Rm,jIndicate the m recurrence J-th of unit of tree, each unit indicate a division classification;I(x∈Rm,j) it is indicator function, x is independent variable, that is, is discharged Variation VSPt, vt-1And at-1.As regression tree TmDetermine x ∈ Rm,jWhen, that is, indicate that the independent variable belongs to Rm,jIn unit, at this time I value takes 1, is otherwise 0;cm,jIndicate the output valve for the leaf node that the m regression tree obtains under j-th of feature unit, i.e. this spy Levy the emission index value under unit.
The generalization ability of model is improved to avoid model over-fitting, gradient boosting algorithm introduces shrinkage parameters v, then more The expression formula of new learner becomes:
Shrinkage parameters v is known as learning rate, and v=1 is ungauged regions, when learning rate takes smaller value, can effectively improve model Generalization ability avoids model from over-fitting occur, but the size of learning rate is directly proportional to the complexity of model, therefore learning rate is big Small selection answers the performance of equilibrium model and calculates the time.
According to the continuous iteration of process of the m regression tree of training, until obtaining M regression tree superposition most after iteration M times Whole gradient promotes regression tree model, and representation method is as follows:
Wherein,Indicate that gradient promotes regression tree.
It is that M iteration is carried out using training set data that gradient, which promotes regression tree, and each iteration generates a regression tree model, By using the method that gradient declines, in each iteration by making loss function to the movement of the negative gradient direction of loss function It is smaller and smaller, to obtain more and more accurate model.Each time in iterative process, the output valve by calculating leaf node obtains spy The leaf node output valve in space in all feature units is levied, finally adds up and obtains all leaves in M regression tree feature space Node exports value set.When carrying out emission index estimation using test set, using the independent variable in test set as mode input, instruction The regression tree model perfected can classify to independent variable according to ready-portioned feature unit, determine its which belonging feature list Then member can predict the emission index of the test sample with the mean predicted value of all training samples in this feature unit Value.
4) gradient promotes regression tree model parameter regulation.
The parameter that gradient promotes regression tree mainly includes two classes, and the first kind is the parameter of regulating gradient method for improving, and second Class is the parameter that control returns tree construction.
Important gradient is promoted there are two parameters, is learning rate and regression tree number respectively, is used respectively in Python Learning_rate and n_estimators are indicated, are used to regulating step 3) in shrinkage parameters v and regression tree sum M. The default value of learning_rate is 0.1.
There are four important regression tree structural parameters: (1) depth capacity set, and (2) each node needs to continue division Minimum sample number, minimum sample number needed for (3) generate leaf node, the characteristic of (4) feature space, i.e., the J value in step 3) Size.The depth for indicating tree with max_depth in Python, with min_samples_split indicate each node need after The minimum sample number of continuous division indicates minimum sample number needed for generating leaf node with min_samples_leaf, uses max_ The characteristic of features expression feature space.
Firstly, regulating gradient promotes parameter value.To determine that gradient promotes parameter, the initial of regression tree structural parameters is first set Value.According to the total sample number and variable number of training set D, the depth max_depth of tree generally chooses 5-8, to avoid initial model There is over-fitting, chooses smaller value 5 here;Each node needs to continue the minimum sample number min_samples_split of division Value range generally between the 0.5%-1% of the total sample number of training set D, the sample number of this experiment about 30,000, therefore Min_samples_split initial value may be set to 150;To avoid over-fitting, minimum sample number needed for generating leaf node Min_samples_leaf chooses 20 and is used as initial value;The characteristic max_features of feature space is selected 5 as initial Value.After initial value is provided with, under conditions of learning_rate default value is 0.1, regression tree number n_ is adjusted estimators.Using trellis search method, it is incremented by with 10 numbers, 80 is incremented to from 20, according to the average value of cross validation (cross_val_score) optimal n_estimators is determined.If optimal value is too large or too small, need to readjust Learning_rate value continues to find n_estimators optimal value using trellis search method.
Tree construction is returned secondly, determining.It is preferentially to adjust the ginseng being affected to result adjusting regression tree structural parameters Number.The depth max_depth of tree and each node need the minimum sample number min_samples_split for continuing division directly to determine Surely the structure set, therefore preferential adjusting max_depth and min_samples_split.Max_depth can measure 10 from 5, Min_samples_split measures 300 from 150 for interval with 10, chooses optimal value using trellis search method.Determine max_ After the optimal value of depth and min_samples_split, min_samples_leaf is adjusted, can be interval with 10 100 are measured from 10, therefrom chooses optimal value.It finally needs that the maximum characteristic max_features of feature space is adjusted, Adjustable range measures 10 from 2, therefrom chooses optimal value.After the completion of parameter regulation, final discharge estimation model is obtained.
A kind of city bus emission index estimation method promoting regression tree based on gradient proposed by the present invention, herein with reality Industry's city bus emissions data of survey carries out example.
(1) data explanation
Using PEMS and GPS device on April 10th, 2016 to during April 20 to 1 tunnel of industry, 51 tunnels and No. 206 buses carry out emissions data acquisition, and the attribute for including is as shown in the table:
Table 2 surveys the attribute value that emissions data includes
The 1- moment 2- seating capacity 3- instantaneous velocity (m/s) 4- acceleration (m/s2)
5- height above sea level (m) 6- longitude and latitude 7- time interval (s) 8- operating range (m)
9-CO emission index (g/s) 10-CO2Emission index (g/s) 11-HC emission index (g/s) 12-NOxEmission index (g/s)
According to the seating capacity that record obtains, it is estimated that the passenger capacity of each moment dynamic change;It is measured according to GPS Elevation data, it is estimated that road grade;According to time interval data, between available PEMS and the sampling of GPS device Every to carry out standardization processing, and the row that the PEMS emissions data measured and GPS are measured using lagrange-interpolation It sails state and geographic information data is merged, be obtained by 30, more than 000 item of second emissions data;Meanwhile according to speed, acceleration The information such as degree, road grade and passenger capacity, can calculate vehicle specific power (VSP), to measure the current operating condition of bus. In addition, it is contemplated that the driving status at t-1 moment can influence the emission performance of t moment to a certain extent, therefore in the number of t moment According to the velocity and acceleration that the t-1 moment is added is concentrated, v is used respectivelyt-1And at-1It indicates, the emissions data attribute finally obtained is as follows Shown in table:
The attribute value that data packet contains is discharged after 3 standardization processing of table
The 1- moment 2- dynamic passenger capacity 3- instantaneous velocity (m/s) 4- acceleration (m/s2)
5- road grade 6-VSP 7-vt-1 8-at-1
9-CO emission index (g/s) 10-CO2Emission index (g/s) 11-HC emission index (g/s) 12-NOxEmission index (g/s)
13- operating range (m)
It according to the ratio of 7:3 is training set by emissions data test value random division by the performance of assessment institute proposition model And test set.Training set is used to have the training of supervision, the emission index in test set estimated based on trained model to Assessment models effect.
(2) model foundation
Since the emission performance of different fuel type bus is there are significant difference, model training and survey are used in the present invention The data of examination are the actual measurement emissions data of liquefied natural gas bus.Liquefied natural gas public transport is established using the step in specification Model is estimated in the discharge of vehicle, and carries out parameter regulation, and the results are shown in Table 4:
Each parameter value of table 4
learning_rate n_estimators max_depth min_samples_leaf min_samples_split max_features
0.05 50 7 28 175 6
Fig. 2 gives to CO, CO in test set2、HC、NOXIn the estimation of four kinds of emission index with the comparing result in actual measurement. It was found that measured value and the probability value p-value of estimated value are respectively less than 0.01, illustrate that estimated value to measured value is significant relevant.Together When, coefficient of determination R2Value is all larger than 0.6, illustrates that model has preferable estimation effect to four kinds of emissions.
(3) effect analysis
For the estimation effect for further verifying model, U.S. EPA is utilized ' s MOtor Vehicle Emission Simulator (MOVES) model carries out Comparative result, and using three kinds of common verifying indexs: mean absolute error (MAE), Mean absolute percentage error (MAPE), root mean square error (RMSE) evaluate the effect of proposed model.Further, since Regional is not provided in MOVES, and does not include liquefied natural gas bus type, therefore is estimated carrying out discharge using MOVES Meter is to have selected to replace with Indiana, USA similar in Jiangsu Province's landform, simultaneous selection compressed natural gas bus approximation Liquefied natural gas bus.Table 5 illustrates the comparison between the emission index estimation effect of two kinds of models, the results showed that is proposed Model works well in emission index estimation.
The performance comparison of 5 model of table

Claims (10)

1. a kind of city bus emission index estimation method for promoting regression tree based on gradient, which is characterized in that including walking as follows It is rapid:
(1) standardization processing is carried out to the bus discharge of acquisition and running data, obtained special by second emission index and driving status Levy supplemental characteristic;
(2) according to bus driving condition calculation of characteristic parameters real-time vehicle specific power, and it is previous with velocity and acceleration characterization The driving status of second, the emissions data obtained based on step (1) determines training set, as mode input parameter;
(3) the loss function L of model is determined according to the input parameter that step (2) obtains, and sets regression tree number M, and initialize Weak learner constructs new recurrence as residual error approximation in the value of current regression tree model with the negative gradient of loss function Tree;
(4) regression tree determined in an iteration according to step (3), renewal learning device function, until M iteration knot Beam, i.e. M regression tree obtain final strong learner model;
(5) emission index estimation is carried out to test set using the model established.
2. the city bus emission index estimation method according to claim 1 for promoting regression tree based on gradient, feature It is, in step (1), standardization processing is carried out to the data of acquisition according to the following formula:
For n+1 point to (x0,y0),(x1,y1),...,(xn,yn), seek a function li(x), make the function in xiPlace obtains Corresponding yiValue, liIt (x) is Lagrangian fundamental polynomials, i.e. Interpolation-Radix-Function, expression formula are as follows:
Wherein, n+1 indicates the point of data set to number;xnIndicate (n+1)th point to it is corresponding at the time of;ynIndicate (n+1)th point Pair discharge and driving status characteristic variable value;
Assuming that the x that any two are differentiIt is all different, lagrange polynomial can be obtained:
3. the city bus emission index estimation method according to claim 1 for promoting regression tree based on gradient, feature It is, in step (1), the bus is by CO, CO that second emission index includes that bus discharges during actual travel2、HC、 NOXBy second emission index;The real-time driving condition characteristic parameter includes speed, acceleration, road grade and dynamic passenger capacity.
4. the city bus emission index estimation method according to claim 1 for promoting regression tree based on gradient, feature It is, it is described to utilize bus driving condition calculation of characteristic parameters real-time vehicle specific power in step (2), and with speed and add The driving status method that speed characterizes the previous second is as follows:
(2.1) the bus driving parameters data obtained using step (1), calculate the vehicle specific power of bus, specific to calculate Method is as follows:
In formula, VSP is the vehicle specific power of bus, FtIt is tractive force (N);V is travel speed (m/s);M is bus gross weight Amount, including vehicle body nt wt net weight and carrying weight (kg);Ff,Fw,Fi,Fj, respectively indicate rolling resistance, air drag, gradient resistance And acceleration resistance (N);A indicates bus acceleration (m/s2);G is acceleration of gravity (9.8m/s2);F is coefficient of rolling resistance, For dimensionless group;εiIndicate quality factor;α indicates road grade;ρaIndicate atmospheric density;CDTraction coeficient;A indicates public transport Car bumper wind transparency area;
(2.2) velocity and acceleration of previous driving status is obtained by second driving status supplemental characteristic according to bus, i.e., before One second velocity and acceleration.
5. the city bus emission index estimation method according to claim 4 for promoting regression tree based on gradient, feature It is, in step (2), the emissions data obtained based on step (1) determines training set, as mode input parameter, above-mentioned training It is as follows to collect the method for determination:
D={ (x1,y1),(x2,y2),...(xi,yi),...,(xN,yN) i=1,2 ..., N
Wherein, D indicates the training set as discharge estimation mode input layer, (x thereini,yi) indicate i-th group of number in training set According to independent variable and dependent variable point pair, xiIt indicates argument data collection, i.e. discharge variation, altogether includes three variations, Respectively VSPt, vt-1, at-1, wherein VSPtIndicate t moment, the i.e. instantaneous vehicle specific power at current time, vt-1And at-1Table respectively Show the instantaneous velocity and acceleration at t-1 moment;yiIndicate bus emission index, including CO, CO2, HC, NOXThe row of four kinds of emissions Rate is put, N is the number of samples of input data.
6. the city bus emission index estimation method according to claim 1 for promoting regression tree based on gradient, feature It is, in step (3), regression tree number M is preset, and loss function is negative binomial log-likelihood function, and expression formula is as follows:
L (y, f (x))=log (1+exp (- 2yf (x)));
Wherein, y surveys emission index value for being dependent variable value;F (x) indicates emission index estimated value;
Initialize the form of weak learner are as follows:
Wherein, N is the number of samples of input data;C is initial leaf node output parameter;L(yi, c) and it indicates with i-th of sample The loss function that training obtains.
7. the city bus emission index estimation method according to claim 6 for promoting regression tree based on gradient, feature It is, in step (3), with the negative gradient of loss function in the value of "current" model, is fitted as the regression tree newly constructed residual The approximation of difference, for each sample (xi,yi), residual error is calculated using the method for gradient decline:
Wherein, rm,iIndicate the residual error of i-th of sample in the m regression tree;fm-1(xi) indicate that the m-1 regression tree training obtains Learner, i.e., when independent variable be xiThe emission index estimated value that the m-1 regression tree of Shi Liyong acquires is calculating the m recurrence F is used when setting the residual error of i-th of samplem-1(xi) replace f (xi);
Since loss function is negative binomial log-likelihood function, residual error can be further indicated that are as follows:
8. the city bus emission index estimation method according to claim 7 for promoting regression tree based on gradient, feature It is, the method for one regression tree of fitting is in step (3): utilizes the residual error of i-th of sample in calculated the m regression tree rm,i, gathered { (xi,rm,i)}I=1,2 ..., N, to train the m regression tree Tm, the division region note of leaf node For Rm,j, j=1,2 ..., J.
9. the city bus emission index estimation method according to claim 8 for promoting regression tree based on gradient, feature It is, in step (3), the method for solving leaf node output valve is, for regression tree TmEach leaf node:
Wherein, cm,jIndicate the leaf node output valve in j-th of feature unit of the m regression tree, the i.e. estimated value of emission index.
10. the city bus emission index estimation method according to claim 9 for promoting regression tree based on gradient, feature It is, in step (4), the update method of learner are as follows: obtain regression tree TmAll leaf node output valves after, renewal learning Device:
Wherein, feature space is divided into J unit { R by regression tree1,R2,...,RJ, feature space refers to every recurrence leaf The generating mode of node determines the method for dividing leaf node that is, according to the number of independent variable and value range;Rm,jIndicate the m J-th of feature unit of regression tree, each feature unit indicate a division classification;I(x∈Rm,j) it is indicator function, x is certainly Variable, i.e. discharge variation VSPt, vt-1And at-1, as regression tree TmDetermine x ∈ Rm,jWhen, that is, indicate that the independent variable belongs to Rm,j In unit, I value takes 1 at this time, is otherwise 0;cm,jIndicate the defeated of the leaf node that the m regression tree obtains under j-th of feature unit It is worth out, i.e. emission index value under this feature unit;
Gradient method for improving introduces shrinkage parameters v, then the expression formula of renewal learning device becomes:
Wherein, shrinkage parameters v is known as learning rate;
According to the continuous iteration of process of the m regression tree of training, until obtaining the final ladder of M regression tree superposition after iteration M times Degree promotes regression tree model, and representation method is as follows:
Wherein,Indicate that gradient promotes regression tree.
CN201810958885.4A 2018-08-22 2018-08-22 A kind of city bus emission index estimation method promoting regression tree based on gradient Pending CN109376331A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810958885.4A CN109376331A (en) 2018-08-22 2018-08-22 A kind of city bus emission index estimation method promoting regression tree based on gradient

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810958885.4A CN109376331A (en) 2018-08-22 2018-08-22 A kind of city bus emission index estimation method promoting regression tree based on gradient

Publications (1)

Publication Number Publication Date
CN109376331A true CN109376331A (en) 2019-02-22

Family

ID=65404364

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810958885.4A Pending CN109376331A (en) 2018-08-22 2018-08-22 A kind of city bus emission index estimation method promoting regression tree based on gradient

Country Status (1)

Country Link
CN (1) CN109376331A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110147835A (en) * 2019-05-10 2019-08-20 东南大学 Resisting shear strength of reinforced concrete beam-column joints prediction technique based on grad enhancement regression algorithm
CN111125862A (en) * 2019-09-27 2020-05-08 长安大学 Following model emission measurement and calculation method based on genetic algorithm and specific power
CN113464418A (en) * 2021-09-01 2021-10-01 蘑菇物联技术(深圳)有限公司 Method for determining performance state of air compressor, computing equipment and computer medium
CN117235679A (en) * 2023-11-15 2023-12-15 长沙金码测控科技股份有限公司 LUCC-based tensile load and compressive load evaluation method and system for foundation pit monitoring
WO2024065954A1 (en) * 2022-09-28 2024-04-04 电子科技大学长三角研究院(湖州) Short-time prediction method and system for occupancy rate of parking spaces in parking lot, device and terminal

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105957348A (en) * 2016-07-01 2016-09-21 东南大学 Urban bus route node emission estimating method based on GIS and PEMS
CN107886188A (en) * 2017-10-18 2018-04-06 东南大学 Liquefied natural gas public transport exhaust emissions Forecasting Methodology

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105957348A (en) * 2016-07-01 2016-09-21 东南大学 Urban bus route node emission estimating method based on GIS and PEMS
CN107886188A (en) * 2017-10-18 2018-04-06 东南大学 Liquefied natural gas public transport exhaust emissions Forecasting Methodology

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李自法等: "一种基于GBRT 算法的CA 砂浆脱空检测方法", 《铁道科学与工程学报》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110147835A (en) * 2019-05-10 2019-08-20 东南大学 Resisting shear strength of reinforced concrete beam-column joints prediction technique based on grad enhancement regression algorithm
CN111125862A (en) * 2019-09-27 2020-05-08 长安大学 Following model emission measurement and calculation method based on genetic algorithm and specific power
CN111125862B (en) * 2019-09-27 2023-12-26 长安大学 Following model emission measuring and calculating method based on genetic algorithm and specific power
CN113464418A (en) * 2021-09-01 2021-10-01 蘑菇物联技术(深圳)有限公司 Method for determining performance state of air compressor, computing equipment and computer medium
WO2024065954A1 (en) * 2022-09-28 2024-04-04 电子科技大学长三角研究院(湖州) Short-time prediction method and system for occupancy rate of parking spaces in parking lot, device and terminal
CN117235679A (en) * 2023-11-15 2023-12-15 长沙金码测控科技股份有限公司 LUCC-based tensile load and compressive load evaluation method and system for foundation pit monitoring

Similar Documents

Publication Publication Date Title
CN109376331A (en) A kind of city bus emission index estimation method promoting regression tree based on gradient
CN107330217B (en) Mesoscopic oil consumption prediction method based on RBFNN
Lyu et al. Review of the studies on emission evaluation approaches for operating vehicles
Lemieux et al. Vehicle speed prediction using deep learning
CN106203735B (en) A kind of automobile driver driving behavior energy consumption characters measuring method
CN109086946B (en) Method for predicting emission of polluted gas of conventional energy and new energy public transport vehicle
Li et al. Research on optimized GA-SVM vehicle speed prediction model based on driver-vehicle-road-traffic system
CN112896186B (en) Automatic driving longitudinal decision control method under cooperative vehicle and road environment
CN109272746B (en) MFD estimation method based on BP neural network data fusion
CN105957348B (en) Evaluation method is discharged at a kind of urban public bus lines node based on GIS and PEMS
CN107886188B (en) Liquefied natural gas bus tail gas emission prediction method
US20200331473A1 (en) Method for ascertaining driving profiles
CN107832910B (en) Method for evaluating influence of road traffic characteristics on carbon monoxide emission concentration
US20220335822A1 (en) Method of determining the amount of pollutant emissions from a vehicle over a road network section
CN109615208B (en) Method for solving traffic jam problem of urban road
Zhu et al. An automated vehicle fuel economy benefits evaluation framework using real-world travel and traffic data
CN114881356A (en) Urban traffic carbon emission prediction method based on particle swarm optimization BP neural network optimization
CN114187766B (en) Road service level evaluation method based on saturation rate
CN106023592A (en) Traffic jam detection method based on GPS data
Zhu et al. Green routing fuel saving opportunity assessment: A case study using large-scale real-world travel data
CN106297296A (en) A kind of fine granularity distribution method hourage based on sparse tracing point data
CN115422747A (en) Method and device for calculating discharge amount of pollutants in tail gas of motor vehicle
CN111710160A (en) Travel time prediction method based on floating car data
Abas et al. Efforts to establish Malaysian urban drive-cycle for fuel economy analysis
CN108171975B (en) Urban automobile running speed prediction method based on road section and intersection distribution

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190222