CN110450771A

CN110450771A - A kind of intelligent automobile stability control method based on deeply study

Info

Publication number: CN110450771A
Application number: CN201910809910.7A
Authority: CN
Inventors: 黄鹤; 郭伟锋; 张炳力; 张润; 王博文; 吴润晨; 程进
Original assignee: Hefei Polytechnic University
Current assignee: Hefei University of Technology; Hefei Polytechnic University
Priority date: 2019-08-29
Filing date: 2019-08-29
Publication date: 2019-11-15
Anticipated expiration: 2039-08-29
Also published as: CN110450771B

Abstract

The invention discloses a kind of intelligent automobile stability control method based on deeply study, step includes: the decision output and vehicle structure parameter, driving parameters of 1 acquisition automobile Lateral Controller；2 define state parameter, action parameter and the reward function of deeply learning method；3 construct and train the network model of deeply learning method, obtain optimal movement network model；4 obtain automobile current state parameter s_t, to export currently additional yaw moment ▽ M using the optimal movement network model_tWith amendment corner ▽ δ_t；5 judge vehicle steadily state；6 determine current amendment corner ▽ δ according to motor turning property and steering wheel angle direction_tDirection and current additional yaw moment ▽ M_tMovement wheel.The present invention is able to achieve the rule of the optimistic coordinated control under steady working condition and limiting condition between direct yaw moment control and course changing control, to realize vehicle stabilization control, guarantees safety and the comfort of driver and passenger.

Description

A kind of intelligent automobile stability control method based on deeply study

Technical field

The present invention relates to automobile dynamics control field, more specifically a kind of intelligent vapour based on deeply study Vehicle stability control method.

Background technique

When turning to, slip angle of tire increases automobile, and lateral force increases, and enables the vehicle to travel according to driver intention, But under some low attachments and anxious steering situation, the lateral force of vehicle is easy to reach limit of adhesion, and vehicle can occur to break away, is anxious Turn, rollover etc. dangerous working conditions.Currently, the major way that above-mentioned dangerous working condition can be intervened be active steering control and directly Yaw moment control.Active steering control is by changing the yaw moment of vehicle to steering wheel Introduced Malaria corner；It is directly horizontal It is poor that pendulum Torque Control mainly passes through adjusting wheel braking force formation brake force, to generate additional yaw moment to adjust vehicle Understeer or ovdersteering.

The influence of active steering and direct yaw moment control to automotive performance respectively has advantage and disadvantage, independent active steering control It is smaller on speed influence, it ensure that the comfort of driver and passenger, but ineffective under limiting condition, it is steady to be unable to control vehicle It is fixed, it can not meet the security requirement of driver and passenger；Individual direct yaw moment control system, it is ensured that driver and passenger exist Safety under limiting condition, but be affected to longitudinal acceleration of the vehicle, it can not meet the comfort requirement of driver and passenger.And vehicle As complicated nonlinear system, there are many couplings between each system, and it is steady to control vehicle in each state of vehicle Relatively optimal control output is established a capital, is not simple linear relationship between these optimal control outputs, passes through design Linear tuning controller can not guarantee safety and the comfort of driver and passenger well.

Summary of the invention

The present invention is to solve above-mentioned the shortcomings of the prior art, proposes a kind of intelligent automobile based on deeply study Stability control method, to be able to achieve under steady working condition and limiting condition between direct yaw moment control and course changing control Optimistic coordinated control rule, to realize vehicle stabilization control, guarantees safety and the comfort of driver and passenger.

The present invention adopts the following technical scheme that in order to solve the technical problem

A kind of the characteristics of intelligent automobile stability control method based on deeply study of the invention is as follows It carries out:

Step 1: obtaining the front wheel angle δ of vehicle lateral control device decision output_fAnd vehicle structure parameter, comprising: vehicle Wheelspan L, mass center are to wheel base from L_fAnd L_r, front and back wheel cornering stiffness C₁And C₂, car mass m；

Obtain vehicle driving parameter, comprising: steering wheel angle sw, speed u and surface friction coefficient μ；

Step 2: calculating ideal yaw velocity w using formula (1)_d:

In formula (1), g is acceleration of gravity, and w is yaw velocity, and is had:

Step 3: calculating ideal side slip angle β using formula (3)_d:

β_d=-min | β |, | β_max|}·sign(δ_f) (3)

In formula (3), β is vehicle centroid side drift angle, β_maxFor vehicle maximum side slip angle, and have:

Step 4: the vehicle status parameters s of deeply learning method is defined using formula (6):

S={ w, β, sw, w_d,β_d} (6)

Step 5: the action parameter a of deeply learning method is defined using formula (7):

In formula (7),Corner is corrected for steering wheel,To add yaw moment；

Step 6: the reward function r of deeply learning method is established using formula (8):

R=r_e+r_ps+r_v+r_m+r_sw+r_st (8)

In formula (8), r_eFor error reward function, and have:

In formula (9),For yaw-rate error,For side slip angle error, and have:

In formula (8), r_psFor fixed prize value function, and have:

In formula (8), r_vFor speed difference reward function, and have:

In formula (8), r_mTo add yaw moment reward function, and have:

In formula (8), r_swFor correction angle reward function, and have:

In formula (8), r_stFor stable region reward function, and have:

Step 7: the network model of building deeply learning method:

Step 7.1: building acts network model, comprising: one layer of input layer comprising a neuron respectively contains n₁It is a The m of neuron₁Layer hidden layer, one layer of output layer comprising 2 neurons；Initialization action network parameter is θ^μ；

Step 7.2: building evaluation network model, comprising: respectively include two layers of input layer of 1 neuron, respectively contain n₂ The m of a neuron₂Layer hidden layer, wherein m₂Layer hidden layer is full articulamentum, one layer of output layer comprising 1 neuron；Just It is θ that beginningization, which evaluates network parameter,^Q；

Step 7.3: building target action network model identical with the movement network architecture, and enable target action Network parameter θ^μ′=θ^μ, objective appraisal network model identical with the evaluation network architecture is constructed, and enable objective appraisal Network parameter θ^Q′=θ^Q；

Step 8: N sample is formed by i-th sample:

Initialize i-th of vehicle status parameters s_i, and with i-th of vehicle status parameters s_iAs the movement network model Input, by the movement network model export μ (s_i|θ^μ)；

I-th of vehicle action parameter a is obtained using formula (17)_i:

a_i=μ (s_i|θ^μ)+N_i (17)

In formula (17), N_iIndicate i-th of random noise；

I-th of vehicle reward value r is obtained according to formula (8)_i, and obtain updated i-th of vehicle status parameters s '_i；To It obtains obtaining i-th sample, is denoted as (s_i,a_i,r_i,s′_i), and then obtain N sample；

Step 9: being trained with network model of the N sample to the deeply learning method, to obtain Obtain optimal movement network model and optimal evaluation network model；

Step 10: judging whether formula (18) and formula (19) are set up, if setting up, then it represents that automobile is in stable state, Otherwise, it indicates that automobile plays pendulum, and executes step 11:

In formula (18), k₁For the first border coefficient of stable region, k₂For stable region the second boundary coefficient；For side slip angle Speed；

In formula (19), ε is adjustable parameter；

Step 11: obtaining vehicle's current condition parameter s_tAs the input of optimal movement network model, thus described in utilizing Optimal movement network model output currently adds yaw momentWith amendment corner

Step 12: judge whether formula (20) is true, if so, the steering property for then indicating automobile is understeer, then enables Movement wheel is inner rear wheel, and executes step 13, otherwise, indicates that the steering property of automobile is negative understeer, then enables movement wheel For outer front-wheel, and execute step 14；

w_d×(w-w_d) > 0 (20)

Step 13: if δ_f> 0 then enables amendment cornerDirection to the left, if δ_f< 0 then enables amendment cornerSide To the right；

Step 14: if δ_f> 0 then enables amendment cornerDirection to the right, if δ_f< 0 then enables amendment cornerSide To the left.

The characteristics of intelligent automobile stability control method of the present invention, lies also in, and the step 9 is according to the following procedure It carries out:

Step 9.1: initialization Study rate parameter is α, and return rate parameter is γ；Initialize i=1；

Step 9.2: with i-th of vehicle status parameters s_iThe input that network model is acted as current i-th, by institute It states current i-th of movement network model and exports i-th of output valve μ (s_i|θ^μ)；

With i-th of vehicle status parameters s_i, i-th of vehicle action parameter a_iWith i-th of output of the movement network Value μ (s_i|θ^μ) as it is described it is current i-th evaluation network model input, by i-th of vehicle status parameters s_iWith i-th A vehicle action parameter a_iI-th of output valve Q is exported by current i-th of evaluation network model_i(a_i)；By the movement I-th of output valve μ (s of network model_i|θ^μ) by i-th of output valve Q of current i-th of evaluation network model output_i(μ (s_i|θ^μ))；

With updated i-th of vehicle status parameters s '_iAs current i-th of target action network model Input exports i-th of output valve μ (s ' by current i-th of target action network model_i|θ^μ′)；

With updated i-th of vehicle status parameters s '_iWith i-th of output valve μ (s ' of target action network model_i |θ^μ′) input as current i-th of objective appraisal network model, it is defeated by current i-th of objective appraisal network model I-th of output valve Q ' out_i(a′_i)；

According to i-th of output valve Q of current i-th of evaluation network model_i(μ(s_i|θ^μ)) Utilization strategies gradient method pair Current i-th of movement network model is updated, thus obtain the updated movement network model of i-th and as i-th+ 1 movement network model；

The output Q of network model is evaluated according to current i-th_i(a_i) and current i-th of objective appraisal network model Output Q '_i(a′_i), current i-th of evaluation network model is updated using loss function is minimized, to obtain The updated evaluation network model of i-th simultaneously evaluates network model as i+1；

Step 9.3: after i+1 is assigned to i, judge whether i > N is true, if so, it then indicates to obtain optimal movement network Model and optimal evaluation network model, otherwise, return step 9.2 execute.

Compared with prior art, the invention has the advantages that:

1, model-free and extensive prediction advantage of the present invention using deeply learning algorithm, it is determined that it and vehicle stabilization Property the relevant input state of control and output action, devise the reward function for being adapted to coordinated control, construct and train most Excellent movement network model, thus using the model under steady working condition and limiting condition can decision go out optimal stable coordination Control strategy ensure that safety and the comfort of driver and passenger to realize vehicle stabilization control；

2, the deeply learning algorithm that the present invention is based on is not needed upon auto model algorithm for design model, is used Deep neural network there is very strong non-linear expression's ability, vehicle condition and active steering can be given expression to, difference is braked Non-linear relation between control more meets true feelings compared to the linear controller designed based on simplified auto model Condition；

3, control method of the invention comparison is without control, active steering control, direct yaw moment control and linear distribution Coordinated control all has preferable control effect under different operating conditions, has better robustness, then has under limiting condition Better comfort.

Detailed description of the invention

Fig. 1 is that the present invention is based on the intelligent automobile stabilitraks that deeply learns；

Fig. 2 is the training process figure of deeply learning method of the present invention.

Specific embodiment

In the present embodiment, it is a kind of based on deeply study intelligent automobile stability control method can be current according to automobile State parameter, decision goes out current amendment corner and additional yaw moment, to realize stability of automobile coordinated control.Specifically It says, as shown in Figure 1, being to carry out as follows:

Step 2: calculating ideal yaw velocity w using formula (1)_d:

Step 3: calculating ideal side slip angle β using formula (3)_d:

β_d=-min | β |, | β_max|}·sign(δ_f) (3)

S={ w, β, sw, w_d,β_d} (6)

In formula (7),Corner is corrected for steering wheel, value range is (0,20), and unit takes °,To add sideway power Square value range is (0,20), and unit takes Nm；

R=r_e+r_ps+r_v+r_m+r_sw+r_st (8)

Reward function is the core of entire depth nitrification enhancement, can guide the adjustment side of deep neural network parameter To.Design principle should be provided first in design, and specific reward function is then redesigned according to design principle.

It is 4 priority that function setup is rewarded in this example, and priority is higher, then the principle is more important, design principle are as follows:

1 grade: object of the present invention is to realize Vehicle Stability Control, therefore guaranteeing that the stability of automobile is top priority；

2 grades: course changing control has advantage compared to control for brake, so to guarantee that course changing control will be controlled prior to braking System；

3 grades: using lesser active steering angle or lesser braking pressure control vehicle steadily as far as possible；

4 grades: for automobile in stability region, making movement output as far as possible is 0；

In formula (8), r_eFor error reward function, corresponding to 1 grade of design principle, the smaller then reward value of error is bigger, in order to The change rate of the importance of prominent 1 grade of design principle, error reward function should be maximum, therefore designs quadratic function as 1 grade Reward function, and have:

In formula (9),For yaw-rate error,For side slip angle error, and have:

In formula (8), r_psFor fixed prize value function, correspond to 2 grades of design principles, preferentially using course changing control can obtain compared with Big reward value, and have:

In formula (8), r_vFor speed difference reward function, correspond to 2 grades of Controlling principles, turning to influences speed compared to braking It is smaller, biggish reward value can be obtained, and have:

In formula (8), r_mTo add yaw moment reward function, corresponding to 3 grades of design principles, and have:

In formula (8), r_swFor correction angle reward function, correspond to 3 grades of design principles, and have:

In formula (8), r_stFor stable region reward function, correspond to 4 grades of design principles, in stable region, movement is smaller, encourages It encourages bigger, and has:

Step 7: the network model of building deeply learning method:

Step 7.3: building target action network model identical with movement network architecture, and enable target action network Parameter θ^μ′=θ^μ, building objective appraisal network model identical with evaluation network architecture, and enable objective appraisal network parameter θ^Q′=θ^Q；

Step 8: N sample is formed by i-th sample:

Initialize i-th of vehicle status parameters s_i, and with i-th of vehicle status parameters s_iAs the defeated of movement network model Enter, μ (s is exported by movement network model_i|θ^μ)；

I-th of vehicle action parameter a is obtained using formula (17)_i:

a_i=μ (s_i|θ^μ)+N_i (17)

In formula (17), N_iIndicate i-th of random noise；

Step 9: as shown in Fig. 2, being trained with network model of the N sample to deeply learning method:

Step 9.2: with i-th of vehicle status parameters s_iThe input that network model is acted as current i-th, by current the I movement network model exports i-th of output valve μ (s_i|θ^μ)；

With i-th of vehicle status parameters s_i, i-th of vehicle action parameter a_iWith i-th of output valve μ (s of movement network_i| θ^μ) as current i-th evaluate network model input, by i-th of vehicle status parameters s_iWith i-th of vehicle action parameter a_iI-th of output valve Q is exported by current i-th of evaluation network model_i(a_i)；By i-th of output valve μ of movement network model (s_i|θ^μ) by i-th of output valve Q of current i-th of evaluation network model output_i(μ(s_i|θ^μ))；

With updated i-th of vehicle status parameters s '_iAs the input of current i-th of target action network model, by Current i-th of target action network model exports i-th of output valve μ (s '_i|θ^μ′)；

With updated i-th of vehicle status parameters s_i' and target action network model i-th of output valve μ (s '_i| θ^μ′) input as current i-th of objective appraisal network model, by current i-th of objective appraisal network model export i-th it is defeated Value Q ' out_i(a′_i)；

I-th of output valve Q of network model is evaluated according to current i-th_i(μ(s_i|θ^μ)) Utilization strategies gradient method is to current I-th of movement network model is updated, to obtain the updated movement network model of i-th and act as i+1 Network model；

The output Q of network model is evaluated with current i-th_i(a_i) and current i-th of objective appraisal network model output Q′_i(a′_i), current i-th of evaluation network model is updated using loss function is minimized, thus after obtaining i-th update Evaluation network model and as i+1 evaluate network model；

Step 9.3: after i+1 is assigned to i, judge whether i > N is true, if so, it then indicates to obtain optimal movement network Model and optimal evaluation network model, otherwise, return step 9.2 execute；

In formula (19), ε is adjustable parameter；

Step 11: obtaining vehicle's current condition parameter s_tAs the input of optimal movement network model, thus using optimal Act network model output currently additional yaw momentWith amendment corner

w_d×(w-w_d) > 0 (20)

Claims

1. a kind of intelligent automobile stability control method based on deeply study, it is characterized in that carrying out as follows:

Step 1: obtaining the front wheel angle δ of vehicle lateral control device decision output_fAnd vehicle structure parameter, comprising: car gage L, mass center is to wheel base from L_fAnd L_r, front and back wheel cornering stiffness C₁And C₂, car mass m；

Step 2: calculating ideal yaw velocity w using formula (1)_d:

Step 3: calculating ideal side slip angle β using formula (3)_d:

β_d=-min | β |, | β_max|}·sign(δ_f) (3)

S={ w, β, sw, w_d,β_d} (6)

A={ ▽ δ, ▽ M } (7)

In formula (7), ▽ δ is that steering wheel corrects corner, and ▽ M is additional yaw moment；

R=r_e+r_ps+r_v+r_m+r_sw+r_st (8)

In formula (8), r_eFor error reward function, and have:

r_e=-▽ w²-▽β²+50 (9)

In formula (9), ▽ w is yaw-rate error, and ▽ β is side slip angle error, and is had:

▽ w=w-w_d (10)

▽ β=β-β_d (11)

In formula (8), r_psFor fixed prize value function, and have:

In formula (8), r_vFor speed difference reward function, and have:

In formula (8), r_mTo add yaw moment reward function, and have:

In formula (8), r_swFor correction angle reward function, and have:

r_sw=-| ▽ δ |+10 (15)

In formula (8), r_stFor stable region reward function, and have:

r_st=-(| ▽ δ |+| ▽ M |)/10 (16)

Step 7: the network model of building deeply learning method:

Step 7.1: building acts network model, comprising: one layer of input layer comprising a neuron respectively contains n₁A nerve The m of member₁Layer hidden layer, one layer of output layer comprising 2 neurons；Initialization action network parameter is θ^μ；

Step 7.2: building evaluation network model, comprising: respectively include two layers of input layer of 1 neuron, respectively contain n₂A nerve The m of member₂Layer hidden layer, wherein m₂Layer hidden layer is full articulamentum, one layer of output layer comprising 1 neuron；Initialization is commented Valence network parameter is θ^Q；

Step 8: N sample is formed by i-th sample:

Initialize i-th of vehicle status parameters s_i, and with i-th of vehicle status parameters s_iAs the defeated of the movement network model Enter, μ (s is exported by the movement network model_i|θ^μ)；

I-th of vehicle action parameter a is obtained using formula (17)_i:

a_i=μ (s_i|θ^μ)+N_i (17)

In formula (17), N_iIndicate i-th of random noise；

I-th of vehicle reward value r is obtained according to formula (8)_i, and obtain updated i-th of vehicle status parameters s '_i；To obtain I-th sample is obtained, (s is denoted as_i,a_i,r_i,s′_i), and then obtain N sample；

Step 9: being trained with network model of the N sample to the deeply learning method, to be obtained Optimal movement network model and optimal evaluation network model；

|▽w|≤|ε·w_d| (19)

In formula (19), ε is adjustable parameter；

Step 11: obtaining vehicle's current condition parameter s_tAs the input of optimal movement network model, thus using described optimal dynamic Make network model output currently additional yaw moment ▽ M_tWith amendment corner ▽ δ_t；

Step 12: judge whether formula (20) is true, if so, the steering property for then indicating automobile is understeer, then enables movement Wheel is inner rear wheel, and executes step 13, otherwise, indicates that the steering property of automobile is negative understeer, then it is outer for enabling movement wheel Front-wheel, and execute step 14；

w_d×(w-w_d) > 0 (20)

Step 13: if δ_f> 0 then enables amendment corner ▽ δ_tDirection to the left, if δ_f< 0 then enables amendment corner ▽ δ_tDirection to It is right；

Step 14: if δ_f> 0 then enables amendment corner ▽ δ_tDirection to the right, if δ_f< 0 then enables amendment corner ▽ δ_tDirection to It is left.

2. intelligent automobile stability control method according to claim 1, characterized in that the step 9 is by following mistake Cheng Jinhang:

Step 9.2: with i-th of vehicle status parameters s_iThe input that network model is acted as current i-th, is worked as by described Preceding i-th of movement network model exports i-th of output valve μ (s_i|θ^μ)；

With i-th of vehicle status parameters s_i, i-th of vehicle action parameter a_iWith i-th of output valve μ of the movement network (s_i|θ^μ) as it is described it is current i-th evaluation network model input, by i-th of vehicle status parameters s_iWith i-th Vehicle action parameter a_iI-th of output valve Q is exported by current i-th of evaluation network model_i(a_i)；By the movement net I-th of output valve μ (s of network model_i|θ^μ) by i-th of output valve Q of current i-th of evaluation network model output_i(μ(s_i |θ^μ))；

With updated i-th of vehicle status parameters s '_iAs the input of current i-th of target action network model, I-th of output valve μ (s ' is exported by current i-th of target action network model_i|θ^μ′)；

With updated i-th of vehicle status parameters s '_iWith i-th of output valve μ (s ' of target action network model_i|θ^μ′) As the input of current i-th of objective appraisal network model, by current i-th of objective appraisal network model output the I output valve Q '_i(a′_i)；

According to i-th of output valve Q of current i-th of evaluation network model_i(μ(s_i|θ^μ)) Utilization strategies gradient method is to described Current i-th of movement network model is updated, to obtain the updated movement network model of i-th and as i+1 Act network model；

The output Q of network model is evaluated according to current i-th_i(a_i) and current i-th of objective appraisal network model is defeated Q ' out_i(a′_i), current i-th of evaluation network model is updated using loss function is minimized, to obtain i-th Updated evaluation network model simultaneously evaluates network model as i+1；

Step 9.3: after i+1 is assigned to i, judge whether i > N is true, if so, it then indicates to obtain optimal movement network model With optimal evaluation network model, otherwise, return step 9.2 is executed.