CN103496368A - Automobile cooperative type self-adaptive cruise control system and method with learning ability - Google Patents

Automobile cooperative type self-adaptive cruise control system and method with learning ability Download PDF

Info

Publication number
CN103496368A
CN103496368A CN201310439454.4A CN201310439454A CN103496368A CN 103496368 A CN103496368 A CN 103496368A CN 201310439454 A CN201310439454 A CN 201310439454A CN 103496368 A CN103496368 A CN 103496368A
Authority
CN
China
Prior art keywords
module
automobile
weights
training
training set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310439454.4A
Other languages
Chinese (zh)
Other versions
CN103496368B (en
Inventor
张晋东
高振海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jilin University
Original Assignee
Jilin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jilin University filed Critical Jilin University
Priority to CN201310439454.4A priority Critical patent/CN103496368B/en
Publication of CN103496368A publication Critical patent/CN103496368A/en
Application granted granted Critical
Publication of CN103496368B publication Critical patent/CN103496368B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W30/00Purposes of road vehicle drive control systems not related to the control of a particular sub-unit, e.g. of systems using conjoint control of vehicle sub-units, or advanced driver assistance systems for ensuring comfort, stability and safety or drive control systems for propelling or retarding the vehicle
    • B60W30/14Adaptive cruise control
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2520/00Input parameters relating to overall vehicle dynamics
    • B60W2520/10Longitudinal speed
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2554/00Input parameters relating to objects
    • B60W2554/80Spatial relation or speed relative to objects
    • B60W2554/801Lateral distance
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2554/00Input parameters relating to objects
    • B60W2554/80Spatial relation or speed relative to objects
    • B60W2554/804Relative longitudinal speed

Landscapes

  • Engineering & Computer Science (AREA)
  • Automation & Control Theory (AREA)
  • Transportation (AREA)
  • Mechanical Engineering (AREA)
  • Control Of Driving Devices And Active Controlling Of Vehicle (AREA)

Abstract

The invention discloses an automobile cooperative type self-adaptive cruise control system with the learning ability. The automobile cooperative type self-adaptive cruise control system comprises a training set and testing set generating module, a training set supplementing module, a training module, a learning algorithm module, an automobile travelling state set module, an automobile control motion set module, an inter-automobile safe distance module, an evaluating module, a feedback value R calculating module, a Q weight updating module, an automobile control motion selecting module and an automobile control motion executing module. The automobile cooperative type self-adaptive cruise control system and method with the learning ability are built and can be used for the one-lane or one-way automobile travelling environment. According to the automobile cooperative type self-adaptive cruise control system and method with the learning ability, each automobile is an intelligent agent, travelling conditions of other automobiles in the travelling environment are sensed through communication, and certainly the sensed environment is a part of the automobile travelling environment due to limitations of communication ranges, data volume, communication channels and the like.

Description

Automobile cooperating type adaptive cruise control system and method with learning ability
Technical field
What the present invention relates to is a kind of automobile cooperating type adaptive cruise control system and method with learning ability.
Background technology
Along with the automobile number for the personal daily trip on road increases, most cities all faces traffic problems, especially in densely inhabited district.Live in these regional people for great majority, meet the part that traffic congestion has become its daily life when on and off duty.People accelerate frequently can cause that with the brake control switching traffic is obstructed when driving.And people's reaction traffic scope slow, that can see is very limited, all can aggravate traffic tie-up.From the angle of environment, unnecessary acceleration and brake control can increase the pollution on road.In addition, artificial riding manipulation mistake is also to cause a principal element of the traffic problems of today.
Along with scientific and technological development, automobile cooperating type adaptive cruise is controlled and is accompanied by the typical cooperative information that work stall communication can be shared mutually at work stall, realization keeps shorter following distance when queue is travelled in the same way many cars people having a common goal, and when the unmatched car of road, cruise is controlled, and when front truck is arranged, utilizes radar self-adaption to follow front truck and travels.
Automobile cooperating type adaptive cruise is controlled has very important meaning for the intellectuality that realizes running car and automation, makes the safety of running car, the comfort level that goes out line efficiency, passenger all can greatly improve.
Unsal, the people such as Kachroo and Bay carry out vertical and horizontal control with multiple incidental learning to automobile, do not expand its method in the multiple agent problem.
Pendrith uses distributed DQL learning method, mean the relative velocity of automobile on every side by fragmentary, perspective view, DQL does not consider the control action of automobile on every side, do not consider the renewal of Q-weights yet, but by within each time period, all intelligent bodies being averaged, the shortcoming of its method be study stability a little less than.
The people such as Emery-Montermerlo use bayes method research to Cruise Control, yet its method remains the model based on environment.
The people such as Fulda and Ventura has proposed the dynamic adjacent automobile operation action of DJAP and has seen clearly algorithm and carry out Cruise Control, wherein each intelligent body can be selected useful adjacent control action dynamically between the learning period, yet, adjacent control action when it only considers running car.
Summary of the invention
Technical matters to be solved by this invention is to provide a kind of automobile cooperating type adaptive cruise control system and method with learning ability for the deficiencies in the prior art.
Technical scheme of the present invention is as follows:
A kind of automobile cooperating type adaptive cruise control system with learning ability, comprise training set and test set generation module, training set complementary module, training module, learning algorithm module, motoring condition collection module, automobile operation behavior aggregate module, work stall safety distance module, evaluation module, backoff values R computing module, Q right value update module, automobile operation action selection module, automobile operation action executing module;
Training set and test set that the front truck that described training set and test set generation module obtain for basis and this vehicle travelling state Information generation adapt;
Described training set complementary module is supplemented for the training set to harsh one-tenth, the supplementary training collection that obtains more optimizing;
Described training module, for adopting supplementary training collection data, is carried out training by training module;
Described learning algorithm module calculate n Q weights under likely a situation arises;
Described motoring condition collection module and work stall safety distance module are selected and can accurately be expressed the data of current vehicle-state and vehicle headway, carry out similar work of tabling look-up by these data, find out the corresponding Q weights of these data, and select to control control action from automobile operation behavior aggregate module;
Described evaluation module retrains and enters backoff values R computing module and obtains backoff values R according to appreciation condition, and realizes the renewal of Q weights simultaneously in Q right value update module; The Q weights that calculate before replacing with the Q weights after upgrading, and come initialization to produce new training set and test set with this; Repeat this process and can obtain more optimal Q weights, and can check the Q weights whether effective by test set.
The described control method with automobile cooperating type adaptive cruise control system of learning ability, after system starts, whether entered the selection of mode of learning by chaufeur: (1) if enter mode of learning, can according to the front truck obtained and this vehicle travelling state information exchange is crossed training set and the test set generation module generates training set and the test set adapted; And then the training set to harsh one-tenth is supplemented by the training set complementary module, the supplementary training collection that obtains more optimizing; Then adopt supplementary training collection data, carry out training by training module, n the Q weights of learning algorithm module under calculating institute likely a situation arises, can be interpreted as these Q weights a two-dimentional form defined by vehicle-state and vehicle headway, and the Q weights are the data in form; Select and can accurately express the data of current vehicle-state and vehicle headway by motoring condition collection module and work stall safety distance module again, carry out similar work of tabling look-up by these data, find out the corresponding Q weights of these data, and select to control control action from automobile operation behavior aggregate module; Then evaluation module retrains and enters backoff values R computing module and obtains backoff values R according to appreciation condition, and realizes the renewal of Q weights simultaneously in Q right value update module; The Q weights that calculate before replacing with the Q weights after upgrading, and come initialization to produce new training set and test set with this.Repeat this process and can obtain more optimal Q weights, and can check the Q weights whether effective by test set; (2), if do not enter mode of learning, can select optimum control action in existing automobile operation action selection module according to the front truck obtained and this vehicle travelling state information; Then carry out this action according to the action of selecting by automobile operation action executing module, thereby complete, the cooperating type adaptive cruise of automobile is controlled.
The present invention has built a kind of automobile cooperating type self-adapting cruise control method with learning ability, and can be used for the traffic environment of bicycle road, automobile one-way traffic.Each car is an intelligent body in the present invention, carrys out the travel conditions of other car in the perception running environment by communication, and due to the restriction of communication context and data volume and communication channel etc., the environment of institute's perception is the running car environment of part certainly.
The accompanying drawing explanation
The functional block diagram of Fig. 1 system;
Fig. 2 learning algorithm main process figure;
Fig. 3 training set and test set generation module;
Fig. 4 training set complementary module;
Fig. 5 training module;
Fig. 6 Q right value update module;
Fig. 7 backoff values R computing module diagram of circuit;
Fig. 8 automobile operation action executing module;
The specific embodiment
Below in conjunction with specific embodiment, the present invention is described in detail.
The present invention has built a kind of automobile cooperating type self-adapting cruise control method with learning ability, and can be used for the traffic environment of bicycle road, automobile one-way traffic.Each car is an intelligent body in the present invention, carrys out the travel conditions of other car in the perception running environment by communication, and due to the restriction of communication context and data volume and communication channel etc., the environment of institute's perception is the running car environment of part certainly.
With reference to figure 1, a kind of automobile cooperating type adaptive cruise control system with learning ability involved in the present invention, comprise training set and test set generation module, training set complementary module, training module, learning algorithm module, motoring condition collection module, automobile operation behavior aggregate module, work stall safety distance module, evaluation module, backoff values R computing module, Q right value update module, automobile operation action selection module, automobile operation action executing module.
As shown in Figure 1, after system starts, can whether be entered by chaufeur the selection of mode of learning: (1), if enter mode of learning, can generate by training set and test set generation module training set and the test set adapted according to the front truck obtained and this vehicle travelling state information (the front truck speed of a motor vehicle, this car speed of a motor vehicle, two vehicle headways); And then the training set to harsh one-tenth is supplemented by the training set complementary module, the supplementary training collection that obtains more optimizing; Then adopt supplementary training collection data, carry out training by training module, n the Q weights of learning algorithm module under calculating institute likely a situation arises, can be interpreted as these Q weights a two-dimentional form defined by vehicle-state and vehicle headway, and the Q weights are the data in form; Select and can accurately express the data of current vehicle-state and vehicle headway by motoring condition collection module and work stall safety distance module again, carry out similar work of tabling look-up by these data, find out the corresponding Q weights of these data, and select to control control action from automobile operation behavior aggregate module; Then evaluation module retrains and enters backoff values R computing module and obtains backoff values R according to appreciation condition, and realizes the renewal of Q weights simultaneously in Q right value update module.The Q weights that calculate before replacing with the Q weights after upgrading, and come initialization to produce new training set and test set with this.Repeat this process and can obtain more optimal Q weights, and can check the Q weights whether effective by test set.(2), if do not enter mode of learning, can select optimum control action in existing automobile operation action selection module according to the front truck obtained and this vehicle travelling state information (the front truck speed of a motor vehicle, this car speed of a motor vehicle, two vehicle headways); Then carry out this action according to the action of selecting by automobile operation action executing module, thereby complete, the cooperating type adaptive cruise of automobile is controlled.
" learning algorithm module ", by learning algorithm, can calculate the corresponding Q weights of each state of automobile, and selected and can be reached the optimum automobile operation action that automobile cooperating type adaptive cruise is controlled target by running car control action module.While making each automobile operation action in its running environment when automobile, can provide and reward or punishment information, to mean the correctness of this action to the motoring condition exercising result.
Learning algorithm is as follows:
1) the Q value equals any Q weight function
2) initialization motoring condition s
3) circulation is carried out: each main body i output automobile operation action a, and a is a specific action, and A makes a general reference some actions, and can be regarded as a is some instantiations of A; Utilize formula Q ( s , a ) = ( 1 - α ) Q ( s , a ) + α [ r + γ max α ∈ A Q ( s ′ , a ) ] Observe and obtain r and a; And motoring condition s equals next motoring condition s '; Until finishing circulation, fragment finishes;
4) learning process finishes;
Some variable-definitions in the learning algorithm of controlling according to automobile cooperating type adaptive cruise are as follows:
" automobile operation behavior aggregate ": A={a1, a2} is for each automobile, and A is one group of limited automobile operation behavior aggregate, and an automobile can be taked any one control action in the automobile operation behavior aggregate, and a1 is acceleration/accel, a2 is deceleration/decel.As: the scope of acceleration/accel is (0-2m/s 2), about 10s accelerates to 100km/h, within 0 o'clock, means at the uniform velocity to travel, and the scope of deceleration/decel is (0-10m/s 2), within about 3 seconds, from 100km/h, decelerate to 0, wherein 0 also means at the uniform velocity to travel.Here acceleration/accel and deceleration/decel can only have a variable that the numerical value that is not 0 is arranged in actual calculating, i.e. in the middle of can only selecting the acceleration or deceleration control action a moment one of a car.
" motoring condition collection ": S={s1, s2} is for each automobile, and S is one group of limited motoring condition collection,, any one motoring condition that automobile may be concentrated in motoring condition, s1 is speed, the distance that s2 is Ben Che and front truck.As: the velocity magnitude scope is (0-50m/s) i.e. (0-180km/h), means that automobile stops at 0 o'clock; With the leading vehicle distance scope be (0-100m), within 0 o'clock, mean to bump, within 100 o'clock, mean at a distance of 100m or be pulled out away from beyond 100m.By Q right value update module, can be found out, each (s, a) put right, corresponding Q weights, by motoring condition collection and the automobile operation behavior aggregate set, can be calculated, one total 51*101*3*11=169983 kind Q weights, wherein owing to accelerating and decelerating manoeuvre moves and can only select one, all this wherein significant Q weights be 51*101*13=66963 kind Q weights.
P:S*A → Sn is for each automobile, and P is the process that motoring condition changes, and current motoring condition S has carried out automobile operation action A, and the residing motoring condition of automobile becomes Sn afterwards.
" backoff values R calculating ": S*A1*A2 → R is for each automobile, R is a feedback function, S and the definite backoff values of automobile operation action A are arranged, also can regard S and the definite backoff values of next motoring condition Sn as, an i.e. Sn preferably, obtain a larger backoff values, if obtained a bad Sn, backoff values is less or be 0.
Therefore, obtained the Q right value update function of Q learning algorithm:
Q ( s , a ) = ( 1 - α ) Q ( s , a ) + α [ r + γ max α ∈ A Q ( s ′ , a ) ]
Wherein (s, a) be the Q weights to Q, and its size is determining whether an automobile operation action has caused a result preferably, i.e. next motoring condition Sn.α in formula is a probability, it is to (s in the present invention, a) put the record of right access times, its value is that accessing points is to (s, the inverse of number of times a), can find out that access times are more, more rely on the Q value before obtained, and access times more depend on the Q value of an estimation more at least, wherein r is feedback function R, γ is a proportionality coefficient, is 0.6.
In algorithm, MaxQ () function is an estimated valve, this estimated valve is determined by next motoring condition Sn, all possible (Sn at next motoring condition Sn and a composition, a) some centering, choose the point of a Q weights maximum, and this value is returned, for calculating the Q weights.
In algorithm, record current motoring condition, and carry out an automobile operation and move, at this moment current motoring condition is changed to next motoring condition, at this moment this motoring condition is used as to current motoring condition record, and then hold the automobile operation action, constantly circulation, constantly carry out learning process.Automobile operation action in learning process is had ready conditions and is selected at random.After carrying out an automobile operation action, rear car is pulled out away from, and rear car only selects to accelerate control action, and constantly is this random acceleration control action scope of selecting of linear growth, if rear car is continued to cast aside.After carrying out an automobile operation action, two cars bump, and rear car is only selected the decelerating manoeuvre action, and constantly are the scope of this random decelerating manoeuvre action of selecting of linear growth, if two cars continue to bump.After carrying out an automobile operation action, the distance of two cars is greater than optimum distance, and rear car only selects to accelerate control action, and constantly is the scope of this random acceleration control action of selecting of linear growth, if this situation continues to occur.
After carrying out an automobile operation action, the distance of two cars is less than optimum distance, and rear car is only selected the decelerating manoeuvre action, and constantly is the scope of this random decelerating manoeuvre action of selecting of linear growth, if this situation continues to occur.After carrying out an automobile operation action, the distance of two cars is identical with optimum distance, and rear car is selected an at the uniform velocity automobile operation action, and the acceleration-deceleration degree is 0.Doing like this is that automobile can be learnt by fast speed, and fast speed obtains a good result.
" work stall safety distance ": during running car and front truck should at least keep a work stall safety distance, the work stall Safety distance model is as follows:
L = V r ( 2 V c - V r ) 2 a max + D
L is the distance that Ben Che and front truck do not bump between two cars that need to keep; Relative velocity V r=V c-V p, V c, V pbe respectively the present speed of Ben Che and front truck; D=2m is constant value; a maxmaximum braking deceleration for this car.
This dynamic optimum distance L is increased in feedback function R, in the study of each step, all according to the motoring condition of front truck and the motoring condition of rear car, calculates this optimized vehicle distance.Due to the relative velocity of having considered two cars, when calculating backoff values R, also should consider it.If x is the vehicle headway under the S state, y is the vehicle headway under the Sn state, and therefore, when x and y all are greater than optimum distance, backoff values is x/y+Vr/L.And when x and y have one to be less than optimum distance or both while all being less than optimum distance, backoff values is y/x-Vr/L.
If the relative velocity of two cars is joined in motoring condition collection module, just can make the Q weights more accurate, but the problem of doing like this is quantity and the dimension that has increased the Q weights.At this moment, the span of the relative velocity of two cars is (50~50m/s), and at this moment the quantity of Q weights is 51*101*101*3*11=17168283, and, after removing insignificant Q weights, the quantity of Q weights still has 51*101*101*13=6763263.
For each, put (s, a) all have Q weights corresponding with it, and Q weights initial value is 0 in the present invention, and initial motoring condition S is variable in the present invention.In continuous learning process, the Q weights are constantly updated, and finally obtain a Q weights array.And, when test, by these Q weights, infer what kind of automobile operation action automobile can select at a motoring condition.
" 1.1 evaluation module ", whether learning algorithm has obtained result preferably in the present invention several judgment criteria:
Be whether automobile bumps in the process of moving, whether can keep a distance preferably.When two spacings when far away, automobile selects a larger acceleration/accel for preferably automobile operation action, the automobile in fast approaching the place ahead, when an automobile is nearer apart from front truck, the danger that will bump, now automobile is selected a less acceleration/accel, or brake is a preferably automobile operation action.
In learning algorithm, to continue to optimize the Q weights exactly, make it obtain a preferably automobile operation action under each motoring condition.
The overall flow of learning algorithm module is as shown in Figure 2: learning algorithm module bodies flow process:
A1, initialization: use initialization function separately is to state set S, and behavior aggregate A carries out initialization, and the Q weights are initialized as 0.
A2, generation training set:
The A21 pattern is learning training, is trained;
The A22 pattern, for test, is tested;
A3, repeat A21 and A22 until time slice is finished.
A4, end.
At first carry out initialization when algorithm starts, initialized Q weights are 0 here, and the initial value of all Q weights is 0.The initialization value of motoring condition collection is: speed=0m/s, distance=10m, relative velocity=0m/s.The initialization value of automobile operation behavior aggregate is acceleration/accel=0m/s 2, deceleration/decel=0m/s 2.All Q (s, access times a) are 1, at this moment due to Q (s in computation process, a) access times are calculated as denominator, if, so be initialized as 0, can cause mistake, and initialization value is 1 not affect last training and testing result.
In the algorithm operational process, whether every execution one step all can be finished by the testing time sheet, and it is global constant 240000 that the training time sheet is set, and the test duration sheet is global constant 50000.Can check in the algorithm operational process that being is training or test, training enters training module if, and test enters test module if.
" 1.2 training set and test set generation module ", as shown in Figure 3, training set and test set generation module flow process:
B1 input training set generates parameter, carries out and supplements part.
Parameter to all inputs loops:
B11 carries out moderating process at a slow speed, calculates deceleration/decel numerical value and exports record;
B12 carries out quick moderating process, calculates deceleration/decel numerical value and exports record;
B13 carries out at the uniform velocity driving process, calculates deceleration/decel numerical value and exports record;
B14 carries out quick accelerator, calculates deceleration/decel numerical value and exports record;
B15 carries out moderating process at a slow speed, calculates deceleration/decel numerical value and exports record;
B2 circulation execution step B1, until the parameter of all inputs all as calculated, training set reaches required value.
B3 finishes.
Training set and test set generation module are divided into five sections, and the acceleration here and deceleration are take the cubic function that time slice is variable, and using cubic function is that curvature in order to make as far as possible junction between each section equates.Wherein the time slice sendout of acceleration and braking section is 10000, and time slice at the uniform velocity is 5000.At first front truck decelerated to 45m/s slowly by very fast motoring condition 50m/s before this, decelerated to fast afterwards 0m/s, was rapidly accelerated to afterwards 45m/s, then accelerated at a slow speed very fast 50m/s.Can find out that this training set does not have completeness, only consider the situation of fraction running car.Therefore, training set being supplemented, is the training set complementary module, can obtain more Q weights and cover more automobile running conditions.
The scale required value of training set is set as 10, altogether can produce 450000 data volumes.Certainly because the function of generated data is fixed, therefore excessive scale is also meaningless, in the process of training, just can, by continuous repeated accesses, can't increase the coverage rate of Q weights.
The generation of test set and the generation of training set are identical, from training set, extract, and this part of extraction is not by automotive training, that is to say, the training set part of generation is for training, and a part is for test.Data for training have 240000, for the data of testing, have 50000.
1.3 the flow process of " training set complementary module " is as shown in Figure 4, training set complementary module flow process:
C1, input parameter x, carry out all parameter x:
The acceleration function of C11x;
The C12 function that at the uniform velocity travels;
The deceleration function of C13x;
The C14 function that at the uniform velocity travels;
If C2 x does not reach the acceleration extreme value, allow x from adding 1, and repeating step C1.
If C3 x has reached the acceleration extreme value, x is carried out:
The random acceleration function of C31x;
The C32 function that at the uniform velocity travels;
The deceleration function of C33x;
The C34 function that at the uniform velocity travels;
If C4 x does not reach the deceleration extreme value, allow x from adding 1, and repeating step C3.
If C5 x has reached the deceleration extreme value, from 6 time slices, choose one and carry out random deceleration function, carry out random the acceleration for all the other 5.
C6, end
For the training set complementary module, acceleration is relevant with concrete acceleration-deceleration with the time slice that braking section is carried out, and the time slice of at the uniform velocity travelling is 15.The random automobile operation action part rate of onset value of supplementing the part back is very fast half.Because the ratio of extreme values that accelerates and slow down is 1:5, can makes speed of a motor vehicle random variation in whole speed range so every 6 time slices are carried out a step deceleration-operation, and be unlikely to be partial to extreme value.
1.4 " training module " is as shown in Figure 5, the training module flow process:
D1, input parameter, according to the numerical value of marker bit flag, select to accelerate accordingly, slow down or at the uniform velocity action.
D2, carry out the function of this action.
Whether D3, inspection bump:
If D31 bumps, module is carried out and is finished;
If D32 does not bump, by Q right value update module, upgrade current Q weights
D4, circulation are carried out until all time slices are finished.
The parameter of training module is automobile operation behavior aggregate A, here be rear car, rear car current driving state S, and rear car is at the motoring condition Sn of next time period, certainly the concrete calculating of Sn is determined by the automobile operation action module, then before this, S and Sn are identical.The initial value of flag is 1.Mean the random acceleration automobile operation action of selecting when flag=1, mean the random deceleration automobile operation action of selecting when flag=-1, when flag=0, mean to select at the uniform velocity automobile operation to move, speed is the speed of current motoring condition S.
1.5Q the right value update module, as shown in Figure 6, the flow process of Q right value update module:
1, input parameter.
2,, for each input parameter, carry out:
1. pass through to access (s, number of times a) calculates the A value;
2. calculate the R value by backoff values R computing module;
3. to action a to some extent, by the MAX_Q function, calculate its corresponding MAX_Q value;
4. upgrade the Q weights.
3, circulation is carried out until all time slices are finished.
4, finish.
In Q right value update module, the parameter of input is the current motoring condition S of rear car, the automobile operation action a of selection, and next motoring condition Sn.It is acceleration or deceleration or at the uniform velocity on earth that value by next motoring condition Sn of producing and the value of current motoring condition S relatively carry out to determine to select the automobile operation action next time.Just carry out this action after selecting an automobile operation action.Detect two cars after carrying out the automobile operation action whether collision has occurred, if bump, will not upgrade the Q weights, because the Q weights are an increment value, this will cause carrying out this automobile operation action will can not get any return.If do not bump upgrade this (s, Q weights under a).
The A value is access, and (each while upgrading the Q weights, corresponding (s, a) access times just can add 1 for s, the inverse of number of times a).
1.6 the flow process of " backoff values R computing module " is as shown in Figure 7:
Backoff values R computing module flow process:
1, input parameter.
2, detect distance and whether equaled ultimate range, cast aside.Perhaps the distance equal 0, generation collision:
If 1. distance=ultimate range, or distance=0, return of value is 0;
If 2. distance is between the two, continue to carry out downwards.
3, detect whether continuous collision has occurred, if there is collision to occur:
1. speed is 0, returns to larger backoff values 3;
2. speed is not 0, and return of value is 0.
4, collision continuously not occurring, detects distance and whether be greater than continuously optimum distance:
If 1. be greater than continuously, return to x/y;
If 2. do not have, continue to carry out;
Whether 5, detect distance is 0:
If 1. distance is not equal to 0, return to y/x
2. distance equals 0, returns to 0.
6, circulation is carried out said process until time slice is finished.
7, finish.
When calculating backoff values R, the parameter needed is current motoring condition S and next motoring condition Sn.Here x is the distance of current motoring condition S, the distance that y is motoring condition Sn.When x=0 and y=0, mean continuous collision, if speed now is 0, be a good automobile operation action, because collide while occurring, can not move on.Now return to a larger backoff values 3.For algorithm, when distance is greater than optimum distance, return of value is x/y, and distance is not or not 0 o'clock return of value is y/x.A last calculative value that part is exactly MAX_Q in Q right value update module.
1.7 " automobile operation action executing module " is as shown in Figure 8, automobile operation action executing module flow process:
1, input parameter.
2, calculate the operating range d of rear car, revise the d value:
If 1. d>cast distance aside, be rewritten as the d size to cast distance aside;
If 2. d<0, be rewritten as 0 by d.
3, calculate according to action the speed newspeed made new advances, and revise:
If 1. news>very fast, be rewritten as news very fast;
If 2. news<0, be rewritten as 0 by news.
4, calculate between two cars new for dis.
If 5 cast aside, dis=casts distance aside, and flag is put to 1.
6, detect and whether to bump, if occur, dis=0, and flag is put to-1.
7, compare dis and optimum distance:
If 1. dis is greater than optimum distance, flag is put to 1.
If 2. dis is less than optimum distance, flag is put to-1.
If 3. dis equals optimum distance, flag is set to 0.
8, give next state Sn by dis, the newspeed calculated.
9, circulation is carried out said process until time slice is finished.
10, finish.
In automobile operation action executing module, the parameter of input is current motoring condition S and selected automobile operation action a.The output of data has been added in the middle of computation process in practice, and having added crosses the border detects and the acceleration/accel adjustment factor makes the training result Fast Convergent.After adding acceleration/accel (deceleration/decel) the adjustment factor, the variation of acceleration/accel is linear change, constantly increase or reduce, now, when next time slice is selected the automobile operation action, not proper random selection automobile operation action, but by the selection automobile operation action of basis to some extent of a upper motoring condition.
Because whole process is in the bicycle road, two car motoring conditions of one-way road, so the distance of establishing when bumping now is 0, certain distance now is likely negative value, but because prerequisite is that rear car can't surmount front truck.Flag another automobile operation Action Selection factor for introducing.
" 1.8 automobile operation action selection module "
The parameter of automobile operation action selection module input is motoring condition S, and the automobile operation action selection module mainly is comprised of two parts:
First calculates respectively under motoring condition S all acceleration and deceleration automobile operation to move the Q weights of corresponding maximum.Certainly to record the corresponding automobile operation action of each maximum Q weights here.The Q weights of the maximum here may have a plurality of, record one by one.The acceleration automobile operation corresponding maximum Q weights of action and the corresponding maximum Q weights of deceleration automobile operation action have so just been obtained.
Second portion is selected the automobile operation action exactly.At first relatively the maximum Q weights that accelerate and slow down, if the maximum Q weights that accelerate the automobile operation action are greater than the maximum Q weights of deceleration automobile operation action, Flag=1 is set, and in the corresponding automobile operation action of these maximum Q weights, the automobile operation action is accelerated in one of equiprobable selection, offers execution module.If accelerating automobile operation moves maximum Q weights and is less than the maximum Q weights of deceleration automobile operation action, F1ag=-1 is set, and, during at these, the maximum corresponding automobile operation of Q weights moves, deceleration automobile operation action of equiprobable selection, offer execution module.If it is identical with the corresponding maximum Q weights that accelerate motion to accelerate the automobile operation action, Flag=0 is set, carry out an at the uniform velocity automobile operation action in execution module, keep and a upper action that time slice is identical.
Should be understood that, for those of ordinary skills, can be improved according to the above description or convert, and all these improvement and conversion all should belong to the protection domain of claims of the present invention.

Claims (2)

1. the automobile cooperating type adaptive cruise control system with learning ability, it is characterized in that, comprise training set and test set generation module, training set complementary module, training module, learning algorithm module, motoring condition collection module, automobile operation behavior aggregate module, work stall safety distance module, evaluation module, backoff values R computing module, Q right value update module, automobile operation action selection module, automobile operation action executing module;
Training set and test set that the front truck that described training set and test set generation module obtain for basis and this vehicle travelling state Information generation adapt;
Described training set complementary module is supplemented for the training set to harsh one-tenth, the supplementary training collection that obtains more optimizing;
Described training module, for adopting supplementary training collection data, is carried out training by training module;
Described learning algorithm module calculate n Q weights under likely a situation arises;
Described motoring condition collection module and work stall safety distance module are selected and can accurately be expressed the data of current vehicle-state and vehicle headway, carry out similar work of tabling look-up by these data, find out the corresponding Q weights of these data, and select to control control action from automobile operation behavior aggregate module;
Described evaluation module retrains and enters backoff values R computing module and obtains backoff values R according to appreciation condition, and realizes the renewal of Q weights simultaneously in Q right value update module; The Q weights that calculate before replacing with the Q weights after upgrading, and come initialization to produce new training set and test set with this; Repeat this process and can obtain more optimal Q weights, and can check the Q weights whether effective by test set.
2. the control method with automobile cooperating type adaptive cruise control system of learning ability according to claim 1, it is characterized in that, after system starts, whether entered the selection of mode of learning by chaufeur: (1) if enter mode of learning, can according to the front truck obtained and this vehicle travelling state information exchange is crossed training set and the test set generation module generates training set and the test set adapted; And then the training set to harsh one-tenth is supplemented by the training set complementary module, the supplementary training collection that obtains more optimizing; Then adopt supplementary training collection data, carry out training by training module, n the Q weights of learning algorithm module under calculating institute likely a situation arises, can be interpreted as these Q weights a two-dimentional form defined by vehicle-state and vehicle headway, and the Q weights are the data in form; Select and can accurately express the data of current vehicle-state and vehicle headway by motoring condition collection module and work stall safety distance module again, carry out similar work of tabling look-up by these data, find out the corresponding Q weights of these data, and select to control control action from automobile operation behavior aggregate module; Then evaluation module retrains and enters backoff values R computing module and obtains backoff values R according to appreciation condition, and realizes the renewal of Q weights simultaneously in Q right value update module; The Q weights that calculate before replacing with the Q weights after upgrading, and come initialization to produce new training set and test set with this.Repeat this process and can obtain more optimal Q weights, and can check the Q weights whether effective by test set; (2), if do not enter mode of learning, can select optimum control action in existing automobile operation action selection module according to the front truck obtained and this vehicle travelling state information; Then carry out this action according to the action of selecting by automobile operation action executing module, thereby complete, the cooperating type adaptive cruise of automobile is controlled.
CN201310439454.4A 2013-09-25 2013-09-25 There is Automobile cooperative type self-adaptive adaptive cruise control system and the method for learning ability Expired - Fee Related CN103496368B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310439454.4A CN103496368B (en) 2013-09-25 2013-09-25 There is Automobile cooperative type self-adaptive adaptive cruise control system and the method for learning ability

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310439454.4A CN103496368B (en) 2013-09-25 2013-09-25 There is Automobile cooperative type self-adaptive adaptive cruise control system and the method for learning ability

Publications (2)

Publication Number Publication Date
CN103496368A true CN103496368A (en) 2014-01-08
CN103496368B CN103496368B (en) 2016-04-20

Family

ID=49861667

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310439454.4A Expired - Fee Related CN103496368B (en) 2013-09-25 2013-09-25 There is Automobile cooperative type self-adaptive adaptive cruise control system and the method for learning ability

Country Status (1)

Country Link
CN (1) CN103496368B (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105109488A (en) * 2015-08-11 2015-12-02 奇瑞汽车股份有限公司 Intelligent car following system and method
CN106202606A (en) * 2016-06-22 2016-12-07 百度在线网络技术(北京)有限公司 A kind of characteristic information acquisition methods simulating intelligent body and device
CN106476806A (en) * 2016-10-26 2017-03-08 上海理工大学 Cooperating type self-adaption cruise system algorithm based on transport information
CN106853827A (en) * 2015-12-08 2017-06-16 现代自动车株式会社 Method for adding the ride queues of vehicle
WO2018014697A1 (en) * 2016-07-19 2018-01-25 Huawei Technologies Co., Ltd. Adaptive passenger comfort enhancement in autonomous vehicles
CN109523029A (en) * 2018-09-28 2019-03-26 清华大学深圳研究生院 For the adaptive double from driving depth deterministic policy Gradient Reinforcement Learning method of training smart body
CN109686086A (en) * 2018-12-24 2019-04-26 东软集团(北京)有限公司 The training of fuzzy control network generates the method and device that speed is suggested at crossing
CN110001654A (en) * 2019-05-06 2019-07-12 吉林大学 A kind of the intelligent vehicle longitudinal velocity tracking control system and control method of adaptive driver type
US10392001B2 (en) 2017-08-11 2019-08-27 Toyota Motor Engineering & Manufacturing North America, Inc. Efficient acceleration semi-autonomous feature
CN110386139A (en) * 2018-04-17 2019-10-29 上海汽车集团股份有限公司 Self-adapting cruise control method, processor and system
CN110696828A (en) * 2019-11-14 2020-01-17 驭势科技(北京)有限公司 Forward target selection method and device and vehicle-mounted equipment
CN111038503A (en) * 2019-11-27 2020-04-21 苏州智加科技有限公司 Vehicle adaptive cruise control method, device, vehicle and storage medium
CN111038504A (en) * 2019-11-27 2020-04-21 苏州智加科技有限公司 Vehicle adaptive cruise control method, device, vehicle and storage medium
CN112109708A (en) * 2020-10-26 2020-12-22 吉林大学 Adaptive cruise control system considering driving behaviors and control method thereof
CN112255996A (en) * 2020-10-21 2021-01-22 长安大学 CACC stability test system and method based on whole vehicle in-loop
CN115077936A (en) * 2022-06-16 2022-09-20 中国汽车工程研究院股份有限公司 Method for evaluating driving performance of vehicle adaptive cruise system

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100138100A1 (en) * 2008-12-03 2010-06-03 Electronics And Telecommunications Research Institute Cruise control system and method thereof
CN103010213A (en) * 2012-12-28 2013-04-03 长城汽车股份有限公司 Control method and control system for vehicle cruise
CN103121449A (en) * 2011-11-18 2013-05-29 北汽福田汽车股份有限公司 Method and system for cruise control of electric automobile
CN103153745A (en) * 2010-09-03 2013-06-12 丰田自动车株式会社 Drive control device of vehicle
WO2013095232A1 (en) * 2011-12-22 2013-06-27 Scania Cv Ab Method and module for controlling a vehicle's speed based on rules and/or costs
GB2498223A (en) * 2012-01-09 2013-07-10 Jaguar Cars Method and apparatus for determining the hand of traffic
CN103241241A (en) * 2012-02-13 2013-08-14 株式会社电装 Cruise control apparatus
CN103269926A (en) * 2010-12-06 2013-08-28 依维柯公司 Method for actuating the cruise control function in a vehicle equipped with hybrid driving, especially an industrial or commercial vehicle

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100138100A1 (en) * 2008-12-03 2010-06-03 Electronics And Telecommunications Research Institute Cruise control system and method thereof
CN103153745A (en) * 2010-09-03 2013-06-12 丰田自动车株式会社 Drive control device of vehicle
CN103269926A (en) * 2010-12-06 2013-08-28 依维柯公司 Method for actuating the cruise control function in a vehicle equipped with hybrid driving, especially an industrial or commercial vehicle
CN103121449A (en) * 2011-11-18 2013-05-29 北汽福田汽车股份有限公司 Method and system for cruise control of electric automobile
WO2013095232A1 (en) * 2011-12-22 2013-06-27 Scania Cv Ab Method and module for controlling a vehicle's speed based on rules and/or costs
GB2498223A (en) * 2012-01-09 2013-07-10 Jaguar Cars Method and apparatus for determining the hand of traffic
CN103241241A (en) * 2012-02-13 2013-08-14 株式会社电装 Cruise control apparatus
CN103010213A (en) * 2012-12-28 2013-04-03 长城汽车股份有限公司 Control method and control system for vehicle cruise

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105109488A (en) * 2015-08-11 2015-12-02 奇瑞汽车股份有限公司 Intelligent car following system and method
CN106853827B (en) * 2015-12-08 2020-11-10 现代自动车株式会社 Method for joining a driving queue of a vehicle
CN106853827A (en) * 2015-12-08 2017-06-16 现代自动车株式会社 Method for adding the ride queues of vehicle
US10417358B2 (en) 2016-06-22 2019-09-17 Baidu Online Network Technology (Beijing) Co., Ltd. Method and apparatus of obtaining feature information of simulated agents
CN106202606A (en) * 2016-06-22 2016-12-07 百度在线网络技术(北京)有限公司 A kind of characteristic information acquisition methods simulating intelligent body and device
WO2018014697A1 (en) * 2016-07-19 2018-01-25 Huawei Technologies Co., Ltd. Adaptive passenger comfort enhancement in autonomous vehicles
CN109415062A (en) * 2016-07-19 2019-03-01 华为技术有限公司 Adaptive comfort of passenger enhancing in automatic driving vehicle
US10029698B2 (en) 2016-07-19 2018-07-24 Futurewei Technologies, Inc. Adaptive passenger comfort enhancement in autonomous vehicles
CN109415062B (en) * 2016-07-19 2020-08-14 华为技术有限公司 Adaptive passenger comfort enhancement in autonomous vehicles
CN106476806B (en) * 2016-10-26 2019-01-15 上海理工大学 Cooperating type self-adaption cruise system algorithm based on traffic information
CN106476806A (en) * 2016-10-26 2017-03-08 上海理工大学 Cooperating type self-adaption cruise system algorithm based on transport information
US10392001B2 (en) 2017-08-11 2019-08-27 Toyota Motor Engineering & Manufacturing North America, Inc. Efficient acceleration semi-autonomous feature
CN110386139A (en) * 2018-04-17 2019-10-29 上海汽车集团股份有限公司 Self-adapting cruise control method, processor and system
CN109523029A (en) * 2018-09-28 2019-03-26 清华大学深圳研究生院 For the adaptive double from driving depth deterministic policy Gradient Reinforcement Learning method of training smart body
CN109523029B (en) * 2018-09-28 2020-11-03 清华大学深圳研究生院 Self-adaptive double-self-driven depth certainty strategy gradient reinforcement learning method
CN109686086A (en) * 2018-12-24 2019-04-26 东软集团(北京)有限公司 The training of fuzzy control network generates the method and device that speed is suggested at crossing
CN110001654A (en) * 2019-05-06 2019-07-12 吉林大学 A kind of the intelligent vehicle longitudinal velocity tracking control system and control method of adaptive driver type
CN110001654B (en) * 2019-05-06 2023-07-28 吉林大学 Intelligent vehicle longitudinal speed tracking control system and control method for self-adaptive driver type
CN110696828A (en) * 2019-11-14 2020-01-17 驭势科技(北京)有限公司 Forward target selection method and device and vehicle-mounted equipment
CN110696828B (en) * 2019-11-14 2022-01-14 驭势科技(北京)有限公司 Forward target selection method and device and vehicle-mounted equipment
CN111038504A (en) * 2019-11-27 2020-04-21 苏州智加科技有限公司 Vehicle adaptive cruise control method, device, vehicle and storage medium
CN111038504B (en) * 2019-11-27 2021-11-02 苏州智加科技有限公司 Vehicle adaptive cruise control method, device, vehicle and storage medium
CN111038503A (en) * 2019-11-27 2020-04-21 苏州智加科技有限公司 Vehicle adaptive cruise control method, device, vehicle and storage medium
CN112255996A (en) * 2020-10-21 2021-01-22 长安大学 CACC stability test system and method based on whole vehicle in-loop
CN112255996B (en) * 2020-10-21 2021-12-28 长安大学 CACC stability test system and method based on whole vehicle in-loop
CN112109708A (en) * 2020-10-26 2020-12-22 吉林大学 Adaptive cruise control system considering driving behaviors and control method thereof
CN112109708B (en) * 2020-10-26 2023-07-14 吉林大学 Self-adaptive cruise control system considering driving behavior and control method thereof
CN115077936A (en) * 2022-06-16 2022-09-20 中国汽车工程研究院股份有限公司 Method for evaluating driving performance of vehicle adaptive cruise system

Also Published As

Publication number Publication date
CN103496368B (en) 2016-04-20

Similar Documents

Publication Publication Date Title
CN103496368B (en) There is Automobile cooperative type self-adaptive adaptive cruise control system and the method for learning ability
CN102292752B (en) Row running control system and vehicle
Ahn et al. Ecodrive application: Algorithmic development and preliminary testing
KR101601889B1 (en) Method and module for controlling a vehicle&#39;s speed based on rules and/or costs
CN103324085B (en) Based on the method for optimally controlling of supervised intensified learning
US8255110B2 (en) Travel trace generation method and travel trace generation device
CN111145552B (en) Planning method for vehicle dynamic lane changing track based on 5G network
CN105091892A (en) Vehicle energy management device
CN108564234A (en) A kind of intersection no signal self-organizing passing control method of intelligent network connection automobile
CN103718220A (en) Traffic control system, vehicle control system, traffic regulation system, and traffic control method
CN102968541A (en) Traffic flow microscopic simulation method based on car following behavior
SE1151256A1 (en) Method and module for controlling the speed of a vehicle through simulation
CN104851280A (en) Vehicle driving control method, device, system and related equipment
CN103116808A (en) Method of real-timely predicting short time traffic flow of express way
Zhang et al. Data-driven based cruise control of connected and automated vehicles under cyber-physical system framework
Wang et al. Connected variable speed limits control and vehicle acceleration control to resolve moving jams
CN111625989A (en) Intelligent vehicle influx method and system based on A3C-SRU
CN103693042A (en) Method for forecasting automobile running speed on mountain complicated road based on foresight track curvature
Chen et al. Dynamic eco-driving speed guidance at signalized intersections: Multivehicle driving simulator based experimental study
CN106997675A (en) Target vehicle speed Forecasting Methodology based on Dynamic Programming
Kamal et al. Eco-driving using real-time optimization
Jones et al. Energy-efficient cooperative adaptive cruise control strategy using V2I
Clement et al. Simple platoon advancement: a model of automated vehicle movement at signalised intersections
CN113635900B (en) Channel switching decision control method based on energy management in predicted cruising process
CN108839655A (en) A kind of cooperating type self-adaptation control method based on minimum safe spacing

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
C53 Correction of patent of invention or patent application
CB03 Change of inventor or designer information

Inventor after: Zhang Jindong

Inventor after: Gao Zhenhai

Inventor after: Shen Muxi

Inventor after: Xue Yang

Inventor after: Liu Liu

Inventor after: Wu Xingchen

Inventor before: Zhang Jindong

Inventor before: Gao Zhenhai

COR Change of bibliographic data

Free format text: CORRECT: INVENTOR; FROM: ZHANG JINDONG GAO ZHENHAI TO: ZHANG JINDONG GAO ZHENHAI SHEN MUXI XUE YANG LIU LIU WU XINGCHEN

C14 Grant of patent or utility model
GR01 Patent grant
DD01 Delivery of document by public notice

Addressee: Jilin University

Document name: Notification to Pay the Fees

DD01 Delivery of document by public notice
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160420

Termination date: 20180925