CN103496368B - There is Automobile cooperative type self-adaptive adaptive cruise control system and the method for learning ability - Google Patents

There is Automobile cooperative type self-adaptive adaptive cruise control system and the method for learning ability Download PDF

Info

Publication number
CN103496368B
CN103496368B CN201310439454.4A CN201310439454A CN103496368B CN 103496368 B CN103496368 B CN 103496368B CN 201310439454 A CN201310439454 A CN 201310439454A CN 103496368 B CN103496368 B CN 103496368B
Authority
CN
China
Prior art keywords
module
weights
automobile
training
automobile operation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201310439454.4A
Other languages
Chinese (zh)
Other versions
CN103496368A (en
Inventor
张晋东
高振海
沈牧溪
薛杨
刘骝
吴星辰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jilin University
Original Assignee
Jilin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jilin University filed Critical Jilin University
Priority to CN201310439454.4A priority Critical patent/CN103496368B/en
Publication of CN103496368A publication Critical patent/CN103496368A/en
Application granted granted Critical
Publication of CN103496368B publication Critical patent/CN103496368B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W30/00Purposes of road vehicle drive control systems not related to the control of a particular sub-unit, e.g. of systems using conjoint control of vehicle sub-units
    • B60W30/14Adaptive cruise control
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2520/00Input parameters relating to overall vehicle dynamics
    • B60W2520/10Longitudinal speed
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2554/00Input parameters relating to objects
    • B60W2554/80Spatial relation or speed relative to objects
    • B60W2554/801Lateral distance
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2554/00Input parameters relating to objects
    • B60W2554/80Spatial relation or speed relative to objects
    • B60W2554/804Relative longitudinal speed

Landscapes

  • Engineering & Computer Science (AREA)
  • Automation & Control Theory (AREA)
  • Transportation (AREA)
  • Mechanical Engineering (AREA)
  • Control Of Driving Devices And Active Controlling Of Vehicle (AREA)

Abstract

The invention discloses a kind of Automobile cooperative type self-adaptive adaptive cruise control system with learning ability, comprise training set and test set generation module, training set complementary module, training module, learning algorithm module, motoring condition collection module, automobile operation behavior aggregate module, work stall safety distance module, evaluation module, backoff values R computing module, Q right value update module, automobile operation action selection module, automobile operation action executing module; The present invention constructs a kind of Automobile cooperative type self-adaptive adaptive cruise control system method with learning ability, can be used for the traffic environment of bicycle road, automobile one-way traffic.Each car is an intelligent body in the present invention, is carried out the travel conditions of other car in perception running environment by communication, and certainly due to the restriction of communication context and data volume and communication channel etc., the environment of institute's perception is the running car environment of part.

Description

There is Automobile cooperative type self-adaptive adaptive cruise control system and the method for learning ability
Technical field
What the present invention relates to is a kind of Automobile cooperative type self-adaptive adaptive cruise control system and method with learning ability.
Background technology
Increase for the automobile number of personal daily trip along with on road, most cities all faces traffic problems, especially in densely inhabited district.Great majority are lived in the people in these areas, time on and off duty, meet the part that traffic congestion has become its daily life.People accelerate traffic can be caused obstructed with brake control switching when driving frequently.And person's development traffic scope that is comparatively slow, that can see is very limited, all can aggravate traffic tie-up.From the angle of environment, unnecessary acceleration and brake control can increase the pollution on road.In addition, artificial riding manipulation mistake is also a principal element of the traffic problems causing today.
Along with the development of science and technology, the typical cooperative information that Automobile cooperative type self-adaptive adaptive learning algorithms can be shared at work stall mutually along with work stall communication, realize many cars people having a common goal in the same way platoon driving time keep shorter following distance, and cruise controls when road unmatched car, utilize when there being front truck radar self-adaption to follow front truck and travel.
Automobile cooperative type self-adaptive adaptive learning algorithms has very important meaning for the intellectuality and automation realizing running car, makes the safety of running car, goes out line efficiency, the comfort level of passenger all can improve greatly.
The people such as Unsal, Kachroo and Bay use multiple incidental learning to carry out vertical and horizontal control to automobile, do not expand its method in multiple agent problem.
Pendrith uses distributed DQL learning method, the relative velocity of automobile is around represented by fragmentary, perspective view, DQL does not consider the control action of automobile around, the renewal of Q-weights is not considered yet, but by averaging to all intelligent bodies within each time period, the shortcoming of its method is that study stability is more weak.
The people such as Emery-Montermerlo use bayes method to study Cruise Control, but its method remains the model based on environment.
The people such as Fulda and Ventura propose the dynamic adjacent automobile operation action of DJAP and see clearly algorithm to carry out Cruise Control, wherein each intelligent body can select useful adjacent control action dynamically between the learning period, but, adjacent control action when it only considers running car.
Summary of the invention
Technical matters to be solved by this invention provides a kind of Automobile cooperative type self-adaptive adaptive cruise control system and the method with learning ability for the deficiencies in the prior art.
Technical scheme of the present invention is as follows:
There is an Automobile cooperative type self-adaptive adaptive cruise control system for learning ability, comprise training set and test set generation module, training set complementary module, training module, learning algorithm module, motoring condition collection module, automobile operation behavior aggregate module, work stall safety distance module, evaluation module, backoff values R computing module, Q right value update module, automobile operation action selection module, automobile operation action executing module;
Described training set and test set generation module are used for generating according to the front truck obtained and this vehicle travelling state information the training set and test set that adapt;
Described training set complementary module is used for supplementing the training set of harsh one-tenth, the supplementary training collection more optimized;
Described training module, for adopting supplementary training collection data, performs training by training module;
Described learning algorithm module calculate n Q weights under likely a situation arises;
Described motoring condition collection module and work stall safety distance model choice go out the data accurately can expressing current vehicle condition and vehicle headway, similar work of tabling look-up is carried out by these data, find out the Q weights corresponding to these data, and select to control control action from automobile operation behavior aggregate module;
Described evaluation module retrains according to appreciation condition and enters backoff values R computing module and obtains backoff values R, and in Q right value update module, realizes the renewal of Q weights simultaneously; The Q weights that calculate before replacing with the Q weights after upgrading, and carry out initialization with this and produce new training set and test set; Repeat this process and can obtain more optimal Q weights, and check Q weights whether effective by test set.
The described control method with the Automobile cooperative type self-adaptive adaptive cruise control system of learning ability, after system starts, the selection of mode of learning whether is entered: (1) if enter mode of learning, then can generate according to the front truck obtained and this vehicle travelling state information the training set and test set that adapt by training set and test set generation module by chaufeur; And then supplemented by the training set of training set complementary module to harsh one-tenth, the supplementary training collection more optimized; Then supplementary training collection data are adopted, training is performed by training module, learning algorithm module calculate n Q weights under likely a situation arises, these Q weights can be interpreted as a two-dimentional form defined by vehicle-state and vehicle headway, Q weights are the data in form; The data accurately can expressing current vehicle condition and vehicle headway are gone out again by motoring condition collection module and work stall safety distance model choice, similar work of tabling look-up is carried out by these data, find out the Q weights corresponding to these data, and select to control control action from automobile operation behavior aggregate module; Then evaluation module retrains according to appreciation condition and enters backoff values R computing module and obtains backoff values R, and in Q right value update module, realizes the renewal of Q weights simultaneously; The Q weights that calculate before replacing with the Q weights after upgrading, and carry out initialization with this and produce new training set and test set.Repeat this process and can obtain more optimal Q weights, and check Q weights whether effective by test set; (2) if do not enter mode of learning, then optimum control action can be selected according to the front truck obtained and this vehicle travelling state information in existing automobile operation action selection module; Then perform this action according to the action selected by automobile operation action executing module, thus complete the cooperating type adaptive learning algorithms to automobile.
The present invention constructs a kind of Automobile cooperative type self-adaptive self-adapting cruise control method with learning ability, and can be used for the traffic environment of bicycle road, automobile one-way traffic.Each car is an intelligent body in the present invention, is carried out the travel conditions of other car in perception running environment by communication, and certainly due to the restriction of communication context and data volume and communication channel etc., the environment of institute's perception is the running car environment of part.
Accompanying drawing explanation
The functional block diagram of Fig. 1 system;
Fig. 2 learning algorithm main process figure;
Fig. 3 training set and test set generation module;
Fig. 4 training set complementary module;
Fig. 5 training module;
Fig. 6 Q right value update module;
Fig. 7 backoff values R computing module diagram of circuit;
Fig. 8 automobile operation action executing module;
Detailed description of the invention
Below in conjunction with specific embodiment, the present invention is described in detail.
The present invention constructs a kind of Automobile cooperative type self-adaptive self-adapting cruise control method with learning ability, and can be used for the traffic environment of bicycle road, automobile one-way traffic.Each car is an intelligent body in the present invention, is carried out the travel conditions of other car in perception running environment by communication, and certainly due to the restriction of communication context and data volume and communication channel etc., the environment of institute's perception is the running car environment of part.
With reference to figure 1, a kind of Automobile cooperative type self-adaptive adaptive cruise control system with learning ability involved in the present invention, comprises training set and test set generation module, training set complementary module, training module, learning algorithm module, motoring condition collection module, automobile operation behavior aggregate module, work stall safety distance module, evaluation module, backoff values R computing module, Q right value update module, automobile operation action selection module, automobile operation action executing module.
As shown in Figure 1, after system starts, the selection of mode of learning whether can be entered: (1) if enter mode of learning, then can generate according to the front truck obtained and this vehicle travelling state information (the front truck speed of a motor vehicle, this car speed of a motor vehicle, two vehicle headways) training set and test set that adapt by training set and test set generation module by chaufeur; And then supplemented by the training set of training set complementary module to harsh one-tenth, the supplementary training collection more optimized; Then supplementary training collection data are adopted, training is performed by training module, learning algorithm module calculate n Q weights under likely a situation arises, these Q weights can be interpreted as a two-dimentional form defined by vehicle-state and vehicle headway, Q weights are the data in form; The data accurately can expressing current vehicle condition and vehicle headway are gone out again by motoring condition collection module and work stall safety distance model choice, similar work of tabling look-up is carried out by these data, find out the Q weights corresponding to these data, and select to control control action from automobile operation behavior aggregate module; Then evaluation module retrains according to appreciation condition and enters backoff values R computing module and obtains backoff values R, and in Q right value update module, realizes the renewal of Q weights simultaneously.The Q weights that calculate before replacing with the Q weights after upgrading, and carry out initialization with this and produce new training set and test set.Repeat this process and can obtain more optimal Q weights, and check Q weights whether effective by test set.(2) if do not enter mode of learning, then optimum control action can be selected according to the front truck obtained and this vehicle travelling state information (the front truck speed of a motor vehicle, this car speed of a motor vehicle, two vehicle headways) in existing automobile operation action selection module; Then perform this action according to the action selected by automobile operation action executing module, thus complete the cooperating type adaptive learning algorithms to automobile.
" learning algorithm module ", by learning algorithm, can calculate the corresponding Q weights of each state of automobile, and goes out to reach the optimum automobile operation action of Automobile cooperative type self-adaptive adaptive learning algorithms target by running car control action model choice.When each automobile operation action made by automobile in its running environment, award or punishment information can be provided, to represent the correctness of this action to motoring condition exercising result.
Learning algorithm is as follows:
1) Q value equals any Q weight function
2) initialization motoring condition s
3) circulation performs: each main body i exports automobile operation action a, and a is a specific action, and A makes a general reference some actions, can be regarded as some instantiations that a is A; Utilize formula Q ( s , a ) = ( 1 - α ) Q ( s , a ) + α [ r + γ max α ∈ A Q ( s ′ , a ) ] Observe and obtain r and a; And motoring condition s equals next motoring condition s '; Until fragment terminates, circulate end;
4) learning process terminates;
As follows according to some variable-definitions in the learning algorithm of Automobile cooperative type self-adaptive adaptive learning algorithms:
" automobile operation behavior aggregate ": A={a1, a2} are for each automobile, and A is one group of limited automobile operation behavior aggregate, namely automobile can take any one control action in automobile operation behavior aggregate, and a1 is acceleration/accel, and a2 is deceleration/decel.As: the scope of acceleration/accel is (0-2m/s 2), namely about 10s accelerates to 100km/h, represents and at the uniform velocity travel when 0, and the scope of deceleration/decel is (0-10m/s 2), namely within about 3 seconds, decelerate to 0 from 100km/h, wherein 0 also represents and at the uniform velocity travels.Here acceleration/accel and deceleration/decel can only have a variable to have not to be the numerical value of 0 in the calculating of reality, and namely car can only be selected in the middle of acceleration or deceleration control action a moment one.
" motoring condition collection ": S={s1, s2} are for each automobile, and S is one group of limited motoring condition collection, namely, an automobile may be in any one concentrated motoring condition of motoring condition, and s1 is speed, and s2 is the distance of Ben Che and front truck.As: velocity magnitude scope is (0-50m/s) i.e. (0-180km/h), represents that automobile stops when 0; Be (0-100m) with leading vehicle distance scope, represent when 0 and collide, represent at a distance of 100m when 100 or be pulled out away from beyond 100m.As can be seen from Q right value update module, each (s, a) put right, correspond to Q weights, can be calculated by the motoring condition collection set and automobile operation behavior aggregate, one total 51*101*3*11=169983 kind Q weights, wherein owing to accelerate and decelerating manoeuvre action can only select one, all this wherein significant Q weights be 51*101*13=66963 kind Q weights.
P:S*A → Sn is for each automobile, and P is a process of motoring condition change, and namely current motoring condition S performs automobile operation action A, and the motoring condition afterwards residing for automobile becomes Sn.
" backoff values R calculating ": S*A1*A2 → R is for each automobile, R is a feedback function, namely the backoff values that S and automobile operation action A determines is had, also the backoff values that S and next motoring condition Sn determines can be regarded as, an i.e. good Sn, obtain a larger backoff values, if obtain a bad Sn, then backoff values is less or be 0.
Therefore, the Q right value update function of Q learning algorithm is obtained:
Q ( s , a ) = ( 1 - α ) Q ( s , a ) + α [ r + γ max α ∈ A Q ( s ′ , a ) ]
Wherein (s, a) is Q weights to Q, and namely its size decides an automobile operation action and whether result in a good result, i.e. next motoring condition Sn.α in formula is a probability, it is to (s in the present invention, a) put the record of right access times, its value is that accessing points is to (s, the inverse of number of times a), can find out that access times are more, then more rely on the Q value previously obtained, and access times more depend on a Q value estimated more at least, wherein r is feedback function R, γ is a proportionality coefficient, is 0.6.
In the algorithm, MaxQ () function is an estimated valve, this estimated valve is determined by next motoring condition Sn, at all possible (Sn that next motoring condition Sn and a forms, a) centering is put, choose the point of a Q maximum weight, and this value is returned, for calculating Q weights.
In the algorithm, record current motoring condition, and perform an automobile operation action, at this moment current motoring condition is changed to next motoring condition, at this moment this motoring condition is used as current motoring condition and record, and then hold automobile operation action, constantly circulate, learning process is constantly carried out.Automobile operation action in learning process is had ready conditions Stochastic choice.After execution automobile operation action, rear car is pulled out away from, then rear car is only selected to accelerate control action, and constantly linearly increases the acceleration control action scope of this Stochastic choice, if rear car is continued to cast aside.When after execution automobile operation action, two cars collide, then rear car only selects decelerating manoeuvre action, and constantly linearly increase the scope of the decelerating manoeuvre action of this Stochastic choice, if two cars continue to collide.When after execution automobile operation action, the distance of two cars is greater than optimum distance, then rear car is only selected to accelerate control action, and constantly linearly increases the scope of the acceleration control action of this Stochastic choice, if this situation continues to occur.
When after execution automobile operation action, the distance of two cars is less than optimum distance, then rear car only selects decelerating manoeuvre action, and constantly linearly increases the scope of the decelerating manoeuvre action of this Stochastic choice, if this situation continues to occur.When after execution automobile operation action, the Distance geometry optimum distance of two cars is identical, then rear car selects an at the uniform velocity automobile operation action, and namely acceleration-deceleration degree is 0.Doing like this is that automobile fast speed is learnt, and fast speed obtains a good result.
" work stall safety distance ": during running car and front truck at least should keep a work stall safety distance, work stall Safety distance model is as follows:
L = V r ( 2 V c - V r ) 2 a max + D
L is the distance that Ben Che and front truck do not collide between two cars that need to keep; Relative velocity V r=V c-V p, V c, V pbe respectively the present speed of Ben Che and front truck; D=2m is constant value; a maxfor the maximum braking deceleration of this car.
This dynamic optimum distance L is increased in feedback function R, namely in the study of each step, all calculates this optimized vehicle distance according to the motoring condition of front truck and the motoring condition of rear car.Owing to take into account the relative velocity of two cars, when calculating backoff values R, also it should be considered.If x is the vehicle headway under S state, y is the vehicle headway under Sn state, and therefore, when x and y is all greater than optimum distance, backoff values is x/y+Vr/L.And when x and y has one to be less than optimum distance or when both are all less than optimum distance, backoff values is y/x-Vr/L.
If the relative velocity of two cars is joined in motoring condition collection module, Q weights just can be made more accurate, but the problem done like this is the increase in quantity and the dimension of Q weights.At this moment, the span of the relative velocity of two cars is (-50 ~ 50m/s), and at this moment the quantity of Q weights is 51*101*101*3*11=17168283, and after removing insignificant Q weights, the quantity of Q weights still has 51*101*101*13=6763263.
For each point to (s, a) all has Q weights corresponding with it, and Q weights initial value is 0 in the present invention, and initial motoring condition S is variable in the present invention.In continuous learning process, Q weights are constantly updated, and finally obtain a Q weights array.And when testing, infer what kind of automobile operation action automobile can select at a motoring condition by these Q weights.
" 1.1 evaluation module ", whether learning algorithm achieves good result in the present invention several judgment criteria:
Namely whether automobile collides in the process of moving, whether can keep a good distance.Namely when two spacings are far away, automobile selects a larger acceleration/accel for preferably automobile operation action, the automobile in fast approaching front, when an automobile distance front truck is nearer, the danger that will collide, now a less acceleration/accel selected by automobile, or brake is a preferably automobile operation action.
In learning algorithm, Q weights to be continued to optimize exactly, make it under each motoring condition, obtain a preferably automobile operation action.
The overall flow of learning algorithm module is as shown in Figure 2: learning algorithm module bodies flow process:
A1, initialization: with respective initialization function to state set S, behavior aggregate A carries out initialization, Q weight initialization is 0.
A2, generation training set:
A21 pattern is learning training, then train;
A22 pattern is test, then test;
A3, repetition A21 and A22 are until time slice is finished.
A4, end.
First carry out initialization when algorithm starts, initialized Q weights are 0 here, and the initial value of namely all Q weights is 0.The initialization value of motoring condition collection is: speed=0m/s, distance=10m, relative velocity=0m/s.The initialization value of automobile operation behavior aggregate is acceleration/accel=0m/s 2, deceleration/decel=0m/s 2.All Q (s, access times a) are 1, at this moment due to Q (s in computation process, a) access times calculate as denominator, if so be initialized as 0, mistake can be caused, and initialization value is 1 do not affect last training and testing result.
Whether, in algorithm operational process, often perform a step and all can be finished by testing time sheet, arranging training time sheet is global constant 240000, and test duration sheet is global constant 50000.Can check in algorithm operational process as training or test, if be training, enter training module, if be test, enter test module.
" 1.2 training set and test set generation module ", as shown in Figure 3, training set and test set generation module flow process:
B1 inputs training set and generates parameter, performs and supplements part.
To the parameter of all inputs, circulation is carried out:
B11 performs moderating process at a slow speed, calculates deceleration/decel numerical value and exports record;
B12 performs quick deceleration process, calculates deceleration/decel numerical value and exports record;
B13 performs at the uniform velocity driving process, calculates deceleration/decel numerical value and exports record;
B14 performs quick accelerator, calculates deceleration/decel numerical value and exports record;
B15 performs moderating process at a slow speed, calculates deceleration/decel numerical value and exports record;
B2 circulation performs step B1, until the parameter of all inputs all as calculated, training set reaches required value.
B3 terminates.
Training set and test set generation module are divided into five sections, and it is the cubic function of variable that acceleration here and deceleration are with time slice, uses cubic function to be to make the curvature of junction between each section equal as far as possible.Wherein the time slice sendout of acceleration and braking section is 10000, and time slice is at the uniform velocity 5000.First front truck decelerated to 45m/s slowly by very fast motoring condition 50m/s before this, and quick deceleration is to 0m/s afterwards, is rapidly accelerated to 45m/s afterwards, then accelerated at a slow speed very fast 50m/s.Can find out that this training set does not have completeness, only considered the situation of fraction running car.Therefore, supplementing training set, is namely training set complementary module, can obtain more Q weights and cover more automobile running conditions.
The scale requirements value of training set is set as 10, namely altogether can produce 450000 data volumes.Certainly because the function generating data is fixing, therefore excessive scale is also meaningless, in the process of training, just by continuous repeated accesses, can can't increase the coverage rate of Q weights.
The generation of test set and the generation of training set are identical, extract from training set, and this part of extraction is not by automotive training, and that is, a training set part for generation is used for training, and a part is used for test.Data for training have 240000, have 50000 for the data of testing.
1.3 the flow process of " training set complementary module " is as shown in Figure 4, training set complementary module flow process:
C1, input parameter x, perform all parameter x:
The acceleration function of C11x;
C12 at the uniform velocity travels function;
The deceleration function of C13x;
C14 at the uniform velocity travels function;
If C2 x does not reach acceleration extreme value, then allow x from adding 1, and repeat step C1.
If C3 x has reached acceleration extreme value, x is performed:
The random acceleration function of C31x;
C32 at the uniform velocity travels function;
The deceleration function of C33x;
C34 at the uniform velocity travels function;
If C4 x does not reach deceleration extreme value, then allow x from adding 1, and repeat step C3.
If C5 x has reached deceleration extreme value, then from 6 time slices, chosen one perform the function that slows down at random, all the other 5 perform random acceleration.
C6, end
For training set complementary module, the time slice accelerating to perform with braking section is relevant with concrete acceleration-deceleration, and the time slice at the uniform velocity travelled is 15.Supplementing part random automobile operation action part rate of onset value is below very fast half.Because the ratio of extreme values accelerated and slow down is 1:5, so every 6 time slices perform a step deceleration-operation can make speed of a motor vehicle random variation in whole speed range, and be unlikely to be partial to extreme value.
1.4 " training modules " as shown in Figure 5, training module flow process:
D1, input parameter, numerical value according to marker bit flag, select to accelerate accordingly, slow down or constant motion.
D2, perform the function of this action.
D3, check whether and collide:
If D31 collides, module performs end;
If D32 does not collide, then upgrade current Q weights by Q right value update module
D4, circulation perform until all time slices are finished.
The parameter of training module is automobile operation behavior aggregate A, here be rear car, rear car current running state S, and rear car is at the motoring condition Sn of subsequent time period, the concrete calculating of certain Sn is determined by automobile operation action module, then before this, S and Sn is identical.The initial value of flag is 1.Represent that as flag=1 Stochastic choice one accelerates automobile operation action, represent the action of a Stochastic choice deceleration automobile operation as flag=-1, represent select at the uniform velocity automobile operation action as flag=0, speed is the speed of current motoring condition S.
1.5Q right value update module, as shown in Figure 6, the flow process of Q right value update module:
1, input parameter.
2, for each input parameter, perform:
1. by access (s, number of times a) calculates A value;
2. R value is calculated by backoff values R computing module;
3. to action a to some extent, the MAX_Q value of its correspondence is calculated by MAX_Q function;
4. Q weights are upgraded.
3, circulation performs until all time slices are finished.
4, terminate.
In Q right value update module, the parameter of input is the automobile operation action a of the current motoring condition S of rear car, selection, and next motoring condition Sn.Relatively determine to select automobile operation action to be accelerate or slow down or at the uniform velocity on earth by the value of next motoring condition Sn that produces and the value of current motoring condition S next time.Just this action is performed after selection automobile operation action.After the action of execution automobile operation, detect two cars whether there occurs collision, will not upgrade Q weights if collided, because Q weights are an increment value, this will cause performing this automobile operation action will can not get any return.If do not collided, upgrade this (s, Q weights a).
A value be access (s, the inverse of number of times a), each upgrade Q weights time, it is corresponding that (s, a) access times just can add 1.
The flow process of 1.6 " backoff values R computing modules " is as shown in Figure 7:
Backoff values R computing module flow process:
1, input parameter.
2, whether detecting distance equals ultimate range, casts aside.Or distance equals 0, the collision of generation:
If 1. distance=ultimate range, or distance=0, return of value is 0;
If 2. distance is between both, continue to perform downwards.
3, detect whether there occurs continuous collision, if there is collision to occur:
1. speed is 0, then return larger backoff values 3;
2. speed is not 0, and return of value is 0.
4, collision continuously does not occur, then whether detecting distance is greater than optimum distance continuously:
If be 1. greater than continuously, then return x/y;
If 2. do not had, continue to perform;
5, whether detecting distance is 0:
If 1. distance is not equal to 0, then return y/x
2. distance equals 0, then return 0.
6, circulation performs said process until time slice is finished.
7, terminate.
When calculating backoff values R, the parameter of needs is current motoring condition S and next motoring condition Sn.Here x is the distance of current motoring condition S, and y is the distance of motoring condition Sn.As x=0 and y=0, meaning continuous collision, if speed is now 0, is then a good automobile operation action, because can not move on when collision occurs.Now return a larger backoff values 3.For algorithm, when distance is greater than optimum distance, return of value is x/y, and when distance is not 0, return of value is y/x.In Q right value update module, finally a calculative part is exactly the value of MAX_Q.
1.7 " automobile operation action executing modules " as shown in Figure 8, automobile operation action executing block process:
1, input parameter.
2, calculate the operating range d of rear car, revise d value:
If 1. d > casts distance aside, then d size is rewritten as and casts distance aside;
If 2. d < 0, be then rewritten as 0 by d.
3, calculate the speed newspeed made new advances according to action, and revise:
If 1. news > is very fast, then news is rewritten as very fast;
If 2. news < 0, be then rewritten as 0 by news.
4, distance dis new between two cars is calculated.
If 5 cast aside, then dis=casts distance aside, and flag is put 1.
6, whether detection collides, if occurred, then dis=0, and flag is put-1.
7, dis and optimum distance is compared:
If 1. dis is greater than optimum distance, flag is put 1.
If 2. dis is less than optimum distance, flag is put-1.
If 3. dis equals optimum distance, flag is set to 0.
8, next state Sn is given by dis, newspeed of calculating.
9, circulation performs said process until time slice is finished.
10, terminate.
In automobile operation action executing module, the parameter of input is current motoring condition S and selected automobile operation action a.The output of data has been added in the middle of computation process in practice, adding cross the border detection and acceleration/accel Dynamic gene make training result Fast Convergent.After adding acceleration/accel (deceleration/decel) Dynamic gene, the change of acceleration/accel linearly changes, continuous increase or minimizing, now, when next time slice selects automobile operation action, not then the action of proper Stochastic choice automobile operation, but by the selection automobile operation action of upper motoring condition basis to some extent.
Because whole process is in bicycle road, two car motoring conditions of one-way road, so set distance now as 0 when colliding, distance is now likely negative value certainly, but due to prerequisite be that rear car cannot surmount front truck.Flag is another automobile operation Action Selection factor introduced.
1.8 " automobile operation action selection modules "
The parameter of automobile operation action selection module input is motoring condition S, and automobile operation action selection module forms primarily of two parts:
Part I calculates maximum Q weights corresponding to all acceleration and the action of deceleration automobile operation under motoring condition S respectively.Certainly each maximum automobile operation action corresponding to Q weights to be recorded here.Namely maximum Q weights here may have multiple, record one by one.So just obtain and accelerate the maximum Q weights corresponding to automobile operation action and the maximum Q weights corresponding to the action of deceleration automobile operation.
Part II selects automobile operation action exactly.First the maximum Q weights accelerating and slow down are compared, if the maximum Q weights accelerating automobile operation action are greater than the maximum Q weights of deceleration automobile operation action, then Flag=1 is set, and in the automobile operation action corresponding to these maximum Q weights, automobile operation action is accelerated in equiprobable selection one, is supplied to execution module.If accelerate the maximum Q weights that automobile operation action maximum Q weights are less than the action of deceleration automobile operation, then F1ag=-1 is set, and in the automobile operation action corresponding to the Q weights that these are maximum, the action of an equiprobable selection deceleration automobile operation, is supplied to execution module.If the maximum Q weights accelerating automobile operation action corresponding with accelerated motion are identical, then Flag=0 is set, in execution module, performs an at the uniform velocity automobile operation action, namely keep the action identical with a upper time slice.
Should be understood that, for those of ordinary skills, can be improved according to the above description or convert, and all these improve and convert the protection domain that all should belong to claims of the present invention.

Claims (2)

1. one kind has the Automobile cooperative type self-adaptive adaptive cruise control system of learning ability, it is characterized in that, comprise training set and test set generation module, training set complementary module, training module, learning algorithm module, motoring condition collection module, automobile operation behavior aggregate module, work stall safety distance module, evaluation module, backoff values R computing module, Q right value update module, automobile operation action selection module, automobile operation action executing module;
Described training set and test set generation module are used for generating according to the front truck obtained and this vehicle travelling state information the training set and test set that adapt;
Described training set complementary module is used for supplementing the training set of harsh one-tenth, the supplementary training collection more optimized;
Described training module, for adopting supplementary training collection data, performs training by training module;
Described learning algorithm module calculate n Q weights under likely a situation arises;
Described motoring condition collection module and work stall safety distance model choice go out the data accurately can expressing current vehicle condition and vehicle headway, the work of tabling look-up is carried out by these data, find out the Q weights corresponding to these data, and select to control control action from automobile operation behavior aggregate module;
Described evaluation module retrains according to appreciation condition and enters backoff values R computing module and obtains backoff values R, and in Q right value update module, realizes the renewal of Q weights simultaneously; The Q weights that calculate before replacing with the Q weights after upgrading, and carry out initialization with this and produce new training set and test set; Repeat this process and can obtain more optimal Q weights, and check Q weights whether effective by test set.
2. the control method with the Automobile cooperative type self-adaptive adaptive cruise control system of learning ability according to claim 1, it is characterized in that, after system starts, the selection of mode of learning whether is entered: (1) if enter mode of learning, then can generate according to the front truck obtained and this vehicle travelling state information the training set and test set that adapt by training set and test set generation module by chaufeur; And then supplemented by the training set of training set complementary module to harsh one-tenth, the supplementary training collection more optimized; Then supplementary training collection data are adopted, training is performed by training module, learning algorithm module calculate n Q weights under likely a situation arises, these Q weights can be interpreted as a two-dimentional form defined by vehicle-state and vehicle headway, Q weights are the data in form; The data accurately can expressing current vehicle condition and vehicle headway are gone out again by motoring condition collection module and work stall safety distance model choice, the work of tabling look-up is carried out by these data, find out the Q weights corresponding to these data, and select to control control action from automobile operation behavior aggregate module; Then evaluation module retrains according to appreciation condition and enters backoff values R computing module and obtains backoff values R, and in Q right value update module, realizes the renewal of Q weights simultaneously; The Q weights that calculate before replacing with the Q weights after upgrading, and carry out initialization with this and produce new training set and test set; Repeat this process and can obtain more optimal Q weights, and check Q weights whether effective by test set; (2) if do not enter mode of learning, then optimum control action can be selected according to the front truck obtained and this vehicle travelling state information in existing automobile operation action selection module; Then perform this action according to the action selected by automobile operation action executing module, thus complete the cooperating type adaptive learning algorithms to automobile.
CN201310439454.4A 2013-09-25 2013-09-25 There is Automobile cooperative type self-adaptive adaptive cruise control system and the method for learning ability Expired - Fee Related CN103496368B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310439454.4A CN103496368B (en) 2013-09-25 2013-09-25 There is Automobile cooperative type self-adaptive adaptive cruise control system and the method for learning ability

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310439454.4A CN103496368B (en) 2013-09-25 2013-09-25 There is Automobile cooperative type self-adaptive adaptive cruise control system and the method for learning ability

Publications (2)

Publication Number Publication Date
CN103496368A CN103496368A (en) 2014-01-08
CN103496368B true CN103496368B (en) 2016-04-20

Family

ID=49861667

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310439454.4A Expired - Fee Related CN103496368B (en) 2013-09-25 2013-09-25 There is Automobile cooperative type self-adaptive adaptive cruise control system and the method for learning ability

Country Status (1)

Country Link
CN (1) CN103496368B (en)

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105109488B (en) * 2015-08-11 2017-10-20 奇瑞汽车股份有限公司 A kind of intelligence is with car system and method
KR101846631B1 (en) * 2015-12-08 2018-04-06 현대자동차주식회사 Method for joining driving the ranks of vehicle
CN106202606A (en) 2016-06-22 2016-12-07 百度在线网络技术(北京)有限公司 A kind of characteristic information acquisition methods simulating intelligent body and device
US10029698B2 (en) * 2016-07-19 2018-07-24 Futurewei Technologies, Inc. Adaptive passenger comfort enhancement in autonomous vehicles
CN106476806B (en) * 2016-10-26 2019-01-15 上海理工大学 Cooperating type self-adaption cruise system algorithm based on traffic information
US10392001B2 (en) 2017-08-11 2019-08-27 Toyota Motor Engineering & Manufacturing North America, Inc. Efficient acceleration semi-autonomous feature
CN110386139B (en) * 2018-04-17 2020-08-28 上海汽车集团股份有限公司 Adaptive cruise control method, processor and system
CN109523029B (en) * 2018-09-28 2020-11-03 清华大学深圳研究生院 Self-adaptive double-self-driven depth certainty strategy gradient reinforcement learning method
CN109686086B (en) * 2018-12-24 2020-08-07 东软集团(北京)有限公司 Method and device for training fuzzy control network and generating intersection suggested speed
CN110001654B (en) * 2019-05-06 2023-07-28 吉林大学 Intelligent vehicle longitudinal speed tracking control system and control method for self-adaptive driver type
CN110696828B (en) * 2019-11-14 2022-01-14 驭势科技(北京)有限公司 Forward target selection method and device and vehicle-mounted equipment
CN111038504B (en) * 2019-11-27 2021-11-02 苏州智加科技有限公司 Vehicle adaptive cruise control method, device, vehicle and storage medium
CN111038503B (en) * 2019-11-27 2022-04-29 苏州智加科技有限公司 Vehicle adaptive cruise control method, device, vehicle and storage medium
CN112255996B (en) * 2020-10-21 2021-12-28 长安大学 CACC stability test system and method based on whole vehicle in-loop
CN112109708B (en) * 2020-10-26 2023-07-14 吉林大学 Self-adaptive cruise control system considering driving behavior and control method thereof
CN115077936B (en) * 2022-06-16 2023-04-07 中国汽车工程研究院股份有限公司 Method for evaluating driving performance of vehicle adaptive cruise system

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103010213A (en) * 2012-12-28 2013-04-03 长城汽车股份有限公司 Control method and control system for vehicle cruise
CN103121449A (en) * 2011-11-18 2013-05-29 北汽福田汽车股份有限公司 Method and system for cruise control of electric automobile
CN103153745A (en) * 2010-09-03 2013-06-12 丰田自动车株式会社 Drive control device of vehicle
WO2013095232A1 (en) * 2011-12-22 2013-06-27 Scania Cv Ab Method and module for controlling a vehicle's speed based on rules and/or costs
GB2498223A (en) * 2012-01-09 2013-07-10 Jaguar Cars Method and apparatus for determining the hand of traffic
CN103241241A (en) * 2012-02-13 2013-08-14 株式会社电装 Cruise control apparatus
CN103269926A (en) * 2010-12-06 2013-08-28 依维柯公司 Method for actuating the cruise control function in a vehicle equipped with hybrid driving, especially an industrial or commercial vehicle

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101168744B1 (en) * 2008-12-03 2012-07-26 한국전자통신연구원 Cruise control system and method thereof

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103153745A (en) * 2010-09-03 2013-06-12 丰田自动车株式会社 Drive control device of vehicle
CN103269926A (en) * 2010-12-06 2013-08-28 依维柯公司 Method for actuating the cruise control function in a vehicle equipped with hybrid driving, especially an industrial or commercial vehicle
CN103121449A (en) * 2011-11-18 2013-05-29 北汽福田汽车股份有限公司 Method and system for cruise control of electric automobile
WO2013095232A1 (en) * 2011-12-22 2013-06-27 Scania Cv Ab Method and module for controlling a vehicle's speed based on rules and/or costs
GB2498223A (en) * 2012-01-09 2013-07-10 Jaguar Cars Method and apparatus for determining the hand of traffic
CN103241241A (en) * 2012-02-13 2013-08-14 株式会社电装 Cruise control apparatus
CN103010213A (en) * 2012-12-28 2013-04-03 长城汽车股份有限公司 Control method and control system for vehicle cruise

Also Published As

Publication number Publication date
CN103496368A (en) 2014-01-08

Similar Documents

Publication Publication Date Title
CN103496368B (en) There is Automobile cooperative type self-adaptive adaptive cruise control system and the method for learning ability
CN105091892B (en) Vehicle energy management apparatus
CN102292752B (en) Row running control system and vehicle
US8255110B2 (en) Travel trace generation method and travel trace generation device
Ahn et al. Ecodrive application: Algorithmic development and preliminary testing
CN103996312B (en) There is the pilotless automobile control system that social action is mutual
CN106427987A (en) System and method for controlling hybrid vehicle
CN103253261A (en) Following auxiliary control system based on inter-vehicle cooperation
CN106564496A (en) Reconstruction method for security environment envelope of intelligent vehicle based on driving behaviors of preceding vehicle
CN102968541A (en) Traffic flow microscopic simulation method based on car following behavior
CN103158704A (en) Drive mode guide system for vehicle and method thereof
Bayar et al. Impact of different spacing policies for adaptive cruise control on traffic and energy consumption of electric vehicles
CN104680785A (en) Method and device for determining traffic condition of road section
CN104851280A (en) Vehicle driving control method, device, system and related equipment
CN108694841A (en) A kind of intelligent vehicle passage crossroads traffic light method based on V2X technologies
CN106530691A (en) Hybrid vehicle model multilane cellular automaton model considering vehicle occupancy space
Hegyi et al. A cooperative system based variable speed limit control algorithm against jam waves-an extension of the SPECIALIST algorithm
Zhang et al. Data-driven based cruise control of connected and automated vehicles under cyber-physical system framework
Valera et al. Driving cycle and road grade on-board predictions for the optimal energy management in EV-PHEVs
Wang et al. Connected variable speed limits control and vehicle acceleration control to resolve moving jams
CN108082188B (en) Vehicle Control Unit (VCU) and method of operating the same
CN105788363A (en) Driving early warning method, driving early warning device, and driving early warning system
Chen et al. Dynamic Eco‐Driving Speed Guidance at Signalized Intersections: Multivehicle Driving Simulator Based Experimental Study
CN106997675A (en) Target vehicle speed Forecasting Methodology based on Dynamic Programming
CN109455178A (en) A kind of road vehicles traveling active control system and method based on binocular vision

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
C53 Correction of patent for invention or patent application
CB03 Change of inventor or designer information

Inventor after: Zhang Jindong

Inventor after: Gao Zhenhai

Inventor after: Shen Muxi

Inventor after: Xue Yang

Inventor after: Liu Liu

Inventor after: Wu Xingchen

Inventor before: Zhang Jindong

Inventor before: Gao Zhenhai

COR Change of bibliographic data

Free format text: CORRECT: INVENTOR; FROM: ZHANG JINDONG GAO ZHENHAI TO: ZHANG JINDONG GAO ZHENHAI SHEN MUXI XUE YANG LIU LIU WU XINGCHEN

C14 Grant of patent or utility model
GR01 Patent grant
DD01 Delivery of document by public notice
DD01 Delivery of document by public notice

Addressee: Jilin University

Document name: Notification to Pay the Fees

CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160420

Termination date: 20180925