CN115755606A

CN115755606A - Carrier controller automatic optimization method, medium and equipment based on Bayesian optimization

Info

Publication number: CN115755606A
Application number: CN202211433936.4A
Authority: CN
Inventors: 苏杰; 牟剑秋; 许正昊; 李晓芸
Original assignee: Shanghai Youdao Zhitu Technology Co Ltd
Current assignee: Shanghai Youdao Zhitu Technology Co Ltd
Priority date: 2022-11-16
Filing date: 2022-11-16
Publication date: 2023-03-07
Anticipated expiration: 2042-11-16
Also published as: CN115755606B

Abstract

The invention discloses an automatic optimization method, medium and equipment of an automatic carrier driving controller based on Bayesian optimization, wherein Bayesian optimization is used for automatically optimizing the performance of the automatic carrier driving controller, manual parameter adjustment and grid parameter adjustment which are original, tedious and low in efficiency are replaced, and the automatic carrier driving controller has definite practical significance, and batch parallelization technology is used for improving analytic proxy functions of Bayesian optimization, so that the efficiency of optimizing the performance of the automatic carrier driving controller is improved, and the automatic carrier driving controller has obvious technical advancement and practicability.

Description

Carrier controller automatic optimization method, medium and equipment based on Bayesian optimization

Technical Field

The invention belongs to the technical field of intelligent automobile automatic driving, and particularly relates to a carrier automatic driving controller automatic optimization method, medium and equipment based on Bayesian optimization.

Background

In recent years, with the rapid improvement of the intelligence level of a vehicle, the related technology of automatic driving is developed vigorously, a controller deployed with a control algorithm is one of necessary modules of an automatic driving vehicle system, and can effectively control the vehicle to track a reference track to drive forwards. In the past, the adjustment of control algorithm parameters is performed by methods such as manual adjustment or grid search, the efficiency is low, the parameter adjustment space is limited, and the optimal control performance cannot be approached.

In order to optimize the performance of the control algorithm and to improve the efficiency of the optimization process, researchers have conducted some research and exploration. Marco et al, in the article "Automatic LQR Tuning Based on Gaussian Process Global Optimization" by IEEE International Conference on Robotics and Automation,2016, proposes an Automatic LQR controller Optimization method Based on Gaussian Process Bayesian Optimization, which uses entropy search as a proxy function and can automatically, efficiently and quickly search the optimal parameter set of the LQR controller; su, jie et al, published in the article "Autonomous vehicle control through the dynamics and controller learning" of IEEE Transactions on Vehicular Technology,2018, further consider the performance optimization of the LQR controller for Gaussian process Bayesian optimization, design a time-varying lower confidence function as the proxy function of Bayesian optimization for the time-varying characteristics of system operation, and have better applicability to the time-varying characteristic scene of the vehicle; riboni, A. Et al, published in nature,2022, the scientific report "Bayesian optimization and deep learning for steering wheel angle prediction" used LSTM to design controllers for backbone networks and Bayesian optimization as controller parameters for automated optimization search for steering control of autonomous vehicles.

The research results can improve the performance optimization efficiency of the controller to a certain extent, but the methods still have certain limitations, such as: the research results all consider a single-process serialization decision example, so that the design of a Bayesian optimization process by using an analytic proxy function cannot be parallelized. The deep learning control algorithm designed by Riboni, A. Et al, as described above, has numerous parameters to be adjusted, which makes the parameter set space dimension of Bayesian optimization very high; in addition, the value space of part of the control parameters has a compact continuity characteristic, so that the number of the parameter groups to be searched is increased sharply, and further challenges are brought to the efficiency of optimizing the search task.

Disclosure of Invention

In view of the above problems, the present invention is directed to a vehicle automatic driving controller automatic optimization method, medium, and apparatus based on bayesian optimization, which consider using a multi-batch parallelized expectation-improvement function as a proxy function for bayesian optimization, and provide a better solution for optimization search of an automatic driving control algorithm.

In order to achieve the purpose, the invention adopts the following technical scheme:

an automatic driving carrier controller optimization method based on Bayesian optimization comprises the following steps:

s1: initializing a sample data set

S2: for data sets

Modeling by using a proxy model;

establishing a Bayesian optimization proxy function, and circulating the following steps:

s21: obtaining the mean value and the variance of posterior distribution through proxy model regression;

s22: obtaining a parameter group to be evaluated through a Bayesian optimization proxy function;

s23: the obtained parameter group to be evaluated is verified on a vehicle carrying a vehicle body, and the data of the parameter group is amplified to a sampling data set

S3: and if the parameter group to be evaluated reaches the termination condition, the loop step of S2 is exited, and the index parameter obtained by the controller is finished.

As a further description of the invention, S1:modeling the performance index of the controller to be evaluated to obtain an evaluation function; selecting n feasible combinations as parameter sets to be evaluated according to reachable domains of parameter sets to be optimized in the indexes, performing a controller performance effect experiment on a carrier vehicle through the parameters to be evaluated, and collecting an effect index data set

S2: set X = { theta ] of S1 parameter group to be evaluated ₁ ,…,θ _n Using the result index set corresponding to the parameter group set to be evaluated as input

As output, effect index data set is performed using a proxy model

Modeling; wherein, theta _i ,i∈[1,n]Indicating the set of parameters that have been evaluated,

representing a controller performance effect value; establishing a Bayesian optimization proxy function, and then circulating the following steps:

s21, obtaining an effect index data set through proxy model regression

Posterior distribution and variance of (d);

s22, substituting the mean function and the variance function of the posterior distribution obtained in the S21 into a Bayesian optimization proxy function to obtain a recommended solution theta predicted by the proxy function _n+1· ；

S23, performing a controller performance effect experiment on the carrier vehicle by using the parameter group represented by the recommendation solution obtained in the step S22, and collecting effect indexes

And augmenting the set of data with the existing data set

S3: and when the difference between the controller effect index and the ideal index is smaller than a set threshold value, or the difference between the posterior distribution mean value and the set threshold value is smaller than the set threshold value, the loop step is exited, and the obtained recommended solution is the index parameter required by the controller.

As a further description of the present invention, the evaluation function modeling manner of the vehicle system control performance index in S1 is:

wherein the content of the first and second substances,

the control performance of the parameter theta matched into the system is represented, and the parameter theta represents weighted fusion of control accuracy, vehicle safety and control cost,

represents a variance of

The noise of the gaussian distribution of (a),

representing the corresponding noisy evaluation measured after the parameter set theta is fitted into the system.

As a further description of the present invention, the reachable domain space of the parameter set to be optimized in S1 is a mixed space, the mixed space includes a discrete space and a continuous space, a part of the parameter sets to be optimized belongs to the discrete space, and a part of the parameter sets belongs to the continuous space.

As a further description of the present invention, the surrogate model in S2 is a gaussian process, which is completely described by a mean function μ (X) and a covariance function K (X, X);

the mean function μ (X) is:

where Ψ (X) represents a polynomial function of order p, α _p Coefficients representing respective orders, C being a constant;

the covariance function K (X, X) is:

wherein the kernel function k (theta) _i ,θ _j ),i∈[1,n],j∈[1,n]In its complete form:

wherein, the diagonal matrix

Indicating a length-stretch over-parameter, λ _i ,i∈[1,n]Representing the parameter theta _i Corresponding expansion parameters;

the objective function of the proxy model regression is the logarithm of the edge likelihood distribution, as follows:

wherein, the first and the second end of the pipe are connected with each other,

a distribution of the likelihood is represented by,

representing all sets of gaussian process-related hyper-parameters.

As a further description of the present invention, the proxy function of bayesian optimization is modeled as:

wherein, AC (X) represents a proxy function, N represents the number of Monte Carlo sampling points, i represents the serial number of the current sampling point, q represents the number of the parallelization total batches, and j represents the current batch; x = { X ₁ ,…,X _q Denotes the cutting of the parameter set into q batches, where X _q A set of parameter sets representing the q-th batch;

representing a posterior mean function

Batch j, L (X) is the posterior distribution covariance of the Gaussian Process

Geodesic decomposition, i.e.: which satisfies

A sample of a standard normal distribution is represented,

representing the optimal value minY observed for the current data set.

As a further description of the invention, the posterior mean function

The calculating method comprises the following steps:

wherein I represents a unit matrix, X _1:n-1 ＝{θ ₁ ,…,θ _n-1 },

The posterior distribution covariance

The calculation method of (2) is as follows:

as a further description of the present invention, the Bayesian proxy function also includes a maximization operation, which depends on whether the need for the objective function is minimized or maximized;

minimizing demands are directed to proxy functions with minimizing operations and maximizing demands are directed to proxy functions with maximizing operations.

A computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the automated driving vehicle controller optimization method based on bayesian optimization.

An automated driving carrier controller optimization device based on Bayesian optimization comprises a memory for storing a computer program and a processor, wherein the processor implements the automated driving carrier controller optimization method based on Bayesian optimization when executing the computer program.

Compared with the prior art, the invention has the technical effects that:

the invention provides an automatic optimization method, medium and equipment of an automatic driving controller of a carrier based on Bayesian optimization, wherein the Bayesian optimization is used for automatically optimizing the performance of the automatic driving controller of the carrier, manual parameter adjustment and grid parameter adjustment which are original, redundant and low in efficiency are replaced, the method has clear practical significance, a batch parallelization technology is used for improving and promoting analytic proxy functions of Bayesian optimization, the efficiency of performance optimization of the automatic driving controller of the carrier is improved, and the method has obvious technical advancement and practicability.

Drawings

FIG. 1 is a flow chart of an automated driving vehicle controller optimization method based on Bayesian optimization;

FIG. 2 is a schematic diagram of the controller performance variation corresponding to a set of candidate parameters during Bayesian optimization operation;

fig. 3 is a schematic diagram of the trace tracking effect of the LQR controller designed by using the parameter set obtained by bayesian optimization.

Detailed Description

The invention is described in detail below with reference to the attached drawing figures:

an automatic driving carrier controller optimization method based on Bayesian optimization is disclosed with reference to FIGS. 1-3, and comprises the following steps:

s1: initializing a sampled data set

S2: for data sets

Modeling by using a proxy model;

Specifically, the present embodiment specifically describes the steps as follows:

s1: modeling the performance index of the controller to be evaluated to obtain an evaluation function; selecting n feasible combinations as a parameter set to be evaluated according to the reachable domain of the parameter set to be optimized in the indexes, performing a controller performance effect experiment on the parameters to be evaluated on a carrier vehicle, and collecting an effect index data set

The evaluation function modeling mode of the vehicle system control performance index is as follows:

the control performance of the parameter theta matched into the system represents the weighted fusion of control accuracy, vehicle safety and control cost,

represents a variance of

The noise of the gaussian distribution of (a),

S2: a parameter set X = { theta = (theta) = (S1) to be evaluated ₁ ,…,θ _n Using the result index set corresponding to the parameter group set to be evaluated as input

As output, an effectiveness index dataset is performed using a proxy model

The modeling of (2);

wherein, theta _i ,i∈[1,n]Indicating the set of parameters that have been evaluated,

a value representing the controller performance effect;

in this embodiment, the proxy model is preferably configured as a gaussian process, but is not limited to the gaussian process, and the gaussian process is completely described by a mean function μ (X) and a covariance function K (X, X);

the mean function μ (X) is:

where Ψ (X) represents a p-order polynomial function, α _p Coefficients representing respective orders, C being a constant;

the covariance function K (X, X) is:

wherein, the diagonal matrix

Denotes the length-stretch over-parameter, λ _i ,i∈[1,n]Representing the parameter theta _i And (4) corresponding expansion and contraction parameters.

The objective function of the gaussian process regression is the logarithm of the edge likelihood distribution, as follows:

a likelihood distribution is represented by a distribution of the likelihood,

representing all sets of gaussian process-related hyper-parameters.

Further, a bayesian optimization proxy function is established, and in this embodiment, as an optimization, the bayesian optimization proxy function is modeled as:

representing a posterior mean function

Batch j, L (X) is the posterior distribution covariance of the Gaussian Process

Geodesic decomposition, meaning: which satisfies

Represents a sample of a standard normal distribution sample,

representing the optimal value observed for the current data set, minY.

After the Bayesian optimization proxy function is determined, aiming at the parameter set to be evaluated, an effect experiment is circulated, and the following steps are performed:

s21, obtaining an effect index data set through proxy model regression

Posterior distribution and variance of (d);

the posterior mean function

The calculation method comprises the following steps:

wherein I represents a unit matrix, X _1:n-1 ＝{θ ₁ ,…,θ _n-1 },

The posterior distribution covariance

The calculation method of (2) is as follows:

S23, controlling the parameter group represented by the recommended solution obtained in the S22 on the vehiclePerformance and effect experiment, collecting effect index

And augmenting the set of data with an existing S2 efficacy index dataset

S3: when the difference between the controller effect index and the ideal index is smaller than a set threshold value, or the difference between the posterior distribution mean value and the set threshold value is smaller than the set threshold value, the loop step of S2 is exited, and the obtained recommended solution is the index parameter which is obtained by the controller;

it should be noted that the present embodiment has no limitation on the type of the automatic driving controller, and can be used for automatic driving controllers with various parameter optimization requirements.

In one embodiment, the vehicle is typically modeled using a bicycle model (bicycle model), and linearized, discretized to the following form:

z _k+1 ＝Az _k +Bu _k , (1)

wherein

Representing the system state vector, e representing the trajectory tracking lateral offset error, d _ e representing the derivative of the trajectory tracking lateral cheap error, th _ e representing the angular offset error of the trajectory tracking, d _ th _ e representing the derivative of the trajectory tracking angular offset error, and delta _ v representing the difference between the current velocity and the planned velocity.

System control vector, delta steering angle and acc longitudinal acceleration.

Matrices a and B are shown below:

where dt represents the discrete time step and v represents the vehicle speed.

The control performance objective function is modeled as follows:

wherein

Indicating that it is expected, and M is the number of experiments. The optimization goal of the infinite time domain needs to get a finite approximation and it is expected to correspond to noisy estimates, we approximate with the following function:

wherein

Represents a variance of

Gaussian distribution noise. The controller is designed as a Linear Quadratic Regulator (LQR), and Q and R represent a state weight matrix and a control weight matrix, respectively. Consider the state weight parameter Q [0, 0] corresponding to the position error term]＝θ[0,0]. The corresponding control quantity corresponds to a control weight parameter of R0, 0]＝θ[0,1]。

The expression of the LQR controller is as follows,

u _k ＝-F _θ z _k , (5)

wherein F _θ The way of calculating (c) is as follows,

wherein P is _θ Is the solution of the algebraic licarbati equation:

the LQR controller is used for carrying out trajectory tracking control. The reference trajectory of the trajectory tracking is obtained by using a cubic spline interpolation function as follows:

where pos denotes the reference position, h ₁ ,…,h _m+1 Representing a total of m +1 reference anchor points. a is ₁ ,b ₁ ,c ₁ ,d ₁ ,…,a _m ,b _m ,c _m ,d _m Are the corresponding coefficients. In this example, the reference trajectory anchor point lateral positions are set to [0.0,6.0,12.5,10.0,17.5,20.0,25.0]The longitudinal positions are set to [0.0, -3.0, -5.0,6.5,3.0,0.0]。

Constructing an initial dataset for a Bayesian optimized proxy model

In the present embodiment, it is preferred that, taking n θ to form X = { θ = ₁ ,…,θ _n }; the parameter sets are respectively substituted into the LQR controller to carry out track tracking to obtain a controller effect evaluation set

In the present embodiment, two parameters of θ are set to [0.0001,0.001,0.01,0.1,1,10,100,1000, respectively]Therefore, n =64.

And entering a Bayesian optimization main loop.

First, a data set is obtained using a Gaussian process regression

Posterior distribution of (2). Without loss of generality, the prior mean function of the Gaussian process is taken as a zero mean, and the first n-1 points are taken for prior covariance function calculation, as follows:

the kernel function k (θ) _i ,θ _j ),i∈[1,n-1],j∈[1,n-1]In its complete form, the composition is,

wherein, the diagonal matrix

Indicating a length-stretch over-parameter, λ _i ,i∈[1,n-1]Representing the parameter theta _i And (4) corresponding expansion parameters.

The hyper-parameters are all obtained by minimizing the logarithm of the edge likelihood:

wherein

A likelihood distribution is represented by a distribution of the likelihood,

representing all sets of gaussian process-related hyper-parameters.

The posterior mean function

The calculation method of (2) is as follows:

wherein I represents a unit matrix, X _1:n-1 ＝{θ ₁ ,…,θ _n-1 },

The posterior distribution covariance

The calculation method of (2) is as follows:

and secondly, obtaining a next point to be evaluated recommended by Bayesian optimization by using the following proxy function:

wherein, AC (X) represents a proxy function, N represents the number of Monte Carlo sampling points, i represents the serial number of the current sampling point, q represents the number of the parallelization total batches, and j represents the current batch. X = { X ₁ ,…,X _q Denotes the cutting of the parameter set into q batches, where X _q Set of parameter sets representing the q-th batch.

Representing a posterior mean function

Batch j, L (X) is the posterior distribution covariance of the Gaussian Process

Obtained by Cholesky decomposition, i.e. it satisfies

Represents a sample of a standard normal distribution sample,

representing the best value observed for the current data set, minY.

Thirdly, mixing theta _n+1· Carrying out controller performance effect experiment on carrier vehicle, and collecting effect index

And augmenting the set of data to an existing data set

And updates the posterior distribution of bayesian optimization.

And when the difference between the performance index of the controller and the ideal index or the difference between the posterior distribution mean value of the proxy model and the set threshold is smaller than the set threshold, exiting the Bayesian optimization main loop to obtain the solution.

The above algorithm is implemented and deployed on computer media and equipment. In this embodiment, the computer medium is a notebook computer, the hardware of which is configured as CPUi5-10210U and 16G memory, and the software of which is configured as a windows 10 operating system, and is configured as python 3.9.6, pytorch 1.12.1, gptorch 1.9.0, boot ch 0.7.2, numpy 1.23.3 and matchlotlib 3.6.0.

The program operating parameters are configured as follows: the wheel diameter of the vehicle is 0.5m, and the maximum turning angle is 45 degrees. The discrete sampling time of the kinetic model was 0.1s. The gaussian process model is a single-task gaussian process with 16 initialization samples set. Bayesian optimization was attempted three times, each time trying to search 16 rounds, q for the surrogate function qEI was set to 1, the number of monte carlo samples was set to 64, and the search boundaries were all set to [0.0001,100].

As shown in fig. 2, a schematic diagram of changes of an objective function (i.e., a position error of the LQR controller track tracking) in an automatic optimal parameter set searching process obtained by operating the method of the present embodiment (i.e., the automatic optimization method based on the bayesian optimization vehicle automatic driving control algorithm described in S1 to S3 above) shows that, by using the method of the present embodiment, the parameter set of the controller can be effectively and automatically optimized.

As shown in fig. 3, the trajectory tracking control effect of the LQR controller is designed for the parameters obtained by operating the method of the present embodiment (i.e., the automatic optimization method based on the bayesian optimization vehicle automatic driving control algorithm described in S1 to S3).

Additionally, in other embodiments, the present invention may also provide an autonomous driving vehicle controller optimization apparatus based on bayesian optimization, comprising a memory and a processor;

the memory for storing a computer program;

the processor is configured to implement the automated vehicle driving controller optimization method based on bayesian optimization as described in S1 to S3 above when executing the computer program.

In addition, in another embodiment, the present invention may further provide a computer-readable storage medium, wherein the storage medium stores a computer program, and when the computer program is executed by a processor, the method for automatically optimizing an automatic vehicle driving controller based on bayesian optimization as described in S1 to S3 above can be implemented.

It should be noted that the Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Neural Network Processor (NPU), etc.; but also Digital Signal Processors (DSPs), application Specific Integrated Circuits (ASICs), field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components. Of course, the device should also have the necessary components to implement the program operation, such as power supply, communication bus, etc.

The above embodiments are only for illustrating the technical solutions of the present invention and are not limited, and other modifications or equivalent substitutions made by the technical solutions of the present invention by the ordinary skilled person in the art are included in the scope of the claims of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims

1. An automatic driving carrier controller optimization method based on Bayesian optimization is characterized by comprising the following steps:

s1: initializing a sample data set

S2: for data sets

Modeling by using a proxy model;

S3: and if the parameter group to be evaluated reaches the termination condition, exiting from the loop step of S2, ending and obtaining the index parameters required by the controller.

2. The automated driving vehicle controller optimization method based on bayesian optimization according to claim 1, wherein:

s1: modeling the performance index of the controller to be evaluated to obtain an evaluation function; according to the reachable domain of the parameter set to be optimized in the index, selecting n feasible combinations as the parameter set to be evaluated, and carrying the vehicle by the parameters to be evaluatedPerforming controller performance effect experiments on a vehicle and collecting effect index data sets

As output, effect index data set is performed using a proxy model

The modeling of (2); wherein, theta _i ,i∈[1,n]Indicating the set of parameters that have been evaluated,

a value representing the controller performance effect;

establishing a Bayesian optimization proxy function, and then circulating the following steps:

s21, obtaining an effect index data set through proxy model regression

Posterior distribution and variance of (a);

And augmenting the set of data with the existing data set

3. The automated driving vehicle controller optimization method based on bayesian optimization according to claim 2, wherein: s1, the evaluation function modeling mode of the vehicle system control performance index is as follows:

wherein the content of the first and second substances,

represents a variance of

The noise of the gaussian distribution of (a),

4. The automated driving vehicle controller optimization method based on bayesian optimization according to claim 2, wherein: in S1, the reachable domain space of the parameter set to be optimized is a mixed space, the mixed space comprises a discrete space and a continuous space, a part of the parameter set to be optimized belongs to the discrete space, and a part of the parameter set belongs to the continuous space.

5. The automated driving vehicle controller optimization method based on bayesian optimization according to claim 2, wherein: s2, the agent model is a Gaussian process which is completely described by a mean function mu (X) and a covariance function K (X, X);

the mean function μ (X) is:

the covariance function K (X, X) is:

wherein, the diagonal matrix

wherein the content of the first and second substances,

a likelihood distribution is represented by a distribution of the likelihood,

representing all sets of gaussian process-related hyper-parameters.

6. The automated driving vehicle controller optimization method based on bayesian optimization according to claim 2, wherein: the proxy function modeling of the Bayesian optimization is as follows:

representing a posterior mean function

Batch j, L (X) is the Gaussian process posterior distribution covariance

Geodesic decomposition, namely: which satisfies

Represents a sample of a standard normal distribution sample,

representing the optimal value minY observed for the current data set.

7. The automated driving vehicle controller optimization method based on bayesian optimization according to claim 6, wherein: the posterior mean function

The calculating method comprises the following steps:

wherein I represents a unit matrix, X _1:n-1 ＝{θ ₁ ,…,θ _n-1 },

The posterior distribution covariance

The calculation method of (2) is as follows:

8. the automated driving vehicle controller optimization method based on bayesian optimization according to claim 6, wherein: the Bayesian proxy function further comprises a maximization operation which depends on whether the demand for the objective function is minimized or maximized;

9. A computer-readable storage medium, characterized in that: the storage medium having stored thereon a computer program that, when executed by a processor, implements a bayesian optimization-based automated driving vehicle controller optimization method according to any of claims 1-8.

10. An automated driving carrier controller optimization apparatus based on bayesian optimization, characterized by: comprising a memory for storing a computer program and a processor which, when executing the computer program, carries out the automated driving carrier controller optimization method based on bayesian optimization according to any of the claims 1-8.