CN115755606A - Carrier controller automatic optimization method, medium and equipment based on Bayesian optimization - Google Patents

Carrier controller automatic optimization method, medium and equipment based on Bayesian optimization Download PDF

Info

Publication number
CN115755606A
CN115755606A CN202211433936.4A CN202211433936A CN115755606A CN 115755606 A CN115755606 A CN 115755606A CN 202211433936 A CN202211433936 A CN 202211433936A CN 115755606 A CN115755606 A CN 115755606A
Authority
CN
China
Prior art keywords
function
parameter
controller
optimization
proxy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211433936.4A
Other languages
Chinese (zh)
Other versions
CN115755606B (en
Inventor
苏杰
牟剑秋
许正昊
李晓芸
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Youdao Zhitu Technology Co Ltd
Original Assignee
Shanghai Youdao Zhitu Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Youdao Zhitu Technology Co Ltd filed Critical Shanghai Youdao Zhitu Technology Co Ltd
Priority to CN202211433936.4A priority Critical patent/CN115755606B/en
Publication of CN115755606A publication Critical patent/CN115755606A/en
Application granted granted Critical
Publication of CN115755606B publication Critical patent/CN115755606B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Feedback Control In General (AREA)

Abstract

The invention discloses an automatic optimization method, medium and equipment of an automatic carrier driving controller based on Bayesian optimization, wherein Bayesian optimization is used for automatically optimizing the performance of the automatic carrier driving controller, manual parameter adjustment and grid parameter adjustment which are original, tedious and low in efficiency are replaced, and the automatic carrier driving controller has definite practical significance, and batch parallelization technology is used for improving analytic proxy functions of Bayesian optimization, so that the efficiency of optimizing the performance of the automatic carrier driving controller is improved, and the automatic carrier driving controller has obvious technical advancement and practicability.

Description

Carrier controller automatic optimization method, medium and equipment based on Bayesian optimization
Technical Field
The invention belongs to the technical field of intelligent automobile automatic driving, and particularly relates to a carrier automatic driving controller automatic optimization method, medium and equipment based on Bayesian optimization.
Background
In recent years, with the rapid improvement of the intelligence level of a vehicle, the related technology of automatic driving is developed vigorously, a controller deployed with a control algorithm is one of necessary modules of an automatic driving vehicle system, and can effectively control the vehicle to track a reference track to drive forwards. In the past, the adjustment of control algorithm parameters is performed by methods such as manual adjustment or grid search, the efficiency is low, the parameter adjustment space is limited, and the optimal control performance cannot be approached.
In order to optimize the performance of the control algorithm and to improve the efficiency of the optimization process, researchers have conducted some research and exploration. Marco et al, in the article "Automatic LQR Tuning Based on Gaussian Process Global Optimization" by IEEE International Conference on Robotics and Automation,2016, proposes an Automatic LQR controller Optimization method Based on Gaussian Process Bayesian Optimization, which uses entropy search as a proxy function and can automatically, efficiently and quickly search the optimal parameter set of the LQR controller; su, jie et al, published in the article "Autonomous vehicle control through the dynamics and controller learning" of IEEE Transactions on Vehicular Technology,2018, further consider the performance optimization of the LQR controller for Gaussian process Bayesian optimization, design a time-varying lower confidence function as the proxy function of Bayesian optimization for the time-varying characteristics of system operation, and have better applicability to the time-varying characteristic scene of the vehicle; riboni, A. Et al, published in nature,2022, the scientific report "Bayesian optimization and deep learning for steering wheel angle prediction" used LSTM to design controllers for backbone networks and Bayesian optimization as controller parameters for automated optimization search for steering control of autonomous vehicles.
The research results can improve the performance optimization efficiency of the controller to a certain extent, but the methods still have certain limitations, such as: the research results all consider a single-process serialization decision example, so that the design of a Bayesian optimization process by using an analytic proxy function cannot be parallelized. The deep learning control algorithm designed by Riboni, A. Et al, as described above, has numerous parameters to be adjusted, which makes the parameter set space dimension of Bayesian optimization very high; in addition, the value space of part of the control parameters has a compact continuity characteristic, so that the number of the parameter groups to be searched is increased sharply, and further challenges are brought to the efficiency of optimizing the search task.
Disclosure of Invention
In view of the above problems, the present invention is directed to a vehicle automatic driving controller automatic optimization method, medium, and apparatus based on bayesian optimization, which consider using a multi-batch parallelized expectation-improvement function as a proxy function for bayesian optimization, and provide a better solution for optimization search of an automatic driving control algorithm.
In order to achieve the purpose, the invention adopts the following technical scheme:
an automatic driving carrier controller optimization method based on Bayesian optimization comprises the following steps:
s1: initializing a sample data set
Figure BDA0003946135710000021
S2: for data sets
Figure BDA0003946135710000022
Modeling by using a proxy model;
establishing a Bayesian optimization proxy function, and circulating the following steps:
s21: obtaining the mean value and the variance of posterior distribution through proxy model regression;
s22: obtaining a parameter group to be evaluated through a Bayesian optimization proxy function;
s23: the obtained parameter group to be evaluated is verified on a vehicle carrying a vehicle body, and the data of the parameter group is amplified to a sampling data set
Figure BDA0003946135710000023
S3: and if the parameter group to be evaluated reaches the termination condition, the loop step of S2 is exited, and the index parameter obtained by the controller is finished.
As a further description of the invention, S1:modeling the performance index of the controller to be evaluated to obtain an evaluation function; selecting n feasible combinations as parameter sets to be evaluated according to reachable domains of parameter sets to be optimized in the indexes, performing a controller performance effect experiment on a carrier vehicle through the parameters to be evaluated, and collecting an effect index data set
Figure BDA0003946135710000031
S2: set X = { theta ] of S1 parameter group to be evaluated 1 ,…,θ n Using the result index set corresponding to the parameter group set to be evaluated as input
Figure BDA0003946135710000032
As output, effect index data set is performed using a proxy model
Figure BDA0003946135710000033
Modeling; wherein, theta i ,i∈[1,n]Indicating the set of parameters that have been evaluated,
Figure BDA0003946135710000034
representing a controller performance effect value; establishing a Bayesian optimization proxy function, and then circulating the following steps:
s21, obtaining an effect index data set through proxy model regression
Figure BDA00039461357100000312
Posterior distribution and variance of (d);
s22, substituting the mean function and the variance function of the posterior distribution obtained in the S21 into a Bayesian optimization proxy function to obtain a recommended solution theta predicted by the proxy function n+1·
S23, performing a controller performance effect experiment on the carrier vehicle by using the parameter group represented by the recommendation solution obtained in the step S22, and collecting effect indexes
Figure BDA0003946135710000035
And augmenting the set of data with the existing data set
Figure BDA0003946135710000036
S3: and when the difference between the controller effect index and the ideal index is smaller than a set threshold value, or the difference between the posterior distribution mean value and the set threshold value is smaller than the set threshold value, the loop step is exited, and the obtained recommended solution is the index parameter required by the controller.
As a further description of the present invention, the evaluation function modeling manner of the vehicle system control performance index in S1 is:
Figure BDA0003946135710000037
wherein the content of the first and second substances,
Figure BDA0003946135710000038
the control performance of the parameter theta matched into the system is represented, and the parameter theta represents weighted fusion of control accuracy, vehicle safety and control cost,
Figure BDA0003946135710000039
represents a variance of
Figure BDA00039461357100000310
The noise of the gaussian distribution of (a),
Figure BDA00039461357100000311
representing the corresponding noisy evaluation measured after the parameter set theta is fitted into the system.
As a further description of the present invention, the reachable domain space of the parameter set to be optimized in S1 is a mixed space, the mixed space includes a discrete space and a continuous space, a part of the parameter sets to be optimized belongs to the discrete space, and a part of the parameter sets belongs to the continuous space.
As a further description of the present invention, the surrogate model in S2 is a gaussian process, which is completely described by a mean function μ (X) and a covariance function K (X, X);
the mean function μ (X) is:
Figure BDA0003946135710000041
where Ψ (X) represents a polynomial function of order p, α p Coefficients representing respective orders, C being a constant;
the covariance function K (X, X) is:
Figure BDA0003946135710000042
wherein the kernel function k (theta) ij ),i∈[1,n],j∈[1,n]In its complete form:
Figure BDA0003946135710000043
wherein, the diagonal matrix
Figure BDA0003946135710000044
Indicating a length-stretch over-parameter, λ i ,i∈[1,n]Representing the parameter theta i Corresponding expansion parameters;
the objective function of the proxy model regression is the logarithm of the edge likelihood distribution, as follows:
Figure BDA0003946135710000045
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003946135710000046
a distribution of the likelihood is represented by,
Figure BDA00039461357100000417
representing all sets of gaussian process-related hyper-parameters.
As a further description of the present invention, the proxy function of bayesian optimization is modeled as:
Figure BDA0003946135710000047
wherein, AC (X) represents a proxy function, N represents the number of Monte Carlo sampling points, i represents the serial number of the current sampling point, q represents the number of the parallelization total batches, and j represents the current batch; x = { X 1 ,…,X q Denotes the cutting of the parameter set into q batches, where X q A set of parameter sets representing the q-th batch;
Figure BDA0003946135710000048
representing a posterior mean function
Figure BDA0003946135710000049
Batch j, L (X) is the posterior distribution covariance of the Gaussian Process
Figure BDA00039461357100000410
Geodesic decomposition, i.e.: which satisfies
Figure BDA00039461357100000411
Figure BDA00039461357100000412
A sample of a standard normal distribution is represented,
Figure BDA00039461357100000413
representing the optimal value minY observed for the current data set.
As a further description of the invention, the posterior mean function
Figure BDA00039461357100000414
The calculating method comprises the following steps:
Figure BDA00039461357100000415
wherein I represents a unit matrix, X 1:n-1 ={θ 1 ,…,θ n-1 },
Figure BDA00039461357100000416
Figure BDA0003946135710000053
The posterior distribution covariance
Figure BDA0003946135710000051
The calculation method of (2) is as follows:
Figure BDA0003946135710000052
as a further description of the present invention, the Bayesian proxy function also includes a maximization operation, which depends on whether the need for the objective function is minimized or maximized;
minimizing demands are directed to proxy functions with minimizing operations and maximizing demands are directed to proxy functions with maximizing operations.
A computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the automated driving vehicle controller optimization method based on bayesian optimization.
An automated driving carrier controller optimization device based on Bayesian optimization comprises a memory for storing a computer program and a processor, wherein the processor implements the automated driving carrier controller optimization method based on Bayesian optimization when executing the computer program.
Compared with the prior art, the invention has the technical effects that:
the invention provides an automatic optimization method, medium and equipment of an automatic driving controller of a carrier based on Bayesian optimization, wherein the Bayesian optimization is used for automatically optimizing the performance of the automatic driving controller of the carrier, manual parameter adjustment and grid parameter adjustment which are original, redundant and low in efficiency are replaced, the method has clear practical significance, a batch parallelization technology is used for improving and promoting analytic proxy functions of Bayesian optimization, the efficiency of performance optimization of the automatic driving controller of the carrier is improved, and the method has obvious technical advancement and practicability.
Drawings
FIG. 1 is a flow chart of an automated driving vehicle controller optimization method based on Bayesian optimization;
FIG. 2 is a schematic diagram of the controller performance variation corresponding to a set of candidate parameters during Bayesian optimization operation;
fig. 3 is a schematic diagram of the trace tracking effect of the LQR controller designed by using the parameter set obtained by bayesian optimization.
Detailed Description
The invention is described in detail below with reference to the attached drawing figures:
an automatic driving carrier controller optimization method based on Bayesian optimization is disclosed with reference to FIGS. 1-3, and comprises the following steps:
s1: initializing a sampled data set
Figure BDA0003946135710000061
S2: for data sets
Figure BDA0003946135710000062
Modeling by using a proxy model;
establishing a Bayesian optimization proxy function, and circulating the following steps:
s21: obtaining the mean value and the variance of posterior distribution through proxy model regression;
s22: obtaining a parameter group to be evaluated through a Bayesian optimization proxy function;
s23: the obtained parameter group to be evaluated is verified on a vehicle carrying a vehicle body, and the data of the parameter group is amplified to a sampling data set
Figure BDA0003946135710000063
S3: and if the parameter group to be evaluated reaches the termination condition, the loop step of S2 is exited, and the index parameter obtained by the controller is finished.
Specifically, the present embodiment specifically describes the steps as follows:
s1: modeling the performance index of the controller to be evaluated to obtain an evaluation function; selecting n feasible combinations as a parameter set to be evaluated according to the reachable domain of the parameter set to be optimized in the indexes, performing a controller performance effect experiment on the parameters to be evaluated on a carrier vehicle, and collecting an effect index data set
Figure BDA0003946135710000064
The evaluation function modeling mode of the vehicle system control performance index is as follows:
Figure BDA0003946135710000065
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003946135710000066
the control performance of the parameter theta matched into the system represents the weighted fusion of control accuracy, vehicle safety and control cost,
Figure BDA0003946135710000067
represents a variance of
Figure BDA0003946135710000068
The noise of the gaussian distribution of (a),
Figure BDA0003946135710000069
representing the corresponding noisy evaluation measured after the parameter set theta is fitted into the system.
S2: a parameter set X = { theta = (theta) = (S1) to be evaluated 1 ,…,θ n Using the result index set corresponding to the parameter group set to be evaluated as input
Figure BDA00039461357100000610
As output, an effectiveness index dataset is performed using a proxy model
Figure BDA00039461357100000611
The modeling of (2);
wherein, theta i ,i∈[1,n]Indicating the set of parameters that have been evaluated,
Figure BDA00039461357100000612
a value representing the controller performance effect;
in this embodiment, the proxy model is preferably configured as a gaussian process, but is not limited to the gaussian process, and the gaussian process is completely described by a mean function μ (X) and a covariance function K (X, X);
the mean function μ (X) is:
Figure BDA0003946135710000071
where Ψ (X) represents a p-order polynomial function, α p Coefficients representing respective orders, C being a constant;
the covariance function K (X, X) is:
Figure BDA0003946135710000072
wherein the kernel function k (theta) ij ),i∈[1,n],j∈[1,n]In its complete form:
Figure BDA0003946135710000073
wherein, the diagonal matrix
Figure BDA0003946135710000074
Denotes the length-stretch over-parameter, λ i ,i∈[1,n]Representing the parameter theta i And (4) corresponding expansion and contraction parameters.
The objective function of the gaussian process regression is the logarithm of the edge likelihood distribution, as follows:
Figure BDA0003946135710000075
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003946135710000076
a likelihood distribution is represented by a distribution of the likelihood,
Figure BDA00039461357100000714
representing all sets of gaussian process-related hyper-parameters.
Further, a bayesian optimization proxy function is established, and in this embodiment, as an optimization, the bayesian optimization proxy function is modeled as:
Figure BDA0003946135710000077
wherein, AC (X) represents a proxy function, N represents the number of Monte Carlo sampling points, i represents the serial number of the current sampling point, q represents the number of the parallelization total batches, and j represents the current batch; x = { X 1 ,…,X q Denotes the cutting of the parameter set into q batches, where X q A set of parameter sets representing the q-th batch;
Figure BDA0003946135710000078
representing a posterior mean function
Figure BDA0003946135710000079
Batch j, L (X) is the posterior distribution covariance of the Gaussian Process
Figure BDA00039461357100000710
Geodesic decomposition, meaning: which satisfies
Figure BDA00039461357100000711
Figure BDA00039461357100000712
Represents a sample of a standard normal distribution sample,
Figure BDA00039461357100000713
representing the optimal value observed for the current data set, minY.
After the Bayesian optimization proxy function is determined, aiming at the parameter set to be evaluated, an effect experiment is circulated, and the following steps are performed:
s21, obtaining an effect index data set through proxy model regression
Figure BDA0003946135710000081
Posterior distribution and variance of (d);
the posterior mean function
Figure BDA0003946135710000082
The calculation method comprises the following steps:
Figure BDA0003946135710000083
wherein I represents a unit matrix, X 1:n-1 ={θ 1 ,…,θ n-1 },
Figure BDA00039461357100000811
Figure BDA00039461357100000812
The posterior distribution covariance
Figure BDA0003946135710000085
The calculation method of (2) is as follows:
Figure BDA0003946135710000086
s22, substituting the mean function and the variance function of the posterior distribution obtained in the S21 into a Bayesian optimization proxy function to obtain a recommended solution theta predicted by the proxy function n+1·
S23, controlling the parameter group represented by the recommended solution obtained in the S22 on the vehiclePerformance and effect experiment, collecting effect index
Figure BDA0003946135710000087
And augmenting the set of data with an existing S2 efficacy index dataset
Figure BDA0003946135710000088
S3: when the difference between the controller effect index and the ideal index is smaller than a set threshold value, or the difference between the posterior distribution mean value and the set threshold value is smaller than the set threshold value, the loop step of S2 is exited, and the obtained recommended solution is the index parameter which is obtained by the controller;
it should be noted that the present embodiment has no limitation on the type of the automatic driving controller, and can be used for automatic driving controllers with various parameter optimization requirements.
In one embodiment, the vehicle is typically modeled using a bicycle model (bicycle model), and linearized, discretized to the following form:
z k+1 =Az k +Bu k , (1)
wherein
Figure BDA0003946135710000089
Representing the system state vector, e representing the trajectory tracking lateral offset error, d _ e representing the derivative of the trajectory tracking lateral cheap error, th _ e representing the angular offset error of the trajectory tracking, d _ th _ e representing the derivative of the trajectory tracking angular offset error, and delta _ v representing the difference between the current velocity and the planned velocity.
Figure BDA00039461357100000810
System control vector, delta steering angle and acc longitudinal acceleration.
Matrices a and B are shown below:
Figure BDA0003946135710000091
where dt represents the discrete time step and v represents the vehicle speed.
The control performance objective function is modeled as follows:
Figure BDA0003946135710000092
wherein
Figure BDA0003946135710000093
Indicating that it is expected, and M is the number of experiments. The optimization goal of the infinite time domain needs to get a finite approximation and it is expected to correspond to noisy estimates, we approximate with the following function:
Figure BDA0003946135710000094
wherein
Figure BDA0003946135710000095
Represents a variance of
Figure BDA0003946135710000096
Gaussian distribution noise. The controller is designed as a Linear Quadratic Regulator (LQR), and Q and R represent a state weight matrix and a control weight matrix, respectively. Consider the state weight parameter Q [0, 0] corresponding to the position error term]=θ[0,0]. The corresponding control quantity corresponds to a control weight parameter of R0, 0]=θ[0,1]。
The expression of the LQR controller is as follows,
u k =-F θ z k , (5)
wherein F θ The way of calculating (c) is as follows,
Figure BDA0003946135710000098
wherein P is θ Is the solution of the algebraic licarbati equation:
Figure BDA0003946135710000099
the LQR controller is used for carrying out trajectory tracking control. The reference trajectory of the trajectory tracking is obtained by using a cubic spline interpolation function as follows:
Figure BDA0003946135710000097
where pos denotes the reference position, h 1 ,…,h m+1 Representing a total of m +1 reference anchor points. a is 1 ,b 1 ,c 1 ,d 1 ,…,a m ,b m ,c m ,d m Are the corresponding coefficients. In this example, the reference trajectory anchor point lateral positions are set to [0.0,6.0,12.5,10.0,17.5,20.0,25.0]The longitudinal positions are set to [0.0, -3.0, -5.0,6.5,3.0,0.0]。
Constructing an initial dataset for a Bayesian optimized proxy model
Figure BDA0003946135710000101
In the present embodiment, it is preferred that, taking n θ to form X = { θ = 1 ,…,θ n }; the parameter sets are respectively substituted into the LQR controller to carry out track tracking to obtain a controller effect evaluation set
Figure BDA0003946135710000102
In the present embodiment, two parameters of θ are set to [0.0001,0.001,0.01,0.1,1,10,100,1000, respectively]Therefore, n =64.
And entering a Bayesian optimization main loop.
First, a data set is obtained using a Gaussian process regression
Figure BDA0003946135710000103
Posterior distribution of (2). Without loss of generality, the prior mean function of the Gaussian process is taken as a zero mean, and the first n-1 points are taken for prior covariance function calculation, as follows:
Figure BDA0003946135710000104
the kernel function k (θ) ij ),i∈[1,n-1],j∈[1,n-1]In its complete form, the composition is,
Figure BDA0003946135710000105
wherein, the diagonal matrix
Figure BDA0003946135710000106
Indicating a length-stretch over-parameter, λ i ,i∈[1,n-1]Representing the parameter theta i And (4) corresponding expansion parameters.
The hyper-parameters are all obtained by minimizing the logarithm of the edge likelihood:
Figure BDA0003946135710000107
wherein
Figure BDA0003946135710000108
A likelihood distribution is represented by a distribution of the likelihood,
Figure BDA0003946135710000109
representing all sets of gaussian process-related hyper-parameters.
The posterior mean function
Figure BDA00039461357100001010
The calculation method of (2) is as follows:
Figure BDA00039461357100001011
wherein I represents a unit matrix, X 1:n-1 ={θ 1 ,…,θ n-1 },
Figure BDA00039461357100001012
Figure BDA00039461357100001014
The posterior distribution covariance
Figure BDA00039461357100001013
The calculation method of (2) is as follows:
Figure BDA0003946135710000111
and secondly, obtaining a next point to be evaluated recommended by Bayesian optimization by using the following proxy function:
Figure BDA0003946135710000112
wherein, AC (X) represents a proxy function, N represents the number of Monte Carlo sampling points, i represents the serial number of the current sampling point, q represents the number of the parallelization total batches, and j represents the current batch. X = { X 1 ,…,X q Denotes the cutting of the parameter set into q batches, where X q Set of parameter sets representing the q-th batch.
Figure BDA0003946135710000113
Representing a posterior mean function
Figure BDA0003946135710000114
Batch j, L (X) is the posterior distribution covariance of the Gaussian Process
Figure BDA0003946135710000115
Obtained by Cholesky decomposition, i.e. it satisfies
Figure BDA0003946135710000116
Figure BDA0003946135710000117
Represents a sample of a standard normal distribution sample,
Figure BDA0003946135710000118
representing the best value observed for the current data set, minY.
Thirdly, mixing theta n+1· Carrying out controller performance effect experiment on carrier vehicle, and collecting effect index
Figure BDA0003946135710000119
And augmenting the set of data to an existing data set
Figure BDA00039461357100001110
And updates the posterior distribution of bayesian optimization.
And when the difference between the performance index of the controller and the ideal index or the difference between the posterior distribution mean value of the proxy model and the set threshold is smaller than the set threshold, exiting the Bayesian optimization main loop to obtain the solution.
The above algorithm is implemented and deployed on computer media and equipment. In this embodiment, the computer medium is a notebook computer, the hardware of which is configured as CPUi5-10210U and 16G memory, and the software of which is configured as a windows 10 operating system, and is configured as python 3.9.6, pytorch 1.12.1, gptorch 1.9.0, boot ch 0.7.2, numpy 1.23.3 and matchlotlib 3.6.0.
The program operating parameters are configured as follows: the wheel diameter of the vehicle is 0.5m, and the maximum turning angle is 45 degrees. The discrete sampling time of the kinetic model was 0.1s. The gaussian process model is a single-task gaussian process with 16 initialization samples set. Bayesian optimization was attempted three times, each time trying to search 16 rounds, q for the surrogate function qEI was set to 1, the number of monte carlo samples was set to 64, and the search boundaries were all set to [0.0001,100].
As shown in fig. 2, a schematic diagram of changes of an objective function (i.e., a position error of the LQR controller track tracking) in an automatic optimal parameter set searching process obtained by operating the method of the present embodiment (i.e., the automatic optimization method based on the bayesian optimization vehicle automatic driving control algorithm described in S1 to S3 above) shows that, by using the method of the present embodiment, the parameter set of the controller can be effectively and automatically optimized.
As shown in fig. 3, the trajectory tracking control effect of the LQR controller is designed for the parameters obtained by operating the method of the present embodiment (i.e., the automatic optimization method based on the bayesian optimization vehicle automatic driving control algorithm described in S1 to S3).
Additionally, in other embodiments, the present invention may also provide an autonomous driving vehicle controller optimization apparatus based on bayesian optimization, comprising a memory and a processor;
the memory for storing a computer program;
the processor is configured to implement the automated vehicle driving controller optimization method based on bayesian optimization as described in S1 to S3 above when executing the computer program.
In addition, in another embodiment, the present invention may further provide a computer-readable storage medium, wherein the storage medium stores a computer program, and when the computer program is executed by a processor, the method for automatically optimizing an automatic vehicle driving controller based on bayesian optimization as described in S1 to S3 above can be implemented.
It should be noted that the Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Neural Network Processor (NPU), etc.; but also Digital Signal Processors (DSPs), application Specific Integrated Circuits (ASICs), field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components. Of course, the device should also have the necessary components to implement the program operation, such as power supply, communication bus, etc.
The above embodiments are only for illustrating the technical solutions of the present invention and are not limited, and other modifications or equivalent substitutions made by the technical solutions of the present invention by the ordinary skilled person in the art are included in the scope of the claims of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims (10)

1. An automatic driving carrier controller optimization method based on Bayesian optimization is characterized by comprising the following steps:
s1: initializing a sample data set
Figure FDA0003946135700000011
S2: for data sets
Figure FDA0003946135700000012
Modeling by using a proxy model;
establishing a Bayesian optimization proxy function, and circulating the following steps:
s21: obtaining the mean value and the variance of posterior distribution through proxy model regression;
s22: obtaining a parameter group to be evaluated through a Bayesian optimization proxy function;
s23: the obtained parameter group to be evaluated is verified on a vehicle carrying a vehicle body, and the data of the parameter group is amplified to a sampling data set
Figure FDA0003946135700000013
S3: and if the parameter group to be evaluated reaches the termination condition, exiting from the loop step of S2, ending and obtaining the index parameters required by the controller.
2. The automated driving vehicle controller optimization method based on bayesian optimization according to claim 1, wherein:
s1: modeling the performance index of the controller to be evaluated to obtain an evaluation function; according to the reachable domain of the parameter set to be optimized in the index, selecting n feasible combinations as the parameter set to be evaluated, and carrying the vehicle by the parameters to be evaluatedPerforming controller performance effect experiments on a vehicle and collecting effect index data sets
Figure FDA0003946135700000014
S2: set X = { theta ] of S1 parameter group to be evaluated 1 ,…,θ n Using the result index set corresponding to the parameter group set to be evaluated as input
Figure FDA0003946135700000015
As output, effect index data set is performed using a proxy model
Figure FDA0003946135700000016
The modeling of (2); wherein, theta i ,i∈[1,n]Indicating the set of parameters that have been evaluated,
Figure FDA0003946135700000017
a value representing the controller performance effect;
establishing a Bayesian optimization proxy function, and then circulating the following steps:
s21, obtaining an effect index data set through proxy model regression
Figure FDA0003946135700000018
Posterior distribution and variance of (a);
s22, substituting the mean function and the variance function of the posterior distribution obtained in the S21 into a Bayesian optimization proxy function to obtain a recommended solution theta predicted by the proxy function n+1·
S23, performing a controller performance effect experiment on the carrier vehicle by using the parameter group represented by the recommendation solution obtained in the step S22, and collecting effect indexes
Figure FDA0003946135700000021
And augmenting the set of data with the existing data set
Figure FDA0003946135700000022
S3: and when the difference between the controller effect index and the ideal index is smaller than a set threshold value, or the difference between the posterior distribution mean value and the set threshold value is smaller than the set threshold value, the loop step is exited, and the obtained recommended solution is the index parameter required by the controller.
3. The automated driving vehicle controller optimization method based on bayesian optimization according to claim 2, wherein: s1, the evaluation function modeling mode of the vehicle system control performance index is as follows:
Figure FDA0003946135700000023
wherein the content of the first and second substances,
Figure FDA0003946135700000024
the control performance of the parameter theta matched into the system is represented, and the parameter theta represents weighted fusion of control accuracy, vehicle safety and control cost,
Figure FDA0003946135700000025
represents a variance of
Figure FDA0003946135700000026
The noise of the gaussian distribution of (a),
Figure FDA0003946135700000027
representing the corresponding noisy evaluation measured after the parameter set theta is fitted into the system.
4. The automated driving vehicle controller optimization method based on bayesian optimization according to claim 2, wherein: in S1, the reachable domain space of the parameter set to be optimized is a mixed space, the mixed space comprises a discrete space and a continuous space, a part of the parameter set to be optimized belongs to the discrete space, and a part of the parameter set belongs to the continuous space.
5. The automated driving vehicle controller optimization method based on bayesian optimization according to claim 2, wherein: s2, the agent model is a Gaussian process which is completely described by a mean function mu (X) and a covariance function K (X, X);
the mean function μ (X) is:
Figure FDA0003946135700000028
where Ψ (X) represents a p-order polynomial function, α p Coefficients representing respective orders, C being a constant;
the covariance function K (X, X) is:
Figure FDA0003946135700000029
wherein the kernel function k (theta) ij ),i∈[1,n],j∈[1,n]In its complete form:
Figure FDA00039461357000000210
wherein, the diagonal matrix
Figure FDA0003946135700000031
Indicating a length-stretch over-parameter, λ i ,i∈[1,n]Representing the parameter theta i Corresponding expansion parameters;
the objective function of the proxy model regression is the logarithm of the edge likelihood distribution, as follows:
Figure FDA0003946135700000032
wherein the content of the first and second substances,
Figure FDA0003946135700000033
a likelihood distribution is represented by a distribution of the likelihood,
Figure FDA0003946135700000034
representing all sets of gaussian process-related hyper-parameters.
6. The automated driving vehicle controller optimization method based on bayesian optimization according to claim 2, wherein: the proxy function modeling of the Bayesian optimization is as follows:
Figure FDA0003946135700000035
wherein, AC (X) represents a proxy function, N represents the number of Monte Carlo sampling points, i represents the serial number of the current sampling point, q represents the number of the parallelization total batches, and j represents the current batch; x = { X 1 ,…,X q Denotes the cutting of the parameter set into q batches, where X q A set of parameter sets representing the q-th batch;
Figure FDA0003946135700000036
representing a posterior mean function
Figure FDA0003946135700000037
Batch j, L (X) is the Gaussian process posterior distribution covariance
Figure FDA0003946135700000038
Geodesic decomposition, namely: which satisfies
Figure FDA0003946135700000039
Represents a sample of a standard normal distribution sample,
Figure FDA00039461357000000310
representing the optimal value minY observed for the current data set.
7. The automated driving vehicle controller optimization method based on bayesian optimization according to claim 6, wherein: the posterior mean function
Figure FDA00039461357000000311
The calculating method comprises the following steps:
Figure FDA00039461357000000312
wherein I represents a unit matrix, X 1:n-1 ={θ 1 ,…,θ n-1 },
Figure FDA00039461357000000313
Figure FDA00039461357000000316
Figure FDA00039461357000000317
The posterior distribution covariance
Figure FDA00039461357000000314
The calculation method of (2) is as follows:
Figure FDA00039461357000000315
8. the automated driving vehicle controller optimization method based on bayesian optimization according to claim 6, wherein: the Bayesian proxy function further comprises a maximization operation which depends on whether the demand for the objective function is minimized or maximized;
minimizing demands are directed to proxy functions with minimizing operations and maximizing demands are directed to proxy functions with maximizing operations.
9. A computer-readable storage medium, characterized in that: the storage medium having stored thereon a computer program that, when executed by a processor, implements a bayesian optimization-based automated driving vehicle controller optimization method according to any of claims 1-8.
10. An automated driving carrier controller optimization apparatus based on bayesian optimization, characterized by: comprising a memory for storing a computer program and a processor which, when executing the computer program, carries out the automated driving carrier controller optimization method based on bayesian optimization according to any of the claims 1-8.
CN202211433936.4A 2022-11-16 2022-11-16 Automatic optimization method, medium and equipment for carrier controller based on Bayesian optimization Active CN115755606B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211433936.4A CN115755606B (en) 2022-11-16 2022-11-16 Automatic optimization method, medium and equipment for carrier controller based on Bayesian optimization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211433936.4A CN115755606B (en) 2022-11-16 2022-11-16 Automatic optimization method, medium and equipment for carrier controller based on Bayesian optimization

Publications (2)

Publication Number Publication Date
CN115755606A true CN115755606A (en) 2023-03-07
CN115755606B CN115755606B (en) 2023-07-07

Family

ID=85372191

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211433936.4A Active CN115755606B (en) 2022-11-16 2022-11-16 Automatic optimization method, medium and equipment for carrier controller based on Bayesian optimization

Country Status (1)

Country Link
CN (1) CN115755606B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108573281A (en) * 2018-04-11 2018-09-25 中科弘云科技(北京)有限公司 A kind of tuning improved method of the deep learning hyper parameter based on Bayes's optimization
CN110304075A (en) * 2019-07-04 2019-10-08 清华大学 Track of vehicle prediction technique based on Mix-state DBN and Gaussian process
CN111460368A (en) * 2020-05-22 2020-07-28 南京大学 Parallel Bayesian optimization method
EP3748556A1 (en) * 2019-06-06 2020-12-09 Robert Bosch GmbH Method and device for determining a control strategy for a technical system
CN112163373A (en) * 2020-09-23 2021-01-01 中国民航大学 Radar system performance index dynamic evaluation method based on Bayesian machine learning
CN113525406A (en) * 2020-04-15 2021-10-22 百度(美国)有限责任公司 Bayesian global optimization based parameter tuning for vehicle motion controllers
CN113874865A (en) * 2019-06-06 2021-12-31 罗伯特·博世有限公司 Method and device for determining model parameters of a control strategy of a technical system by means of a Bayesian optimization method
US20220126441A1 (en) * 2020-10-28 2022-04-28 Robert Bosch Gmbh Method for optimizing a policy for a robot

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108573281A (en) * 2018-04-11 2018-09-25 中科弘云科技(北京)有限公司 A kind of tuning improved method of the deep learning hyper parameter based on Bayes's optimization
EP3748556A1 (en) * 2019-06-06 2020-12-09 Robert Bosch GmbH Method and device for determining a control strategy for a technical system
CN113874865A (en) * 2019-06-06 2021-12-31 罗伯特·博世有限公司 Method and device for determining model parameters of a control strategy of a technical system by means of a Bayesian optimization method
CN110304075A (en) * 2019-07-04 2019-10-08 清华大学 Track of vehicle prediction technique based on Mix-state DBN and Gaussian process
CN113525406A (en) * 2020-04-15 2021-10-22 百度(美国)有限责任公司 Bayesian global optimization based parameter tuning for vehicle motion controllers
CN111460368A (en) * 2020-05-22 2020-07-28 南京大学 Parallel Bayesian optimization method
CN112163373A (en) * 2020-09-23 2021-01-01 中国民航大学 Radar system performance index dynamic evaluation method based on Bayesian machine learning
US20220126441A1 (en) * 2020-10-28 2022-04-28 Robert Bosch Gmbh Method for optimizing a policy for a robot

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"Autonomous Vehicle Control Through the Dynamics and Controller Learning", 《IEEE》 *
ALONSO MARCO 等: "Automatic LQR tuning based on Gaussian process global optimization", 《IEEE》 *

Also Published As

Publication number Publication date
CN115755606B (en) 2023-07-07

Similar Documents

Publication Publication Date Title
Berkenkamp et al. Bayesian optimization with safety constraints: safe and automatic parameter tuning in robotics
Xie et al. Motion trajectory prediction based on a CNN-LSTM sequential model
Rosolia et al. Autonomous racing using learning model predictive control
Li et al. A policy search method for temporal logic specified reinforcement learning tasks
Xiao et al. Deep neural networks with Koopman operators for modeling and control of autonomous vehicles
Okada et al. Path integral networks: End-to-end differentiable optimal control
Quinonero-Candela et al. Approximation methods for Gaussian process regression
Yu et al. Path tracking control based on tube MPC and time delay motion prediction
CN112977412A (en) Vehicle control method, device and equipment and computer storage medium
Huang et al. Interpretable policies for reinforcement learning by empirical fuzzy sets
Lin et al. Continuous-time finite-horizon ADP for automated vehicle controller design with high efficiency
Zhu et al. A gaussian process model for opponent prediction in autonomous racing
Xin et al. Accelerated inverse reinforcement learning with randomly pre-sampled policies for autonomous driving reward design
Yin et al. A novel gated recurrent unit network based on SVM and moth-flame optimization algorithm for behavior decision-making of autonomous vehicles
CN115755606A (en) Carrier controller automatic optimization method, medium and equipment based on Bayesian optimization
Engin et al. Neural optimal control using learned system dynamics
Mazumder et al. Action permissibility in deep reinforcement learning and application to autonomous driving
Li et al. Neural-fuzzy control of truck backer-upper system using a clustering method
Wibawa et al. Modified online sequential extreme learning machine algorithm using model predictive control approach
Li et al. Efficient and Guaranteed-Safe Non-Convex Trajectory Optimization with Constrained Diffusion Model
Wang et al. Path Tracking Method Based on Model Predictive Control and Genetic Algorithm for Autonomous Vehicle
CN114228748A (en) Human-like automatic driving track planning method based on geometric path generation
Du et al. Heuristic reinforcement learning based overtaking decision for an autonomous vehicle
Dastider et al. Learning adaptive control in dynamic environments using reproducing kernel priors with bayesian policy gradients
Mostafa et al. Fast adaptive regression-based model predictive control

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant