CN113353088A

CN113353088A - Vehicle running speed control method, device, control equipment and storage medium

Info

Publication number: CN113353088A
Application number: CN202110804852.6A
Authority: CN
Inventors: 李垚; 唐子烨; 倪昆
Original assignee: Suzhou Zhijia Technology Co Ltd
Current assignee: Suzhou Zhijia Technology Co Ltd
Priority date: 2021-07-16
Filing date: 2021-07-16
Publication date: 2021-09-07
Anticipated expiration: 2041-07-16
Also published as: CN113353088B

Abstract

The present specification discloses a vehicle running speed control method, apparatus, control device and storage medium, the method comprising: receiving running time information and oil consumption information of a specified vehicle after running on a current road section based on given running speed information; updating a vehicle speed control algorithm by taking the running time information and the oil consumption information as learning rewards; the vehicle speed control algorithm is constructed based on a reinforcement learning model; acquiring current vehicle state information and specified road information of the specified vehicle; the specified road information at least comprises road information of a target road section to be driven in front of the specified vehicle; and processing the vehicle state information and the specified road information by using the updated vehicle speed control algorithm to obtain target running speed information of the specified vehicle on the target road section, so that the specified vehicle runs on the target road section based on the target running speed information, and the running speed control of the vehicle can be dynamically adapted to the running time and the oil consumption requirement.

Description

Vehicle running speed control method, device, control equipment and storage medium

Technical Field

The present specification relates to the field of automatic driving, and in particular, to a vehicle running speed control method, apparatus, control device, and storage medium.

Background

For automatically driven vehicles such as automatically driven taxis or automatically driven logistics vehicles which need all-weather operation, the oil consumption and the running time of the vehicles are one of the more concerned characteristic indexes of the automatic driving technology. At present, most of methods for improving fuel economy of an automatic driving vehicle are considered from the aspect of vehicle hardware equipment such as an engine and a transmission, the fuel economy is improved by improving the efficiency of the engine and the gear shifting rule of the transmission, but the methods are directly related to the hardware equipment, and the lifting space is limited.

Disclosure of Invention

An object of the embodiments of the present disclosure is to provide a method, an apparatus, a control device, and a storage medium for controlling a vehicle running speed, which can improve flexibility of controlling the vehicle running speed, so as to dynamically adapt the vehicle running speed control to a running time and a fuel consumption requirement.

The present specification provides a vehicle running speed control method, device, control apparatus and storage medium, which are implemented in the following ways:

a vehicle travel speed control method applied to a vehicle control apparatus, the method comprising: receiving running time information and oil consumption information of a specified vehicle after running on a current road section based on given running speed information; updating a vehicle speed control algorithm by taking the running time information and the oil consumption information as learning rewards; the vehicle speed control algorithm is constructed based on a reinforcement learning model; acquiring current vehicle state information and specified road information of the specified vehicle; the specified road information at least comprises road information of a target road section to be driven in front of the specified vehicle; and processing the vehicle state information and the specified road information by using the updated vehicle speed control algorithm to obtain target running speed information of the specified vehicle on a target road section, so that the specified vehicle runs on the target road section based on the target running speed information.

In other embodiments of the method provided in this specification, the updating the vehicle speed control algorithm with the travel time information and the fuel consumption information as learning rewards includes: acquiring a preconfigured time weight and a preconfigured oil consumption weight; wherein the time weight is used for representing the control importance degree of the running time of the specified vehicle; the fuel consumption weight is used for representing the fuel consumption control importance of the specified vehicle; carrying out weighted average processing on the running time information and the oil consumption information by using the time weight and the oil consumption weight; and taking the weighted average processing result as learning reward, and updating the vehicle speed control algorithm.

In still other embodiments of the methods provided herein, the designated road information includes at least grade information for a target road segment to be traveled ahead of the designated vehicle.

In other embodiments of the method provided in this specification, the time weight and the fuel consumption weight are determined by: simulating the whole-course oil consumption and the whole-course running time of the specified vehicle running on the target running route under the control of the vehicle speed control algorithm based on any given time weight and oil consumption weight; and determining the time weight and the oil consumption weight corresponding to the running of the specified vehicle on the target running route based on the whole-course oil consumption and the whole-course running time.

On the other hand, the embodiments of the present specification also provide a vehicle running speed control apparatus, the apparatus including: the receiving module is used for receiving the running time information and the oil consumption information of the specified vehicle after running on the current road section based on the given running speed information; the updating module is used for updating a vehicle speed control algorithm by taking the running time information and the oil consumption information as learning rewards; the vehicle speed control algorithm is constructed based on a reinforcement learning model; the acquisition module is used for acquiring the current vehicle state information and the specified road information of the specified vehicle; the specified road information at least comprises road information of a target road section to be driven in front of the specified vehicle; and the vehicle speed determining module is used for processing the vehicle state information and the specified road information by using the updated vehicle speed control algorithm to obtain target running speed information of the specified vehicle on a target road section, so that the specified vehicle runs on the target road section based on the target running speed information.

In other embodiments of the apparatus provided in this specification, the update module is further configured to obtain a preconfigured time weight and a preconfigured fuel consumption weight; wherein the time weight is used for representing the control importance degree of the running time of the specified vehicle; the fuel consumption weight is used for representing the fuel consumption control importance of the specified vehicle; the opinion carries out weighted average processing on the running time information and the oil consumption information by utilizing the time weight and the oil consumption weight; and taking the weighted average processing result as learning reward, and updating the vehicle speed control algorithm.

In still other embodiments of the apparatus provided herein, the designated road information includes at least gradient information of a target road segment to be traveled ahead of the designated vehicle.

In other embodiments of the device provided in this specification, the device further includes a weight configuration module, configured to simulate, based on any given time weight and oil consumption weight, a full-course oil consumption and a full-course travel time of the specified vehicle traveling on a target travel route under the control of the vehicle speed control algorithm; and determining the time weight and the oil consumption weight corresponding to the running of the specified vehicle on the target running route based on the whole-course oil consumption and the whole-course running time.

In another aspect, this specification further provides a control device, where the control device includes at least one processor and a memory for storing processor-executable instructions, and the instructions, when executed by the processor, implement the steps of the method according to any one or more of the foregoing embodiments.

In another aspect, the present specification provides a computer readable storage medium, on which computer instructions are stored, and the instructions, when executed, implement the steps of the method according to any one or more of the above embodiments.

The vehicle running speed control method, the vehicle running speed control device and the storage medium provided by one or more embodiments of the specification construct a vehicle control algorithm based on a reinforcement learning model, acquire actual running time and oil consumption corresponding to a vehicle running based on a running speed output by the vehicle control algorithm, and use the actual running time and the oil consumption as learning rewards of the vehicle control algorithm, so that the vehicle control algorithm timely and accurately adjusts a learning strategy based on the learning rewards, and dynamic updating of the algorithm is realized; and controlling the vehicle speed based on the updated vehicle speed control algorithm, and repeating the steps to realize the dynamic control of the vehicle speed by taking the actual requirements of the running time and the oil consumption as targets, so that the running speed control of the vehicle is adapted to the actual requirements of the oil consumption and the running time.

Drawings

In order to more clearly illustrate the embodiments of the present specification or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only some embodiments described in the present specification, and for those skilled in the art, other drawings can be obtained according to the drawings without any creative effort. In the drawings:

FIG. 1 is a schematic view of a mechanical model of a vehicle provided herein;

FIG. 2 is a schematic feedback flow diagram of a vehicle speed control algorithm provided herein;

FIG. 3 is a schematic flow chart of an implementation of a method for controlling a driving speed of a vehicle provided by the present disclosure;

FIG. 4 is a schematic view of a vehicle travel speed control flow provided herein;

fig. 5 is a schematic block diagram of a vehicle travel speed control device provided in the present specification.

Detailed Description

In order to make those skilled in the art better understand the technical solutions in the present specification, the technical solutions in one or more embodiments of the present specification will be clearly and completely described below with reference to the drawings in one or more embodiments of the present specification, and it is obvious that the described embodiments are only a part of the embodiments of the specification, and not all embodiments. All other embodiments obtained by a person skilled in the art based on one or more embodiments of the present specification without making any creative effort shall fall within the protection scope of the embodiments of the present specification.

In one example of the scenario provided in this specification, the vehicle running speed control method may be applied to a control device, and the control device may be integrated in a vehicle or may be a device that is arranged independently of the vehicle. Preferably, the control device can be integrated in a vehicle to control the vehicle in real time, so that the control timeliness of the vehicle is improved.

A vehicle speed control algorithm may be constructed in advance and stored in the control apparatus. The control apparatus may control the running speed of the vehicle based on the vehicle speed control algorithm. Preferably, the vehicle speed control algorithm may be constructed based on a reinforcement learning model.

The vehicle speed control algorithm may employ a Deep Q Learning (DQN) model, for example. The core variable of the DQN model is the Q value, which can be understood as the influence of taking a certain action in a certain state, including the instant reward and the future delayed reward. The core idea of the algorithm is to approximate the Q value through a neural network, and accordingly, proper operation is selected according to the state. The specific training and supervised learning are very similar. The labeled data (label data) in supervised learning corresponds to the State-Action Transition (stored in replay memory) in the supervised learning, and the label (label) of the data is calculated and generated through a target net (target net) unlike the supervised learning. Since the target network is dynamically updated, the tags of the same batch of data are also dynamically updated. Policy network (policy net) fitting is iteratively updated by regression of the labeled data as in supervised learning. The target network is generally set as a policy network with a low iterative update frequency. Of course, the speed control algorithm may be constructed using other types of reinforcement learning models.

On the basis of the reinforcement learning model, a vehicle speed control algorithm can be further constructed by combining a vehicle mechanical model. As shown in FIG. 1, a vehicle mechanical model can be obtained by analyzing the stress of the vehicle. The force analysis of the vehicle can be expressed by using the following formulas:

F_j＝F_d-F_g-F_r-F_aF_j＝ma

F_g＝mg×sinθ

F_r＝f×mg×cosθ

F_a＝0.5ρ_aC_dA_rv²

wherein, F_jRepresenting the total force acting on the vehicle, F_dRepresenting the driving force acting on the vehicle, F_gRepresenting the ramp resistance, F_rDenotes rolling resistance, F_aRepresenting air resistance, m representing vehicle mass, a representing vehicle acceleration, g representing gravitational acceleration, theta representing gradient, f representing road rolling resistance coefficient, p_aDenotes the air density, C_dRepresenting the wind resistance coefficient of the vehicle, A_rRepresenting the frontal area of the vehicle, v representing the speed of travel of the vehicle, T_eRepresenting the engine torque of the vehicle, R representing the tire radius of the vehicle, i_fRepresenting the final drive ratio of the vehicle, i_gRepresents the transmission gear ratio of the vehicle and δ represents the transmission efficiency.

The vehicle mass, the windward area, the tire radius and the like belong to basic mechanical parameter information of the vehicle, the rolling resistance coefficient, the air density, the wind resistance coefficient, the road section gradient and the like belong to external environment parameters, parameters related to the engine torque of the vehicle belong to power parameter information of the vehicle, and the speed and the acceleration of the vehicle are influenced by the combination of the power parameters, the inherent mechanical parameters and the external environment parameters. Under the condition that basic mechanical parameter information and external environment parameters of the vehicle are known, the power parameter information of the vehicle can be determined by utilizing the vehicle mechanical model according to the requirements of the running speed and the acceleration of the vehicle, so that the running speed control of the vehicle is realized. Correspondingly, the basic mechanical parameter information of the vehicle and the external environment parameters of the road to be driven can be used as the input of a vehicle speed control algorithm, the vehicle speed control algorithm can output the power parameter data of the vehicle on the road to be driven, and the vehicle can drive based on the power parameter data, so that the purpose of controlling the driving speed of the vehicle is achieved.

As shown in fig. 2, the travel time and fuel consumption of the vehicle traveling based on the outputted power parameter data may be fed back to the vehicle speed control algorithm as a learning reward. The oil consumption of the vehicle is closely related to power parameters such as engine torque, engine rotating speed and the like of the vehicle, the running time of the vehicle is closely related to initial speed and acceleration of the vehicle, the running time and the oil consumption are used as learning rewards, a vehicle speed control algorithm can be used as evaluation feedback according to actual running time and the oil consumption, learning strategy adjustment can be timely and accurately carried out, and guidance of output of the vehicle speed control algorithm is achieved. And controlling the vehicle speed based on the updated vehicle speed control algorithm, and repeating the steps to realize the dynamic control of the vehicle speed by taking the actual requirements of the running time and the oil consumption as targets. The vehicle speed control algorithm is adjusted in a real-time feedback mode, so that the running speed of the vehicle is more adaptive to the actual requirements of oil consumption and running time.

Based on the above scene example, the present specification also provides a vehicle running speed control method. Fig. 3 is a schematic flow chart of an embodiment of the vehicle running speed control method provided by the present specification. As shown in fig. 3, in one embodiment of the vehicle travel speed control method provided by the present specification, the method may be applied to a control apparatus. The method may comprise the following steps.

S30: and receiving the running time information and the fuel consumption information of the specified vehicle after running on the current road section based on the given running speed information.

Any of the automatically driven vehicles currently analyzed may be taken as the designated vehicle. In the vehicle speed control process, the travel path of the specified vehicle may be discretized. And describing any discrete road segment obtained after the discrete processing as a road segment. And assume that the specified vehicle is traveling at a uniform acceleration in any road section. Through the discrete processing, the running speed of the vehicle can be controlled more accurately and flexibly.

For example, the travel road may be discretized based on a preset distance interval. Of course, different discrete methods may be used for different road regions based on the complexity of the actual driving road, for example, a curved road segment and a straight road segment may be discrete at different distance intervals, and the like. If the travel road is determined, the control device may perform discrete processing on the determined travel road in advance as needed, and store the link information after the discrete processing. Or, the control device may also obtain a road with a preset distance to be traveled in front of the vehicle in real time according to the requirement, and perform real-time discrete processing on the road with the preset distance to adapt to the change of the route of the vehicle in the actual traveling process, so that the vehicle speed control is more adaptive to the actual application scene. Of course, other discrete modes can be configured according to the needs, and are not limited herein.

For convenience of description, a link for which the control device is to determine the travel speed information may be described as a target link, and a link previous to the target link may be described as a current link. The control apparatus may receive travel time information and fuel consumption information of a specified vehicle after traveling on a current road section based on given travel speed information. The travel speed information may include power parameter data specifying the vehicle. The given driving speed information may be initial driving speed information given when the designated vehicle is started, and correspondingly, the current road segment may be a first road segment driven when the designated vehicle is started. The given driving speed information may also be the driving speed information output by the vehicle speed control algorithm, and correspondingly, the current road segment may be another road segment except the first road segment in the driving process of the specified vehicle.

The specified vehicle may correspond to the travel time and the fuel consumption after traveling on the current road section based on the given travel speed information. The time when the designated vehicle enters the starting point of the current road section can be recorded as the starting time; when the vehicle is specified to travel to the end of the current road segment, the corresponding time may be recorded and taken as the current time. Accordingly, the time difference between the current time and the starting time is the travel time of the specified vehicle on the current road section based on the given travel speed information. The control device can also acquire the oil consumption data of the specified vehicle between the starting time and the current time from the operation data of the specified vehicle to obtain the oil consumption information of the current road section.

S32: updating a vehicle speed control algorithm by taking the running time information and the oil consumption information as learning rewards; the vehicle speed control algorithm is constructed based on a reinforcement learning model.

The construction of the vehicle speed control algorithm may refer to the above scenario example, which is not described herein again. The learning target may be difficult to determine initially by the vehicle speed control algorithm, so that the learning of the vehicle speed control algorithm deviates from the target and the requirement is difficult to realize. In some embodiments, the learning direction of the vehicle speed control algorithm may be guided to be optimized toward the target direction by using a supervised learning method. And then applying the preliminarily trained vehicle speed control algorithm to the actual vehicle speed control of the vehicle.

The control device may feed back the received travel time information and the received fuel consumption information of the specified vehicle after traveling on the current road section based on the given travel speed information as a learning reward to the vehicle speed control algorithm to guide the model parameter adjustment of the vehicle speed control algorithm. For example, the vehicle speed control algorithm may target a vehicle with as low fuel consumption as possible from the start point to the end point and a travel time as small as possible. Correspondingly, the control equipment can superpose the learning reward of each discrete road section to serve as an adjusting evaluation function of the vehicle speed control algorithm so as to guide the vehicle speed control algorithm to update network parameters such as a strategy network and a target network and approach a learning target.

Generally, under the condition of certain other external conditions, the fuel consumption and the running time are usually contradictory, if the running time is required to be as small as possible, a relatively high power is required to achieve or maintain a relatively high running speed, and correspondingly, a relatively high fuel consumption is required. The combination of the oil consumption and the driving time is used as the learning reward, and the requirements of the driving time and the oil consumption can be considered. In some embodiments, the time weight and the oil consumption weight may be configured in advance; wherein the time weight is used for representing the control importance degree of the running time of the specified vehicle; and the fuel consumption weight is used for representing the fuel consumption control importance of the specified vehicle. The time weight and the fuel consumption weight can be used for carrying out weighted average processing on the running time information and the fuel consumption information, and the minimum value of the weighted average processing result is used as the target of the vehicle speed control algorithm. By configuring the weight, the vehicle speed control algorithm can be weighted to the running time and the fuel consumption according to the requirements of the actual application scene, and if the running time of the whole vehicle is expected to be shortest, the time weight can be set to be larger; if the oil consumption of the designated vehicle in the whole process is expected to be the minimum, the oil consumption weight can be set to be larger so as to realize effective balance of the oil consumption and the running time and better meet the requirement of practical application.

The time weight and the oil consumption weight can be configured according to the requirements of the actual application scene. Or, the whole-course oil consumption and the whole-course running time of the specified vehicle running on the target running route under the control of the vehicle speed control algorithm can be simulated based on any given time weight and oil consumption weight; and determining the time weight and the oil consumption weight corresponding to the running of the specified vehicle on the target running route based on the whole-course oil consumption and the whole-course running time. The whole-journey oil consumption and the whole-journey running time under different time weights and oil consumption weights can be analyzed by the method, the proper whole-journey oil consumption and the proper whole-journey running time are selected according to requirements, and the selected whole-journey oil consumption and the time weight and the oil consumption weight corresponding to the whole-journey running time are used as the time weight and the oil consumption weight of the specified vehicle running on the target running route for associated storage. By determining the time weight and the oil consumption weight in the above manner, the time and the oil consumption weight can be more suitable for the actual application requirement of the specified vehicle in the target driving route.

S34: acquiring current vehicle state information and specified road information of the specified vehicle; the specified road information includes at least road information of a target link to be driven in front of the specified vehicle.

The control device may acquire current vehicle state information of the specified vehicle and specified road information of the front to-be-traveled target section, and take the current vehicle state information and the specified road information of the front to-be-traveled target section as inputs of the updated vehicle speed control algorithm. The specified road information may include at least road information, wind speed information, and the like of a target link to be traveled ahead of the specified vehicle. The current vehicle state information may include at least vehicle state information of a current time and at least one time before the current time. The state variables such as initial speed information and basic mechanical parameter information of the specified vehicle entering the target road section can be determined based on the current vehicle state information, the environment information of the target road section where the specified vehicle enters can be determined based on the specified road information, and the obtained information is used as the input of the updated vehicle speed control algorithm, so that the target running speed information of the target road section can be obtained.

The influence of external factors such as wind speed and road resistance coefficient on the vehicle driving is generally fixed and can be used as a fixed quantity. Preferably, only the road gradient may be taken into consideration as the external environment information to reduce the complexity of data processing. In some embodiments, it may further be assumed that the grade value within any road segment is a fixed value. By the method, the roads are dispersed, and the road information in each road section is fixed, so that the control precision of the vehicle is ensured, the convenience of data processing can be further reduced, and the efficiency of data processing is improved.

S36: and processing the vehicle state information and the specified road information by using the updated vehicle speed control algorithm to obtain target running speed information of the specified vehicle on a target road section, so that the specified vehicle runs on the target road section based on the target running speed information.

As shown in fig. 4, the control device may obtain a road to be traveled five kilometers ahead in real time, perform discrete processing on the road to be traveled based on a preset distance interval, and assume that a uniform acceleration traveling is performed in any discrete road segment and a slope value is unchanged. And discretely processing the road gradient information five kilometers ahead and the vehicle state information of the designated vehicle at the current time and at least one moment before the current time as the input of the updated vehicle speed control algorithm.

Since the dimensions of the road information are generally much larger than the vehicle state information acquired by the control device. The control device may also perform dimension reduction on the specified road information first. When the specified road information is the slope information of the road sections and the slope value in any road section is a fixed value, because the variable describing the change of the slope of the road is only the road height and is a one-dimensional variable, the dimension reduction of the specified road information can be performed by using a one-dimensional convolution network, and the efficiency of dimension reduction processing is improved. The dimension-reduced designated road information and vehicle state information may then be spliced together as input to the updated vehicle speed control algorithm. For example, the spliced data may be used as an input of the DQN full link layer, and then, target traveling speed information of the designated vehicle on a target road segment may be output, so that the designated vehicle travels on the target road segment based on the target traveling speed information.

Correspondingly, after the target road section is driven, the control device may use the target road section as the current road section, repeat the above-mentioned obtaining of the driving time and the oil consumption of the current road section, further use the driving time and the oil consumption as the learning reward, and further optimize the vehicle speed control algorithm. And then, the gradient change data five kilometers ahead can be updated, new vehicle state information is obtained, the running speed information of the specified vehicle on the next road section is further determined based on the updated vehicle speed control algorithm, and the like until the terminal is reached.

Of course, the vehicle speed control may be dynamically performed during actual travel of the designated vehicle, or may be performed by simulation in advance by the control device in the case where the travel route is known. If the control device is implemented in a simulation mode in advance, after the control device obtains the global speed curves corresponding to the known running routes, the control device can also preferably select the running route corresponding to the global speed curve which best meets the requirement according to needs, so that the specified vehicle runs based on the running route and the corresponding global speed curve, and the running time and the oil consumption are saved as far as possible.

Based on the vehicle running speed control method, the specification also provides a vehicle running speed control device which is applied to the control equipment. As shown in fig. 5, the apparatus may include: the receiving module 50 is used for receiving the running time information and the fuel consumption information of the specified vehicle after running on the current road section based on the given running speed information; the updating module 52 is used for updating a vehicle speed control algorithm by taking the running time information and the oil consumption information as learning rewards; the vehicle speed control algorithm is constructed based on a reinforcement learning model; an obtaining module 54, configured to obtain current vehicle state information and specified road information of the specified vehicle; the specified road information at least comprises road information of a target road section to be driven in front of the specified vehicle; and the vehicle speed determining module 56 is configured to process the vehicle state information and the specified road information by using the updated vehicle speed control algorithm to obtain target traveling speed information of the specified vehicle on the target road segment, so that the specified vehicle travels on the target road segment based on the target traveling speed information.

In other embodiments, the updating module 52 is further configured to obtain a preconfigured time weight and a preconfigured fuel consumption weight; wherein the time weight is used for representing the control importance degree of the running time of the specified vehicle; the fuel consumption weight is used for representing the fuel consumption control importance of the specified vehicle; the opinion carries out weighted average processing on the running time information and the oil consumption information by utilizing the time weight and the oil consumption weight; and taking the weighted average processing result as learning reward, and updating the vehicle speed control algorithm.

In still other embodiments, the specified road information includes at least gradient information of a target road segment to be traveled ahead of the specified vehicle.

In other embodiments, the device further includes a weight configuration module, configured to simulate, based on any given time weight and oil consumption weight, a full-course oil consumption and a full-course driving time of the specified vehicle on a target driving route under the control of the vehicle speed control algorithm; and determining the time weight and the oil consumption weight corresponding to the running of the specified vehicle on the target running route based on the whole-course oil consumption and the whole-course running time.

It should be noted that the above-mentioned apparatus may also include other embodiments according to the description of the above-mentioned embodiments. The specific implementation manner may refer to the description of the related method embodiment, and is not described in detail herein.

The present specification also provides a control device comprising at least one processor and a memory for storing processor-executable instructions which, when executed by the processor, implement the steps of the method of any one or more of the embodiments described above.

The present specification also provides a computer readable storage medium having stored thereon computer instructions which, when executed, implement steps of a method comprising any one or more of the embodiments described above. The storage medium may include a physical device for storing information, and typically, the information is digitized and then stored using an electrical, magnetic, or optical media. The storage medium may include: devices that store information using electrical energy, such as various types of memory, e.g., RAM, ROM, etc.; devices that store information using magnetic energy, such as hard disks, floppy disks, tapes, core memories, bubble memories, and usb disks; devices that store information optically, such as CDs or DVDs. Of course, there are other ways of storing media that can be read, such as quantum memory, graphene memory, and so forth.

It should be noted that the embodiments of the present disclosure are not limited to the cases where the data model/template is necessarily compliant with the standard data model/template or the description of the embodiments of the present disclosure. Certain industry standards, or implementations modified slightly from those described using custom modes or examples, may also achieve the same, equivalent, or similar, or other, contemplated implementations of the above-described examples. The embodiments using these modified or transformed data acquisition, storage, judgment, processing, etc. may still fall within the scope of the alternative embodiments of the present description.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment. In the description of the specification, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the specification. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.

The above description is only an example of the present specification, and is not intended to limit the present specification. Various modifications and alterations to this description will become apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present specification should be included in the scope of the claims of the present specification.

Claims

1. A vehicle travel speed control method, applied to a control apparatus, the method comprising:

receiving running time information and oil consumption information of a specified vehicle after running on a current road section based on given running speed information;

updating a vehicle speed control algorithm by taking the running time information and the oil consumption information as learning rewards; the vehicle speed control algorithm is constructed based on a reinforcement learning model;

acquiring current vehicle state information and specified road information of the specified vehicle; the specified road information at least comprises road information of a target road section to be driven in front of the specified vehicle;

and processing the vehicle state information and the specified road information by using the updated vehicle speed control algorithm to obtain target running speed information of the specified vehicle on a target road section, so that the specified vehicle runs on the target road section based on the target running speed information.

2. The method of claim 1, wherein updating a vehicle speed control algorithm with the travel time information and the fuel consumption information as learning rewards comprises:

acquiring a preconfigured time weight and a preconfigured oil consumption weight; wherein the time weight is used for representing the control importance degree of the running time of the specified vehicle; the fuel consumption weight is used for representing the fuel consumption control importance of the specified vehicle;

carrying out weighted average processing on the running time information and the oil consumption information by using the time weight and the oil consumption weight;

and taking the weighted average processing result as learning reward, and updating the vehicle speed control algorithm.

3. The method according to claim 1, wherein the specified road information includes at least gradient information of a target section to be traveled ahead of the specified vehicle.

4. The method of claim 2, wherein the time weight and the fuel consumption weight are determined by:

simulating the whole-course oil consumption and the whole-course running time of the specified vehicle running on the target running route under the control of the vehicle speed control algorithm based on any given time weight and oil consumption weight;

and determining the time weight and the oil consumption weight corresponding to the running of the specified vehicle on the target running route based on the whole-course oil consumption and the whole-course running time.

5. A running speed control apparatus for a vehicle, characterized by comprising:

the receiving module is used for receiving the running time information and the oil consumption information of the specified vehicle after running on the current road section based on the given running speed information;

the updating module is used for updating a vehicle speed control algorithm by taking the running time information and the oil consumption information as learning rewards; the vehicle speed control algorithm is constructed based on a reinforcement learning model;

the acquisition module is used for acquiring the current vehicle state information and the specified road information of the specified vehicle; the specified road information at least comprises road information of a target road section to be driven in front of the specified vehicle;

and the vehicle speed determining module is used for processing the vehicle state information and the specified road information by using the updated vehicle speed control algorithm to obtain target running speed information of the specified vehicle on a target road section, so that the specified vehicle runs on the target road section based on the target running speed information.

6. The device of claim 5, wherein the update module is further configured to obtain a preconfigured weight for time and a weight for fuel consumption; wherein the time weight is used for representing the control importance degree of the running time of the specified vehicle; the fuel consumption weight is used for representing the fuel consumption control importance of the specified vehicle; the opinion carries out weighted average processing on the running time information and the oil consumption information by utilizing the time weight and the oil consumption weight; and taking the weighted average processing result as learning reward, and updating the vehicle speed control algorithm.

7. The apparatus according to claim 5, wherein the specified road information includes at least gradient information of a target section to be traveled ahead of the specified vehicle.

8. The device of claim 6, further comprising a weight configuration module configured to simulate, based on any given time weight and fuel consumption weight, a full-course fuel consumption and a full-course travel time of the specified vehicle traveling on a target travel route under control of the vehicle speed control algorithm; and determining the time weight and the oil consumption weight corresponding to the running of the specified vehicle on the target running route based on the whole-course oil consumption and the whole-course running time.

9. A control device, characterized in that the device comprises at least one processor and a memory for storing processor-executable instructions, which when executed by the processor implement the steps of the method of any one of claims 1 to 4.

10. A computer-readable storage medium having stored thereon computer instructions, wherein the instructions, when executed, implement the steps of the method of any one of claims 1-4.